online_advisor is simple Postgres extension which tries to advice you creation of some extra indexes and extended statistics
which can increase speed of your queries.
online_advisor use mostly the same technique as standard Postgres extension auto_explain.
It sets up executor hook to enable instrumentation and analyze instrumentation results at the end of query execution.
Right now online_advisor detects three patters:
- Quals filtering large number of records. It is assumed that such quals can be replaced with index scan.
- Misestimated clauses: nodes with number of estimated rows significantly smaller than actual number of returned rows. It usually caused by lack/inaccuracy of statistics or lack of knowledge about correlation between columns. It can be addressed by creating extended (multicolumn) statistics.
- Large planning overhead caused by not using prepared statements for simple queries.
Create index proposals:
- Detects plan nodes with number of filtered records exceeds some threshold.
- Collects columns used in this predicate.
- Proposes index creation on this columns.
- Tries to cover different sets of columns with one compound index.
- Checks if such index already exists.
Create extended statistic proposals:
- Detects plan node with actual/estimated returned rows number exceeds some threshold.
- Collects columns accessed by this node.
- Proposes create extended statistics statement fort this columns.
- Checks if such statistics already available.
Prepared statements:
- Calculate planning and execution time (you can enable logging of them for each query) and planning overhead (planning_time/execution_time)
- Accumulate maximal/total planning/execution time and maximal/average planning overhead.
- Log statements which planning overhead exceeds
prepare_threshold
- Checks operators used in the predicate. For example, query predicate
x > 10 and y < 100can not be optimized using compound index on (x,y). But right nowonline_advisordoesn't detect it. - Suggests indexes for table joins or eliminating sort for order by clauses.
- Estimate effect of adding suggested index. There is another extension - hypothetical indexes https://github.com/HypoPG/hypopg# which allows to do it. It can be used in conjunction with
online_advisor. online_advisordoesn't create indexes or extended statistics itself. It just makes recommendations to create them. Please also notice that to make optimizer use created indexes or statistics, you should better explicitly callvacuum analyzefor this table.- Generalize queries (exclude constants) and maintain list of queries which should be prepared
- In pg14-16
online_advisorshould be included inpreload_shared_librarieslist. - You should
create extensiononline_advisor` in each database you want to inspect. - To activate online_advisor you need to call any of it's function, for example
get_executor_stats(). - "max_index_proposals" and "max_stat_proposals" can be set only once - prior to activation of online_advisor extension.
online_advisor has the following GUCs:
- "online_advisor.filtered_threshold": specifies threshold for number of filtered records (default 1000)
- "online_advisor.misestimation_threshold": threshold for actual/estimated #rows ratio (default 10)
- "online_advisor.min_rows": minimal number of returns nodes for which misestimation is considered (default 1000)
- "online_advisor.max_index_proposals": maximal number of tracked clauses to propose index creation (default 1000).
- "online_advisor.max_stat_proposals": maximal number of tracked clauses to propose extended statistics creation (default 1000).
- "online_advisor.do_instrumentation": allows to switch on/off instrumentation and so collecting of data by
online_advisor(default on). - "online_advisor.log_duration": log planning/execution time for each query (default off).
- "online_advisor.prepare_threshold": minimal planning/execution time relation for suggesting use of prepared statements (default 1.0).
- Results can be obtained through
proposed_indexesandproposed_statisticsviews having the following columns:
CREATE TYPE index_candidate as (
n_filtered float8, -- total number of filtered columns
n_called bigint, -- how much time this combination of columns was used
elapsed_sec float8, -- elapsed time spent in this nodes
create_index text -- statement to create proposed index
);
CREATE TYPE statistic_candidate as (
misestimation float8, -- maximal actual/estimated ratio
n_called bigint, -- how much time this combination of columns was used
elapsed_sec float8, -- elapsed time spent in this nodes
create_statistics text -- statement to create proposed extended statistics
);
- Suggested indexes and statistics are reported separately for each database.
- By default
online_advisortries to propose single compound index covering different use cases. For example, queriesselect * from T where x=?andselect * from T where x=? and y=?can use one compound index (x,y). But queriesselect * from T where x=? and y=?andselect * from T where x=? and z=?can not use compound index (x,y,z).online_advisoralso provides functionpropose_indexes(combine boolean default true)which is used byproposed_indexesview. You can call it directly withcombine=falseto report information separately for each set of columns and preventonline_advisorfrom covering several cases using compound index. - Query planning and execution time can be obtained through
get_executor_stats(reset boolean default false)function returningexecutor_statstype:
CREATE TYPE executor_stats as (
total_execution_time float8,
max_execution_time float8,
total_planning_time float8,
max_planning_time float8,
avg_planning_overhead float8,
max_planning_overhead float8,
total_queries bigint
);
You should consider use of prepared statements if avg_planning_overhead is greater than 1.