Two common mistakes using METHOD_OPT in DBMS_STATS
When gathering Cost-based Optimizer (CBO) statistics for a Table, most probably you use DBMS_STATS directly (unless your application is EBS, PSFT or Siebel). There are a couple of common mistakes we see around the use of the METHOD_OPT parameter on GATHER APIs out of this DBMS_STATS package.
METHOD_OPT => ‘FOR ALL INDEXED COLUMNS SIZE…’
With this parameter value DBMS_STATS gathers statistics ONLY in columns referenced in at least one index. So, any column that is not referenced by an index does NOT get any new statistics. The misconception is that if we specify let’s say SIZE AUTO or SIZE 254, we would get Histograms for indexed columns and regular statistics (with no Histograms) for the rest of the columns. This is not the case. We simple do not get new statistics for those columns which are not referenced by an index.
WHY is this a problem? Well, if your query has predicates referencing a non-indexed column, the CBO still has to figure out the Selectivity for the predicate. With outdated statistics or no statistics at all, this Selectivity may not be accurate and your plan may suffer.
If you are thinking “if I have no statistics on a column then Dynamic Sampling will kick in” you may be overlooking the fact that in most cases Dynamic Sampling ONLY gets invoked if your Table has no statistics. In a case like this the Table has statistics (DBA_TABLES) but the column of interest may not (DBA_TABL_COLUMNS). In other words, Dynamic Sampling is not the answer here. What you really want is to stay away of “FOR ALL INDEXED COLUMNS“. You want to use instead “FOR ALL COLUMNS“.
Note: Siebel is an exception since Siebel requires all columns ever used on a predicate to have at least one index. Not that I agree with this policy, but if actually implemented then gathering statistics in all indexed columns is just fine.
METHOD_OPT => NULL
If you are using a tool or a script to dynamically generate the DBMS_STATS.GATHER_TABLE_STATS command to gather Table statistics, you may have a front-end or set of input parameters to specify HOW the user of your tool/script wants to gather statistics. A common mistake (I made it myself some time back), is to pass parameter “METHOD_OPT => NULL” (when the user does not specify anything for method). Be aware that passing a NULL in this parameter means: gather statistics for the Table (DBA_TABLES) but nothing for the Columns (DBA_TAB_COLUMNS). In other words, the statistics for Table Columns do not get refreshed, preserving their old values.
On 10g and 11g using “METHOD_OPT => NULL” is wrong since we end-up with outdated and out of sync statistics. Of course this has a negative effect in our execution plans. On 12c it seems the behavior is different. I did some testings and on 12c it seems that we get refreshed column statistics even if we pass a NULL. But if you are on 10g or 11g be aware of this common mistake. What you want to do instead is to NOT pass this parameter at. If the parameter is not passed at all, then it gets a default value that basically means: gather statistics in all columns, but in those referenced by at least one predicate, collect Histograms. This default value is usually fine, and in any case it is what you intended when you were passing NULL in the first place.
Gathering CBO statistics calling DBMS_STATS.GATHER_TABLE_STATS with METHOD_OPT => ‘FOR ALL INDEXED COLUMNS…” or passing a value of NULL are common mistakes. You don’t see the detrimental effects right away. But as the column statistics become gradually older and older, some of your execution plans start to change. The symptoms are: unexpected poorly performing plans pop-up once in a while. The remedy is simple: avoid these two common mistakes!
Keep always in mind: statistics that closely represent your data improve the chances for the CBO to generate an optimal plan. I always think of the CBO as an old temperamental lady: be nice to her and she may be nice to you in return😉