Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Archive for the ‘Statistics’ Category

Two common mistakes using METHOD_OPT in DBMS_STATS

with 29 comments

When gathering Cost-based Optimizer (CBO) statistics for a Table, most probably you use DBMS_STATS directly (unless your application is EBS, PSFT or Siebel). There are a couple of common mistakes we see around the use of the METHOD_OPT parameter on GATHER APIs out of this DBMS_STATS package.

METHOD_OPT => ‘FOR ALL INDEXED COLUMNS SIZE…’

With this parameter value DBMS_STATS gathers statistics ONLY in columns referenced in at least one index. So, any column that is not referenced by an index does NOT get any new statistics. The misconception is that if we specify let’s say SIZE AUTO or SIZE 254,  we would get Histograms for indexed columns and regular statistics (with no Histograms) for the rest of the columns. This is not the case. We simple do not get new statistics for those columns which are not referenced by an index.

WHY is this a problem? Well, if your query has predicates referencing a non-indexed column, the CBO still has to figure out the Selectivity for the predicate. With outdated statistics or no statistics at all, this Selectivity may not be accurate and your plan may suffer.

If you are thinking “if I have no statistics on a column then Dynamic Sampling will kick in” you may be overlooking the fact that in most cases Dynamic Sampling ONLY gets invoked if your Table has no statistics. In a case like this the Table has statistics (DBA_TABLES) but the column of interest may not (DBA_TABL_COLUMNS). In other words, Dynamic Sampling is not the answer here. What you really want is to stay away of “FOR  ALL INDEXED COLUMNS“. You want to use instead “FOR ALL COLUMNS“.

Note: Siebel is an exception since Siebel requires all columns ever used on a predicate to have at least one index. Not that I agree with this policy, but if actually implemented then gathering statistics in all indexed columns is just fine.

METHOD_OPT => NULL

If you are using a tool or a script to dynamically generate the DBMS_STATS.GATHER_TABLE_STATS command to gather Table statistics, you may have a front-end or set of input parameters to specify HOW the user of your tool/script wants to gather statistics. A common mistake (I made it myself some time back), is to pass parameter “METHOD_OPT => NULL” (when the user does not specify anything for method). Be aware that passing a NULL in this parameter means: gather statistics for the Table (DBA_TABLES) but nothing for the Columns (DBA_TAB_COLUMNS). In other words, the statistics for Table Columns do not get refreshed, preserving their old values.

On 10g and 11g using “METHOD_OPT => NULL” is wrong since we end-up with outdated and out of sync statistics. Of course this has a negative effect in our execution plans. On 12c it seems the behavior is different. I did some testings and on 12c  it seems that we get refreshed column statistics even if we pass a NULL. But if you are on 10g or 11g be aware of this common mistake. What you want to do instead is to NOT pass this parameter at. If the parameter is not passed at all, then it gets a default value that basically means: gather statistics in all columns, but in those referenced by at least one predicate, collect Histograms. This default value is usually fine, and in any case it is what you intended when you were passing NULL in the first place.

Conclusion

Gathering CBO statistics calling DBMS_STATS.GATHER_TABLE_STATS with METHOD_OPT => ‘FOR ALL INDEXED COLUMNS…” or passing a value of NULL are common mistakes. You don’t see the detrimental effects right away. But as the column statistics become gradually older and older, some of your execution plans start to change. The symptoms are: unexpected poorly performing plans pop-up once in a while.  The remedy is simple: avoid these two common mistakes!

Keep always in mind: statistics that closely represent your data improve the chances for the CBO to generate an optimal plan. I always think of the CBO as an old temperamental lady: be nice to her and she may be nice to you in return 😉

Written by Carlos Sierra

July 27, 2013 at 6:38 am

Posted in Statistics

Gathering CBO Statistics in Oracle Applications

with 17 comments

When we refer to Oracle Applications we usually think E-Business (EBS), but there are other large Oracle Applications that requires some special consideration when it comes to CBO Statistics gathering: Siebel and People Soft (PSFT). There are some others, but I would say EBS, Siebel and PSFT make the bulk of them all.

Gathering Stats on EBS

This is easy, since the only supported method is FND_STATS. That means using plain DBMS_STATS and ANALYZE is simply not supported. FND_STATS is a wrapper on top of DBMS_STATS that includes some extra functionality. The most important addition is controlling which columns get Histograms. The set of columns EBS requires to have Histograms is small and well controlled. FND_STATS can be executed from within the application (Concurrent Request, i.e. batch job) or directly from SQL*Plus.

If you are using EBS, then be sure FND_STATS is the only active method to gather statistics. In other words, automatic jobs to gather statistics using DBMS_STATS or custom scripts must be disabled.

Gathering Stats on Siebel

Siebel documents how CBO statistics must be gathered in white paper titled “Performance Tuning Guidelines for Siebel CRM Applications on Oracle Database“. This paper is authored by “James Qiu, Paul Blokhin, Mehdi Gerami” (Oracle Development) and its latest version is dated December, 2010. It was rewritten back on 2010 with the contribution of Abel Macias, Dave Henriquez and myself. Out of this paper we created the coe_siebel_stats.sql and the coe_siebel_profile.sql scripts. Both are well documented. Please refer to Metalink (MOS) note 781927.1 to download the latest version of the white paper and the actual scripts.

If you are using Siebel on databases 10.2 or higher, you really want to follow the white paper provided in MOS 781927.1 and implement coe_siebel_stats.sql. We have noticed that since its inception a couple of years ago, the number of performance issues reported in Siebel applications has been reduced significantly.

Gathering Stats on PSFT

For PeopleTools 8.49 and higher, on databases 10.2.0.4 and higher, I recommend the use of Metalink (MOS) note 1322888.1 (Improving Statistics in Oracle RDBMS for PeopleSoft Enterprise). The author of this note is John Houghton, who works for the Center of Expertise (CoE) in the PSFT side of the house. In this MOS note you will find a zip file that contains a couple of documents explaining how to use script pscbo_stats.sql and it also contains the actual script. There is some implementation effort, but once pscbo_stats.sql is implemented your CBO statistics are gathered following best practices for PSFT. I had the opportunity to collaborate with John in the creation and testing of this pscbo_stats.sql script. It is currently on its 2nd major release.

If you are on PSFT 8.49 or higher, you want to implement pscbo_stats.sql out of MOS 1322888.1. Feedback on the results of using this tool have been positive in most cases.

Conclusion

The Cost-based Optimizer (CBO) requires to input good statistics in order to produce an optimal plan. In the perfect world we would use DBMS_STATS with no parameters. Still, some specific application design characteristics, like very skewed data or the use of small tables and temporary tables, require a specif setup to gather statistics tailored to the needs of these applications. All FND_STATS, coe_siebel_stats.sql and pscbo_stats.sql are wrappers on top of DBMS_STATS that incorporate fine granularity use of DBMS_STATS parameters to gather statistics better suited to their application. For EBS the use of FND_STATS is a must, for the other two the use of their corresponding scripts are not a must but strongly recommended.

Written by Carlos Sierra

July 25, 2013 at 7:23 am

Oracle Queries are taking different Plans

with 4 comments

I recently saw a post with this title “Oracle Queries are taking different Plans” and since there was a lot of guessing I suggested to use SQLTXPLAIN and find there the root cause. For some reason or another the thread is still open after 20 days, and the guessing continues. Since the problem of “finding the root cause of a change on an execution plan” is old and still common, I decided to talk about it here.

In Oracle, execution plans do change for one SQL over time, that is a fundamental function of the Cost-based Optimizer (CBO). Not only is normal, but expected and desired in most circumstances. What is not desired, is when a SQL that had a good performing plan suddenly changes its plan to one that we consider sub-optimal in terms of performance. This should be the concern and focus. So WHY the CBO suddenly generates such a plan? The possible reasons are many but I enlist here the most common ones I can think of:

  • Schema object statistics changes
  • Small sample sizes when gathering CBO schema statistics (mostly on 10g)
  • Lack of statistics in some columns referenced by predicates
  • Inconsistent values of bind variables referenced by SQL
  • System wide changes like CBO parameters or system statistics
  • Index modifications including new and dropped indexes
  • Invalid or invisible indexes
  • Large SQL text with many predicates where several plans with similar cost are possible
  • Inconsistent ramp-up process of Adaptive Cursor Sharing (ACS)
  • SQL Profiles generated by the SQL Tuning Advisor (STA)
  • An index changing its blevel (for example from blevel 2 to 3)
  • Volatile values in some columns (for example value “Pending” exists in table column for a short time)
  • Asynchronous gathering of stats on volatile data instead of gathering right between load and consumption

The list above is not comprehensive. It only includes what came to my mind while writing it, so I may have forgotten several cases. And of course there are always some corner cases like special conditions activating uncommon heuristics, or simply hitting a CBO bug.

Guessing WHY a plan has changed is very time consuming and requires multiple queries to the metadata on the database. Since most of the metadata we are interested on is dynamic by nature, we base our guess work on an ever changing environment. Of course the root cause become elusive and we may give up for some time, or jump to false conclusions.

This is where SQLTXPLAIN (MOS 215187.1) or its younger and lighter sister SQLHC (MOS 1366133.1) come to the rescue. These free tools, specially SQLTXPLAIN (SQLT), capture the static and dynamic environment around the one SQL we want to analyze.

By having a consistent view of all elements affecting the execution plan, it becomes much easier to find the root cause of a plan change. And since SQLT also includes current and past details about parameters, statistics and bind variable values, we can even correlate those values to prior execution plans. So I encourage you to try SQLTXPLAIN on a SQL next time you are confronted by the old question: “WHY Oracle Queries are taking different Plans?”.

I am aware SQLTXPLAIN is intimidating. Its output is exuberant, but the point is that it includes what we need to find the root cause of most if not all SQL Tuning cases where a plan has changed.

I have presented this SQLT tool in several conferences during the past year or so. I think it is appropriate now to put together a 3 days workshop to master this tool… Just an idea at this time.

A couple of myths about Plan Stability and CBO Statistics

with one comment

Every so often I get asked about this two statements:

  1. Bind Peeking requires Histograms in order to generate multiple Execution Plans.
  2. Freezing CBO Statistics freezes an Execution Plan.

Both are myths. The true is that you don’t have to have Histograms for a Plan using Bind Peeking to change. And even if you freeze your CBO Statistics your Plan may change.

Range Predicates and Bind Peeking

Consider Range Predicate “COLUMN > :b1“. Where column low/high values are “A” and “Z”. If value of bind :b1 is “A” at the time of a hard parse, the Selectivity of this Predicate is close to 1. If the CBO peeks the value of “Z” the Selectivity is close to zero. A variation of Selectivity like this may change an Execution Plan. Thus: COLUMN does not require to have a CBO Histogram for Bind Peeking to produce multiple Execution Plans”.

Frozen CBO Statistics and Plan Stability

If your table contains columns based on sequences (like in a Primary Key), or if it contains DATE type columns, your Execution Plan may change even if the CBO Statistics do not change.

  1. Consider Predicate “N1 = :b1” where N1 is a NUMBER type column based on a sequence with 100 distinct values and low/high values of 1 and 100. Then hard parse this SQL with values of 100 and 200. The first value may result in a Selectivity of 0.01 while the second my be close to zero. That is a significant change that may cause a plan to change.
  2. Consider Predicate “D1 = :b2” where D1 is a DATE type column with 100 distinct values and low/high of 2013/01/01 and 2013/04/17. Then hard parse this SQL with values of 2013/02/14 and 2013/12/31. Same as the prior case, one value may render a selectivity close to 0.01 while the other close to zero.

The question in both cases above is: WHY would I pass a value out of range? As time progresses, users tend to pass more recent values and since CBO Statistics are frozen, these peeked values become out of low/high range and departing from their low/high end over time. Thus, the Predicate Selectivity diminishes gradually over time. At some point your otherwise stable Execution Plan may suddenly change. Thus: Freezing CBO Statistics does not freeze Execution Plans”.

Conclusion

Since lack of Histograms or freezing CBO Statistics do not guarantee Plan Stability, do not rely on these two myths. If what you are looking for is Plan Stability use then SQL Plan Management available since 11g or SQL Profiles available from 10g.

Written by Carlos Sierra

April 17, 2013 at 9:50 am

Hotsos Symposium 2013

with 5 comments

Today is the 3rd and last day of the Hotsos Symposium 2013. I will stay the extra day so I can attend Maria Colgan’s 1-day class. I am delighted to be here. This annual event is awesome. I will certainly apply to speak next year!

In my opinion, most sessions have been of good quality. Topics are interesting, and having 2 sessions at the same time allows me to pick one out of two instead of many. Of course there are times where I would like to attend the two!

Since this is my second year, I feel like coming back home. I know more people than last year, including other speakers. My kudos to Hotsos for keeping up with the quality of this event.

I would love to see more Oracle attendees, specially Support managers so they can better sense what the Oracle community needs… Anyways, I am happy to see some Oracle faces here… Since many of us work from home these days, participating in events like this helps to reestablish relationships, that would otherwise fade over time.

Yesterday, we paid a visit to Enkitec‘s offices. Veronica showed us the office and we ate some pizza there. The office is modern, stylish and cool. It invites to work! One thing I noticed is the natural synergy among employees. I find this intimate networking of smart guys so important to achieve better results. It encourages constant collaboration and kind of explains me the success of this firm. It brought to my memory my first years at Oracle, when we used to gathered in the aisles and talk about anything, including work! Anyways, it was nice to go and visit…

My to-do list after this Symposium grows:

  1. I noticed some misconceptions about statistics gathering. I have to chase down some Metalink notes which are causing this trouble…
  2. Without knowing it, some speakers have created some “enhancement requests” to SQLT, so I have to incorporate some of these soon…
  3. I need to review some SQL tuning complex cases that some participants want to discuss with me. I will review and provide some advice…
  4. I expect to receive some emails with enhancements to either the CBO or the database. I will log them to corresponding groups…
  5. Oracle Support would benefit of some refreshment in “diagnostics collection” for SQL tuning issues. That means another round of internal training. Maybe short remote sessions…
  6. I have a list of at least half a dozen topics that could be of interest for future conferences. I need to polish this list and draft some abstracts. Then I will review with some OUG’s key players to sense interest.
  7. I need to purchase (and read) like 3-4 books.
  8. I need to evaluate the possibility of writing a new tool. It would be a holistic review of CBO statistics (including schema, system, data dictionary and fixed objects). The tool would perform many health-checks and return a set of observations. A natural component could be the dynamic generation of a second script to actually implement those observations. Statistics is still a hot issue and I noticed very diverse opinions. Some are good and some are simply wrong.
  9. We need to investigate if “STATISTICS_LEVEL to ALL” can be improved in terms of performance in EXADATA.
  10. I need to learn STRACE and determine its real value for Support. Then review existing Metalink notes and update them accordingly.
  11. Big data and fussy data mining seem to be hot. Performance of data mining may become a concern… I have to research.
  12. Method-R and Delphix have excellent products. I need to learn more about them since I expect their adoption by the Oracle community will make them more visible to all of us. I may write a Metalink note on the former so Support can make better sense of it.
  13. I have to document in this Blog more about the many ramifications of SQLT, specially in the area of Plan Stability and Plan Migration. It seems to be an eternal concern. SQLT provides a lot of help in this area but I think we need better explanations on how to exploit these capabilities.

Ok. Time for a coffee and my daily email race… So far, email is winning by far…

Written by Carlos Sierra

March 6, 2013 at 7:53 am

SQL Tuning Workshop at the NoCOUG

with 5 comments

I am delivering a two-days SQL Tuning Workshop at the Norther California Oracle Users Group (NoCOUG) in Pleasanton, California. Today is the 2nd and last day of this class. We discussed yesterday: Cost-based Optimizer (CBO) Statistics, Event 10053, Adaptive Cursor Sharing (ACS) and SQL Plan Management (SPM). We will discuss today some SQL Tuning tools: Event 10046, Trace Analyzer TRCANLZR, SQLTXPLAIN (SQLT), SQL Health-Check (SQLHC) and SQLT Test Case (TC). There are around 35 participants and I would say the level of the class is between intermediate to advanced. Most participants are seasoned DBAs and some have a good level of expertise as Developers. Abel Macias is participating in this session as my co-instructor. He is bringing to class his extensive expertise in SQL Tuning. It is always a challenge to keep the attention of such a large and diverse group, but I would say the level of participation has been quite high. I appreciate all the effort from several contributors of the NoCOUG who have made this training possible. My special thanks to Gwen Shapira, Randy Samberg, Iggy Fernandez and Eric Hutchinson. It is very rewarding to see how Oracle Users Groups are putting together events like this where we can openly share knowledge with our Oracle users community.

I will speak at the actual NoCOUG Conference this Thursday. I will talk about the content of SQLTXPLAIN (SQLT) main report, which is quite comprehensive. I will go over some samples and I will allocate plenty of time for questions during and after my presentation. These days SQLT is widely used within Oracle and now by several enterprise Oracle customers.

Stelios Charalambides book about SQLT Tuning and SQLTXPLAIN will be released on March 20. The title is Oracle SQL Tuning with Oracle SQLTXPLAIN. This book can now be pre-ordered in Amazon. If you want to learn more about how to use SQLTXPLAIN to diagnose SQL Tuning issues you may want to get a copy of this book.

After NoCOUG I will spend one week at the office, where I will deliver a live Webinar in SQL Tuning to our Oracle Partners. Then my next conference is the popular Hotsos Symposium in March 3-7. In my humble opinion, this annual Hotsos event is the single most relevant in the area of Oracle Performance. Thus I am very excited to have the opportunity to speak there. A real honor!

2-days open SQL Tuning Class in Pleasanton, CA this month!

with 9 comments

I will be delivering a two-days SQL Tuning class on February 19-20 in Pleasanton, CA. It is available to registered Oracle Users thanks to the help from NoCOUG. Space  is limited, so if you want to attend you need to hurry up.

This class covers:

  • CBO statistics (fundamentals and good practices);
  • Tracing with Events 10046 and 10053;
  • Plan Stability with Adaptive Cursor Sharing (ACS) and SQL Plan Management (SPM);
  • SQL Tuning tools: Trace Analyzer TRCANLZR (TRCA), SQLTXPLAIN (SQLT) and SQL Health-Check (SQLHC); and
  • SQL Tuning Test Cases (creation and implementation).

This is the first time this 2-days class is offered outside Oracle. Its content is an extract of the internal SQL Tuning Workshops I have developed and delivered to over 700 Oracle Engineers around the globe.

This is not an Oracle University (OU) class, which means the scope is more into the actual trades of the day-to-day life of SQL Tuning and less into product features. So if you have taken a SQL Tuning OU class, then this 2-days session would be a nice addition.

Another thing that you should expect from this class is lots of participation. In my sessions I always encourage participants to bring to the table their current SQL Tuning issues or concerns, so we can openly analyze and discuss them during class. The environment is friendly, welcoming and open.

I hope you can join us on February 19-20.

Materialized Views Health Check and Diagnostics Reports

with one comment

About  mvhcdr.sql

I uploaded to My Oracle Support (MOS) under document_id 1517362.1 a new script mvhcdr.sql to aid in the diagnostics of issues regarding Materialized Views (MV). This script installs nothing in the database and it produces a zip file with 5 zip files within. The output includes HTML and CSV files from all DBA views related to Materialized Views, as well as Metadata for the MVs and their Logs. Output also includes some Health-Checks like: MV Logs with Outdated Statistics, MV Logs larger than their Master Table, Complete Refresh Candidates and Old Materialized Views. A list of Materialized Views capabilities is also included within the set of output reports, as well as Logs per MV and MVs per Log with some important attributes. CBO Statistics, actual row counts, segment sizes in blocks, indexes, partitions and similar related metadata is included as well. The output of mvhcdr.sql is quite comprehensive. You can find an output sample under same MOS 1517362.1.

The main reason to have this new script is to aid Oracle Support to collect diagnostics information around Materialized Views. It can also be used by any Oracle user that may want to analyze the health of the MVs on a system, or two compare their setup to a similar database. Another use is to simply document all these MVs on a database, together with their related metadata.

You can download mvhcdr.sql from its own MOS note, or from SQLTXPLAIN 11.4.5.4 or higher. If the latter, you can find mvhcdr.sql under directory sqlt/utl.

This new script mvhcdr.sql can be used on 10g and 11g databases. You execute it connected as SYS. It has a couple of pre-requisites and just one execution parameter. The output gets generated under the same local directory from where you executed SQL*Plus.

Pre-requisites

Tables PLAN_TABLE and MV_CAPABILITIES_TABLE must be available. Else, create them with following scripts:

  1. PLAN_TABLE: $ORACLE_HOME/rdbms/admin/catplan.sql; or $ORACLE_HOME/rdbms/admin/utlxplan.sql;
  2. MV_CAPABILITIES_TABLE: $ORACLE_HOME/rdbms/admin/utlxmv.sql;

The MV_CAPABILITIES_TABLE is used to generate and report MV Capabilities as the name implies. The PLAN_TABLE is used as a staging repository for the duration of the execution of mvhcdr.sql.

Execution Parameters

  1. Materialized View name or prefix (optional, defaults to %)

Written by Carlos Sierra

January 31, 2013 at 6:35 am

Deep-dive on SQLTXPLAIN offered externally for the first time

with 8 comments

The Rocky Mountain Oracle Users Group (RMOUG) Training Days is approaching soon: February 11-13, 2013. I hope you are attending this Oracle relevant event. There are so many good sessions that it is hard to decide which ones to attend!

For the first time, I will be conducting a 4-hours deep-dive external-session on SQLTXPLAIN (SQLT). The title is Using SQLTXPLAIN (SQLT) to Diagnose SQL Statements Performing Poorly. If you want to know SQLT from the basics to an intermediate level, consider attending this 4-hrs session. It will be offered on Monday, February 11 from 1 to 5 pm at the Colorado Convention Center in Denver, Colorado (USA). I love interactive sessions, so I will encourage voluntary participation and open discussions.

In addition to the 4-hrs deep-dive on SQLT, I will also speak on Wednesday, February 13 at 8:30 am about How to Improve SQL Performance with New SQL Health Check Tool. This is a 1-hour seminar with plenty of time for Q&A.

If you attend the event and want to discuss any issue with regard to SQLTXPLAIN (SQLT), SQL Health-Check (SQLHC) or simply SQL Tuning, please approach me. I will be there to help, share, listen, learn, discuss, etc.

Written by Carlos Sierra

January 23, 2013 at 6:39 am

Mutating histograms, mutating number of rows, mutating blevel, mutating number of distinct levels… What do they mean?

with 2 comments

Every so often I get an email from a SQLTXPLAIN user asking the meaning of these “mutating” observation from the health-checks engine on SQLT. On next SQLT release 11.4.5.0 (due on November 21, 2012), these messages will read “fluctuating” instead of “mutating”, but the health checks will remain the same. So, what do they mean?

Take “fluctuating number of histogram endpoints count”. What it means is the following: every time we gather CBO statistics on a Table, we have a new “version” of the statistics. We store the old one automatically (DBMS_STATS does that starting on 10g) and the new version becomes “current”. We store 31 days of history by default. So if we gather statistics on a Table lets say daily, we may have about 30 statistics versions for that Table and its columns (and Indexes). Now, if a column has lets say 5 buckets in its histogram and the next version has 6, there was a recorded change of 20%, which may be relevant for your investigation. If a column during its stored history has a number of buckets that fluctuates more than 10% from one version to the next, then it is flagged as “fluctuating” (use to be mutating in SQLT 11.4.4.8 and older). Changes of less than 10% are ignored.

Same concept applies to other elements like the number of distinct values in a column, or the height of an index, or the number of rows in a Table or Partition. If they fluctuate more than 10% from one version of the CBO statistics to the next, then they are flagged and reported in the Observations section of the main SQLT report.

Are we supposed to do anything with those health-checks? All depends. Lets say your SQL suddenly stop using index X and now uses index Y. If blevel on X has increased recently, you can suspect a cost increased due to the new blevel and investigate accordingly. Keep in mind that “Observations” section of SQLT, which reports outcome of health-checks, is basically a collection of heads-up. Things to investigate and maybe pointing to the root cause of a recent change in an execution plan for the SQL being investigated.

Written by Carlos Sierra

November 14, 2012 at 5:59 am