Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Archive for the ‘Plan Stability’ Category

How a Cursor becomes Bind Aware?

with 7 comments

On 11g we have Adaptive Cursor Sharing (ACS) as a mechanism to allow the generation of multiple optimal execution plans for a query. Before ACS we would only have one optimal execution plan in memory for a query given an optimizer environment. There were some exceptions to the rule like having different plans among RAC nodes, or different plans among child cursors when they were not shared. But in essence, before 11g we had one plan in memory for each query.

On ACS we have “bind sensitive” and “bind aware” cursors. The former stage only means “been monitored”, while the latter means: selectivity of predicates during a soft parse determine which optimal plan to use among the ones in memory. So it requires two mechanisms: one to control how to change state from “sensitive” to “aware” and another to select a plan in memory as per the values of the binds.

Becoming Bind Aware

Column v$sql_cs_statistics.rows_processed seems to drive this mechanism. This column is updated during a hard parse and it is a fuzzy representation of the amount of data the query manipulates during its execution (correlated to actual rows in execution plan).

For small values of v$sql_cs_statistics.rows_processed we increment by one v$sql_cs_histogram.bucket_id(0). For medium values we increase by one v$sql_cs_histogram.bucket_id(1). And for large values we do v$sql_cs_histogram.bucket_id(2). Observations on a test case built on 11.2.0.3 seem to indicate that boundaries for these 3 buckets are: less than 1,000; between 1,000 and 1,000,000; and more than 1,000,000.

Cursor Sharing histogram buckets 0-2 are updated on every execution of a bind sensitive query. They represent data volume manipulation of every execution. If there are significant variances then the query would benefit of ACS and it becomes Bind Aware.

I have observed a couple of scenarios where the values stored on these 3 buckets cause the cursor to become bind aware:

  1. When two of the buckets have the same value, and this value is not zero
  2. When the smallest and the largest buckets (0 and 2) have a value other than zero

Keep in mind that observations above are not comprehensive and they may change in future releases. But they give you some clues WHY in some cases your query seems to be erratic in regards to ACS. WHY sometimes it stays as bind sensitive and sometimes it becomes bind aware. Basically ACS requires a ramp-up process. If your SQL executes mostly in one of the 3 volume ranges (as per buckets boundaries), it is possible that sometimes it becomes bind aware but not always. In such cases you may want to use the /*+ bind_aware */ CBO Hint. See this Optimizer Blog.

To help you inserting the /*+ bind_aware */ CBO Hint on an application that cannot be easily modified, you may want to use sqlt/utl/coe_gen_sql_patch.sql which is part of SQLTXPLAIN (MOS/Metalink 215187.1). You can also create a SQL Patch as documented in this other Optimizer Blog.

Script sqlt/utl/coe_gen_sql_patch.sql implements a SQL Patch adding by default 3 CBO Hints to your SQL: GATHER_PLAN_STATISTICS, MONITOR and BIND_AWARE. These 3 Hints would help to force “bind aware” status, and would also enable diagnostics on your query. When you execute this script sqlt/utl/coe_gen_sql_patch.sql you can change the CBO Hints you want in your SQL Patch. I recommend you only pass diagnostics Hints and not access path modifiers.

Written by Carlos Sierra

March 29, 2013 at 7:04 am

Interesting case where a full table scan is chosen instead of an index

with 13 comments

While I was presenting at Hotsos Symposium 2013, I showed a real-life case where the CBO was choosing a full table scan instead of an index range scan. The root cause was a suspicious “NVL(column, value) = value” predicate seen in the execution plan. But the actual SQL contained only “column = value”. So the question became “WHERE the NVL is coming from”? The selection of index or full scan became a simple side effect of the “function(column) = value” predicate.

On 11g there is a new optimization to improve the performance of DDL operations adding columns to an exiting table with data, where the new column is declared as having a constant as default value while making it not nullable. In other words, all existing rows are logically expected to be updated with default value on this new column. If the table is large, this ALTER TABLE ADD COLUMN operation would take long. This optimization records instead just the metadata for this new column (default value and not null constraint) and returns in no time regardless the size of the table.

What was confusing in this real-life case was the fact that filter predicate looked like this “NVL(USER2.SOME_FLAG,’N’)=’Y'” while SOME_FLAG default value was “Y” instead of “N” according to DBA_TAB_COLS. How was that possible? At some point we suspected the column default value had changed from “N” to “Y” but it was a simple speculation. So HOW do we prove or disprove the unexpected “N” in the filter predicate was the product of changing the default value of the affected column?

I created a small test case in order to reproduce observations from the real-life case. I created a table with data on my 11g test database. I created afterwards a new column which would use this new optimization (having a default constant and making it not nullable during same DDL). Last, I modified the default value for same column with another DDL. Keep in mind that when column was created only the metadata was recorded and the actual rows where not updated (this is the optimization).

DROP TABLE t1;
CREATE TABLE t1 AS
WITH v1 AS (SELECT ROWNUM n FROM dual CONNECT BY LEVEL <= 10000)
SELECT ROWNUM id,
 DBMS_RANDOM.STRING('U', 40) c1
 FROM v1, v1
 WHERE ROWNUM <= 10000;

-- rows are not updated. only metadata is recorded. (first DDL)
ALTER TABLE t1 ADD (c2 VARCHAR2(3) DEFAULT 'NO' NOT NULL);

-- data default is recorded. (second DDL)
ALTER TABLE t1 MODIFY (c2 DEFAULT 'YES');

SELECT COUNT(*) FROM t1 WHERE c2 = 'NO';

/*
 COUNT(*)
----------
 10000 <<<<<<<<<<<<<<<<< expecting 0!
*/

SET PAGES 100 LIN 300;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, 'PREDICATE'));

/*
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 23 (100)| |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T1 | 11342 | 34026 | 23 (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 2 - filter(NVL("C2",'NO')='NO') <<<< expecting (NVL("C2",'YES')='NO')!
*/

I was expecting different result and I thought this was a bug. I presented the issue to Mauro Pagano for a peer review and he explained that results were expected (and correct). If we hadn’t had the DDL optimization, at the time of the first ALTER TABLE we would had updated all rows with value “NO”. Then at second DDL we would had updated no rows with value “YES”, but future rows may get that value “YES” if c2 was not explicitly assigned a value. It made sense. And the optimization mimics the same behavior. So, rows that existed before the first DDL have no actual value for “c2” but stored metadata will assign “NO” when accessed. Rows created between DDL 1 and 2 would acquire default value of “NO” if needed (and updated with that “NO” value). And rows created after 2nd DDL would get value “YES” if needed (and updated with that “YES” value). So, only rows before DDL 1 would still have an internal NULL but external “NO”, thanks to initial stored metadata from DDL optimization.

Conclusion

The use of DDL optimization improved the performance of the ALTER TABLE ADD COLUMN command but it introduced a filter predicate which disabled the possible use of a normal index on same column, affecting the performance of queries having such filter predicates. Possible solutions include:

  1. Adding such columns without the optimization: Add column (DDL), update all rows to default value (DML), add default value to column (DDL), add not null constraint (DDL)
  2. Issue an ALTER SESSION SET “_add_col_optim_enabled” = FALSE; before the column is created
  3. Create either an expression with function seen in predicate, or a function-based index

Written by Carlos Sierra

March 12, 2013 at 8:59 am

Posted in OUG, Plan Stability

Hotsos Symposium 2013

with 5 comments

Today is the 3rd and last day of the Hotsos Symposium 2013. I will stay the extra day so I can attend Maria Colgan’s 1-day class. I am delighted to be here. This annual event is awesome. I will certainly apply to speak next year!

In my opinion, most sessions have been of good quality. Topics are interesting, and having 2 sessions at the same time allows me to pick one out of two instead of many. Of course there are times where I would like to attend the two!

Since this is my second year, I feel like coming back home. I know more people than last year, including other speakers. My kudos to Hotsos for keeping up with the quality of this event.

I would love to see more Oracle attendees, specially Support managers so they can better sense what the Oracle community needs… Anyways, I am happy to see some Oracle faces here… Since many of us work from home these days, participating in events like this helps to reestablish relationships, that would otherwise fade over time.

Yesterday, we paid a visit to Enkitec‘s offices. Veronica showed us the office and we ate some pizza there. The office is modern, stylish and cool. It invites to work! One thing I noticed is the natural synergy among employees. I find this intimate networking of smart guys so important to achieve better results. It encourages constant collaboration and kind of explains me the success of this firm. It brought to my memory my first years at Oracle, when we used to gathered in the aisles and talk about anything, including work! Anyways, it was nice to go and visit…

My to-do list after this Symposium grows:

  1. I noticed some misconceptions about statistics gathering. I have to chase down some Metalink notes which are causing this trouble…
  2. Without knowing it, some speakers have created some “enhancement requests” to SQLT, so I have to incorporate some of these soon…
  3. I need to review some SQL tuning complex cases that some participants want to discuss with me. I will review and provide some advice…
  4. I expect to receive some emails with enhancements to either the CBO or the database. I will log them to corresponding groups…
  5. Oracle Support would benefit of some refreshment in “diagnostics collection” for SQL tuning issues. That means another round of internal training. Maybe short remote sessions…
  6. I have a list of at least half a dozen topics that could be of interest for future conferences. I need to polish this list and draft some abstracts. Then I will review with some OUG’s key players to sense interest.
  7. I need to purchase (and read) like 3-4 books.
  8. I need to evaluate the possibility of writing a new tool. It would be a holistic review of CBO statistics (including schema, system, data dictionary and fixed objects). The tool would perform many health-checks and return a set of observations. A natural component could be the dynamic generation of a second script to actually implement those observations. Statistics is still a hot issue and I noticed very diverse opinions. Some are good and some are simply wrong.
  9. We need to investigate if “STATISTICS_LEVEL to ALL” can be improved in terms of performance in EXADATA.
  10. I need to learn STRACE and determine its real value for Support. Then review existing Metalink notes and update them accordingly.
  11. Big data and fussy data mining seem to be hot. Performance of data mining may become a concern… I have to research.
  12. Method-R and Delphix have excellent products. I need to learn more about them since I expect their adoption by the Oracle community will make them more visible to all of us. I may write a Metalink note on the former so Support can make better sense of it.
  13. I have to document in this Blog more about the many ramifications of SQLT, specially in the area of Plan Stability and Plan Migration. It seems to be an eternal concern. SQLT provides a lot of help in this area but I think we need better explanations on how to exploit these capabilities.

Ok. Time for a coffee and my daily email race… So far, email is winning by far…

Written by Carlos Sierra

March 6, 2013 at 7:53 am

SQLTXPLAIN capabilities – free webinar for Oracle partners (Wednesday 27)

with one comment

This Wednesday, February 27 at 8 am Pacific Time (11 am ET), I will deliver a one hour webinar to Oracle partners. The topic is about SQLTXPLAIN capabilities and how this tool helps to diagnose SQL statements performing poorly. Details are here. Format is 45 minutes of presentation and demo, followed by 15 minutes for a Q&A session.

Written by Carlos Sierra

February 25, 2013 at 11:27 am

SQL Tuning Workshop at the NoCOUG

with 5 comments

I am delivering a two-days SQL Tuning Workshop at the Norther California Oracle Users Group (NoCOUG) in Pleasanton, California. Today is the 2nd and last day of this class. We discussed yesterday: Cost-based Optimizer (CBO) Statistics, Event 10053, Adaptive Cursor Sharing (ACS) and SQL Plan Management (SPM). We will discuss today some SQL Tuning tools: Event 10046, Trace Analyzer TRCANLZR, SQLTXPLAIN (SQLT), SQL Health-Check (SQLHC) and SQLT Test Case (TC). There are around 35 participants and I would say the level of the class is between intermediate to advanced. Most participants are seasoned DBAs and some have a good level of expertise as Developers. Abel Macias is participating in this session as my co-instructor. He is bringing to class his extensive expertise in SQL Tuning. It is always a challenge to keep the attention of such a large and diverse group, but I would say the level of participation has been quite high. I appreciate all the effort from several contributors of the NoCOUG who have made this training possible. My special thanks to Gwen Shapira, Randy Samberg, Iggy Fernandez and Eric Hutchinson. It is very rewarding to see how Oracle Users Groups are putting together events like this where we can openly share knowledge with our Oracle users community.

I will speak at the actual NoCOUG Conference this Thursday. I will talk about the content of SQLTXPLAIN (SQLT) main report, which is quite comprehensive. I will go over some samples and I will allocate plenty of time for questions during and after my presentation. These days SQLT is widely used within Oracle and now by several enterprise Oracle customers.

Stelios Charalambides book about SQLT Tuning and SQLTXPLAIN will be released on March 20. The title is Oracle SQL Tuning with Oracle SQLTXPLAIN. This book can now be pre-ordered in Amazon. If you want to learn more about how to use SQLTXPLAIN to diagnose SQL Tuning issues you may want to get a copy of this book.

After NoCOUG I will spend one week at the office, where I will deliver a live Webinar in SQL Tuning to our Oracle Partners. Then my next conference is the popular Hotsos Symposium in March 3-7. In my humble opinion, this annual Hotsos event is the single most relevant in the area of Oracle Performance. Thus I am very excited to have the opportunity to speak there. A real honor!

RMOUG Training Days February 11 (day 1)

with 5 comments

First day of the RMOUG Training Days February 11-13, 2013 is over. I had the honor to present Using SQLTXPLAIN (SQLT) to Diagnose SQL Statements Performing Poorly. I had 14 participants and we were able to go into the details of several topics that were raised by this small group. We covered all the material I had prepared. We also covered SQL Plan Management (SPM) in a live demo branched from the SQLT core presentation and demos. The most rewarding part for me was watching the smiles in the faces of most participants. I think everyone got something out of this session.

I wanted to share my personal experience at this great conference. In short, I am having a blast! From being picked up at the airport by Prashant Khanolkar and sharing the ride with Cary Millsap and Faun deHerry; to speaking at the pre-conference event; and sharing a beer (or two) with other speakers. Just being here at this RMOUG event has been an awesome experience that I highly recommend to all! Last night, I had the chance to socially share with well recognized personalities, including: Kerry Osborne, Maria Colgan, Karl Arao, Kellyn Pot’Vin, Kyle Hailey, Jeff Jacobs, Andy ColvinFaun deHerry and Scott Spendolini. I also had the change to meet or meet for the second time: Tim Gorman, Mogens Nørgaard, Ken Johnson, David Stokes, Lisa Cocklin, George Trujillo and Galo Balda. I met a lot more nice people but I am not so good with names to remember them all, specially this early in the morning…

It feels like yesterday when I participated at the Hotsos Seminar on March 2012. I was so intimidated just by thinking that some of the Oracle performance key players were there, and they would actually might sit in my session… I feel deep respect for most of the professionals that gather and are regulars at these events. What I learned at Hotsos 2012 was that most of these “big shots” were so approachable and down to earth. It was a big relief since I had been learning from them for years, through their books and blogs. For some reason I always thought they would be too busy to talk to someone like me… How wrong I was! That Hotsos 2012 event almost a year ago was the inspiration I needed to submit topics and try to participate in other similar events. RMOUG has had the same friendly, open, welcoming environment I discovered at Hotsos. My kudos to RMOUG organizers for making this event an awesome experience! As long as Oracle is willing to sponsor my travel expenses I will keep submitting topics to the RMOUG annual meeting… and Hotsos, and NoCOUG, and ECO, and some others…

Now, it is time for some workout before enjoying a busy day at the RMOUG!

Written by Carlos Sierra

February 12, 2013 at 7:34 am

2-days open SQL Tuning Class in Pleasanton, CA this month!

with 9 comments

I will be delivering a two-days SQL Tuning class on February 19-20 in Pleasanton, CA. It is available to registered Oracle Users thanks to the help from NoCOUG. Space  is limited, so if you want to attend you need to hurry up.

This class covers:

  • CBO statistics (fundamentals and good practices);
  • Tracing with Events 10046 and 10053;
  • Plan Stability with Adaptive Cursor Sharing (ACS) and SQL Plan Management (SPM);
  • SQL Tuning tools: Trace Analyzer TRCANLZR (TRCA), SQLTXPLAIN (SQLT) and SQL Health-Check (SQLHC); and
  • SQL Tuning Test Cases (creation and implementation).

This is the first time this 2-days class is offered outside Oracle. Its content is an extract of the internal SQL Tuning Workshops I have developed and delivered to over 700 Oracle Engineers around the globe.

This is not an Oracle University (OU) class, which means the scope is more into the actual trades of the day-to-day life of SQL Tuning and less into product features. So if you have taken a SQL Tuning OU class, then this 2-days session would be a nice addition.

Another thing that you should expect from this class is lots of participation. In my sessions I always encourage participants to bring to the table their current SQL Tuning issues or concerns, so we can openly analyze and discuss them during class. The environment is friendly, welcoming and open.

I hope you can join us on February 19-20.

Deep-dive on SQLTXPLAIN offered externally for the first time

with 8 comments

The Rocky Mountain Oracle Users Group (RMOUG) Training Days is approaching soon: February 11-13, 2013. I hope you are attending this Oracle relevant event. There are so many good sessions that it is hard to decide which ones to attend!

For the first time, I will be conducting a 4-hours deep-dive external-session on SQLTXPLAIN (SQLT). The title is Using SQLTXPLAIN (SQLT) to Diagnose SQL Statements Performing Poorly. If you want to know SQLT from the basics to an intermediate level, consider attending this 4-hrs session. It will be offered on Monday, February 11 from 1 to 5 pm at the Colorado Convention Center in Denver, Colorado (USA). I love interactive sessions, so I will encourage voluntary participation and open discussions.

In addition to the 4-hrs deep-dive on SQLT, I will also speak on Wednesday, February 13 at 8:30 am about How to Improve SQL Performance with New SQL Health Check Tool. This is a 1-hour seminar with plenty of time for Q&A.

If you attend the event and want to discuss any issue with regard to SQLTXPLAIN (SQLT), SQL Health-Check (SQLHC) or simply SQL Tuning, please approach me. I will be there to help, share, listen, learn, discuss, etc.

Written by Carlos Sierra

January 23, 2013 at 6:39 am

Mutating histograms, mutating number of rows, mutating blevel, mutating number of distinct levels… What do they mean?

with 2 comments

Every so often I get an email from a SQLTXPLAIN user asking the meaning of these “mutating” observation from the health-checks engine on SQLT. On next SQLT release 11.4.5.0 (due on November 21, 2012), these messages will read “fluctuating” instead of “mutating”, but the health checks will remain the same. So, what do they mean?

Take “fluctuating number of histogram endpoints count”. What it means is the following: every time we gather CBO statistics on a Table, we have a new “version” of the statistics. We store the old one automatically (DBMS_STATS does that starting on 10g) and the new version becomes “current”. We store 31 days of history by default. So if we gather statistics on a Table lets say daily, we may have about 30 statistics versions for that Table and its columns (and Indexes). Now, if a column has lets say 5 buckets in its histogram and the next version has 6, there was a recorded change of 20%, which may be relevant for your investigation. If a column during its stored history has a number of buckets that fluctuates more than 10% from one version to the next, then it is flagged as “fluctuating” (use to be mutating in SQLT 11.4.4.8 and older). Changes of less than 10% are ignored.

Same concept applies to other elements like the number of distinct values in a column, or the height of an index, or the number of rows in a Table or Partition. If they fluctuate more than 10% from one version of the CBO statistics to the next, then they are flagged and reported in the Observations section of the main SQLT report.

Are we supposed to do anything with those health-checks? All depends. Lets say your SQL suddenly stop using index X and now uses index Y. If blevel on X has increased recently, you can suspect a cost increased due to the new blevel and investigate accordingly. Keep in mind that “Observations” section of SQLT, which reports outcome of health-checks, is basically a collection of heads-up. Things to investigate and maybe pointing to the root cause of a recent change in an execution plan for the SQL being investigated.

Written by Carlos Sierra

November 14, 2012 at 5:59 am

Should I delete my column Histograms to improve plan stability?

with 4 comments

When we combine the use of bind variables in a SQL statement with column Histograms we usually end-up having multiple execution plans over time. This plan instability is due to the lack of control we have over the selectivity of the predicates at the time the SQL is hard parsed while using bind peeking. On 11g this behavior is mitigated with the use of Adaptive Cursor Sharing (ACS), and it can be further improved when using ACS together with SQL Plan Management (SPM).

As we generate multiple execution plans for one SQL statement where the reason is the combination of binds and Histograms, some of these plans may perform better than others over a large number of executions. Thus it is natural to consider some of these execution plans as “bad plans”. If the use of ACS and SPM is not an option, then we may need to consider fixing the plan instability issue with the suppression of either bind peeking or Histograms. If we are in such case, it is better to consider the removal of Histograms first, and leave bind peeking active. The reason is that bind peeking even with no Histograms still uses CBO column statistics and among other things it uses column low and high values when computing selectivity.

Back to the question: Should I delete my column Histograms to improve plan stability?

If you are on 10g, or if on 11g but ACS is not an option, then deleting column Histograms may be your best option to improve your plan stability. Like any other change, you rather test first on a non-production environment. Even in such test environment, you may want to restrict your test to only those tables accessed by your SQL. On 11g, DBMS_STATS.DELETE_COLUMN_STATS provides an option to delete just the Histograms while preserving all other column statistics. This is by using parameter COL_STAT_TYPE with value HISTOGRAM. If you want to do the same on 10g, you may want to use the SQLT HGRM module. In any case you can always restore column statistics using DBMS_STATS.RESTORE_TABLE_STATS.

If you are considering deleting Histograms to test the effect on an execution plan and the performance of your SQL, you may want to test first asking the CBO to simply ignore them. If patch for bug 9550277 has been applied in your system then you can command below.

ALTER SESSION SET "_FIX_CONTROL"='9550277:1';

Written by Carlos Sierra

August 13, 2012 at 8:33 am