Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Non-intrusive SQL Trace instrumentation on legacy PL/SQL code

leave a comment »

Problem

Legacy PL/SQL code with intermittent performance degradation.

To improve the performance of this code, the first step is to diagnose it. But this code has no instrumentation whatsoever, it is in Production, and rolling any code into Production usually requires rigorous testing. So, whatever method we implement has to be light-weight and as safe as possible.

Using Oradebug is not a viable solution mainly for two reasons: It misses the “head” of the transaction, so we may not get to know the SQL taking longer; and second, internal procedures from finding about the issue, reporting it, then acting on it may take from several minutes to hours.

Solution

  1. Identify which PL/SQL libraries are known to be problematic in terms of intermittent performance degradation.
  2. At the beginning  of each callable PL/SQL Procedure or Function, add a call to DBMS_APPLICATION_INFO.SET_MODULE to set some appropriate MODULE and ACTION, for example “R252, LOAD”. Call this API also at the end, to NULL out these two parameters. This code change is very small and safe. It introduces practically no overhead. It simply labels every SQL executed by the PL/SQL library with some MODULE and ACTION that uniquely identify the code of concern.
  3. Activate SQL Trace on the module/action that needs to be traced, by calling DBMS_MONITOR.SERV_MOD_ACT_TRACE_ENABLE, passing parameters SERVICE, MODULE and ACTION. With this API request a SQL Trace to be generated with WAITs and BINDs (binds are optional but desirable). Once these traces are no longer needed (reviewed by someone), turn SQL Trace off using API DBMS_MONITOR.SERV_MOD_ACT_TRACE_DISABLE.
  4. Once the SQL Trace is produced, generate a TKPROF on it. You may want to include parameter “sort=exeela fchela”. This way you get the slower SQL at the top of the TKPROF report.
  5. With SQL Trace and TKPROF, identify the slower SQL and use SQL Monitor and/or SQL XTRACT to get more granular diagnostics (you need to identify SQL_ID). On any given PL/SQL library, it is common that 1~5 SQL statements consume > 80% of the Elapsed Time. Focus on these large consumers.

Conclusion

Producing a SQL Trace with EVENT 10046 level 8 or 12 is very useful to properly diagnose the code on a PL/SQL library which performs poorly. A follow-up on the slower SQL with SQL Monitor and/or SQLT XTRACT is in order. The method presented above is very easy and safe to implement.

Written by Carlos Sierra

December 23, 2013 at 12:23 pm

Counting rows fast

with one comment

A friend of mine asked me last night basically this: “How is that SQLTXPLAIN counts rows?”. In particular, he was referring to the use of the SAMPLE clause of the SELECT statement. Look at this SQLT’s log piece:

SQL_ID a9x1kc4ymyhkz
--------------------
SELECT /*+ FULL(t) PARALLEL */ COUNT(*) * 1e4
FROM "XYPZ"."INSTRUMENT" SAMPLE (.01) t

SQL_ID 025v6k1032t69
--------------------
SELECT /*+ FULL(t) PARALLEL */ COUNT(*) * 1e5
FROM "XYPZ"."POSITION_COMPOSITION" SAMPLE (.001) t

SQL_ID 8rby3340xpd9k
--------------------
SELECT /*+ FULL(t) PARALLEL */ COUNT(*) * 1e5
FROM "XYPZ"."POSITION_EVENT" SAMPLE (.001) t

WHY is it that SQLT has to count rows?

SQLT has to count rows so it can report side by side DBA_TABLES.NUM_ROWS and COUNT(*) from each Table. This is an easy way to see if your statistics are way off, and this mechanism exists on SQLT well before DBA_TAB_MODIFICATIONS came to existence. Actually, SQLT uses both methods to health-check how stale are your Table statistics.

The conundrum here is: “I use SQLT because I want to diagnose a performance issue on a QUERY on top of large Tables, but I do not want SQLT to take a long time just to produce a COUNT(*) of my Tables…”.

Fast versus Precise

In Performance tuning, there is always a trade-off. You want X but you sacrifice B. Counting rows is no different. Do you want it faster? Then you sacrifice precision. The SAMPLE clause of the SELECT statement allows you to do exactly that (syntax below):

SAMPLE [ BLOCK ] ( sample_percent ) [ SEED ( seed_value ) ]

So, if you specify a 10% sample size then you have to multiply the COUNT(*) by 10. If you sample 1% you multiply the COUNT(*) by 100. In large Tables if you sample, lets say 0.1%, your multiplier becomes 1,000, which is the same than 1e3 (10**3 or 10^3 depending where you went to school). Sample size can be as small as 0.000,001 and as large as 100 (but without including 100 itself). It represents probabilities more than an actual sample size.

The optional BLOCK clause simple says: use sample blocks instead of rows. And the optional SEED clause tries to provide some consistency in the result of the count when you use the same value for two executions of the exact same count. This SEED clause takes a value between 0 and 4,294,967,295.

How SQLT counts rows?

SQLT has over 40 tool parameters. One of them is count_star_threshold with a seeded value of 10,000.

SQLT includes a small algorithm (below) that determines the size of the SAMPLE according to the estimated size of the Table itself, by looking at its statistics as per DBA_TABLES.NUM_ROWS. No statistics? then skip the sample and do a normal full scan. If the Table is expected to be smaller then the count_star_threshold, then do a full scan. So is up to 10x this threshold. After that, use a sample size proportionally inverse to the Table size. The bigger the Table the smaller the Sample.

SQLT also forces a full Table scan and invokes Parallel Execution (PX) as a method to expedite the count. This count can be really fast on Exadata systems as you can imagine.

 /* -------------------------
 *
 * private perform_count_star
 *
 * called by: sqlt$i.common_calls and sqlt$i.remote_xtract
 *
 * ------------------------- */
 PROCEDURE perform_count_star (p_statement_id IN NUMBER)
 IS
 l_sql VARCHAR2(32767);
 l_number NUMBER;
 l_count NUMBER;
 BEGIN
 write_log('=> perform_count_star');

IF sqlt$a.get_param_n('count_star_threshold') = 0 THEN
 write_log('skip "count_star" as per corresponding parameter');
 ELSE
 FOR i IN (SELECT owner, table_name, num_rows, source
 FROM &&tool_administer_schema..sqlt$_dba_all_tables_v
 WHERE statement_id = p_statement_id
 ORDER BY
 owner, table_name)
 LOOP
 IF i.num_rows IS NULL THEN
 l_sql := 'SELECT /*+ FULL(t) PARALLEL */ COUNT(*)
FROM "'||i.owner||'"."'||i.table_name||'" t WHERE ROWNUM <= :number';
 l_number := sqlt$a.get_param_n('count_star_threshold');
 ELSIF i.num_rows < sqlt$a.get_param_n('count_star_threshold') THEN
 l_sql := 'SELECT /*+ FULL(t) PARALLEL */ COUNT(*)
FROM "'||i.owner||'"."'||i.table_name||'" t WHERE ROWNUM <= :number';
 l_number := sqlt$a.get_param_n('count_star_threshold') * 10;
 ELSIF i.num_rows < (sqlt$a.get_param_n('count_star_threshold') * 1e1) THEN
 l_sql := 'SELECT /*+ FULL(t) PARALLEL */ COUNT(*) * 1e1
FROM "'||i.owner||'"."'||i.table_name||'" SAMPLE (:number) t';
 l_number := 1e1;
 ELSIF i.num_rows < (sqlt$a.get_param_n('count_star_threshold') * 1e2) THEN
 l_sql := 'SELECT /*+ FULL(t) PARALLEL */ COUNT(*) * 1e2
FROM "'||i.owner||'"."'||i.table_name||'" SAMPLE (:number) t';
 l_number := 1e0;
 ELSIF i.num_rows < (sqlt$a.get_param_n('count_star_threshold') * 1e3) THEN
 l_sql := 'SELECT /*+ FULL(t) PARALLEL */ COUNT(*) * 1e3
FROM "'||i.owner||'"."'||i.table_name||'" SAMPLE (:number) t';
 l_number := 1/1e1;
 ELSIF i.num_rows < (sqlt$a.get_param_n('count_star_threshold') * 1e4) THEN
 l_sql := 'SELECT /*+ FULL(t) PARALLEL */ COUNT(*) * 1e4
FROM "'||i.owner||'"."'||i.table_name||'" SAMPLE (:number) t';
 l_number := 1/1e2;
 ELSIF i.num_rows < (sqlt$a.get_param_n('count_star_threshold') * 1e5) THEN
 l_sql := 'SELECT /*+ FULL(t) PARALLEL */ COUNT(*) * 1e5
FROM "'||i.owner||'"."'||i.table_name||'" SAMPLE (:number) t';
 l_number := 1/1e3;
 ELSIF i.num_rows < (sqlt$a.get_param_n('count_star_threshold') * 1e6) THEN
 l_sql := 'SELECT /*+ FULL(t) PARALLEL */ COUNT(*) * 1e6
FROM "'||i.owner||'"."'||i.table_name||'" SAMPLE (:number) t';
 l_number := 1/1e4;
 ELSIF i.num_rows < (sqlt$a.get_param_n('count_star_threshold') * 1e7) THEN
 l_sql := 'SELECT /*+ FULL(t) PARALLEL */ COUNT(*) * 1e7
FROM "'||i.owner||'"."'||i.table_name||'" SAMPLE (:number) t';
 l_number := 1/1e5;
 ELSE
 l_sql := 'SELECT /*+ FULL(t) PARALLEL */ COUNT(*) * 1e8
FROM "'||i.owner||'"."'||i.table_name||'" SAMPLE (:number) t';
 l_number := 1/1e6;
 END IF;

l_sql := REPLACE(l_sql, ':number', l_number);
 write_log('num_rows='||i.num_rows||' sql='||l_sql);
 l_count := NULL;

BEGIN
 EXECUTE IMMEDIATE l_sql INTO l_count;
 write_log(l_count||' rows counted');
 EXCEPTION
 WHEN OTHERS THEN
 write_log('** '||SQLERRM);
 write_log(l_sql||' failed with error above. Process continues.');
 END;

IF l_count IS NOT NULL THEN
 IF i.source = 'DBA_TABLES' THEN
 UPDATE &&tool_repository_schema..sqlt$_dba_tables
 SET count_star = l_count
 WHERE statement_id = p_statement_id
 AND owner = i.owner
 AND table_name = i.table_name;
 ELSIF i.source = 'DBA_OBJECT_TABLES' THEN
 UPDATE &&tool_repository_schema..sqlt$_dba_object_tables
 SET count_star = l_count
 WHERE statement_id = p_statement_id
 AND owner = i.owner
 AND table_name = i.table_name;
 END IF;
 END IF;
 END LOOP;

COMMIT;
 END IF;

write_log('<= perform_count_star');
 END perform_count_star;

Conclusion

Counting rows is like counting beans, you can count one at a time, or you can take some shortcuts. If you are willing to sacrifice some precision for the sake of gaining performance, consider then using the SAMPLE clause of the SELECT statement.

Carlos Sierra’s shared Scripts and Presentations

with 2 comments

I recently delivered 3 sessions at the East Coast Oracle Users Group (ECO). During these sessions I offered to share the actual Presentations and some of the Scripts I used during the 3rd session. I plan to keep updating and expanding both scripts and presentations. They also show now on the right side of this page. Feel free to use, share and recycle any of my scripts and presentations.

Written by Carlos Sierra

November 12, 2013 at 7:23 am

A healthy way to do an Oracle database health-check

with 12 comments

Q: How do I do an Oracle database health-check?

A: It all depends. (hint: we can use this answer for most Oracle related questions in general and Performance related in particular, but don’t try it at home).

This seems like a quite broad question and yes it is. And of course there are many ways to proceed with a database health-check. So at this post I ‘d rather talk about: what I think is a healthy way to approach an Oracle database health-check.

  1. Start with the basics: Listen to the users of this database. If nobody complains then most probably you would have to define the scope by yourself. In any case, go on.
  2. Gather environment information. This includes the understanding of the architecture used, the databases on such architecture and the applications on those databases. Also learn who is who: Users, DBAs and Developers.
  3. Gather metrics. I am referring to OS metrics (CPU, IO and Memory), and also database metrics (AWR) together with alert logs. When gathering these metrics focus on time periods where the pain has been reported, and slice the time as small as possible (i.e. AWR reports for each time slice captured, avoiding the 6-24 hours time frame common mistake).
  4. Let the combination of complains (concerns) and findings on metrics to guide you to the next step. This is where most get lost. So don’t panic and dive in into what you see as contention on your metrics. Keep in mind that the most important metric of all is “user response time”, so anything affecting it must be in your priority list.
  5. There are many more things to check, but they are more in the configuration and sound practices arena. For example: redundancy on control files, archive mode, backups, non-default parameters, PX setup, memory setup, etc. For these, creating a check list would help.
  6. At some point you will have many leads and you will start to lose focus. Do some yoga or go for a walk, then make an A, B, C list with what is really important, what is kind-of and what is mere style.
  7. You are not an expert on every aspect of the database (nobody is). So, do not pretend you can do everything yourself. Rely on your peers and/or contacts. Reach out for help in those areas where you feel insecure (feeling insecure is a good thing, much better than feeling secure without any solid foundation).
  8. Once you think you have it nailed, go to your peers, colleagues, boss(es), friends, partner, or strangers if necessarily, and solicit a review of your findings and recommendations. Accept feedback. This is key. Maybe what you thought was sound it makes absolutely no sense to someone with more experience or simply with a different view.
  9. Reconsider everything. Avoid the pitfall of assuming that what you have learn in your two-digits years of experience can be applied to every case. For example, if you have done mostly SQL Tuning, don’t expect every issue to be SQL Tuning. Health-checks are like fortune cookies, you never know what you will get.
  10. Last but not least: Learn from your new experience, practice listening to others, use your common sense, exercise your knowledge, and work as a team member. Add the word “collaboration” to your daily work and maybe one day you will learn you are not alone.

Cheers — Carlos

Written by Carlos Sierra

November 1, 2013 at 7:27 am

Speaking about SQLT XPLORE – The SQLT hidden child

leave a comment »

Next week I will be participating at the East Coast Oracle Users Group ECO. If you are not familiar with this Oracle Users Group you may want to check it out. It gathers close to 300 Oracle users, and every year it gets bigger and better. The environment is friendly, and I would say even cozy. There are 6 simultaneous tracks packed into 2 full days. Sometimes it is hard to decide which session to attend, but fortunately the diversity of topics usually reduces the choice to one or two (unless you are like me, and want to learn, and learn, and learn…)

Anyways, the schedule is here so you can check what is all about. If you decide to attend this year (next week), you can still register today and get a small discount using code SPEAKERVIP.

I will be speaking about two topics. One is about Adaptive Cursor Sharing, while the second is about SQLT XPLORE, which I call “the SQLT hidden child“. SQLT XPLORE is a stand-alone module inside SQLTXPLAIN (SQLT), but it does not require SQLT to be installed. This SQLT XPLORE uses brute force analysis in order to “discover” Execution Plans that may be elusive. Typical case is when you upgrade your database and your Execution Plan changed.

Stelios Charalambides writes about SQLT XPLORE in his book “Oracle SQL Tuning with Oracle SQLTXPLAIN“. In my session at ECO I will show some XPLORE samples, and will explain how to read its output. I am planing to do a live demo about executing this tool. And of course, all questions are welcomed!

I hope to see some familiar faces next week at ECO, and also to meet some new Oracle users and other speakers. Looking forward to speak at ECO next week!

Written by Carlos Sierra

October 31, 2013 at 7:50 am

Speaking at the SLOUG tomorrow

leave a comment »

Kerry and I will be speaking at the Saint Louis Oracle Users Group. The name of the event is Big Bash 2013. We will talk about problem solving, plan stability and plan flexibility. We expect to have a large audience and we look forward to meet old friends and make new ones. Oracle Users Groups are nice gatherings, and every group has its one charisma. For me, this is my first time speaking at SLOUG, and actually the first time in Saint Louis. Flying today and retuning tomorrow late at night. Feeling excited!

Written by Carlos Sierra

October 17, 2013 at 7:32 am

Posted in Conferences, OUG

SQL Monitor and SQL ASH Scripts are now available

with 12 comments

I just uploaded two new scripts to my set of “Shared Scripts“. These are the sqlmon.sql and the sqlash.sql. They both input a SQL_ID and produce some txt/html/zip output.

The sqlmon.sql produces SQL Monitor Reports for all executions available on SQL Monitor for a given SQL. It is RAC aware and its output includes both text and “active” reports. You must have an Oracle Tuning Pack license on the system where you may want to use it.

The sqlash.sql produces Active Session History (ASH) Reports for one SQL_ID. It is also RAC aware and output includes both text and html formats. It produces these ASH reports from both memory and AWR. You must have an Oracle Diagnostics Pack license on the system where you may want to use it.

These two scripts are light weight. Install nothing, change nothing. If you need a more robust diagnostic tool in these areas, please use SQLHC and/or SQLT. In the other hand, for a quick understanding of one SQL you may want to use the trio: planx.sql, sqlash.sql and sqlmon.sql.

Written by Carlos Sierra

October 15, 2013 at 9:03 pm

YASTS: Yet Another SQL Tuning Script: planx.sql

with 6 comments

Having SQLTXPLAIN and SQLHC available, WHY do I need yet another way to display execution plans?

New script planx.sql reports execution plans for one SQL_ID from RAC and AWR. It is lightweight and installs nothing. It produces list of performance metrics for given SQL out of gv$sqlstats, gv$sqlstats_plan_hash, gv$sql and dba_hist_sqlstat. It also displays execution plans from gv$sql_plan_statistics_all and dba_hist_sql_plan. It is RAC aware. It also reports on io_saved when executed on Exadata.

Most stand-alone light-weight scripts I have seen only report plans from connected RAC node. This script reports from all RAC nodes. The AWR piece is optional. In other words, if your site does not have a Diagnostics Pack License you can specify so when executing this script, thus all access to AWR data is simply skipped. Output is plain text and it executes in seconds.

I will be using this planx.sql as my first step in the analysis of queries performing slowly. If I need more, then I will use SQLHC or SQLTXPLAIN. This planx.sql script, as well as some others, are beginning to populate my new shared directory of “free” scripts. The link is at the right of the screen, and also here. Quite often I write small scripts to do my job, now they will have a new house there. A readme provides a one-line description of each script.

Conclusion

New planx.sql is an alternative to plain DBMS_XPLAIN.DISPLAY_CURSOR. It displays plans from all RAC nodes and from AWR(opt). It also reports relevant performance metrics for all recorded execution plans. It is fast and installs nothing.

Written by Carlos Sierra

October 9, 2013 at 6:06 pm

Carlos Sierra Shared Scripts

with 2 comments

Every so often I write small scripts for my personal use. Recently I wrote this mini tiny version of a snaper on v$mystat. Basically a script with no parameters that if executed right before and after a SQL I am investigating, it spits out the delta on two snaps on v$mystat taken just before and after my SQL. This way I can quickly validate some of the “cell” counts used by Exadata for example. Of course if I want a more robust script, and I want the capability to monitor a session other than the one I connected, I would use snaper.sql created by Tanel Poder. In my case, I just needed something extra light, with no installation of any object into the database. This mini tiny script is named mystat.sql and it is available here. As time passes by, I will be writing other small and useful scripts. I will upload them into same location. So feel free to visit this Google Drive place every so often. If I upload something exiting I will blog about it. Until then, I just have two new scripts: mystat.sql and tkprof.sql.

Written by Carlos Sierra

October 4, 2013 at 10:43 am

Posted in Scripts

Speaking at Oaktable World

with 6 comments

Some of you may not know what is the Oaktable.  I quote here a very precise description right from the Oaktable’s main page: <<< The OakTable network is a network for the Oracle scientist, who believes in better ways of administering and developing Oracle based systems. The network is an informal organisation, which will get together at various Oracle events. >>> To me, it is a lot more, since its members are a very collaborative bunch of smart and dedicated folks. A common denominator I have perceived so far is the honest desire to apply scientific methods to complex issues we encounter in the Oracle world. If you are serious about learning Oracle and its intricacies, you may want to do as I do:  “feast from the crumbles of the Oaktable“.

The Oaktable World is a conference that happens to coincide with OOW.  This year I have the honor to speak at Oaktable World. According to the agenda, I will speak on Tuesday, September 24 at 10am. My session is at the same time than Jonathan Lewis, thus I am prepared to speak to myself (unless lucky me you happen to not know who is JL). Anyways, I will be speaking about Adaptive Cursor Sharing (ACS) and how it actually works. I will do some live demos and my goal is that all participants (if any) can actually fully understand how and when ACS produces multiple optimal execution plans for a given SQL.

If you plan to attend any of the Oaktable World sessions, please be sure you register. Space is very limited.

I will be attending myself as many of the other Oaktable World sessions as I can, else I will be there at the Enkitec booth #501 in Moscone South in case you want to stop-by and say hello.

Written by Carlos Sierra

September 16, 2013 at 5:13 pm