Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Archive for the ‘Scripts’ Category

Forcing a “Nested Loop only” Execution Plan

with 5 comments

Sometimes you do what you have to do. So here I confess doing something I usually avoid: forcing an Execution Plan (which is not the same as using a more conventional method for Plan stability).

This is a case on 11.2.0.3.0 base release where the application vendor sets the optimizer to 9i, and tweaks other CBO parameters in questionable ways, then some queries produce suboptimal plans (as expected); and you are called to help without changing the obvious.

There is a family of queries from an ad-hoc query generator that permits users to issue queries without a set of selective predicates. These queries join several large tables and their performance is poor (as expected as well!). On top of the previous, all these queries include the /*+ FIRST_ROWS */ CBO Hint and the questionable DISTINCT keyword. Note: it is quite common for developers to throw a DISTINCT keyword “to avoid duplicates” where the mere existence of duplicates would be an indication of an application bug; so “why fix it if I can hide it, right?”.

There is one caveat although: these queries include a generic predicate “rownum <= :b1”, and value passed defaults to 5000, so users rationale is “if I only want the first X rows my query should return fast”. This highlights still another questionable practice since it is hard to imagine a user scrolling 5000 rows and making any sense of such large set, especially when the full “filtered” set would be several million rows long. So the original problem is questionable in several ways. Nevertheless, sometimes we are called to help besides providing advice. And no, we are not allowed to slap hands 😉

The good news is that we can use this extra predicate on rownum and make these queries to return the first X rows really fast; and I mean less than 5 seconds instead of over one hour or more! And if users want not 5000 but 500 or even 50 rows, then we can be in the sub-second range!

You may be thinking FIRST_ROWS optimization, and that was my first try. Unfortunately, on 11.2.0.3.0, even reversing all the suboptimal CBO parameters at the session level, I would consistently get an Execution Plan with a few Hash Joins and a large Cost; and if I were to force a Nested Loop Plan, the cost would be several orders of magnitude larger so the CBO would not pick it! Nevertheless, such a “Nest Loop only” Execution Plan would fulfill the user’s expectations, regardless the validity of the initial request. And yes, CBO statistics are OK, not perfect but simply OK. One more piece of info: this is not Exadata! (if it were Exadata most probably these same Execution Plans with full table scans and Hash Joins would simply fly!).

So, my issue became: How do I force an Execution Plan that only contains Nested Loops? If I could do that, then the COUNT STOP operation could help me to halt my SQL execution once I fetched the first X rows (Hash Join does not allow me do that). Remember: these tables have literally millions of rows. I could pepper these queries with a ton of CBO Hints and I would get my desired “Nested Loop only” Execution Plan… But that would be a lot of work and tricky at best.

SQL Patch to the rescue

I could had used a SQL Profile, but I think this dirty trick of suppressing Hash Joins and Sort Merge Joins, would be better served with a SQL Patch. I also thought Siebel: They do tweak CBO parameters as well, and they suppress Hash Joins, but they change System and Session level parameters… Since I wanted my change to be very localized, SQL Patch could provide me just what I needed.

Under the Downloads section on the margin of this page, there is a “cscripts” link that includes the sqlpch.sql script. I used this script and passed as the second parameter the following string (1st parameter is SQL_ID). With a SQL Patch generated this way, I could systematically produce a “Nested-Loops only” Execution Plan for these few queries. I did not have to change the original SQL, nor change the CBO environment at the System or Session level, neither restrict the query generator, and I did not had to “educate” the users to avoid such unbounded queries.

OPT_PARAM("_optimizer_sortmerge_join_enabled" "FALSE") OPT_PARAM("_hash_join_enabled" "FALSE")

Conclusion

I have to concede doing something questionable, in this case using a SQL Patch to force a desired Execution Plan instead of fixing the obvious, simply because that was the shortest path to alleviate the user’s pain.

I consider this technique above a temporary work-around and not a solution to the actual issue. In this case the right way to handle this issue would be:

  1. Have the application vendor certify their application to the latest release of the database and reset all CBO related parameters, plus
  2. Have the application vendor remove CBO Hints and DISTINCT keyword from queries, plus
  3. Configure the ad-hoc query generator to restrict users from executing queries without selective predicates, then
  4. Tune those outlier queries that may still need some work to perform as per business requirements, and possibly
  5. Educate the users to provide as many selective predicates as possible

Anyways, the potential of using a SQL Patch to tweak an Execution Plan in mysterious ways is quite powerful, and something we may want to keep in the back of our minds for a rainy day…

Written by Carlos Sierra

August 8, 2015 at 8:30 am

edb360 taking a long time

with 2 comments

In most cases edb360 takes less than 1hr to execute. But I often hear of cases where it takes a lot longer than that. In a corner case it was taking several days and it had to be killed.

So the question is WHY edb360 takes that long?

Well, edb360 executes thousands of SQL statements sequentially (intentionally). Many of these queries read data from AWR and in particular from ASH. So, lets say your ASH historical table has 2B rows, and on top of that you have not gathered statistics on AWR tables in years, thus CBO under-estimates cardinality and tends to use index access and nested loops. In such extreme cases you may end up with suboptimal execution plans that expect to return a few rows, but actually read a couple of billion rows using index access operations and nested loops. A query like this may take hours to complete!

As of version v1515, edb360 has a shortcut algorithm that ends an execution after 8 hours. So you may get an incomplete output, but it ends normally and the partial output can actually be used. This is not a solution but a workaround for those long executions.

How to troubleshoot edb360 taking long?

Steps:

1. Review files 00002_edb360_dbname_log.txt, 00003_edb360_dbname_log2.txt, 00004_edb360_dbname_log3.txt and 00005_edb360_dbname_tkprof_sort.txt. First log shows the state of the statistics for AWR Tables. If stats are old then gather them fresh with script edb360/sql/gather_stats_wr_sys.sql

2. If number of rows on WRH$_ACTIVE_SESSION_HISTORY as per 00002_edb360_dbname_log.txt is several millions, then you may not be purging data periodically. There are some known bugs and some blog posts on this regard. Review MOS 387914.1 and proceed accordingly. Execute query below to validate ASH age:

SELECT TRUNC(sample_time, 'MM'), COUNT(*)
  FROM dba_hist_active_sess_history
 GROUP BY TRUNC(sample_time, 'MM')
 ORDER BY TRUNC(sample_time, 'MM')
 /

3. If edb360 version (first line on its readme) is older than 1 month, download and use latest version: https://github.com/carlos-sierra/edb360/archive/master.zip (link is also provided on the right-hand side of this blog under downloads).

4. Consider suppressing text and or csv reports. Each for an estimated gain of about 20%. Keep in mind that when suppressing reports, you start loosing some functionality. To suppress lets say text and csv reports, place the following two commands at the end of script edb360/sql/edb360_00_config.sql

DEF edb360_conf_incl_text = ‘N’;
DEF edb360_conf_incl_csv = ‘N’;

5. If after going through steps 1-4 above, edb360 still takes longer than a few hours, feel free to email author carlos.sierra.usa@gmail.com and provide 4 files from step 1.

Written by Carlos Sierra

May 6, 2015 at 8:19 pm

Discovering if a System level Parameter has changed its value (and when it happened)

with 4 comments

Quite often I learn of a system where “nobody changed anything” and suddenly the system is experiencing some strange behavior. Then after diligent investigation it turns out someone changed a little parameter at the System level, but somehow disregarded mentioning it since he/she thought it had no connection to the unexpected behavior.  As we all know, System parameters are big knobs that we don’t change lightly, still we often see “unknown” changes like the one described.

Script below produces a list of changes to System parameter values, indicating when a parameter was changed and from which value into which value. It does not filter out cache re-sizing operations, or resource manager plan changes. Both would be easy to exclude, but I’d rather see those global changes listed as well.

Note: This script below should only be executed if your site has a license for the Oracle Diagnostics pack (or Tuning pack), since it reads from AWR.

WITH
all_parameters AS (
SELECT snap_id,
       dbid,
       instance_number,
       parameter_name,
       value,
       isdefault,
       ismodified,
       lag(value) OVER (PARTITION BY dbid, instance_number, parameter_hash ORDER BY snap_id) prior_value
  FROM dba_hist_parameter
)
SELECT TO_CHAR(s.begin_interval_time, 'YYYY-MM-DD HH24:MI') begin_time,
       TO_CHAR(s.end_interval_time, 'YYYY-MM-DD HH24:MI') end_time,
       p.snap_id,
       p.dbid,
       p.instance_number,
       p.parameter_name,
       p.value,
       p.isdefault,
       p.ismodified,
       p.prior_value
  FROM all_parameters p,
       dba_hist_snapshot s
 WHERE p.value != p.prior_value
   AND s.snap_id = p.snap_id
   AND s.dbid = p.dbid
   AND s.instance_number = p.instance_number
 ORDER BY
       s.begin_interval_time DESC,
       p.dbid,
       p.instance_number,
       p.parameter_name
/

Sample output follows, where we can see a parameter affecting Degree of Parallelism was changed. This is just to illustrate its use. Enjoy this new free script! It is now part of edb360.
Screen Shot 2015-03-25 at 19.15.26

Written by Carlos Sierra

March 25, 2015 at 7:25 pm

SQLTXPLAIN under new administration

with 3 comments

During my 17 years at Oracle, I developed several tools and scripts. The largest and more widely used is SQLTXPLAIN. It is available through My Oracle Support (MOS) under document_id 215187.1.

SQLTXPLAIN, also know as SQLT, is a tool for SQL diagnostics, including Performance and Wrong Results. I am the original developer and author, but since very early stages of its development, this tool encapsulates the expertise of many bright engineers, DBAs, developers and others, who constantly helped to improve this tool on every new release by providing valuable feedback. SQLT is then nothing but the collection of many good ideas from many people. I was just the lucky guy that decided to build something useful for the Oracle SQL tuning community.

When I decided to join Enkitec back on 2013, I asked Mauro Pagano to look after my baby (I mean SQLT), and sure enough he did an excellent job. Mauro fixed most of my bugs, as he jokes about, and also incorporated some of his own :-). Mauro kept SQLT in good shape and he was able to continue improving it on every new release. Now Mauro also works for Enkitec, so SQLT has a new owner and custodian at Oracle.

Abel Macias is the new owner of SQLT, and as such he gets busy maintaining and enhancing this tool among other duties at Oracle. So, if you have enhancement requests, or positive feedback, please reach out to Abel at his Oracle account: abel.macias@oracle.com. If you come across some of my other tools and scripts, and they show my former Oracle account (carlos.sierra@oracle.com), please reach out to Abel and he might be able to route your concern or question.

Since one of my hobbies is to build free software that I also consume, my current efforts are on eDB360, eAdam and eSP. The most popular and openly available is eDB360, which basically gives your a 360-degree view of a database without installing anything. Then, Mauro is also building something cool on his own free time. Mauro is building the new SQLd360 tool, which is already available on the web (search for SQLd360). This SQLd360 tool, similar to eDB360, provides a 360-degree view, but instead of a database its focus is one SQL. And similarly than eDB360 it installs nothing on the database. Both are available as “free software” for anyone to download and use. That is the nice part: everyone likes free! (specially if any good).

What is the difference between SQLd360 and SQLT?

Both are exceptional tools. And both can be used for SQL Tuning and for SQL diagnostics. The main differences in my opinion are these:

  1. SQLT has it all. It is huge and it covers pretty much all corners. So, for SQL Tuning this SQLTXPLAIN is “THE” tool.
  2. SQLd360 in the other hand is smaller, newer and faster to execute. It gives me what is more important and most commonly used.
  3. SQLT requires to install a couple of schemas and hundreds of objects. SQLd360 installs nothing!
  4. To download SQLT you need to login into MOS. In contrast, SQLd360 is wide open (free software license), and no login is needed.
  5. Oracle Support requires SQLT, and Oracle Engineers are not exposed yet to SQLd360.
  6. SQLd360 uses Google charts (as well as eDB360 does) which enhance readability of large data sets, like time series for example. Thus SQLd360 output is quite readable.
  7. eDB360 calls SQLd360 on SQL of interest (large database consumers), so in that sense SQLd360 enhances eDB360. But SQLd360 can also be used stand-alone.

If you ask me which one would I recommend, I would answer: both!. If you can use both, then that is better than using just one. Each of these two tools (SQLT and SQLd360) has pros and cons compared to the other. But at the end both are great tools. And thanks to Abel Macias, SQLT continues its lifecycle with frequent enhancements. And thanks to Mauro, we have now a new kid on the block! I would say we have a win-win for our large Oracle community!

Written by Carlos Sierra

March 18, 2015 at 12:37 pm

How to identify SQL performing poorly on an APEX application?

with 3 comments

Oracle Application Express (APEX) is a great tool to rapidly develop applications on top of an Oracle database. While developing an internal application we noticed that some pages were slow, meaning taking a few seconds to refresh. Suspecting there was some poorly performing SQL behind those pages, we tried to generate a SQL Trace so we could review the generated SQL. Well, there is no out-of-the-box instrumentation to turn SQL Trace ON from an APEX page… Thus our challenge became: How can we identify suspected SQL performing poorly, when such SQL is generated by an APEX page?

Using ASH

Active Session History (ASH) requires an Oracle Diagnostics Pack License. If your site has such a License, and you need to identify poorly performing SQL generated by APEX, you may want to use find_apex.sql script below. It asks for an application user and for the APEX session (a list is provided in both cases). It outputs a list of poorly performing SQL indicating the APEX page of origin, the SQL_ID and the SQL text. With the SQL_ID you can use some other tool in order to gather additional diagnostics details, including the Execution Plan. You may want to use for that: planx.sql, sqlmon.sql or sqlash.sql. Note that find_apex.sql script also references sqld360.sql, but this new tool is not yet available, so use one of the other 3 suggestions for the time being (or SQLHC/SQLT).

To find poorly performing SQL, script find_apex.sql uses ASH instead of SQL Trace. If the action on a page takes more than a second, then most probably ASH will capture the poorly performing SQL delaying the page.

Script

----------------------------------------------------------------------------------------
--
-- File name: find_apex.sql
--
-- Purpose: Finds APEX poorly performing SQL for a given application user and session
--
-- Author: Carlos Sierra
--
-- Version: 2014/09/03
--
-- Usage: Inputs APEX application user and session id, and outputs list of poorly
-- performing SQL statements for further investigation with other tools.
--
-- Example: @find_apex.sql
--
-- Notes: Developed and tested on 11.2.0.3.
--
-- Requires an Oracle Diagnostics Pack License since ASH data is accessed.
--
-- To further investigate poorly performing SQL use sqld360.sql
-- (or planx.sql or sqlmon.sql or sqlash.sql).
--
---------------------------------------------------------------------------------------
--
WHENEVER SQLERROR EXIT SQL.SQLCODE;
ACC confirm_license PROMPT 'Confirm with "Y" that your site has an Oracle Diagnostics Pack License: '
BEGIN
IF NOT '&&confirm_license.' = 'Y' THEN
RAISE_APPLICATION_ERROR(-20000, 'You must have an Oracle Diagnostics Pack License in order to use this script.');
END IF;
END;
/
WHENEVER SQLERROR CONTINUE;
--
COL seconds FOR 999,990;
COL appl_user FOR A30;
COL min_sample_time FOR A25;
COL max_sample_time FOR A25;
COL apex_session_id FOR A25;
COL page FOR A4;
COL sql_text FOR A80;
--
SELECT COUNT(*) seconds,
SUBSTR(client_id, 1, INSTR(client_id, ':') - 1) appl_user,
MIN(sample_time) min_sample_time,
MAX(sample_time) max_sample_time
FROM gv$active_session_history
WHERE module LIKE '%/APEX:APP %'
GROUP BY
SUBSTR(client_id, 1, INSTR(client_id, ':') - 1)
HAVING SUBSTR(client_id, 1, INSTR(client_id, ':') - 1) IS NOT NULL
ORDER BY
1 DESC, 2
/
--
ACC appl_user PROMPT 'Enter application user: ';
--
SELECT MIN(sample_time) min_sample_time,
MAX(sample_time) max_sample_time,
SUBSTR(client_id, INSTR(client_id, ':') + 1) apex_session_id,
COUNT(*) seconds
FROM gv$active_session_history
WHERE module LIKE '%/APEX:APP %'
AND SUBSTR(client_id, 1, INSTR(client_id, ':') - 1) = TRIM('&&appl_user.')
GROUP BY
SUBSTR(client_id, INSTR(client_id, ':') + 1)
ORDER BY
1 DESC
/
--
ACC apex_session_id PROMPT 'Enter APEX session ID: ';
--
SELECT COUNT(*) seconds,
SUBSTR(h.module, INSTR(h.module, ':', 1, 2) + 1) page,
h.sql_id,
SUBSTR(s.sql_text, 1, 80) sql_text
FROM gv$active_session_history h,
gv$sql s
WHERE h.module LIKE '%/APEX:APP %'
AND SUBSTR(h.client_id, 1, INSTR(h.client_id, ':') - 1) = TRIM('&&appl_user.')
AND SUBSTR(h.client_id, INSTR(h.client_id, ':') + 1) = TRIM('&&apex_session_id.')
AND s.sql_id = h.sql_id
AND s.inst_id = h.inst_id
AND s.child_number = h.sql_child_number
GROUP BY
SUBSTR(h.module, INSTR(h.module, ':', 1, 2) + 1),
h.sql_id,
SUBSTR(s.sql_text, 1, 80)
ORDER BY
1 DESC, 2, 3
/
--
PRO Use sqld360.sql (or planx.sql or sqlmon.sql or sqlash.sql) on SQL_ID of interest

Note

This script as well as some others are now available on GitHub.

Written by Carlos Sierra

September 4, 2014 at 5:29 pm

Free script to generate a Line Chart on HTML

with 20 comments

Performance Metrics are easier to digest if visualized trough some Line Charts. OEM, eDB360, eAdam and other tools use them. If you already have a SQL Statement that provides the Performance Metrics you care about, and just need to generate a Line Chart for them, you can easily create a CSV file and open it with MS-Excel. But if you want to build an HTML Report out of your SQL, that is a bit harder, unless you use existing technologies. Tools like eDB360 and eAdam use Google Charts as a mechanism to easily generate such Charts. A peer asked me if we could have such functionality stand-alone, and that challenged me to create and share it.

HTML Line Chart
This HTML Line Chart Report above was created with script line_chart.sql shown below. The actual chart, which includes Zoom functionality on HTML can be downloaded from this Dropbox location. Feel free to use this line_chart.sql script as a template to display your Performance Metrics. It can display several series into one Chart (example above shows only one), and by reviewing code below you will find out how easy it is to adjust to your own needs. Chart above was created using a simple query against the Oracle Sample Schema SH, but the actual use could be Performance Metrics or any other Application time series.

Script

SET TERM OFF HEA OFF LIN 32767 NEWP NONE PAGES 0 FEED OFF ECHO OFF VER OFF LONG 32000 LONGC 2000 WRA ON TRIMS ON TRIM ON TI OFF TIMI OFF ARRAY 100 NUM 20 SQLBL ON BLO . RECSEP OFF;
PRO
DEF report_title = "Line Chart Report";
DEF report_abstract_1 = "<br>This line chart is an aggregate per month.";
DEF report_abstract_2 = "<br>It can be by day or any other slice size.";
DEF report_abstract_3 = "";
DEF report_abstract_4 = "";
DEF chart_title = "Amount Sold over 4 years";
DEF xaxis_title = "Sales between 1998-2001";
--DEF vaxis_title = "Amount Sold per Hour";
--DEF vaxis_title = "Amount Sold per Day";
DEF vaxis_title = "Amount Sold per Month";
DEF vaxis_baseline = ", baseline:2200000";
DEF chart_foot_note_1 = "<br>1) Drag to Zoom, and right click to reset Chart.";
DEF chart_foot_note_2 = "<br>2) Some other note.";
DEF chart_foot_note_3 = "";
DEF chart_foot_note_4 = "";
DEF report_foot_note = "This is a sample line chart report.";
PRO
SPO line_chart.html;
PRO <html>
PRO <!-- $Header: line_chart.sql 2014-07-27 carlos.sierra $ -->
PRO <head>
PRO <title>line_chart.html</title>
PRO
PRO <style type="text/css">
PRO body   {font:10pt Arial,Helvetica,Geneva,sans-serif; color:black; background:white;}
PRO h1     {font-size:16pt; font-weight:bold; color:#336699; border-bottom:1px solid #cccc99; margin-top:0pt; margin-bottom:0pt; padding:0px 0px 0px 0px;}
PRO h2     {font-size:14pt; font-weight:bold; color:#336699; margin-top:4pt; margin-bottom:0pt;}
PRO h3     {font-size:12pt; font-weight:bold; color:#336699; margin-top:4pt; margin-bottom:0pt;}
PRO pre    {font:8pt monospace;Monaco,"Courier New",Courier;}
PRO a      {color:#663300;}
PRO table  {font-size:8pt; border_collapse:collapse; empty-cells:show; white-space:nowrap; border:1px solid #cccc99;}
PRO li     {font-size:8pt; color:black; padding-left:4px; padding-right:4px; padding-bottom:2px;}
PRO th     {font-weight:bold; color:white; background:#0066CC; padding-left:4px; padding-right:4px; padding-bottom:2px;}
PRO td     {color:black; background:#fcfcf0; vertical-align:top; border:1px solid #cccc99;}
PRO td.c   {text-align:center;}
PRO font.n {font-size:8pt; font-style:italic; color:#336699;}
PRO font.f {font-size:8pt; color:#999999; border-top:1px solid #cccc99; margin-top:30pt;}
PRO </style>
PRO
PRO <script type="text/javascript" src="https://www.google.com/jsapi"></script>
PRO <script type="text/javascript">
PRO google.load("visualization", "1", {packages:["corechart"]})
PRO google.setOnLoadCallback(drawChart)
PRO
PRO function drawChart() {
PRO var data = google.visualization.arrayToDataTable([
/* add below more columns if needed (modify 3 places) */
PRO ['Date Column', 'Number Column 1']
/****************************************************************************************/
WITH
my_query AS (
/* query below selects one date_column and a small set of number_columns */
SELECT --TRUNC(time_id, 'HH24') date_column /* preserve the column name */
       --TRUNC(time_id, 'DD') date_column /* preserve the column name */
       TRUNC(time_id, 'MM') date_column /* preserve the column name */
       , SUM(amount_sold) number_column_1 /* add below more columns if needed (modify 3 places) */
  FROM sh.sales
 GROUP BY
       --TRUNC(time_id, 'HH24') /* aggregate per hour, but it could be any other */
       --TRUNC(time_id, 'DD') /* aggregate per day, but it could be any other */
       TRUNC(time_id, 'MM') /* aggregate per month, but it could be any other */
/* end of query */
)
/****************************************************************************************/
/* no need to modify the date column below, but you may need to add some number columns */
SELECT ', [new Date('||
       TO_CHAR(q.date_column, 'YYYY')|| /* year */
       ','||(TO_NUMBER(TO_CHAR(q.date_column, 'MM')) - 1)|| /* month - 1 */
       --','||TO_CHAR(q.date_column, 'DD')|| /* day */
       --','||TO_CHAR(q.date_column, 'HH24')|| /* hour */
       --','||TO_CHAR(q.date_column, 'MI')|| /* minute */
       --','||TO_CHAR(q.date_column, 'SS')|| /* second */
       ')'||
       ','||q.number_column_1|| /* add below more columns if needed (modify 3 places) */
       ']'
  FROM my_query q
 ORDER BY
       date_column
/
/****************************************************************************************/
PRO ]);
PRO
PRO var options = {
PRO backgroundColor: {fill: '#fcfcf0', stroke: '#336699', strokeWidth: 1},
PRO explorer: {actions: ['dragToZoom', 'rightClickToReset'], maxZoomIn: 0.1},
PRO title: '&&chart_title.',
PRO titleTextStyle: {fontSize: 16, bold: false},
PRO focusTarget: 'category',
PRO legend: {position: 'right', textStyle: {fontSize: 12}},
PRO tooltip: {textStyle: {fontSize: 10}},
PRO hAxis: {title: '&&xaxis_title.', gridlines: {count: -1}},
PRO vAxis: {title: '&&vaxis_title.' &&vaxis_baseline., gridlines: {count: -1}}
PRO }
PRO
PRO var chart = new google.visualization.LineChart(document.getElementById('chart_div'))
PRO chart.draw(data, options)
PRO }
PRO </script>
PRO </head>
PRO <body>
PRO <h1>&&report_title.</h1>
PRO &&report_abstract_1.
PRO &&report_abstract_2.
PRO &&report_abstract_3.
PRO &&report_abstract_4.
PRO <div id="chart_div" style="width: 900px; height: 500px;"></div>
PRO <font class="n">Notes:</font>
PRO <font class="n">&&chart_foot_note_1.</font>
PRO <font class="n">&&chart_foot_note_2.</font>
PRO <font class="n">&&chart_foot_note_3.</font>
PRO <font class="n">&&chart_foot_note_4.</font>
PRO <pre>
L
PRO </pre>
PRO <br>
PRO <font class="f">&&report_foot_note.</font>
PRO </body>
PRO </html>
SPO OFF;
SET HEA ON LIN 80 NEWP 1 PAGES 14 FEED ON ECHO OFF VER ON LONG 80 LONGC 80 WRA ON TRIMS OFF TRIM OFF TI OFF TIMI OFF ARRAY 15 NUM 10 NUMF "" SQLBL OFF BLO ON RECSEP WR;

 

 

Written by Carlos Sierra

July 28, 2014 at 2:34 pm

Free script to very quickly and cheaply estimate the size of an index if it were to be rebuilt

with 7 comments

A good friend of mine recently asked me if edb360 included a section showing indexes that would benefit of a rebuild. I replied “not yet” and basically committed to add something. This topic of the needs versus the implications of rebuilding an index has been recently discussed in Richard Foote’s Blog. In my opinion, if you want to know more about indexes, Richard’s blog is one of the first stops. To my surprise and delight, I learned that we can actually use a little trick of the EXPLAIN PLAN FOR command and actually use the CBO to estimate what would be the size of an index if we were to create (or rebuild) it. In this latter blog posting Richard explains and demonstrates how it can be done.

My blog posting today is about encapsulating this cool method to quickly and cheaply estimate the size of an index if it were to be rebuilt, and put it on a free script for the Oracle community to use. So, feel free to use script below, but I need to remind you that before you jump to conclusions that this or that index should be rebuilt, read first about the actual need of doing so. You may want to include in your reading Richard Foote’s numerous postings on this topic, and also a couple of Oracle MOS notes: 989093.1 and 989186.1

----------------------------------------------------------------------------------------
--
-- File name:   estimate_index_size.sql
--
-- Purpose:     Reports Indexes with an Actual size > Estimated size for over 1 MB
--
-- Author:      Carlos Sierra
--
-- Version:     2014/07/18
--
-- Description: Script to very quickly and cheaply estimate the size of an index if it
--              were to be rebuilt. It uses EXPLAIN PLAN FOR CREATE INDEX technique.
--              It can be used on a single index, or all the indexes on a table, or
--              a particular application schema, or all application schemas. It does not
--              lock indexes and only updates the plan_table, which is usually a global
--              temporary table.
--
-- Usage:       Connect to SQL*Plus as SYS or DBA account and execute without parameters.
--              It will ask for optional schema owner, table name and index name. If all
--              3 are given null values then it acts on all application schemas. It
--              generates a simple text report with the indexes having an estimated size
--              of at least 1 MB over their actual size.
--
-- Example:     @estimate_index_size.sql
--
-- Notes:       Developed and tested on 11.2.0.3.
--
--              Inspired on blog posts from Richard Foote and Connor MacDonald:
--              http://richardfoote.wordpress.com/2014/04/24/estimate-index-size-with-explain-plan-i-cant-explain/#comment-116966
--              http://connormcdonald.wordpress.com/2012/05/30/index-size/
--
--              If considering index rebuilds based on the output of this script, read
--              first Richard Foote's numerous blog postings about this topic. Bottom
--              line: there are only a few cases where you actually need to manually
--              rebuild an index.
--
--              This method to estimated size of an index is far from perfect, please
--              scrutinize this script before using it. You may also want to read
--              Oracle MOS notes: 989093.1 and 989186.1 on this topic.
--
---------------------------------------------------------------------------------------
--
SPO estimate_index_size.txt;
UNDEF owner table_name index_name exclusion_list exclusion_list2;
DEF exclusion_list = "('ANONYMOUS','APEX_030200','APEX_040000','APEX_SSO','APPQOSSYS','CTXSYS','DBSNMP','DIP','EXFSYS','FLOWS_FILES','MDSYS','OLAPSYS','ORACLE_OCM','ORDDATA','ORDPLUGINS','ORDSYS','OUTLN','OWBSYS')";
DEF exclusion_list2 = "('SI_INFORMTN_SCHEMA','SQLTXADMIN','SQLTXPLAIN','SYS','SYSMAN','SYSTEM','TRCANLZR','WMSYS','XDB','XS$NULL')";
VAR random1 VARCHAR2(30);
VAR random2 VARCHAR2(30);
EXEC :random1 := DBMS_RANDOM.string('A', 30);
EXEC :random2 := DBMS_RANDOM.string('X', 30);
DELETE plan_table WHERE statement_id IN (:random1, :random2);

SET SERVEROUT ON;
DECLARE
  sql_text CLOB;
BEGIN
  FOR i IN (SELECT idx.owner, idx.index_name
              FROM dba_indexes idx,
                   dba_tables tbl
             WHERE idx.owner = NVL(UPPER(TRIM('&&owner.')), idx.owner) -- optional schema owner name
               AND idx.table_name = NVL(UPPER(TRIM('&&table_name.')), idx.table_name) -- optional table name
               AND idx.index_name = NVL(UPPER(TRIM('&&index_name.')), idx.index_name) -- optional index name
               AND idx.owner NOT IN &&exclusion_list. -- exclude non-application schemas
               AND idx.owner NOT IN &&exclusion_list2. -- exclude more non-application schemas
               AND idx.index_type IN ('NORMAL', 'FUNCTION-BASED NORMAL', 'BITMAP', 'NORMAL/REV') -- exclude domain and lob
               AND idx.status != 'UNUSABLE' -- only valid indexes
               AND idx.temporary = 'N'
               AND tbl.owner = idx.table_owner
               AND tbl.table_name = idx.table_name
               AND tbl.last_analyzed IS NOT NULL -- only tables with statistics
               AND tbl.num_rows > 0 -- only tables with rows as per statistics
               AND tbl.blocks > 128 -- skip small tables
               AND tbl.temporary = 'N')
  LOOP
    BEGIN
      sql_text := 'EXPLAIN PLAN SET STATEMENT_ID = '''||:random1||''' FOR '||REPLACE(DBMS_METADATA.get_ddl('INDEX', i.index_name, i.owner), CHR(10), ' ');
      -- cbo estimates index size based on explain plan for create index ddl
      EXECUTE IMMEDIATE sql_text;
      -- index owner and name do not fit on statement_id, thus using object_owner and object_name, using statement_id as processing state
      DELETE plan_table WHERE statement_id = :random1 AND (other_xml IS NULL OR NVL(DBMS_LOB.instr(other_xml, 'index_size'), 0) = 0);
      UPDATE plan_table SET object_owner = i.owner, object_name = i.index_name, statement_id = :random2 WHERE statement_id = :random1;
    EXCEPTION
      WHEN OTHERS THEN
        DBMS_OUTPUT.PUT_LINE(i.owner||'.'||i.index_name||': '||SQLERRM);
        DBMS_OUTPUT.PUT_LINE(DBMS_LOB.substr(sql_text));
    END;
  END LOOP;
END;
/
SET SERVEROUT OFF;

WITH
indexes AS (
SELECT pt.object_owner,
       pt.object_name,
       TO_NUMBER(EXTRACTVALUE(VALUE(d), '/info')) estimated_bytes
  FROM plan_table pt,
       TABLE(XMLSEQUENCE(EXTRACT(XMLTYPE(pt.other_xml), '/*/info'))) d
 WHERE pt.statement_id = :random2
   AND pt.other_xml IS NOT NULL -- redundant
   AND DBMS_LOB.instr(pt.other_xml, 'index_size') > 0 -- redundant
   AND EXTRACTVALUE(VALUE(d), '/info/@type') = 'index_size' -- grab index_size type
),
segments AS (
SELECT owner, segment_name, SUM(bytes) actual_bytes
  FROM dba_segments
 WHERE owner = NVL(UPPER(TRIM('&&owner.')), owner) -- optional schema owner name
   AND segment_name = NVL(UPPER(TRIM('&&index_name.')), segment_name) -- optional index name
   AND owner NOT IN &&exclusion_list. -- exclude non-application schemas
   AND owner NOT IN &&exclusion_list2. -- exclude more non-application schemas
   AND segment_type LIKE 'INDEX%'
HAVING SUM(bytes) > POWER(2, 20) -- only indexes with actual size > 1 MB
 GROUP BY
       owner,
       segment_name
),
list_bytes AS (
SELECT (s.actual_bytes - i.estimated_bytes) actual_minus_estimated,
       s.actual_bytes,
       i.estimated_bytes,
       i.object_owner,
       i.object_name
  FROM indexes i,
       segments s
 WHERE i.estimated_bytes > POWER(2, 20) -- only indexes with estimated size > 1 MB
   AND s.owner = i.object_owner
   AND s.segment_name = i.object_name
)
SELECT ROUND(actual_minus_estimated / POWER(2, 20)) actual_minus_estimated,
       ROUND(actual_bytes / POWER(2, 20)) actual_mb,
       ROUND(estimated_bytes / POWER(2, 20)) estimated_mb,
       object_owner owner,
       object_name index_name
  FROM list_bytes
 WHERE actual_minus_estimated > POWER(2, 20) -- only differences > 1 MB
 ORDER BY
       1 DESC,
       object_owner,
       object_name
/

DELETE plan_table WHERE statement_id IN (:random1, :random2);
UNDEF owner table_name index_name exclusion_list exclusion_list2;
SPO OFF;

Written by Carlos Sierra

July 18, 2014 at 9:45 am