Carlos Sierra's Tools and Tips

Tools and Tips for Oracle Performance and SQL Tuning

Search Results

eSP

with 2 comments

eSPEnkitec’s Sizing and Provisioning (eSP) is a new internal tool designed and developed with Oracle Engineered Systems in mind. Thanks to the experience and insights from Randy Johnson, Karl Arao and Frits Hoogland, what began as a pet project for some of us, over time became an actual robust APEX/PLSQL application, developed by Christoph Ruepprich and myself, and ready to debut at Oracle Open World 2014.

This posting is about eSP, what it does, and how it helps on the sizing and provisioning of Oracle Engineered System, or I would rather say, any System where Oracle runs.

We used to size Engineered Systems using a complex and very useful spread sheet developed by Randy Johnson and Karl Arao. Now, it is the turn for eSP to take the next step, and move this effort forward into a more scalable application that sits on top of one of our Exadata machines.

Sizing an Engineered System

Sizing a System can be quite challenging, especially when the current system is composed of several hosts with multiple databases of diverse use, size, versions, workloads, etc. The new target system may also bring some complexities; as the number of possible configurations grows, finding the right choice becomes harder. Then we also have the challenge of disk redundancy, recovery areas, the potential benefits of offloading with their smart scans, just to mention some added complexities.

At a very high level, Sizing a System is about 3 entities: Resources, Capacity and Utilization. Resources define what I call “demand”, which is basically the set of computational resources from your original System made of one or many databases and instances over some hosts. Capacity, which I also call it “supply”, is the set of possible target Systems with their multiple Configurations, in other words Engineered Systems, or any other hardware capable to host Oracle databases. Utilization, which I may also refer as “allocation” is where the magic and challenge resides. It is a clever and unbiassed mapping between databases and configurations, then between instances and nodes. This mapping has to consider at the very least CPU footprint, Memory for SGA and PGA, database disk space, and throughput in terms of IOPS and MBPS. Additional constraints, as mentioned before, include redundancy and offloading among others. CPU can be a bit tricky since each CPU make and model has its own characteristics, so mapping them requires the use of SPEC.

Other challenge a Sizing tool has to consider is the variability of the Resources. The question becomes: Do we see the Resources as a worst case scenario, or shall we rather consider them as time series? In other words, do we compute and use peaks, or do we observe the use of Resources over time, then develop some methods to aggregate them consistently as time series? If we decide to use a reduced set of data points, do we use peaks or percentiles? if the latter, which percentile is well balanced? 99.9, 99, 95 or maybe 90? How conservative are those values? There are so many questions and the answer for most of them, as you may guess is: “it all depends”.

How eSP Works

Without getting into the technical details, I can say that eSP is an APEX application with a repository on an Oracle database, which inputs collected “Requirements” from the databases to be sized, then it processes these Requirements and prepares them to be “Allocated” into one or more defined hardware configurations. The process is for the most part “automated”, meaning this: we execute some tool or script in the set of hosts where the databases reside, then upload the output of these collectors into eSP and we are ready to Plan and apply “what-if” scenarios. Having an Exadata System as our work engine, it allows this eSP application to scale quite well. A “what-if” scenario takes as long as it takes to navigate APEX pages,while all the computations are done in sub-seconds behind scenes, thanks to Exadata!

Once we load the Resources from the eSP collector script, or from the eAdam tool, we can start playing with the metadata. Since eSP’s set of known Configurations (Capacity) include current Engineered Systems (X4), allocating Configurations is a matter of seconds, then mapping databases and instances becomes the next step. eSP contains an auto “allocate” algorithm for databases and instances, where we can choose between a “balanced” allocation or one that is “dense” with several density factors to choose from (100%, 90%, 80%, 70%, 60% and 50%). With all these automated options, we can try multiple sizing and allocation possibilities in seconds, regardless if we are Sizing and Provisioning for one database or a hundred of them.

eSP and OOW

eSP DemoThe Enkitec’s Sizing and Provisioning (eSP) tool is an internal application that we created to help our customers to Size their next System or Systems in a sensible manner. The methods we implemented are transparent and unbiassed. We are bringing eSP to Oracle Open World 2014. I will personally demo eSP at our assigned booth, which is #111 at the Moscone South. I will be on and off the booth, so if you are interested on a demo please let me know, or contact your Enkitec/Accenture representative. We do prefer appointments, but walk-ins are welcomed. Hope to see you at OOW!

Advertisements

Written by Carlos Sierra

September 21, 2014 at 5:40 pm

Posted in eAdam, edb360, Exadata, General, OOW

Adapting and adopting SQL Plan Management (SPM)

with 3 comments

Introduction

This post is about: “Adapting and adopting SQL Plan Management (SPM) to achieve execution plan stability for sub-second queries on a high-rate OLTP mission-critical application”. In our case, such an application is implemented on top of several Oracle 12c multi tenant databases, where a consistent average execution time is more valuable than flexible execution plans. We successfully achieved plan stability implementing a simple algorithm using PL/SQL calling DBMS_SPM public APIs.

Chart below depicts a typical case where the average performance of a large set of business-critical SQL statements suddenly degraded from sub-millisecond to 15 or 20ms, then beccome more stable around 3ms. Wide spikes are a typical trademark of an Execution Plan for one or more SQL statements flipping for some time. In order to produce a more consistent latency we needed to improve plan stability, and of course the preferred tool to achieve that on an Oracle database is SQL Plan Management.

Algorithm

We tested and ruled out adaptive SQL Plan Management, which is an excellent 12c new feature. But, due to the dynamics of this application, where transactional data shifts so fast, allowing this “adaptive SPM” feature to evaluate auto-captured plans using bind variable values captured a few hours earlier, rendered unfortunately false positives. These false positives “evolved” as execution plans that were numerically optimal for values captured (at the time the candidate plan was captured), but performed poorly when executed on “current” values a few hours later. Nevertheless, this 12c “adaptive SPM” new feature is worth exploring for other applications.

We adapted SPM so it would only generate SQL Plan Baselines on SQL that executes often, and that is critical for the business. The algorithm has some complexity such as candidate evaluation and SQL categorization; and besides SPB creation it also includes plan demotion and plan promotion. We have successfully implemented it in some PDBs and we are currently doing a rollout to entire CDBs. The algorithm is depicted on diagram on the left, and more details are included in corresponding presentation slides listed on the right-hand bar. I plan to talk about this topic on an international Oracle Users Group in 2018.

This algorithm is scripted into a sample PL/SQL package, which you can find on a subdirectory on my shared scripts. If you consider using this sample script for an application of your own, be sure you make it yours before attempting to use it. In other words: fully understand it first, then proceed to customize it accordingly and test it thoroughly.

Results

Chart below shows how average performance of business-critical SQL became more stable after implementing algorithm to adapt and adopt SPM on a pilot PDB. Not all went fine although: we had some outliers that required some tuning to the algorithm. Among challenges we faced: volatile data (creating a SPB when table was almost empty, then using it when table was larger); skewed values (create a SPB for non-popular value, then using it on a popular value); proper use of multiple optimal plans due to Adaptive Cursor Sharing (ACS); rejected candidates due to conservative initial restrictions on algorithm (performance per execution, number of executions, age of cursor, etc.)

Conclusion

If your OLTP application contains business critical SQL that executes at a high-rate, and where a spike on latency risks affecting SLAs, you may want to consider implementing SQL Plan Management. Consider then both: “adaptive SPM” if it satisfies your requirements, else build a PL/SQL library that can implement more complex logic for candidates evaluation and for SPBs maintenance. I do believe SPM works great, specially when you enhance its out-of-the-box functionality to satisfy your specific needs.

 

 

Written by Carlos Sierra

December 20, 2017 at 6:32 pm

Creating a SQL Plan Baseline from Cursor Cache or AWR

with 18 comments

A DBA deals with performance issues often, and having a SQL suddenly performing poorly is common. What do we do? We proceed to “pin” an execution plan, then investigate root cause (the latter is true if time to next fire permits).

DBMS_SPM provides some APIs to create a SQL Plan Baseline (SPB) from the Cursor Cache, or from a SQL Tuning Set (STS), but not from the Automatic Workload Repository (AWR). For the latter, you need a two-steps approach: create a STS from AWR, then load a SPB from the STS. Fine, except when your next fire is waiting for you, or when deciding which is the “best” plan is not trivial.

Take for example chart below, which depicts multiple execution plans with different performance for one SQL statement. The SQL statement is actually quite simple, and data is not significantly skewed. On this particular application, usually one-size-fits-all (meaning one-and-only-one plan) works well for most values passed on variable place holders. Then, which plan would you choose?

Sample chart created by SQLd360

Looking at summary of known Execution Plans’ performance below (as reported by planx.sql), we can see the same 6 Execution Plans.

1st Plan on list shows an average execution time of 2.897ms according to AWR, and 0.896ms according to Cursor Cache; and number of recorded executions for this Plan are 2,502 and 2,178 respectively. We see this Plan contains one Nested Loop, and if we look at historical performance we notice this Plan takes less than 109ms 95% of the time, less than 115ms 97% of the time, and less then 134ms 99% of the time. We also see that worst recorded AWR period, had this SQL performing in under 150ms (on average for that one period).

We also notice that last plan on list performs one execution in 120.847ms on average (as per AWR) and 181.113ms according to Cursor Cache (on average as well). Then, “pinning” 1st plan on list seems like a good choice, but not too different than all but last plan, specially when we consider both: average performance and historical performance according to percentiles reported.

PLANS PERFORMANCE
~~~~~~~~~~~~~~~~~

       Plan ET Avg      ET Avg      CPU Avg     CPU Avg           BG Avg       BG Avg   Executions   Executions                                   ET 100th    ET 99th     ET 97th     ET 95th     CPU 100th   CPU 99th    CPU 97th    CPU 95th
 Hash Value AWR (ms)    MEM (ms)    AWR (ms)    MEM (ms)             AWR          MEM          AWR          MEM   MIN Cost   MAX Cost  NL  HJ  MJ Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)
----------- ----------- ----------- ----------- ----------- ------------ ------------ ------------ ------------ ---------- ---------- --- --- --- ----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
 4113179674       2.897       0.896       2.715       0.714           96            5        2,502        2,178          8        738   1   0   0     149.841     133.135     114.305     108.411     147.809     133.007     113.042     107.390
  578709260      29.576      32.704      28.865      31.685        1,583        1,436        6,150        1,843         67        875   1   0   0     154.560      84.264      65.409      57.311     148.648      75.209      62.957      56.305
 1990606009      74.399      79.054      73.163      77.186        1,117        1,192          172          214        905      1,108   0   1   0     208.648     208.648      95.877      95.351     205.768     205.768      94.117      93.814
 1242077371      77.961                  77.182                    1,772                     8,780                     949      1,040   0   1   0     102.966      98.206      91.163      89.272     100.147      97.239      90.165      88.412
 2214147219      79.650      82.413      78.242      80.817        1,999        2,143       42,360       24,862        906      1,242   0   1   0     122.535     101.293      98.442      95.737     119.240      99.118      95.266      93.156
 1214505235     120.847     181.113     105.485     162.783          506        1,355           48           12        114        718   1   0   0     285.950     285.950     285.950     285.950     193.954     193.954     193.954     193.954

Plans performance summary above is displayed in a matter of seconds by planx.sql, sqlperf.sql and by a new script spb_create.sql. This output helps make a quick decision about which Execution Plan is better for “pinning”, meaning: to create a SPB on it.

Sometimes such decision is not that trivial, as we can see on sample below. Which plan is better? I would go with 2nd on list. Why? performance-wise this plan is more stable. It does a Hash Join, so I am expecting to see a Plan with full scans, but if I can get consistent executions under 0.4s (according to percentiles), I would be tempted to “pin” this 2nd Plan instead of 1st one. And I would stay away from 3rd and 5th. So maybe I would create a SPB with 3 plans instead of just one, and include on this SPB 1st, 2nd and 4th on the list.

PLANS PERFORMANCE
~~~~~~~~~~~~~~~~~

       Plan ET Avg      ET Avg      CPU Avg     CPU Avg           BG Avg       BG Avg   Executions   Executions                                   ET 100th    ET 99th     ET 97th     ET 95th     CPU 100th   CPU 99th    CPU 97th    CPU 95th
 Hash Value AWR (ms)    MEM (ms)    AWR (ms)    MEM (ms)             AWR          MEM          AWR          MEM   MIN Cost   MAX Cost  NL  HJ  MJ Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)   Pctl (ms)
----------- ----------- ----------- ----------- ----------- ------------ ------------ ------------ ------------ ---------- ---------- --- --- --- ----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
 1917891576       0.467       0.334       0.330       0.172          119           33  554,914,504   57,748,249          6      1,188   2   0   0   6,732.017      10.592       1.628       1.572   1,420.864       1.557       1.482       1.261
   99953997       1.162       2.427       0.655       0.492           83           55   58,890,160    2,225,247         12      2,311   0   1   0     395.819     235.474     108.142      34.909      56.008      22.329      12.926       3.069
 3559532534       1.175   1,741.041       0.858      91.486          359           46   21,739,877          392          4         20   1   0   0  89,523.768   4,014.301     554.740     298.545  21,635.611     216.456      54.050      30.130
 3650324870       2.028      20.788       1.409       2.257          251          199   24,038,404      143,819         11      5,417   0   1   0     726.964     254.245      75.322      20.817     113.259      21.211      13.591       8.486
 3019880278      43.465                  43.029                   20,217                    13,349                   5,693      5,693   0   1   0      43.465      43.465      43.465      43.465      43.029      43.029      43.029      43.029

About new script spb_create.sql

This new script is a life-saver for us, since our response time for an alert is usually measured in minutes, with a resolution (and sometimes a root cause analysis) expected in less than one hour from the time the incident is raised.

This script is quite simple:

  • it provides a list of known Execution Plans including current (Cursor Cache) and historical (AWR) performance as displayed in two samples above, then
  • asks on which Plan Hash Values (PHVs) you want to create a SPB on. It allows you to enter up to 3 PHVs; last
  • asks if you want these plans to be set as FIXED

After you respond to ACCEPT parameters, then a SPB for your SQL is created and displayed. It does not matter if the Plan exists on Cursor Cache and/or on AWR, it finds the Plan and creates the SPB for you. Then: finding known Execution Plans, deciding which one is a better choice (or maybe more than one), and creating a SPB, all can be done very rapidly.

If you still prefer to use SQL Profiles and not SPBs for whatever reason, script coe_xfr_sql_profile.sql is still around and updated. On these 12c days, and soon 18c and beyond, I’d much rather use SQL Plan Management and create SPBs although!

Anyways, enjoy these free scripts and become a faster hero “pinning” good plans. Then don’t forget to do diligent root cause analysis afterwards. I use SQLd360 by Mauro Pagano for deep understanding of what is going on with my SQL statements.

Soon, I will post about a cool free tool that automates the implementation of SQL Plan Management on a high-rate OLTP where stability is more important than flexibility (frequently changing Execution Plans). Stay tuned!

Written by Carlos Sierra

December 1, 2017 at 6:32 am

Autonomous Online Index Rebuild for Oracle Multitenant

with 7 comments

First things first: Do not rebuild Oracle indexes! … Unless you have to.

If you are even considering rebuilding indexes on an autonomous manner, please stop now, and first spend some time reading some of the many things Richard Foote has to say on his well-recommended blog.

A little side story: Many, many years ago, as I was supporting EBS at Oracle, one day I got a call from a guy with an aussie accent (even today I think it was Richard). He called Support simply to set the record straight (for me): I should not be promoting dropping, recreating or rebuilding indexes with a couple of nice notes/scripts I wrote for the delight of my then EBS customers. This guy educated me back then: such practices were no longer needed, as Oracle automatically reused index blocks after massive deletions (I was being a bad hombre!)… And I agreed with him. This was back before Y2K, and I still do agree with him today. But (there is always a but), there are some corner cases where we still have to rebuild indexes ourselves. And that is the reason of my story today.

Why would you even want to automate index rebuild?

Since my background is Oracle performance,  and “if all you have is a hammer, everything looks like a nail”, I am a bit bias in that direction, and I would have to answer: because there are some valid performance reasons. There is also a lesser reason about disk space savings, which is not so relevant these days of cheap storage… Nevertheless, I will briefly justify below these two (performance and space) under my specific corner case. Maybe you have a similar case of your own to consider…

Performance

We have this key custom OLTP application installed in over 30+ servers and 700+ PDBs. It executes high-rate simple queries that due to current application constraints, often produce “optimal” executions plans with persistent fast full index scans operations on them, or some very long index range scans. The data is a time-based progressive window which contains anything between the last two hours and up the last few days of rapidly changing values (intense on inserts and deletes). All indexes are appended with a monolithically increasing numeric column seeded from a sequence.

If you are thinking “why doing fast full index scans on OLTP queries in the 1st place?” I would have to ask you to bare with me on that question for a few months (we do need to make some improvements on the application data model, but that is a story for a day in the far future yet to come). For now, we are stuck with large indexes growing on their far right (like many citizens today) of every value of their leading column(s), and we have to do something about them! One more thing worth mentioning: the number of reads exceeds by far the number of writes or deletes.

Take for example, the output of script indexes_2b_shrunk.sql (below). We can see that 4th index down WORKXXXXXXXTANCES_PK, has a current size of 114.9 MB (you may need to scroll to the right using a hidden bar at the bottom of the report), but if it were to rebuilt it, its size would go down to 43 MB, reducing then its space by 62.6%. This script basically lists indexes that consume more than 10 MB of disk space, and with a potential savings of at least 25% if they were to be rebuilt.

Output of similar script indexes_2b_shrunk_fast_full_scan.sql (which I executed but omitted from this post), includes the same index because in addition to the considerations of space and savings, such index is referenced by at least one execution plan as a full scan. Then, if I can reduce its size by 62.6%, at least some SQL performing full scans on it would benefit, performance wise. This second script is identical to the first one with one exception: it only includes indexes for which at least a recent execution plan references them with full scans.

These two scripts referenced above take no action on the database – they simply list indexes that could be rebuilt as per their current size and estimated space savings (and if they are referenced by full scans in the case of the latter). By the way, the estimated size if rebuilt is pretty accurate. Just saying.

REPORT_DATE
-------------------
2017-10-07T12-55-01
please wait...
PDB: SOMEPDB
---
TABLE_NAME               OWNER.INDEX_NAME                          SAVING %  CURRENT SIZE ESTIMATED SIZE  COMMAND
------------------------ --------------------------------------- ---------- ------------- --------------  --------------------------------------------------------------------
XXXXINSTANCES            XXXXXXXXXXXX.XXXXINSTANCES_PK              50.6 %       647.7 MB       320.0 MB  ALTER INDEX xxxxxxxxxxxx.xxxxinstances_pk REBUILD ONLINE;
XXXXTRANSACTIONKEYS      XXXXXXXXXXXX.XXXXTRANSACTIONKEYS_PK        95.9 %    18,391.6 MB       752.0 MB  ALTER INDEX xxxxxxxxxxxx.xxxxtransactionkeys_pk REBUILD ONLINE;
HISTOXXXXXXXXIGNMENT     XXXXXXXXXXXX.HISTOXXXXXXXXIGNMENT_PK       55.1 %       783.8 MB       352.0 MB  ALTER INDEX xxxxxxxxxxxx.histxxxxxxxignment_pk REBUILD ONLINE;
WORKXXXXXXXTANCES        XXXXXXXXXXXX.WORKXXXXXXXTANCES_PK          62.6 %       114.9 MB        43.0 MB  ALTER INDEX xxxxxxxxxxxx.workxxxxxxxtances_pk REBUILD ONLINE;
XXXXTRANSACTIONKEYS      XXXXXXXXXXXX.XXXXTRANSACTIONKEYS_AK        96.9 %    23,896.1 MB       752.0 MB  ALTER INDEX xxxxxxxxxxxx.xxxxtransactionkeys_ak REBUILD ONLINE;
XXXXGCWORKSPACE          XXXXXXXXXXXX.XXXXGCWORKSPACE_PK            31.3 %       745.1 MB       512.0 MB  ALTER INDEX xxxxxxxxxxxx.xxxxgcworkspace_pk REBUILD ONLINE;
WORKXXXXXXXTANCESINDEX   XXXXXXXXXXXX.WORKXXXXXXXTANCESINDEX_PK     51.8 %        70.5 MB        34.0 MB  ALTER INDEX xxxxxxxxxxxx.workxxxxxxxtancesindex_pk REBUILD ONLINE;

Space

Take same output above, and look now at 5th index down XXXXTRANSACTIONKEYS_AK, currently consuming 23,896.1 MB of space. This is on an X5-2 with NVMe SSD, where space is kind of premium. If we can reduce the size of this index by a factor of 96.9% to only 752 MB, even if there are no execution plans referencing this index with a full scan, we would simply recover close to 23 GB of space! This is good news for operations DBAs who get paged when a Tablespace is running short (kind of annoying, specially at night!).

Of course there are two follow-up questions here, which I may just mention and leave for future blog posts: 1) do we really need to append every index with a monolithically increasing numeric column based on a sequence? (most probably a “no”); and b) considering that most leading columns have moderate cardinality and super wide values, would index compression help to reduce index size? (spoiler alert: definitely a “yes”).

Autonomous Online Index Rebuild for Oracle Multitenant

Script iod_indexes_rebuild_online.sql is what I use today to automate indexes rebuild, through a weekly Oracle Enterprise Manager (OEM) job executed in +30 12c Oracle multitenant CDBs (with +700 PDBs). This script rebuilds indexes that are referenced by full scans, are larger than 10 MB, and with an estimated space reduction of at least 25%. It excludes one particular table that for a very sticky reason needs to be avoided (an explicit lock that produces kind of an ugly concurrency contention)… The script also includes the indexes of a known big table, even if its indexes are not referenced by full scans. Both, the one exclusion and the one inclusion, are clearly marked on the script. You may need your own exclusions and inclusions (like I do on my ThanksGiving dinner), in case you decide to write your own script based on this.

This rebuild script is written in such a way that can be executed at the root container level either through OEM or SQL*plus, or at the PDB level, also from OEM or SQL*Plus.

Output of the autonomous online index rebuild script is recorded in 3 places: a spool file directed to the tmp directory, a trace file, and into the alert log. The content of each varies. The shortest is on the alert log, so we can see we ran something in the event we need to audit that timeframe. And the longest is the trace, which records accurate times and messages. Edited output below is from a sample trace file. On it we can see how an index with a size of 21,973.1 MB was reduced to 809.8 MB (96% reduction). Such index took 36 seconds to be rebuilt online. The estimated time is rubbish, but the estimated size is very accurate in my opinion.

+++ DBE IOD +++ iod_indexes_rebuild_online.sql begin
+++ DBE IOD +++ host: iod-db-xxxx-99999.node.xxx.xx
+++ DBE IOD +++ trace: /u01/app/oracle/diag/rdbms/xxxx1xx2_c/xxxx1xx2/trace/xxxx1xx2_ora_40444_iod_indexes_rebuild_online.trc
+++ DBE IOD +++ report: /tmp/iod_indexes_rebuild_online_2017-10-06T20-11-36.txt
...
*** MODULE NAME:(IOD_INDEXES_REBUILD_ONLINE (15) XXXXX_XX) 2017-10-06 20:32:32.618
*** ACTION NAME:(SELECTING PDB INDEX CANDIDATES) 2017-10-06 20:32:32.618
*** CONTAINER ID:(15) 2017-10-06 20:32:32.618
...
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CON_NAME:XXXXX_XX CON_ID:15 DB_NAME:xxxx1xx2 DB_UNIQUE_NAME:XXXX1XX2_C SERVER_HOST:iod-db-xxxx-99999 TIME:2017-10-06T20:32:32
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
...
*** 2017-10-06 20:32:45.307
*** ACTION NAME:(XXXXTRANSACTIONKEYS_PK {COST}) 2017-10-06 20:34:46.527
...
TABLE_NAME                     OWNER.INDEX_NAME                                         SIZE BEFORE REBUILD       ESTIMATED SIZE  EST SAVING EST SECONDS   SIZE AFTER REBUILD  ACT SAVING ACT SECONDS
------------------------------ ------------------------------------------------------- -------------------- --------------------  ---------- ----------- --------------------  ---------- -----------
XXXXTRANSACTIONKEYS            XXXXXWFR2XX2.XXXXTRANSACTIONKEYS_PK                              21,973.1 MB             880.0 MB      96.0 %       11.0s
...
*** 2017-10-06 20:37:22.683
*** ACTION NAME:(XXXXTRANSACTIONKEYS_PK {DONE}) 2017-10-06 20:37:22.683

XXXXTRANSACTIONKEYS            XXXXXWFR2XX2.XXXXTRANSACTIONKEYS_PK                              21,973.1 MB             880.0 MB      96.0 %       11.0s             809.8 MB      96.3 %       36.0s
*** ACTION NAME:(HISTORICALASSIGNMENT_PK {COST}) 2017-10-06 20:37:22.688
...
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                          PDB NAME                                                 SIZE BEFORE REBUILD       ESTIMATED SIZE  EST SAVING EST SECONDS   SIZE AFTER REBUILD  ACT SAVING ACT SECONDS
------------------------------ ------------------------------------------------------- -------------------- --------------------  ---------- ----------- --------------------  ---------- -----------
TOTAL                          XXXXX_XX                                                         53,101.1 MB           2,585.0 MB      95.1 %       28.0s           2,505.2 MB      95.3 %      101.0s
...
+++ DBE IOD +++ DB_INDEXES:63 DB_BEFORE:130,111.0MB(127.1GB) DB_AFTER:12,608.1MB(12.3GB) DB_SAVING:90.3%
+++ DBE IOD +++ report: /tmp/iod_indexes_rebuild_online_2017-10-06T20-11-36.txt
+++ DBE IOD +++ trace: /u01/app/oracle/diag/rdbms/xxxx1xx2_c/xxxx1xx2/trace/xxxx1xx2_ora_40444_iod_indexes_rebuild_online.trc
+++ DBE IOD +++ host: iod-db-xxxx-99999.node.xxx.xx
+++ DBE IOD +++ iod_indexes_rebuild_online.sql end
+++ DBE IOD +++ duration: 8301 seconds

Companion scripts

Besides the two “change-nothing” indexes_2b_shrunk*.sql scripts, and the actual rebuild script, there are two more that are part of the set and can be handy.

The first one is iod_indexes_rebuild_online_mon.sql, and it helps to monitor progress when executed on a separe SQL*Plus sessions of its own. This monitoring script executes a query that if you simply keep entering the “/” command, it keeps telling you the PDB and index that is getting rebuilt (or just ended, or is next). This is kind of “are we there yet?”, but for some reason sometimes I just need to know which PDB and particular index I am working on, just in case I see something like a suspicious spike on my OEM window. I do recommend this at least during the first couple of cycles.

The seconds one is iod_indexes_rebuild_online_kill_job.sql, and before you freak out and close this page, let me state that the “kill” script does nothing but interrupt the execution of the rebuild script on a nice manner, meaning between the rebuild of two indexes, or in other words: after it completes the index getting rebuilt. This is kind of important se we do not end with half-baked indexes which may need to get recreated. I am planning on never have to use the “kill” script, but if I do, I already have a OEM job ready with it: This is like a “poisson pill”. A word of unsolicited advice here: prepare always for the worst, and hope for the best!

Disclaimer

Before you rush to implement these scripts, please review them extensively, justify their use, and fully digest them. Make them yours, customize them as needed, then test, test and test again. Only then consider implementing an “autonomous index rebuild job” on a production environment. Best wishes!

Written by Carlos Sierra

October 7, 2017 at 4:42 pm

Poor’s man script to summarize reasons WHY cursors are not shared

with 5 comments

Having a large number of child cursors can affect parsing performance as hinted by Abel Macias on his blog post about Diagnosis of a High Version Count (HVC). On his post, Abel also refers to a note on MOS which includes a script that dives into the reasons WHY our cursors are not getting shared. Then, for deep-dives in this area, I strongly suggest to read his post and use the referenced script provided at MOS.

Besides longer parse times, and potential library cache contention, manifested by some waits (such as on mutex), there is another side effect that may bite us: CBO may produce a different plan when a SQL statement is hard-parsed while creating a new child cursor. This latter side effect can be critical for transactional applications with SLA depending on very short latencies of some queries.

This post is about a poor’s man script, that with no installation whatsoever, it lists an aggregated summary of the reasons why our cursors are not shared, including child cursor counts and distinct SQL_IDs counts for each reason (see sample output below). I had to write such script since in our environments we cannot simply run diagnostics scripts that create objects in the database, such as the one provided by MOS.

   CURSORS    SQL_IDS REASON_NOT_SHARED
---------- ---------- -----------------------------
    226916       7826 ROLL_INVALID_MISMATCH
     29387        105 BIND_EQUIV_FAILURE
     21794       4027 HASH_MATCH_FAILED
     11588       2134 OPTIMIZER_MISMATCH
     11027        413 BIND_LENGTH_UPGRADEABLE
     11008        384 BIND_MISMATCH
     10125       2697 USE_FEEDBACK_STATS
      4540        109 OPTIMIZER_MODE_MISMATCH
      1652         72 PURGED_CURSOR
      1245         81 BIND_UACS_DIFF
      1062        316 LANGUAGE_MISMATCH
       771        103 LOAD_OPTIMIZER_STATS
       500         52 STATS_ROW_MISMATCH
       238         86 MV_QUERY_GEN_MISMATCH
        94          3 MULTI_PX_MISMATCH
        28          4 AUTH_CHECK_MISMATCH
        23          1 INSUFF_PRIVS

Once I get to see some reasons for not sharing, some responsible for a large number of child cursors (and distinct SQL_IDs), then I can search on MOS as Abel suggested. Ideally, if you are interested in plan stability, you may want to reduce the times the CBO is tasked to create a new child cursor (and potentially a new Execution Plan).

In output sample above, top in our list is ROLL_INVALID_MISMATCH, causing 226,916 child cursors in as many as 7,826 SQL statements. This particular reason for not sharing cursors is due to a persistent gathering of schema object statistics with the explicit request to invalidate cursors. Since we want to promote plan stability, we would need to suspend such aggressive gathering of CBO statistics and validate reason ROLL_INVALID_MISMATCH is reduced.

Anyways, free script used is below. Enjoy it!

*** edited *** a new version of the script is now available (below). Thanks to stewashton for his input.

-- sql_shared_cursor.sql
SET HEA OFF LIN 300 NEWP NONE PAGES 0 FEED OFF ECHO OFF VER OFF TRIMS ON TRIM ON TI OFF TIMI OFF SQLBL ON BLO . RECSEP OFF;
SPO all_reasons.sql
SELECT CASE WHEN ROWNUM = 1 THEN '( ' ELSE ', ' END||column_name
  FROM dba_tab_columns
 WHERE table_name = 'V_$SQL_SHARED_CURSOR'
   AND owner = 'SYS'
   AND data_type = 'VARCHAR2'
   AND data_length = 1
/
SPO OFF;
GET all_reasons.sql
I )
I )
I WHERE value = 'Y'
I GROUP BY reason_not_shared
I ORDER BY cursors DESC, sql_ids DESC, reason_not_shared
0 ( value FOR reason_not_shared IN 
0 FROM v$sql_shared_cursor UNPIVOT
0 SELECT COUNT(*) cursors, COUNT(DISTINCT sql_id) sql_ids, reason_not_shared
L
SET HEA ON NEWP 1 PAGES 30
PRO please wait
/
!rm all_reasons.sql

Written by Carlos Sierra

September 1, 2017 at 1:01 pm

Script to identify index rebuild candidates on 12c

with 12 comments

Some time back I blogged about an easy way to estimate the size of an index. It turns out there is an API that also uses the plan_table under the hood in order to estimate what would be the size of an index if it were rebuilt. Such API is DBMS_SPACE.CREATE_INDEX_COST.

Script below uses  DBMS_SPACE.CREATE_INDEX_COST, and when executed on 12c connected into a PDB, it outputs a list of indexes with enlarged space, which if rebuilt they would shrink at least 25% their current size. This script depends on the accuracy of the CBO statistics.

Once you are comfortable with the output, you may even consider automating its execution using OEM. I will post in a few days a way to do that using DBMS_SQL. In the meantime, here I share the stand-alone version.

----------------------------------------------------------------------------------------
--
-- File name: indexes_2b_shrunk.sql
--
-- Purpose: List of candidate indexes to be shrunk (rebuild online)
--
-- Author: Carlos Sierra
--
-- Version: 2017/07/12
--
-- Usage: Execute on PDB
--
-- Example: @indexes_2b_shrunk.sql
--
-- Notes: Execute connected into a PDB.
-- Consider then:
-- ALTER INDEX [schema.]index REBUILD ONLINE;
--
---------------------------------------------------------------------------------------

-- select only those indexes with an estimated space saving percent greater than 25%
VAR savings_percent NUMBER;
EXEC :savings_percent := 25;
-- select only indexes with current size (as per cbo stats) greater then 1MB
VAR minimum_size_mb NUMBER;
EXEC :minimum_size_mb := 1;

SET SERVEROUT ON ECHO OFF FEED OFF VER OFF TAB OFF LINES 300;

COL report_date NEW_V report_date;
SELECT TO_CHAR(SYSDATE, 'YYYY-MM-DD"T"HH24:MI:SS') report_date FROM DUAL;
SPO /tmp/indexes_2b_shrunk_&&report_date..txt;

DECLARE
l_used_bytes NUMBER;
l_alloc_bytes NUMBER;
l_percent NUMBER;
BEGIN
DBMS_OUTPUT.PUT_LINE('PDB: '||SYS_CONTEXT('USERENV', 'CON_NAME'));
DBMS_OUTPUT.PUT_LINE('---');
DBMS_OUTPUT.PUT_LINE(
RPAD('OWNER.INDEX_NAME', 35)||' '||
LPAD('SAVING %', 10)||' '||
LPAD('CURRENT SIZE', 20)||' '||
LPAD('ESTIMATED SIZE', 20));
DBMS_OUTPUT.PUT_LINE(
RPAD('-', 35, '-')||' '||
LPAD('-', 10, '-')||' '||
LPAD('-', 20, '-')||' '||
LPAD('-', 20, '-'));
FOR i IN (SELECT x.owner, x.index_name, SUM(s.leaf_blocks) * TO_NUMBER(p.value) index_size,
REPLACE(DBMS_METADATA.GET_DDL('INDEX',x.index_name,x.owner),CHR(10),CHR(32)) ddl
FROM dba_ind_statistics s, dba_indexes x, dba_users u, v$parameter p
WHERE u.oracle_maintained = 'N'
AND x.owner = u.username
AND x.tablespace_name NOT IN ('SYSTEM','SYSAUX')
AND x.index_type LIKE '%NORMAL%'
AND x.table_type = 'TABLE'
AND x.status = 'VALID'
AND x.temporary = 'N'
AND x.dropped = 'NO'
AND x.visibility = 'VISIBLE'
AND x.segment_created = 'YES'
AND x.orphaned_entries = 'NO'
AND p.name = 'db_block_size'
AND s.owner = x.owner
AND s.index_name = x.index_name
GROUP BY
x.owner, x.index_name, p.value
HAVING
SUM(s.leaf_blocks) * TO_NUMBER(p.value) > :minimum_size_mb * POWER(2,20)
ORDER BY
index_size DESC)
LOOP
DBMS_SPACE.CREATE_INDEX_COST(i.ddl,l_used_bytes,l_alloc_bytes);
IF i.index_size * (100 - :savings_percent) / 100 > l_alloc_bytes THEN
l_percent := 100 * (i.index_size - l_alloc_bytes) / i.index_size;
DBMS_OUTPUT.PUT_LINE(
RPAD(i.owner||'.'||i.index_name, 35)||' '||
LPAD(TO_CHAR(ROUND(l_percent, 1), '990.0')||' % ', 10)||' '||
LPAD(TO_CHAR(ROUND(i.index_size / POWER(2,20), 1), '999,999,990.0')||' MB', 20)||' '||
LPAD(TO_CHAR(ROUND(l_alloc_bytes / POWER(2,20), 1), '999,999,990.0')||' MB', 20));
END IF;
END LOOP;
END;
/

SPO OFF;

And if you want to try the DBMS_SPACE.CREATE_INDEX_COST API by itself, you can also grab the estimated size of the index after calling this API, using query below. But the API already returns that value!


SELECT TO_NUMBER(EXTRACTVALUE(VALUE(d), '/info')) index_size
FROM XMLTABLE('/*/info'
PASSING (SELECT XMLTYPE(other_xml)
FROM plan_table
WHERE other_xml LIKE '%index_size%')) d
WHERE EXTRACTVALUE(VALUE(d), '/info/@type') = 'index_size'
/

Written by Carlos Sierra

July 12, 2017 at 11:03 am

Posted in Index Rebuild

eAdam 3.0

eAdam 3.0 is now available. A little bit of history is needed: eAdam was born on 2014 as version 1.0. Old version 1.0 extracted less than 30 views (DBA_HIST, GV$ and V$) as text files in order to facilitate the offline analysis of performance related metadata of large databases. eAdam 3.0 extracts raw data from over 200 views.

Why the need of eAdam in the first place? Well, in 2014 I was challenged with a requirement to explore some large databases, and report on opportunities to improve their performance and scalability. Access to the actual systems was off-limits, and the state of the AWR was suboptimal to put it nicely. So, any query against any AWR table would take very long (hours in some cases), and resources on such production environments were scarce. Back then, I suggested to extract AWR (and some GV$/V$) views and do all the analysis using a local test environment for which I would get some restricted access. And this is how eAdam was born: out of the necessity to perform an Oracle database health-check, consuming computational resources for such analysis on a lower environment, while minimizing the amount of resources consumed on the source production database. The solution was: extract from the source database the metadata needed to perform its analysis on a different target database (local or remote).

eAdam 1.0 was a great success, but its scope was maybe too small. With less than 30 objects, there were many follow-up questions that would require going back to the source database and execute new queries. Later on, eDB360 was created as the next logical step after eAdam. This 2nd tool eDB360 expanded the initial functionality of eAdam by executing such new tool on the same database being analyzed. This way, every new query to answer more elaborated questions was added to eDB360, and the footprint of metadata queried grew from less than 30 to more than 200 views (DBA_HIST, DBA, GV$ and V$). After some maturing time for eDB360, eAdam was used less often and its code got frozen with its less than 30 views. But even 3 years later on 2017, eAdam is still used by some  users around the globe to perform its initial task: extract a performance related repository and restore it on another database for further analysis. Bear in mind that eAdam was created well before AWR Warehouse was made available, and eAdam still has some advantages such as: the repository on the target system can be of same, lower or higher version than the source database; and eAdam is free software, so it can be easily customized for special needs.

What is then new on eAdam 3.0? With this version, there are 3 different methods (or versions), and all 3 are available on GitHub under the same master directory.

eAdam1: This is the original version. Less than 30 views. Still available for those using eAdam 1.0 for years and happy with what it provides. No enhancements are expected.

eAdam2: Completely new code. Similar concept than 1.0, but expands from less than 30 to over 200 views. Uses text files for the repository, so it installs nothing on the source database. Drawback is that some CLOB columns may get incomplete data due to some limitations on the technology stack. XMLTYPE columns are copied with null data, affecting two sql directives 12c views.

eAdam3: Completely new code. It uses compressed external tables. More efficient than 2.0 and preserves all metadata including CLOB and XMLTYPE columns, which are tricky to transport as text. The only disadvantage compared to 2.0 is that it requires to create some objects on the source database. It does not consume tablespace on the source database although, and this method is highly preferred over eAdam1 and eAdam2.

With the eAdam repositories from versions 2.0 and 3.0 there is a big bonus, and it is the compatibility with eDB360. What does that mean? Well, if you restore the eAdam 2.0 or 3.0 repositories on a target system, then you can produce the comprehensive eDB360 output on such target system, and get a 360-degree view of the source database. Of course if you can run eDB360 entirely on the source database that is preferred, but if doing so on the source database requires over 24 hours of execution time, then you may want to consider to leverage the power of eAdam and combine the power of these two: eAdam 3.0 plus eDB360.

 

Written by Carlos Sierra

April 16, 2017 at 7:03 pm

Posted in

eDB360 meets eAdam 3.0 – when two heads are better than one!

with one comment

Version v1711 of eDB360 invites eAdam 3.0 to the party. What does it mean? We recently learned that eDB360 v1706 introduced the eDB360 repository, which materialized the content of 26 core DBA_HIST views into a staging repository made of heap tables. This in order to expedite the execution of eDB360 on a database with an enlarged AWR. New version v1711 expands the list from 26 views to a total of 219. And these 219 views include now DBA_HIST, DBA, GV$ and V$.

Expanding existing eDB360 repository 8.4x from 26 to 219 views is not what is key on version v1710. The real impact of the latest version of eDB360 is that now it benefits of eAdam 3.0, providing the same benefits of the 219 views of the eDB360 heap-tables repository, but using External Tables, which are easily transported from a source database to a target database. This simple fact opens many doors.

Using the eAdam 3.0 repository from eDB360, we can now extract the metadata from a production database, then restore it on a staging database where we can produce the eDB360 report for the source database. We could also use this new external repository for: finer-granularity data mining; capacity planning; sizing for potential hardware refresh; provisioning tools; to estimate candidate segments for partitioning or for offloading into Hadoop; etc.

With the new external-tables eAdam 3.0 repository, we could easily build a permanent larger heap-table permanent repository for multiple databases (multiple tenants), or for multiple time versions of the same database. Thus, now that eDB360 has met eAdam 3.0, the combination of these two enables multiple innovative future features (custom or to be packaged and shipped with eDB360).

eDB360 recap

eDB360 is a free tool that gives a 360-degree view of an Oracle database. It installs nothing on the database, and it runs on 10g to 12c Oracle databases. It is designed for Linux and UNIX, but runs well on Windows as well (you may want to install first UNIX Utilities UnxUtils and a zip program, else a few OS commands may not properly work on Windows). This repository-less version is the default way to use eDB360, and is the right method for most cases. But now, in addition to the default use, eDB360 can also make use of one of two repositories.

eDB360 takes time to execute (several hours). It is designed to time-out after 24 hours by default. First reason for long execution times is the intentional serial-processing method used, with sequential execution of query after query, while consuming little resources. Such serial execution, plus the fact that it is common to have the tool execute on large databases where the state of AWR is suboptimal, causes the execution to take long.  We often discover that source AWR repositories lack expected partitioning, and in many cases they hold years of data instead of expected 8 to 31 days. Therefore, the nature of serial-execution combined with enlarged and suboptimal AWR repositories, usually cause eDB360 to execute for many hours more than expected. With 8 to 31 days of AWR data, and when such reasonable history is well partitioned, eDB360 usually executes in less than 6 hours.

To overcome the undesired extended execution times of eDB360, and besides the obvious (partition and purge AWR), the tool provides the capability to execute in multiple threads splitting its content by column. And now, in addition to the divide-and-conquer approach of column execution, eDB360 provides 2 repositories with different objectives in mind:

  1. eDB360 repository: Use this method to create a staging repository based on heap tables inside the source database. This method helps to expedite the execution of eDB360 by at least 10x. Repository heap-tables are created and consumed inside the same database.
  2. eAdam3 repository: Use this method to generate a repository based on external tables on the source database. Such external-tables repository can be moved easily to a remote target system, allowing to efficiently generate the eDB360 report there. This method helps to reduce computations in the source database, and enables potential data mining on the external repository without any resources impact on the source database. This method also helps to build other functions on top of the 219-tables repository.

Views included on both eDB360 and eAdam3 repositories:

  1. dba_2pc_neighbors
  2. dba_2pc_pending
  3. dba_all_tables
  4. dba_audit_mgmt_config_params
  5. dba_autotask_client
  6. dba_autotask_client_history
  7. dba_cons_columns
  8. dba_constraints
  9. dba_data_files
  10. dba_db_links
  11. dba_extents
  12. dba_external_tables
  13. dba_feature_usage_statistics
  14. dba_free_space
  15. dba_high_water_mark_statistics
  16. dba_hist_active_sess_history
  17. dba_hist_database_instance
  18. dba_hist_event_histogram
  19. dba_hist_ic_client_stats
  20. dba_hist_ic_device_stats
  21. dba_hist_interconnect_pings
  22. dba_hist_memory_resize_ops
  23. dba_hist_memory_target_advice
  24. dba_hist_osstat
  25. dba_hist_parameter
  26. dba_hist_pgastat
  27. dba_hist_resource_limit
  28. dba_hist_seg_stat
  29. dba_hist_service_name
  30. dba_hist_sga
  31. dba_hist_sgastat
  32. dba_hist_snapshot
  33. dba_hist_sql_plan
  34. dba_hist_sqlstat
  35. dba_hist_sqltext
  36. dba_hist_sys_time_model
  37. dba_hist_sysmetric_history
  38. dba_hist_sysmetric_summary
  39. dba_hist_sysstat
  40. dba_hist_system_event
  41. dba_hist_tbspc_space_usage
  42. dba_hist_wr_control
  43. dba_ind_columns
  44. dba_ind_partitions
  45. dba_ind_statistics
  46. dba_ind_subpartitions
  47. dba_indexes
  48. dba_jobs
  49. dba_jobs_running
  50. dba_lob_partitions
  51. dba_lob_subpartitions
  52. dba_lobs
  53. dba_obj_audit_opts
  54. dba_objects
  55. dba_pdbs
  56. dba_priv_audit_opts
  57. dba_procedures
  58. dba_profiles
  59. dba_recyclebin
  60. dba_registry
  61. dba_registry_hierarchy
  62. dba_registry_history
  63. dba_registry_sqlpatch
  64. dba_role_privs
  65. dba_roles
  66. dba_rsrc_consumer_group_privs
  67. dba_rsrc_consumer_groups
  68. dba_rsrc_group_mappings
  69. dba_rsrc_io_calibrate
  70. dba_rsrc_mapping_priority
  71. dba_rsrc_plan_directives
  72. dba_rsrc_plans
  73. dba_scheduler_job_log
  74. dba_scheduler_jobs
  75. dba_scheduler_windows
  76. dba_scheduler_wingroup_members
  77. dba_segments
  78. dba_sequences
  79. dba_source
  80. dba_sql_patches
  81. dba_sql_plan_baselines
  82. dba_sql_plan_dir_objects
  83. dba_sql_plan_directives
  84. dba_sql_profiles
  85. dba_stat_extensions
  86. dba_stmt_audit_opts
  87. dba_synonyms
  88. dba_sys_privs
  89. dba_tab_cols
  90. dba_tab_columns
  91. dba_tab_modifications
  92. dba_tab_partitions
  93. dba_tab_privs
  94. dba_tab_statistics
  95. dba_tab_subpartitions
  96. dba_tables
  97. dba_tablespace_groups
  98. dba_tablespaces
  99. dba_temp_files
  100. dba_triggers
  101. dba_ts_quotas
  102. dba_unused_col_tabs
  103. dba_users
  104. dba_views
  105. gv$active_session_history
  106. gv$archive_dest
  107. gv$archived_log
  108. gv$asm_disk_iostat
  109. gv$database
  110. gv$dataguard_status
  111. gv$event_name
  112. gv$eventmetric
  113. gv$instance
  114. gv$instance_recovery
  115. gv$latch
  116. gv$license
  117. gv$managed_standby
  118. gv$memory_current_resize_ops
  119. gv$memory_dynamic_components
  120. gv$memory_resize_ops
  121. gv$memory_target_advice
  122. gv$open_cursor
  123. gv$osstat
  124. gv$parameter
  125. gv$pga_target_advice
  126. gv$pgastat
  127. gv$pq_slave
  128. gv$pq_sysstat
  129. gv$process
  130. gv$process_memory
  131. gv$px_buffer_advice
  132. gv$px_process
  133. gv$px_process_sysstat
  134. gv$px_session
  135. gv$px_sesstat
  136. gv$resource_limit
  137. gv$result_cache_memory
  138. gv$result_cache_statistics
  139. gv$rsrc_cons_group_history
  140. gv$rsrc_consumer_group
  141. gv$rsrc_plan
  142. gv$rsrc_plan_history
  143. gv$rsrc_session_info
  144. gv$rsrcmgrmetric
  145. gv$rsrcmgrmetric_history
  146. gv$segstat
  147. gv$services
  148. gv$session
  149. gv$session_blockers
  150. gv$session_wait
  151. gv$sga
  152. gv$sga_target_advice
  153. gv$sgainfo
  154. gv$sgastat
  155. gv$sql
  156. gv$sql_monitor
  157. gv$sql_plan
  158. gv$sql_shared_cursor
  159. gv$sql_workarea_histogram
  160. gv$sysmetric
  161. gv$sysmetric_summary
  162. gv$sysstat
  163. gv$system_parameter2
  164. gv$system_wait_class
  165. gv$temp_extent_pool
  166. gv$undostat
  167. gv$waitclassmetric
  168. gv$waitstat
  169. v$archive_dest_status
  170. v$archived_log
  171. v$ash_info
  172. v$asm_attribute
  173. v$asm_client
  174. v$asm_disk
  175. v$asm_disk_stat
  176. v$asm_diskgroup
  177. v$asm_diskgroup_stat
  178. v$asm_file
  179. v$asm_template
  180. v$backup
  181. v$backup_set_details
  182. v$block_change_tracking
  183. v$cell_config
  184. v$cell_state
  185. v$controlfile
  186. v$database
  187. v$database_block_corruption
  188. v$datafile
  189. v$flashback_database_log
  190. v$flashback_database_stat
  191. v$instance
  192. v$io_outlier
  193. v$iostat_file
  194. v$kernel_io_outlier
  195. v$lgwrio_outlier
  196. v$log
  197. v$log_history
  198. v$logfile
  199. v$mystat
  200. v$nonlogged_block
  201. v$option
  202. v$parallel_degree_limit_mth
  203. v$parameter
  204. v$pdbs
  205. v$recovery_area_usage
  206. v$recovery_file_dest
  207. v$restore_point
  208. v$rman_backup_job_details
  209. v$rman_output
  210. v$segstat
  211. v$spparameter
  212. v$standby_log
  213. v$sys_time_model
  214. v$sysaux_occupants
  215. v$system_parameter2
  216. v$tablespace
  217. v$tempfile
  218. v$thread
  219. v$version

Instructions to use eDB360 and eAdam3 repositories

Both repositories are implemented under the edb360-master/repo subdirectory. Each has a readme file, which explains how to create the repository, consume it and drop it. The eAdam3 repository also includes instructions how to clone an external-table-based eAdam repository into a heap-table-based eDB360 repository.

Executing eDB360 on the eDB360 repository is faster than executing it on the eAdam3 repository, while avoiding new bug 25802477. This new Oracle database bug inflicts compressed external-tables like the ones used by the eAdam3 repository.

If you use eDB360 or eAdam3 repositories and have questions or concerns, do not hesitate to contact the author.

 

Written by Carlos Sierra

April 10, 2017 at 2:41 pm

eDB360 includes now an optional staging repository

with 6 comments

eDB360 has always worked under the premise “no installation required”, and still is the case today – it is part of its fundamental essence: give me a 360-degree view of my Oracle database with no installation whatsoever. With that in mind, this free tool helps sites that have gone to the cloud, as well as those with “on-premises” databases; and in both cases not installing anything certainly expedites diagnostics collections. With eDB360, you simply connect to SQL*Plus with an account that can select from the catalog, execute then a set of scripts behind eDB360 and bingo!, you get to understand what is going on with your database just by navigating the html output. With such functionality, we can remotely diagnose a database, and even elaborate on the full health-check of it. After all, that is how we successfully use it every day!, saving us hundreds of hours of metadata gathering and cross-reference analysis.

Starting with release v1706, eDB360 also supports an optional staging repository of the 26 AWR views listed below. Why? the answer is simple: improved performance! This can be quite significant on large databases with hundreds of active sessions, with frequent snapshots, or with a long history on AWR. We have seen cases where years of data are “stuck” on AWR, specially in older releases of the database. Of course cleaning up the outdated AWR history (and corresponding statistics) is highly recommended, but in the meantime trying to execute edb360 on such databases may lead to long execution hours and frustration, taking sometimes days for what should take only a few hours.

  1. dba_hist_active_sess_history
  2. dba_hist_database_instance
  3. dba_hist_event_histogram
  4. dba_hist_ic_client_stats
  5. dba_hist_ic_device_stats
  6. dba_hist_interconnect_pings
  7. dba_hist_memory_resize_ops
  8. dba_hist_memory_target_advice
  9. dba_hist_osstat
  10. dba_hist_parameter
  11. dba_hist_pgastat
  12. dba_hist_resource_limit
  13. dba_hist_service_name
  14. dba_hist_sga
  15. dba_hist_sgastat
  16. dba_hist_sql_plan
  17. dba_hist_sqlstat
  18. dba_hist_sqltext
  19. dba_hist_sys_time_model
  20. dba_hist_sysmetric_history
  21. dba_hist_sysmetric_summary
  22. dba_hist_sysstat
  23. dba_hist_system_event
  24. dba_hist_tbspc_space_usage
  25. dba_hist_wr_control
  26. dba_hist_snapshot

Thus, if you are contemplating executing eDB360 on a large database, and provided pre-check script edb360-master/sql/awr_ash_pre_check.sql shows that eDB360 might take over 24 hours, then while you clean up your AWR repository you can use the eDB360 staging repository as a workaround to speedup eDB360 execution. The use of this optional staging repository is very simple, just look inside the edb360-master/repo directory for instructions. And as always, shoot me an email or comment here if there were any questions.

Written by Carlos Sierra

February 19, 2017 at 8:43 pm

eDB360 takes long to execute!

with 2 comments

eDB360 provides a lot of insight about an Oracle database. It executes thousands of SQL statements querying GV$ and DBA views from the Oracle’s data dictionary. It was designed to impose very little load on the system where it executes, thus it consumes only one session and avoids parallel execution. On a system where the state of historical metrics is “normal”, eDB360 may take one or two hours to execute. In the other hand, when the volume of historical metrics is abnormally “large”, then eDB360 may execute for many hours up to default threshold of 24, then quit. No worries here, it can be restarted pretty much where it left…

If you are considering executing eDB360 on one of your databases, before you do please execute first a light-weight “pre-check”, which is basically a small script that reports on the state of Active Session History (ASH) stored into the Automatic Workload Repository (AWR). Bare in mind that ASH and AWR are part of the Oracle Diagnostics Pack, so consider such “pre-check” only if you have a license for such Oracle Pack for the target database.

AWR ASH Pre-Check

You can execute stand-alone the awr_ash_pre_check.sql script available as free software in GitHub within edb360-master or cscripts-master. If downloading eDB360, you can find awr_ash_pre_check.sql under the edb360-master/sql directory, and if you download the cscripts set, then you can find awr_ash_pre_check.sql under cscripts-master/sql.

This pre-check script reports on the following potential concerns:

  1. Retention period of AHR, frequency of Snapshots, AWR Baselines, and similar.
  2. Age of CBO Statistics on the main table behind DBA_HIST_ACTIVE_SESS_HISTORY (WRH$_ACTIVE_SESSION_HISTORY) and its partitions.
  3. State of CBO Statistics for WRH$_ACTIVE_SESSION_HISTORY segments: Are the statistics Locked? Are they Stale?
  4. Size of WRH$_ACTIVE_SESSION_HISTORY segments, with range of Snapshots per segment (partition). Are they outside the desired retention window?
  5. Creation date and last DDL date for WRH$_ACTIVE_SESSION_HISTORY segments. So expected partition splitting can be verified.

Take for example result below. It shows that current ASH repository contains 320.6 days of history, even when in this case only 30 were requested (displayed on an upper part of report not shown here). We also see ASH is 16.4 GBs in size, where normally 1-2 GBs are plenty. We can conclude last partition split (or simply last DDL) is 289.7 days old, where normally only a few days-old are expected. This pre-check sample results on a ballpark estimate of 127 hours to execute eDB360. In other words, if we had one month of history (as current AWR configuration requests) instead of 320.6 days of history, then ASH repository would be less than 10% its current size and eDB360 would be estimated to execute in about 13 hours instead of 127. Keep in mind this “pre-check” provides a ballpark estimate, so an actual execution of eDB360 would take more or less the estimated time.

screen-shot-2016-11-23-at-2-03-55-pm

What to do if eDB36o is estimated to run for several days like in sample above? What I recommend is to fix ASH first. This process requires to read and digest at least 3 My Oracle Support (MOS) notes below, but it is not as bad as it sounds. You want to follow this path anyways, so any diagnostics tool that relies on ASH from AWR would benefit of the much needed cleanup.

  1. WRH$_ACTIVE_SESSION_HISTORY Does Not Get Purged Based Upon the Retention Policy (Doc ID 387914.1)
  2. Bug 14084247 – ORA-1555 or ORA-12571 Failed AWR purge can lead to continued SYSAUX space use (Doc ID 14084247.8)
  3. Manually Purge the Optimizer Statistics and AWR Snaphots to Reduce Space Usage of SYSAUX Tablespace (Doc ID 1965061.1)

Diagnosing eDB360 taking long

Let’s say you executed an older version of eDB360 and it took long (newer versions starting on v1620 perform an automatic pre-check, and if execution is estimated to last more than 8 hours, then it pauses and ask for confirmation before executing). If you executed an older version of eDB360, or simply your execution took longer than expected, review next the following files included on the directory from where you executed eDB360 (edb360-master), or stored within the edb360_*.zip file generated by eDB360 as it executed.

  1. awr_ash_pre_check_<database_name>.txt
  2. verify_stats_wr_sys_<database_name>.txt
  3. 00002_edb360_<NNNNNN>_log.txt
  4. 00003_edb360_<NNNNNN>_log2.txt
  5. 00004_edb360_<NNNNNN>_log3.txt
  6. 00005_edb360_157536_tkprof_sort.txt

If you find one of the SQL statements from eDB360 is taking more than a few minutes to execute, suspect first outdated CBO statistics on the Data Dictionary objects behind such query. Or if volume of data accessed by such query from eDB360 is large, suspect AWR data not getting purged. In either case take action accordingly.

If you cannot make sense of the information provided by these diagnostics files above, then contact eDB360 author at carlos.sierra.usa@gmail.com.

Keep in mind that execution time for eDB360 really depends on the state of several metrics stored in AWR, where ASH is the most relevant piece but not the only one. Think outdated statistics and outdated metrics!

Written by Carlos Sierra

November 23, 2016 at 5:47 pm

Posted in General