Gathering CBO Statistics in Oracle Applications
When we refer to Oracle Applications we usually think E-Business (EBS), but there are other large Oracle Applications that requires some special consideration when it comes to CBO Statistics gathering: Siebel and People Soft (PSFT). There are some others, but I would say EBS, Siebel and PSFT make the bulk of them all.
Gathering Stats on EBS
This is easy, since the only supported method is FND_STATS. That means using plain DBMS_STATS and ANALYZE is simply not supported. FND_STATS is a wrapper on top of DBMS_STATS that includes some extra functionality. The most important addition is controlling which columns get Histograms. The set of columns EBS requires to have Histograms is small and well controlled. FND_STATS can be executed from within the application (Concurrent Request, i.e. batch job) or directly from SQL*Plus.
If you are using EBS, then be sure FND_STATS is the only active method to gather statistics. In other words, automatic jobs to gather statistics using DBMS_STATS or custom scripts must be disabled.
Gathering Stats on Siebel
Siebel documents how CBO statistics must be gathered in white paper titled “Performance Tuning Guidelines for Siebel CRM Applications on Oracle Database“. This paper is authored by “James Qiu, Paul Blokhin, Mehdi Gerami” (Oracle Development) and its latest version is dated December, 2010. It was rewritten back on 2010 with the contribution of Abel Macias, Dave Henriquez and myself. Out of this paper we created the coe_siebel_stats.sql and the coe_siebel_profile.sql scripts. Both are well documented. Please refer to Metalink (MOS) note 781927.1 to download the latest version of the white paper and the actual scripts.
If you are using Siebel on databases 10.2 or higher, you really want to follow the white paper provided in MOS 781927.1 and implement coe_siebel_stats.sql. We have noticed that since its inception a couple of years ago, the number of performance issues reported in Siebel applications has been reduced significantly.
Gathering Stats on PSFT
For PeopleTools 8.49 and higher, on databases 10.2.0.4 and higher, I recommend the use of Metalink (MOS) note 1322888.1 (Improving Statistics in Oracle RDBMS for PeopleSoft Enterprise). The author of this note is John Houghton, who works for the Center of Expertise (CoE) in the PSFT side of the house. In this MOS note you will find a zip file that contains a couple of documents explaining how to use script pscbo_stats.sql and it also contains the actual script. There is some implementation effort, but once pscbo_stats.sql is implemented your CBO statistics are gathered following best practices for PSFT. I had the opportunity to collaborate with John in the creation and testing of this pscbo_stats.sql script. It is currently on its 2nd major release.
If you are on PSFT 8.49 or higher, you want to implement pscbo_stats.sql out of MOS 1322888.1. Feedback on the results of using this tool have been positive in most cases.
Conclusion
The Cost-based Optimizer (CBO) requires to input good statistics in order to produce an optimal plan. In the perfect world we would use DBMS_STATS with no parameters. Still, some specific application design characteristics, like very skewed data or the use of small tables and temporary tables, require a specif setup to gather statistics tailored to the needs of these applications. All FND_STATS, coe_siebel_stats.sql and pscbo_stats.sql are wrappers on top of DBMS_STATS that incorporate fine granularity use of DBMS_STATS parameters to gather statistics better suited to their application. For EBS the use of FND_STATS is a must, for the other two the use of their corresponding scripts are not a must but strongly recommended.
Hello Carlos,
We’ve recently upgraded to Siebel 8.1.1.11 running on 11.2.0.4. We are also running the coe_siebel_stats.sql on a regular basis. My question is regarding the collection of system and dictionary stats.
On the previous version of Oracle (10gR2) we never collected any system level stats. I would periodically see evidence of bad system type stats such as slow response against some v$… views, lots of temp space being chewed up during queries, direct temp access, etc. If I collected system-type stats in a test environment the performance issues would go away. I have been strongly cautioned against collecting any type of system stats but this seems counter to good practice and in our 10g environment, it appeared warranted but never done for fear of disrupting production.
What is your recommendation on system level stats and Siebel running on 11gR2? Is there an easy answer to this question or is it more a matter of waiting to see how performance plays out and make adjustments?
Thank you.
CT.
August 25, 2014 at 2:34 pm
CT,
There are 4 types of CBO Statistics:
1. Schema Object
2. Data Dictionary
3. Fixed Objects
4. System Stats
From the 4, coe_siebel_stats takes care of #1. The others, are like any other database.
For Data Dictionary, you can set the DBMS_STATS AUTO Job to collect Data Dictionary stats, by using SET_GLOBAL_STATS for AUTOSTATS_TARGET = ORACLE.
For Fixed Objects, which affect the performance of queries on top of V$ views, you want to use the API DBMS_STATS provides, just be careful when you do this, since it may cause some transient locking.
For System Statistics, you may want to simple use DELETE_SYSTEM_STATS and not gather them.
Carlos Sierra
August 25, 2014 at 2:53 pm
Hello Carlos,
If a Siebel customer is currently using the automatic database statistics collection job , without following the recommendations in Note 781927.1, and now he wants to implement the recommended procedure by scheduling the coe_siebel_stats.sql script, then:
1- Should he start by deleting all existing statistics?
2- Can the existing statistics be used as base for the statistics collected by the coe_siebel_stats.sql script?
3- Or should he delete the current statistics, then run the coe_siebel_stats.sql in ‘b’ mode then in ‘N’ mode?.
Regards
Ahmed
Ahmed Abdel Fattah
September 25, 2014 at 4:02 am
Ahmed,
There is no need to delete the existing stats. Actually, you don’t want to do that. Execute it one with B (baseline) then use N (normal) execution afterwards.
Cheers
Carlos Sierra
September 25, 2014 at 1:38 pm
Thanks Carlos.
Ahmed Abdel Fattah
September 25, 2014 at 4:45 pm
Hi Carlos,
we are currently using Siebel 8.1.1.11 in development and test, and upgraded to database 12.1.0.2 (Windows x64). I noticed that Siebel performance is very bad now. I tried collecting statistics using default parameters for dbms_stats package (and collecting stats for ALL tables) and it increased performance a lot.
When reverting back to coe_siebel_stats, the application is nearly unusable. It seems like the 12c optimizer does have some serious issues with the statistics collection of this script in our environment.
question 1: Is there an updated script for 12c available? Couldn’t find anything on MyOracleSupport.
question 2: How would your statistics script affect the usage of the In-Memory option of 12c, when we wanted to use it with Siebel?
Thanks Carlos, your help is appreciated!
Ben
Ben
December 9, 2014 at 11:59 am
Ben,
If collecting stats for Siebel on 12c using out of the box APIs gives you better performance, then keep using that method.
1. There is no 12c version of coe_siebel_stats. Keep in mind this script was needed on 10g and 11g to correct some deficiencies of regular DBMS_STATS APIs.
2. Gathering stats with a script or using standard DBMS_STATS APIs has no direct relationship with in-memory database. Stats are needed for the CBO to generate optimal plans.
Carlos Sierra
December 9, 2014 at 2:35 pm
Hi Carlos,
thak you for the quick reply!
We’ll go with the default settings then.
Ben
Ben
December 10, 2014 at 2:34 am
Carlos,
I’d be glad if you could answer some of my questions related to the coe_siebel_stats.sql.
I understand the role of histograms in general but I’d like to understand it more in context of Siebel which is somehow “special” kind of OLTP DB.
In the most recent version (10g, 11g) of the script I see following:
IF i.table_name IN (‘S_POSTN_CON’, ‘S_ORG_BU’, ‘S_ORG_GROUP’) THEN
method_opt := ‘FOR ALL COLUMNS SIZE 254’;
ELSE
method_opt := ‘FOR ALL INDEXED COLUMNS SIZE 254’;
END IF;
My understanding is that due to above ELSE clause, for all tables that will have statistics gathered we will generate histograms with max number of 254 buckets for ALL indexed columns.
My understanding is also that Siebel favors index access paths for all queries (optimizer_index_cost_adj = 1), essentially allows “nestet loops” as the only join mechanism and instructs CBO for the best response time:
alter session set optimizer_mode = first_rows_10;
alter session set “_hash_join_enabled” = false;
alter session set “_optimizer_sortmerge_join_enabled” = false;
alter session set “_optimizer_join_sel_sanity_check” = true;
Also, Siebel uses bind variables for predicates in most of the cursors. Also, (what I’ve learned from you) it will be very rare and rather not desirable if in newer Oracle DBs the “adaptive cursor sharing” would kick in. We would be never (almost) interested in Siebel in having alternate execution plan generated, that would prefer full table access over index access.
So in short, in Siebel we want the first query results being returned as quickly as it gets and that can usually be guaranteed by index access and nested loops joining strategy.
Provided the above understanding of mine is correct, my questions are:
– what is the preliminary reason for having histograms in Siebel – is it ONLY about join cardinality and resulting join order or are there OTHER reasons?
– why the method_opt := ‘FOR ALL INDEXED COLUMNS SIZE 254’ option which creates not only frequency but also high balanced histograms (also for columns with unique constraints like ROW_ID)
– does the Oracle DB version matter i.e. would the above recommendation change for Siebel on 11.2 or 12.1?
Thank you
Michal
Michal
December 17, 2014 at 7:52 am
Oops, sorry for the typos – it should read “nested loops” and “height balanced” in the paragraphs above
Michal
Michal
December 17, 2014 at 8:03 am
no worries
Carlos Sierra
December 17, 2014 at 1:43 pm
Michal,
Your’s are very good questions:
1. Why do we need histograms on Siebel?
Histograms allow the CBO to be more accurate in its determination of Cardinality and Selectivity. If there are multiple filter predicates on your SQL, these histograms help to decide leading Table and access method (important if there are multiple indexes that could be used). The assumption here for Siebel is that subsequent values of bind variables would render similar Selectivities. And if that is the case, then the CBO would had generated a “better” plan not only for 1st execution but also for subsequent executions.
2. Why method_opt “for all indexed columns size 254”?
This parameter assumes that all columns referenced by a predicate have a single column index on it (which is a Siebel rule). Generating a histogram on a column with unique values is a waste, unfortunately DBMS_STATS does not discriminate, so we end up having height balanced histograms on columns for which the histogram does not help (a simply waste).
3. Would the above recommendation change for 11R2 or 12c?
Yes and no. As far as I know, there are no plans to revisit this gathering of stats for Siebel any time soon. Actually, when we evaluated the impact of using default values of DBMS_STATS APIs versus legacy values like the ones you indicated, we encountered no significant difference (we did this on 11g), but Siebel Development decided to keep using the same set of non-default parameters just for the sake of consistency with prior versions (i.e. 10g and even 9i). So, I would expect this recommendation to use non-default parameters on DBMS_STATS for 11g and 12c to fade away, and start using default DBMS_STATS calls. This is not my call, but Siebel Development call.
Cheers — Carlos
Carlos Sierra
December 17, 2014 at 1:43 pm
Carlos,
Many thanks for prompt answers of yours – these confirmed my general understanding of things.
I don’t want to overuse your kindness but let me express some thoughts related to the subject of histograms / statistics in Siebel.
You are saying “If there are multiple filter predicates on your SQL, these histograms help to decide leading Table and access method (important if there are multiple indexes that could be used)” – leading table – got it.
I was wondering, if having histograms affects (positively) CBO when it decides join order even further, after the leading table was designated?
Also, some DBAs use modified version of your script with “for all indexed columns size skewonly” or “for all indexed columns size repeat” for method_opt DBMS_STATS. You’ve already answered my second question previously but would you add some more to the purpose of “for all indexed columns size 254”, why we DO need histograms for indexed columns even if there is no skew in column’s data?
And lastly, I’d like to know your opinion about flushing the shared pool. I ask because in one of our project the DBA colleague executes “alter system flush shared_pool” after Siebel stats routine (coe_siebel_stats.sql) that runs daily finishes. In my opinion (that is what I told him) in the stable Siebel DB with fairly little data fluctuation / operations, flushing the shared pool i.e. invalidating ALL cursors is rather harmful than helpful. Probably 95% (if not more) of newly hard parsed cursors will become the same plan as before. And the first users of the Siebel system in question suffer (due to hard parsing?) bad performance every day for several hours (it gets better during course of the day). In other words of asking – when would you recommend flushing the shared pool of a Siebel DB instance?
Thank you
Michal
Michal
December 18, 2014 at 11:22 am
1. Histograms do influence Join Cardinality (when the joining columns have histograms) thus join order and access methods have a dependency on such histograms
2. We do not need Histograms on a column if the data is not skewed. When I tested using the “skewed only” method I did not see the gathering ONLY on columns with skewed data. Actually, the number of columns which got stats was even larger than “for all indexed columns” and some columns with indexes were skipped. Thus we abandoned the idea of using “skewed only”.
3. I do not see any benefit flushing the shared pool. With new stats, affected cursors will be automatically invalidated but not all at once. it happens over a period of a few hours.
cheers
Carlos Sierra
December 21, 2014 at 7:04 am
Thanks for that, I appreciate your help greatly.
Michal
December 29, 2014 at 3:54 pm
Hi Carlos,
We have a consistent problem with some queries in Siebel giving more preference to “order by” clause index and optimizer tends to lead by that. I read “Siebel: Improving Performance of Queries with ORDER BY on Oracle (Doc ID 475413.1)” and as per it, i tried many options including setting “_SORT_ELIMINATION_COST_RATIO” to higher values and removing histogram on the columns in order by clause. Nothing helped. Query performs better when “ALL_ROWS” hint is used.
Question -> Is _sort_elimination_cost_ratio still works in 11.2.0.2 and higher?
Query:
SELECT …
WHERE …
(T23.BILL_ACCNT_ID = :2)
ORDER BY
T23.ASSET_NUM
TABLE : SIEBEL.S_ASSET
====================================================================================================
NO# NAME DATA_TYPE NULLABLE Default Hidden? AVG_LEN NDV Nulls(%) CARDINALITY HISTOGRAM
— ————————- ——————- ——– ——– ——- ——- ——- ——– ———– —————
90 BILL_ACCNT_ID VARCHAR2(60 CHAR) NO 11 69597 0 53.73 HEIGHT BALANCED
NO# NAME DATA_TYPE NULLABLE Default Hidden? AVG_LEN NDV Nulls(%) CARDINALITY HISTOGRAM
— ————————- ——————- ——– ——– ——- ——- ——- ——– ———– —————
8 ASSET_NUM VARCHAR2(400 CHAR) NOT NULL NO 15 3733645 0 1
Regards,
Anjul
Anjul
August 27, 2015 at 12:24 am
yes, that hidden parameter still works on 11g
Carlos Sierra
August 27, 2015 at 11:56 am