Re: Auto stats gathering is not sufficient - what now?
From: Kellyn Pedersen <kjped1313_at_yahoo.com>
Date: Wed, 9 Feb 2011 12:32:42 -0800 (PST)
Message-ID: <401840.79944.qm_at_web120205.mail.ne1.yahoo.com>
Date: Wed, 9 Feb 2011 12:32:42 -0800 (PST)
Message-ID: <401840.79944.qm_at_web120205.mail.ne1.yahoo.com>
Professionally speaking, I think you have listed a number of my reasons for moving away from the auto stats job for a specific object right here. If the default job can not collect statistics often enough of the default choices are too large for the time allotment given to grant the CBO the information it requires to perform well for your customers. Since pretty much all my environments are over 1TB, I have a preference for disabling the default auto stats job and running my own statistics gathering job based off the DBA_TAB_MODIFICATIONS table. This will allow you to set small sample sizes and method options that make sense for your environment, (giving you more definitive control) and allowing you to set different objects, schemas, etc. to only update stats which are stale, nothing else is touched...) The one thing you also need to take into consideration is how often you need updated stats. I am a strong advocate in an environment that contains staging or reporting objects that there are times that the DBA is not the one who needs to be collecting statistics, but the developer is in their code. If you have objects that are only loaded once per night and utilized once per night with 10%> changes, then it might be times to start training the developers and working with them to take statistics when they matter most. Developers want their code to run efficiently and I have not found a developer yet who did not appreciate a first class lesson in the CBO and stats collection. If it makes sense to move the stats collection into the procedural code, I have them move it into the code. I can not help them out if I collect stats AFTER they need really need it. Let me know if you are interested, I'd be happy to send you my shell script to collect stats based on the DBA_TAB_MODIFICATIONS table- you are welcome to it.. Kellyn Pedersen Multi-Platform Database Administrator www.pythian.com http://www.linkedin.com/in/kellynpedersen www.dbakevlar.com ________________________________ From: TJ Kiernan <tkiernan_at_pti-nps.com> To: oracle-l_at_freelists.org Cc: TJ Kiernan <tkiernan_at_pti-nps.com> Sent: Wed, February 9, 2011 12:50:07 PM Subject: Auto stats gathering is not sufficient - what now? 10.2.0.3 on 32-bit Windows 2003 R2 I have a reporting table that is too large to grow by 10% very often - 66 million records growing by 250k per week = 24 weeks before stats go stale and are gathered, meanwhile queries against relatively recent data (last month, last quarter) get horrible execution plans unless we hint them. For instance, from the example below, we have an index on (GROUP_KEY, DATE_PROCESSED) that would return this query in <1 second. If my predicate values were in range of the statistics, then I expect to get better plans, so the first thing I’m considering is a periodic job (probably weekly) to gather stats on this table. My question: What sorts of considerations should I make when setting up a non-standard stats gathering job? Particularly METHOD_OPT, but with other parameters as well, what prompts you to step away from defaults? PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------- SQL_ID 0udsqttt83syw, child number 0 ------------------------------------- SELECT /*+ gather_plan_statistics */ field1, field2, DATE_PROCESSED FROM REPORTING_TABLE WHERE GROUP_KEY = 1234 AND DATE_PROCESSED > to_date('25-DEC-2010', 'DD-MON-YYYY') ORDER BY GROUP_KEY, DATE_PROCESSED Plan hash value: 3444608443 -------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | -------------------------------------------------------------------------------------------------------------------- |* 1 | TABLE ACCESS BY INDEX ROWID| REPORTING_TABLE | 1 | 1 | 28 |00:00:05.84 | 617K| 148K| |* 2 | INDEX RANGE SCAN | RT_DATE_IDX | 1 | 2 | 1599K|00:00:28.81 | 6065 | 5828 | -------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("GROUP_KEY"=1234) 2 - access("DATE_PROCESSED">TO_DATE('2010-12-25 00:00:00', 'yyyy-mm-dd hh24:mi:ss')) filter("DATE_PROCESSED">TO_DATE('2010-12-25 00:00:00', 'yyyy-mm-dd hh24:mi:ss')) ****10053 trace***** Access Path: index (RangeScan) Index: RT_DATE_IDX resc_io: 4.00 resc_cpu: 31027 ix_sel: 2.4406e-008 ix_sel_with_filters: 2.4406e-008 Cost: 4.01 Resp: 4.01 Degree: 1 Using prorated density: 2.4406e-008 of col #2 as selectivity of out-of-range value pred Using prorated density: 2.4406e-008 of col #2 as selectivity of out-of-range value pred Access Path: index (RangeScan) Index: RT_GROUP_DP_IDX resc_io: 5.00 resc_cpu: 36837 ix_sel: 3.9615e-010 ix_sel_with_filters: 3.9615e-010 Cost: 5.01 Resp: 5.01 Degree: 1 Using prorated density: 2.4406e-008 of col #2 as selectivity of out-of-range value pred T. J. Kiernan Database Administrator Pharmaceutical Technologies, Inc. (402) 965-8800 ext 1039 tkiernan_at_pti-nps.com
-- http://www.freelists.org/webpage/oracle-lReceived on Wed Feb 09 2011 - 14:32:42 CST