|
|
Re: Attempt to de-mystify DBMS_STATS ( Any Takers ? ) [message #159672 is a reply to message #159623] |
Mon, 20 February 2006 20:50 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
I agree with Mahesh, it depends.
If the execution plans of ALL your queries are stable, then there is no need to EVER gather stats again.
Two warnings though:
1. If you submit a query like "...AND txn_dt > 'const val'" then the column histograms from your stale old stats might tell you there are no rows for this range. CBO could choose an index access in this case. But if there are 10M rows in reality, an index-scan would be a bad idea.
If this is a concern, gather stats at every load or do not analyze columns. Be careful.
2. Oracle 10g is a clever thing. It is configured by default to capture a summary of changes to a table since the last analyze. If it sees that a table has been truncated since the analyze, then I can't guarantee the CBO will not take appropriate action (which may include dynamically sampling the table during the parse). This is pure supposition - I have no hard evidence to back it up.
DBAs are understandably anal-retentive about stats. They cannot control the data or the queries, so the only safe advice is to always analyze.
_____________
Ross Leishman
|
|
|
Re: Attempt to de-mystify DBMS_STATS ( Any Takers ? ) [message #159773 is a reply to message #159613] |
Tue, 21 February 2006 07:53 |
smartin
Messages: 1803 Registered: March 2005 Location: Jacksonville, Florida
|
Senior Member |
|
|
Just wanted to add...
It takes over 3 hours to gather the stats on a single table, did I read that right? Are you using the degree parameter to gather in parallel? If not, you should consider that. Is your table partitioned? At 400 million rows, I'd think it should be.
Also, you don't have to compute statistics for every row. On a table that big, you can probably get away with a fairly low estimate_percent. Try tests with various percentages to see if there is any appreciable difference in either
a) the actual statistics and histos gathered
b) plans of your most important / expensive queries
Suggested estimate percentages would be 2, 5, 10, 15, 20, 30, 100. Start with two tests, at 2 percent and 100 percent. If no difference, then stop. Otherwise, increase your percentage slowly and try again. Heck with 400 million, you could probably get away with 1%.
Oh, and if you data change is predictable, you can potentially calculate the statistic changes yourself, and manually modify the dictionary via calls to dbms_stats.set_stats. But no need to bother if the above approaches knock your estimate time down to a negligable amount.
|
|
|