Home » RDBMS Server » Performance Tuning » Gathering CBO Stats
Gathering CBO Stats [message #172484] Tue, 16 May 2006 16:20 Go to next message
smartin
Messages: 1803
Registered: March 2005
Location: Jacksonville, Florida
Senior Member
I'm at a new place using oracle 9i. They have some sort of vendor COTS applications with lots of tables (2500 or so). Nice box with plenty of cpu power to spare. I'm testing ways to gather stats (since they currently use rule database-wide).

Tables are all in a single schema, so the easiest way to get up and running is to use gather_schema stats, and schedule that however I want. I can even throw a nice degree in there.

But that will still backlog, even in parallel with degree, because there are many small tables to wade through, and actually not that many big ones (or what they consider to be huge, being greater than one million rows).

Anyway, so I'm just pondering other options, and was curious as to what any of you had done before in a similar situation. I haven't compiled or tested this yet or anything, but I'm thinking of breaking the tables up into 8 buckets using ntile, and then submitting 8 jobs to run right away, with each one calling a procedure with a bucket number 1-8. Oh and the schema owner.

This procedure would do a select of all tables for that schema with the ntile function, and since the order by clause would be deterministic (table names for a specific owner), then the buckets would line up the same in each procedure call / job submission (assuming no tables were added/dropped during the operation, which I will see to it is the case).

Basically it is do it yourself parallelism. Anyone have any comments or would like to point out any flaws in the system? As I said I haven't even compiled, much less tested the below code, but I am planning on it, and will compare its runtime to the runtime of the standard gather_schema_stats call.

create or replace procedure gather_bucket_stats(p_owner varchar2, p_bucket number)
is
begin
	for t in (select table_name from (
			select table_name, ntile(8) over (order by table_name) bucket
			from dba_tables where owner = p_owner) where bucket = p_bucket)
	loop
		dbms_stats.gather_table_stats(p_owner, t.table_name, cascade => true,
			method_opt => 'for all columns size 254');
	end loop;
end;
/

show errors

create or replace procedure gather_schema_stats(p_owner varchar2)
is
	jobno number;
	what varchar2(100);
begin
	for i in 1 .. 8 loop

		what := 'gather_bucket_stats(p_owner, ' || i || ');';

		dbms_job.submit(jobno, what);

	end loop;

	commit;
end;
/

show errors

Re: Gathering CBO Stats [message #172491 is a reply to message #172484] Tue, 16 May 2006 19:27 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
What problem are you really trying to solve?
What metrics lead you to conclude that this problem needs solving?
Based upon which metrics at what value(s) would an independent observer conclude that this problem really has been solved?
Yes, I will stipulate that current statics are a Good Thing(TM).
How much business activity must occur before a new collection is JUSTIFIED (based upon which metric(s) at what value(s))?
Re: Gathering CBO Stats [message #172734 is a reply to message #172491] Wed, 17 May 2006 17:41 Go to previous messageGo to next message
andrew again
Messages: 2577
Registered: March 2000
Senior Member
I would definitely not generate histograms unless you know your data is significantly skewed. You'll have a really large amount of histogram data lying around which in many cases will add no value and have tons of overhead to generate. Additionally I'd start by limiting histograms to indexed columns and with much smaller bucket sizes. As far as the parallelizing goes - setting the degree of parallelism in dbms_stats.gather_table_stats may be as good an alternative. If you turn table monitoring on, you can run stats gathering in AUTO or STALE mode to only update state once they've changed significantly...

Re: Gathering CBO Stats [message #172920 is a reply to message #172484] Thu, 18 May 2006 11:06 Go to previous messageGo to next message
smartin
Messages: 1803
Registered: March 2005
Location: Jacksonville, Florida
Senior Member
Thanks for the thoughts...

I wonder what would happen if I set the degree to a higher than needed value. My thought though is that for the tiny tables, having a degree of anything would actually just slow things down, so the higher the degree the more overhead. But that may not be a factor, I'm just guessing on that.

Vast majority of tables are in the small range (less than 100k rows). I'm running some tests now with the above procedure, and will try out different numbers of degree param as well as number of buckets I use to see if there is any improvement.

Trying to get this initial gathering down in duration, as using the gather schema stats took it like 12 hours, and curious if this approach will help any. On the first run it didn't, but I think part of that is that the ordering was done by table name, and it seems that their bigger tables all have similar names, so they were all being lumped into a single bucket.

So I've changed that...anyway will post back if this approach really does help or if it isn't worth it and just stick with the gather schema and be done with it.

Thing is, there is a ton of spare capacity, at least cpu and ram wise. No point in it sitting idle...

[Updated on: Thu, 18 May 2006 11:11]

Report message to a moderator

Re: Gathering CBO Stats [message #173118 is a reply to message #172484] Fri, 19 May 2006 08:34 Go to previous message
smartin
Messages: 1803
Registered: March 2005
Location: Jacksonville, Florida
Senior Member
Just initial test results in, nothing concrete enough to prove anything on a general basis, or even in this specific case, but it "looks like" the multiple job approach is cutting the gather time in half...from 11 or 12 hours to 5 or 6. But, it could be a matter of simply finding the right value for degree in dbms_stats.gather_schema_stats that would get the same result.
Previous Topic: CBO leaves out access path ?
Next Topic: buffer busy waits
Goto Forum:
  


Current Time: Sat Nov 23 15:40:53 CST 2024