RE: Why a rebuild speeds up my queries.

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Wed, 1 Sep 2004 12:24:09 -0400
Message-ID: <KNEIIDHFLNJDHOOCFCDKCENJFEAA.mwf@rsiz.com>

There may be more modern tools I haven't figured out yet, however in lieu of a better solution being proffered (which I'll quickly swipe myself),the ancient technique is this:

insert an "all zeroes" row (I guess in your case with an actual account_id - are they really all 35 chars or at least unchanging?)
As long as all your columns work with the vsize function (all yours do), generate yourself a query of the form:

Select

     vsize(c1) +
     vsize(c2) +
     ...
     vsize(cn) +
     0
     from tab
     where account_id = 'the brand new one';

     Write down that number. Unless your account id length varies, that is a

constant. If you have a column that vsize chokes on (longs, if memory serves and maybe some user defined types and whatnot), you can look up the estimated size for an empty column of that size in a manual someplace and hope that the reference you locate is reasonably accurate, and add it to the total.

3) Now I don't see a DLM or an account creation date, so I really can't tell from this table whether you have a way to identify mature rows. Usually business data does mature (meaning it stops changing in length). If you have a way to identify mature rows, the next thing you do (I usually partially generate from the dictionary and paste pieces together in an editor) is find the max(vsize(c1)+vsize(c2)+...+vsize(c3)), avg, and min for the mature rows. You probably have some descriptive table for each account_id, right? Anyway, if you figure out how many rows of your average size you can put per block (don't forget the block overhead - I usually verify by inserting a filled in char column of the required length and checking how many actually fit before rowid tells me I've spilled into a new block - all this varies for stuff like initrans, etc.) you can do the math and come up with a pretty good setting for pctfree such that it is exceeded by the correct number of
"all zeroes" rows. That gets screwed up a bit if you're likely to put big
numbers in new rows before the block is full and you'll end up with a bit less density than you want. If you want a better number, you have to do quite a bit more work on the query side to figure out the shape of your mature row distribution length. Then you can figure out how many rows will migrate anyway if you use the average to set pctfree (more math than I'm typing required.) At least you can get pretty doggone close, but usually average is about right, anyway. A side issue is that if you use a higher pctfree to accomodate rows over the average length with less eventual migration, then you have to figure in the extra blocks you'll scan due to lower density versus the extra blocks to pick up a migrated row. Only knowledge of your actual use of the data will answer that question, but you probably have a decent idea. Oh - if the number of mature rows that fit per block from this calculation is less than 1, you probably want to try a bigger block size.

4) You didn't have any date fields, but I'd be remiss if I didn't mention how useful having defined dates meaning "I don't have a date yet" can be to minimizing row migration for tables with lots of dates. Since dates are constant length, this reduces the variability in row size, which will tend to allow you to have a higher density with fewer row migrations.

5) Now remember, you're probably trying to minimize your operational cost over time. If you're not 24x7 and your row length is highly dynamic and variable, and cost and inconvenience to rebuild that individual table is modest, then (heresy) rebuilding that particular table periodically might be your optimal solution. Even then, however, there is a good chance you can usefully rebuild less than the whole table. You've got pretty much all number columns, and they only grow a byte per two orders of magnitude. So unless the values vary tremendously, rows that have been around a few months probably don't change much in size any more. Next time you rebuild, you could do something along the lines of: Create new table with tiny pctfree, copy in all the "mature" rows, reset pct free to your make the number of
"all zeroes" rows about right, and then copy in all the "non-mature" rows.
This should give you the benefit of relatively high density on the rows unlikely to change much in length combined with space to grow for young rows. This will tend to prolong the time before you need another rebuild (which you would only do if you measure that increased density or corrected migration would likely reduce costs enough to justify the cost of the rebuild.) Then you next rebuild would be copy out the rows copied in using higher pctfree (and added later), delete them, copy back the now mature rows from the copy out at lower pctfree and copy back the non-mature rows at higher pctfree. Now if all this trouble is actually justified by your query cost reduction, you might seriously consider partitioning by creation date of your accounts, and manage the density partition by partition.

6) You have to figure out how much trouble this is worth for the gains possible. (Which you can measure.) There is a huge variant in the cost and inconvenience to rebuild depending on your operational environment. (The more like 24x7x365.25 it is the more inconvenient it tends to be.)

Good luck, I'm sure you'll gets lots of useful additional suggestions on this list.

mwf

-----Original Message-----
From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org]On Behalf Of David Sharples Sent: Wednesday, September 01, 2004 9:31 AM To: oracle-l_at_freelists.org
Subject: RE: Why a rebuild speeds up my queries.

As an extra too this, here is a table we rebuilt

ACCOUNT_ID                                NOT NULL VARCHAR2(35)
 ACCOUNT_TYPE                              NOT NULL NUMBER(3)
 CALL_DIVERT_ABS_FEAT_TS                            NUMBER(11)
 CALL_DIVERT_ABS_FEAT_1H                            NUMBER(7)
 CALL_DIVERT_ABS_FEAT_4H                            NUMBER(7)
 CALL_DIVERT_ABS_FEAT_8H                            NUMBER(7)
 <snip>
 VOICE_MAIL_ABS_FEAT_1W                             NUMBER(7)

On account creation, everything is populated with 0 apart from account_id and account_type

As time goes on the others will get proper values, up to the column length, so the row can grow massively.

I'm tempted to set pctfree to be like 80,but this just *seems* bad - any suggestions

Thanks

Dave

<snip>

Please see the official ORACLE-L FAQ: http://www.orafaq.com

To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line.

--
Archives are at http://www.freelists.org/archives/oracle-l/
FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Received on Wed Sep 01 2004 - 14:52:07 CDT