RE: Size estimation

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Mon, 22 Feb 2021 10:18:36 -0500
Message-ID: <446601d7092d$fe2eb090$fa8c11b0$_at_rsiz.com>

Not wrong, but remember what JL wrote about HOW you fill rows.

IF, for example, someone decided as a design issue to pin down a primary key and then fill in all the attributes later (which I am NOT advocating but which I have seen), then unless you do the Hakan factor setting dance you might take up a lot more space.

So if in your applications there is anything less than all the columns getting a value at least as long as their final value at insert time, you might be subject to nasty row migration and space wasting.

If you take your existing test and look at the way your applications insert rows and update columns, you could do that as a series of inserts and updates (filling in the columns that are not “born” with values at insert in a wave or waves to simulate the reality of your application.)

Now that is oriented only toward a before the fact size projection. Usually I use that sort of simulation to help urge folks toward having the initial insert be filled out including using dummy plug values when the attribute is logically unknowable until later. This is particularly useful for operational stage or status columns and date of occurrence columns, as is using the Tim Gorman “Scaling to Infinity” approach (the fastest update or delete is an insert, read his or someone else’s papers on that approach).

Folks quite often underestimate the value of having a row “born” on insert at its full length. If some columns rarely get a non-born-on value, that’s a slippery slope and you might need histograms if a significant number of initial “plug” values pile up.

If nearly all your blocks have 16 or 17 rows (for example) that will be stable information to the CBO and the lack of row migration means your block scanning might take place as efficiently as possible.

I prefer to stack the deck in favor of the CBO easily getting very good plans unless it puts you on a maintenance treadmill that exceeds the value.

Also, I didn’t see anything about compression in the thread (I may have missed it).

Good luck,

mwf

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Pap Sent: Monday, February 22, 2021 3:12 AM
To: jack_at_vanzanen.com
Cc: Mark W. Farnham; Jonathan Lewis; Oracle L Subject: Re: Size estimation

Thank you All.I have created a sample table with all not null values and created all the columns to the max possible length. And inserted the same rows ~100K times i.e. total number of rows is 100K and now i see the size of the full table now is ~53MB and so for 100million rows i am estimating it as ~51GB. For index the size on those two columns it's coming as 4MB,so for 100million rows it will be ~4GB. So table and index combined size for a day's worth of data holding ~100million rows is 55GB. Hope it's correct.

And I was thinking the actual size may not be this much ,so to get the near real size of the table/index the only way is to have the rows inserted as it will be when actual business data gets inserted into the table. Please correct me if I am wrong.

Regards

Pap

On Mon, Feb 22, 2021 at 6:19 AM Jack van Zanen <jack_at_vanzanen.com> wrote:

Simply create the table and indexes in a test environment and add 100K dummy records, record the size and multiply to scale.

no need for maths and also fills indexes so you will know the sizes for those as well :-)

Jack van Zanen

<https://docs.google.com/uc?id=0BwovDucFT1fXaEREVHNWRWZyNjg&export=download>

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient. If you are not the intended recipient, please be aware that any disclosure, copying, distribution or use of this e-mail or any attachment is prohibited. If you have received this e-mail in error, please contact the sender and delete all copies. Thank you for your cooperation

On Mon, Feb 22, 2021 at 9:35 AM Mark W. Farnham <mwf_at_rsiz.com> wrote:

What JL wrote, and you did only ask about the size for the table.

BUT, since you marked a primary key that is almost certainly supported by an index and you may have additional indexes, so you’ll need to tack space for indexes on to get total storage requirements.

mwf

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Jonathan Lewis Sent: Sunday, February 21, 2021 1:29 PM
To: Oracle L
Subject: Re: Size estimation

The number(15,0) will take at most 9 bytes

The number(13,0) will take at most 8 bytes each

So your estimate should be 496 - 13 - 14 - 14 = 455

Then you need to add one byte per column to get 471.

Then you have to allow for block size, which means 8,066 bytes available from an 8KB block size with pctfree 0, initrans 2 (default) and ASSM

Max rows = trunc(8066 / 471) = 17 rows per block,

At 100M rows that's 5,882,353 data blocks.

If you create the table using a large extent size *8MB min) you get 1 bitmap block for every 128 blocks allocated so your block requirement goes up by 128/127,

so a total of 5,928,671 blocks. Round that up to the nearest 64MB (assumed extent size) - 5,931,008 blocks = 45.25GB.

So even with several errors on the way you got pretty close to the "right" answer.

Realistically, though, you're unlikely to fill all those 40 and 50 character columns, and unless you're very carefull with setting pctfree (and maybe playing around with the Hakan factor) you're probably going to run into problems with getting too many rows into a block on the initial insert and running into problems with row migration.

There's also the question of multi-byte character sets - are you thinking of your varchar2(N) declarations N bytes (the default assumption) or N characters (which, depending on character set could mean up to 4N bytes).

Regards

Jonathan Lewis

On Sun, 21 Feb 2021 at 17:03, Pap <oracle.developer35_at_gmail.com> wrote:

Hi Listers, It's Oracle RDBMS version 11.2.0.4 exadata. We have a table with structure as below which is going to be created as part of a new project. And we want to predict the storage/space requirement for this. It may not be the exact size but at least we want to estimate the AVG and MAXIMUM space requirement for the table , if all the columns filled with not null values with max column length being occupied/filled for each of the columns.

So to estimate the maximum space requirement , is it correct to Just add the length of the column as it is in bytes and multiply it with the projected number of rows. Something as below.

--

http://www.freelists.org/webpage/oracle-l Received on Mon Feb 22 2021 - 16:18:36 CET

This message: [ Message body ]
Next message: Jonathan Lewis: "Re: Size estimation"
Previous message: Jonathan Lewis: "Re: Size estimation"
In reply to: Pap: "Re: Size estimation"
Next in thread: Mikhail Velikikh: "Re: Size estimation"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message