RE: TEMP tablespace size in a Data Warehouse

From: Gaja Krishna Vaidyanatha <gajav_at_yahoo.com>
Date: Tue, 13 Jun 2000 10:14:24 -0700 (PDT)
Message-Id: <10527.108998@fatcity.com>

Alex,

You raise a valid point on the history and behavior of the "analyze...estimate statistics;" and I think this needs to be clarified. Yes, prior to 7.3.0 the estimate sampled only the first "x%" of the table, where x is the percentage specified in the command or 1064 rows or whatever(when no sample size is given).

I think in and about version 7.3.3, this behavior changed and in my understanding the process of estimation does what I term as "sequential randomized estimation". In this process, the estimation of statistics is done by sampling the x% from a chunk of read worth - db_file_multiblock_read_count.

Which means, effectively the estimate worked just like a full table scan where all the blocks were read, but the sampling and calculation of statistics was done on the sample size, on a chunk-by-chunk basis. In Oracle 8i, we finally have "true randomized estimation". Thanks for bringing up this point.

Cheers,

Gaja.

Alex Hillman <alex_hillman_at_physia.com> wrote:
> Isn't it true that in 8i it if you use estimate Oracle get
> specified
> percentage of data randomly and before 8i it took specified
> percentage of
> data from the beginning of the table and therefore in 8i we
> can use smoller
> percentage with more reliable results?
>
> Alex Hillman
>
> -----Original Message-----
> Sent: Tuesday, June 13, 2000 12:38 AM
> To: Multiple recipients of list ORACLE-L
>
>
> Charlie,
>
> On a 1 TB implementation, we used the "rule of thumb" of (1.5
> *
> size of the largest table) as required temporary tablespace.
> The database release was in 7.3.4 and 8.0.5. This sizing was
> done keeping in mind, not only the need for temp segments
> during
> application sorts and index creations, but also for the
> potential need to perform a "full analyze" on the table with
> the
> "compute statistics" option. You are aware that a table
> analyze
> automatically induces analyze of all the relevant indexes on
> the
> said table.
>
> Having said that, it needs to be mentioned here that after a
> table reaches a certain size threshold (environment-specific
> but
> usually in 10s of Gb), it is almost processor/computation
> prohibitive to do "computes". While 8.0 and above does allow
> "parallel analyzes" by the use of the dbms_utility.
> analyze_part_object procedure, the sheer cost of performing a
> compute is sometimes infeasible.
>
> For most environments "estimates with sample sizes of 16% or
> above have been known to be statistically adequate". The
> statistical confidence interval for a 16% sample-size analyze
> is
> between 83-91%. I have used a sample size of 20% across the
> board for the past 5 years and it worked for me. Depending on
> the degree of skewness in your data, your mileage may vary.
> But
> at least you have a number to start with.
>
> If the usual sort-related parameters have been tuned, it is
> relevant to mention here that the number of tables that are
> analyzed at a given time, is going to have a direct impact on
> the size of the temporary tablespace.
>
> Best Regards,
>
> Gaja.
>
> --- Charlie Mengler <charliem_at_mwh.com> wrote:
> > Is there any "rule of thumb" that can be used to guesstimate
> > the appropriate size
> > of the TEMP tablespace in a data warehouse instance? For
> > example
> > XXX% of total size or
> > TEMP should be 1.YY * the size of the largest table or
> > TEMP should be 1.ZZ * the size of the largest index or
> > ??????
> >
> > Along these lines what can be done to minimize or reduce the
> > amount of TEMP
> > that is needed & what are the trade-offs involved. (This is
> > beside the "obvious"
> > response WRT to sorts & Sort Area Size.)
> >
> > --
> > Charlie Mengler Maintenance
> > Warehouse
> > charliem_at_mwh.com 5505 Morehouse
> > Drive
> > 858-552-6229 San Diego, CA
> > 92121
> > Always be sincere about your enthusiasm, whether you mean it
> > or not.
> > --
> > Author: Charlie Mengler
> > INET: charliem_at_mwh.com
> >
> > Fat City Network Services -- (858) 538-5051 FAX: (858)
> > 538-5051
> > San Diego, California -- Public Internet access /
> > Mailing Lists
> >
>

> > To REMOVE yourself from this mailing list, send an E-Mail
> > message
> > to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru')
> > and in
> > the message BODY, include a line containing: UNSUB ORACLE-L
> > (or the name of mailing list you want to be removed from).
> > You may
> > also send the HELP command for other information (like
> subscribing).
>
>
> =====
> Gaja Krishna Vaidyanatha | 3460 West Bayshore Road,
> Manager - Integration | Palo Alto, CA 94303
> & Consulting Services | gaja_at_brio.com
> Global Alliances | (650)-565-4442
> Brio Technology | www.brio.com
>
> "Opinions and views expressed are my own and not of Brio
> Technology"
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Photos -- now, 100 FREE prints!
> http://photos.yahoo.com
> --
> Author: Gaja Krishna Vaidyanatha
> INET: gajav_at_yahoo.com
>
> Fat City Network Services -- (858) 538-5051 FAX: (858)
> 538-5051
> San Diego, California -- Public Internet access /
> Mailing Lists
>

> To REMOVE yourself from this mailing list, send an E-Mail
> message
> to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru')
> and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from).
> You may
> also send the HELP command for other information (like
> subscribing).
>

Gaja Krishna Vaidyanatha   | 3460 West Bayshore Road,
Manager - Integration      | Palo Alto, CA 94303
& Consulting Services      | gaja_at_brio.com
Global Alliances           | (650)-565-4442
Brio Technology            | www.brio.com

"Opinions and views expressed are my own and not of Brio Technology" Received on Tue Jun 13 2000 - 12:14:24 CDT