Re: Insert / update in big tables

From: Robert Klemme <bob.news_at_gmx.net>
Date: Wed, 22 Mar 2006 11:20:17 +0100
Message-ID: <48cmn1Fiurs0U1@individual.net>

Bertrand Guillaumin wrote:
> Hi,
> I just received the assignement to make a batch 'go faster' on Oracle
> 9.2.0.5 and AIX 5.
>
> To summarize this batch does :
> - take a file on on machine to fill a temporary table(size between 100K
> and 500K).
> - check on a 22M table and a 15M table(product and product/account link)
> if the external ids corresponds to existing internal ids(reject those
> who don't).
> - Check on another 15M table (product line) if the value exists(flag to
> update if it does).
> - if flagged for insert :
> Insert into the 15M table .
> Insert 10 lines into a 150 million table(product line usage) this way
> INSERT INTO product_line_usage product_line_id,usage_value_n,field_n
> 10 times.
> -if flagged for update : Update the 15 M table
> Update 10 times the product_line_usage table.
>
> The product_line and product_line_usage have an unique index on
> product_line_id and product_line_id resp.
> They are noparallel,logging.
>
> It seems from the first trace files i've managed to get that most of the
> elapsed time is spent in the inserts for the time being.
>
> Since i wont be able to test directly the efficiency of my modifications
> i'm going to explain what i intend to do. If you have better ideas,
> remarks please let me know.
>
> Set all involved tables parallel, nologging for the duration of the
> batch(who will pass alone on the machine).
>
> Don't touch the file to database operation.
>
> then merge /*+APPEND PARALLEL*/ INTO product_line USING (SELECT
> product_line_id ,values from temp_table,product,product_account where
> joins) ON (product_line_id=product_line_id)
> INSERT
> UPDATES
>
> merge /*+APPEND PARALLEL*/ INTO product_line_usage USING (SELECT
> product_line_id ,usage_val,values from
> temp_table,product,product_account,usage_values_table where joins(no
> join on the usage_values_table)) ON (product_line_id=product_line_id AND
> USAGE_VAL = USAGE_VAL) INSERT
> UPDATES set field_val = case when usage_val = n then field_n end
>
> Reject into file : SELECT * FROM TEMP_TABLE WHERE EXTERNAL_ID NOT IN
> (SELECT EXTERNAL ID FROM product,product_account where joins)
>
> What do you think ?

Just a bunch of questions that come to mind: How is your batch implemented? Do you have an SP that does the checking / updating? Do you use direct path load of the initial file? Why do you insert 10 lines explicitly - are these additional values or are they drawn from the temp table?

Did you verify that there is no more room left for general optimizations (like indexing and IO throughput)?

Kind regards

robert Received on Wed Mar 22 2006 - 04:20:17 CST