Re: Data load question

From: jaromir nemec <jaromir_at_db-nemec.com>
Date: Mon, 10 Apr 2006 00:52:12 +0200
Message-ID: <0fd801c65c28$3e325110$3c02a8c0@JARAWIN>

Hi Sandeep,

> However, I want b.id and c.id as pk.

I'm not sure what you mean with PK in this context. What result you would expect if the table a will be initially populated as follows?

insert into a values ('A1','B1');

If the problem is that the generated ID start from 1 (and violating so in the next run the PK) you may use the sequence solution proposed by Stephane, alternatively you may use a max subquery to get the last generated value, see example below (the tricky part here is to get the max efficiently if the tables b and c get large)

slightly modified Mike's solution

insert all

when table_num = 1 then

into b (id, col1)

values (a_seq, col1)

when table_num >= 1 then

into c (id, col2, b_id)

values (b_seq, col2, a_seq)

select col1,

col2,

row_number() over (partition by col1 order by col1) as table_num,

(select nvl(max(id),0) from b) +

dense_rank() over (order by col1) as a_seq,

(select nvl(max(id),0) from c) +

row_number() over (order by col1, col2) as b_seq

from a;

I'd prefer the sequence based solution if there are potentially more parallel job performing the load. The MAX solution works well if the load is performed serially one job at a time. The decision is more a matter of taste depending on your preferences of dealing with redundant data.

If you plan to treat the col1 as a natural key and the id plays the surrogate key a more complicated solution with a lookup in the table b (to get the information whether the natural key is new and need a new surrogate key) will be needed.

something like this.

insert all

when table_num = 1 and new_pk_b = 1 then

into b (id, col1)

values (a_seq, col1)

when table_num >= 1 then

into c (id, col2, b_id)

values (b_seq, col2, a_seq)

select a.col1,

a.col2,

row_number() over (partition by a.col1 order by a.col1) as table_num,

nvl(b.id,

(select nvl(max(id),0) from b) +

dense_rank() over (partition by case when b.col1 is not NULL then 0 else 1 end order by a.col1)

) as a_seq,

(select nvl(max(id),0) from c) +

row_number() over (order by a.col1, a.col2) as b_seq,

case when b.col1 is not NULL then 0 else 1 end new_pk_b

from a, b

where a.col1 = b.col1(+);

Depending on the question above you may need to deploy additionally some de-dup processing.

Regards,

Jaromir D.B. Nemec

Original Message ----- From: "Sandeep Dubey" <dubey.sandeep_at_gmail.com> To: "Mike Spalinger" <mike.spalinger_at_gmail.com> Cc: <oracle-l_at_freelists.org> Sent: Friday, April 07, 2006 10:50 PM Subject: Re: Data load question

--
http://www.freelists.org/webpage/oracle-l

Received on Sun Apr 09 2006 - 17:52:12 CDT