HASH cluster building very Slow [message #441699] |
Tue, 02 February 2010 14:26 |
sky_lt
Messages: 28 Registered: February 2009 Location: vilnius
|
Junior Member |
|
|
Hi all,
I have table of 400 million rows, average row length 75 bytes
i am trying to recreate that table using hash clustered table.
Tablespace block size is 8kb.
--Create Hash cluster
create cluster events_hash_cluster (
cr_code number
)
single table hashkeys 1000000
size 8000
--Column cr_code has 500 000 distinct values
--There are 400 000 000 rows in table
--For table creation i excute the following statement
CREATE TABLE cr_events_hash
(
CR_ID NUMBER,
CR_CODE NUMBER,
CR_MNEMO VARCHAR2(30 Byte),
CR_EVT_CODE NUMBER,
CR_EVT_METH VARCHAR2(30 Byte),
CR_EVT_PL_CD NUMBER,
CR_EVT_PL_IR NUMBER(4),
CR_EVT_TRNS DATE,
CR_EVT_TRNS_CODE NUMBER,
CR_EVT_TRNS_AMT NUMBER(13,2),
CR_EVT_AMOUNT NUMBER(17,6),
CR_EVT_PARAM VARCHAR2 (1 Byte),
CR_EVT_DT DATE,
CR_EVT_USER VARCHAR2 (30 Byte),
CR_EVT_SYS_DT DATE)
CLUSTER events_hash_cluster(cr_code);
--In the end i try to insert rows into clustered table
INSERT INTO credit_events_hash
SELECT CR_ID,
CR_CODE,
CR_MNEMO,
CR_EVT_CODE,
CR_EVT_METH,
CR_EVT_PL_CD,
CR_EVT_PL_IR,
CR_EVT_TRNS,
CR_EVT_TRNS_CODE,
CR_EVT_TRNS_AMT,
CR_EVT_AMOUNT,
CR_EVT_PARAM,
CR_EVT_DT,
CR_EVT_USER,
CR_EVT_SYS_DT)
FROM s_cr_evnt;
--It takes 48hours already and still only 8% done
Is there a better way to fill has clustered table?
May be i should use CTAS on s_cr_evnt table before inserting to clustered table?
[Updated on: Tue, 02 February 2010 15:14] Report message to a moderator
|
|
|
|
Re: HASH cluster building very Slow [message #441766 is a reply to message #441722] |
Wed, 03 February 2010 01:00 |
sky_lt
Messages: 28 Registered: February 2009 Location: vilnius
|
Junior Member |
|
|
ramoradba of course it is testing enviroment.
the information on hash clusters i have found is very theoretical.
and the is not much info about them on net.
I had to broke the procces of filling that cluster as it would take me about a month! for 400 million row table.
As you can see i set the size for hash cluster - 8000 which is equal to the block size.
Does it 100% means that in the block there will never be several
different hashed column values in the same block (i assume i set the haskeys number high enough)
[Updated on: Wed, 03 February 2010 01:03] Report message to a moderator
|
|
|
|
Re: HASH cluster building very Slow [message #441800 is a reply to message #441793] |
Wed, 03 February 2010 03:01 |
John Watson
Messages: 8964 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
I've done a fair bit of work with clusters, and would welcome more information, too. I can't simulate your problem yet (ran out of disc space) but I do have a couple of suggestions. First, hash clusters can be very good for concurrent inserts: no buffer busy waits. You can't use parallelism, but you could divide up the insert into several (32? 64? just guessing) smaller transactions and try again. Partitioning the source table accordingly might help. Second, how often are you log switching? I would usually create huge logfile members for such a big operation (or even be naughty, and set _disable_logging=true). Third, you are going to need an undo tablespace the size of Jupiter. So again, would it make sense to break it up into smaller transactions?
Incidentally, what are the wait events for the session? And can you post an AWR report covering the insert?
Please keep this topic updated, it is an important subject.
|
|
|
|
Re: HASH cluster building very Slow [message #442160 is a reply to message #441804] |
Fri, 05 February 2010 01:15 |
sky_lt
Messages: 28 Registered: February 2009 Location: vilnius
|
Junior Member |
|
|
Finally i managed to fill the clustered table in about 9 hours.
--The steps i made:
--Devide table into 100 almost egual pieces
--and esure column's "credit_evt_credit_code" values exists only in one partition
CREATE TABLE s_credit_events_hash_part
NOLOGGING
PARTITION BY HASH (credit_evt_credit_code)
PARTITIONS 100 STORE IN (kredit_events_idx)
PARALLEL (DEGREE 20) AS
SELECT /*+parallel(i 20)*/i.* FROM s_cr_evnt i
--Filling the Clustered table
DECLARE
V_NAME VARCHAR2(30);
BEGIN
FOR i IN 51..150 LOOP
V_NAME := 'SYS_P'||TO_CHAR(i);
EXECUTE IMMEDIATE 'INSERT INTO credit_events_hash '||'(SELECT * FROM s_credit_events_hash_part PARTITION ('||V_NAME||'))';
COMMIT;
END LOOP;
END;
--Executed in 31623.8 seconds. Comparing to IOT building on the same table it took 4 times more time.
Now i am on testing the performance.
Interestingly i could not specify "NOLOGGING" hint nor in Cluster creation nor in Clustered table creation.
Does anyone knows how to switch off redo logs in this situation?
I think it could speed the creation process.
Or maybe by building hash cluster it takes NOLLOGING hint value from the tablespace in which th cluster is built?
|
|
|
Re: HASH cluster building very Slow [message #442178 is a reply to message #441699] |
Fri, 05 February 2010 03:05 |
John Watson
Messages: 8964 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
Hi - an excellent result.
Do you have time to test loading the data concurrently? Instead of looping through your partitions, launch several sessions and do several partitions at once? That is why I have used hash clusters: to do concurrent inserts from many sessions without serializing on buffer busy wait events.
And can you possibly post AWR reports for the the load you've just done? And also if you try doing it in concurrent sessions?
With regard to logging, I would never recommend the underscore parameter I mentioned before, it was actually rather unprofessional of me to suggest it (though it certainly works). I do wonder if a direct load is technically possible for a cluster, it might be that the load must go through the SGA because of the nature of the cluster structure. Though syntactically, these do work:
insert into cr_events_hash select * from S_CR_EVNT nologging;
insert /*+append*/ into cr_events_hash select * from S_CR_EVNT nologging;
I haven't yet checked whether redo is in fact generated.
|
|
|