Home » RDBMS Server » Performance Tuning » Selecting from a big table.
Selecting from a big table. [message #214788] |
Thu, 18 January 2007 01:38 |
a_developer
Messages: 194 Registered: January 2006
|
Senior Member |
|
|
I am not good in performance tuning.. Anyone, please help..
I have to insert into a table T_DEST the values selected by the following query:
select id,
min(trxn_dt),
max(trxn_dt)
from Mytab
where shop_id = 20
and item_id != 0
and trxn_dt >= '01-Jan-2005'
group by id
Note that:
-- Mytab is partitioned by trxn_dt (monthly)
-- the ID is the primary key
-- It has Bitmap indexes on TRXN_DT and SHOP_ID, Btree index on ITEM_ID
-- It has around 400M rows
My insert statement is:
insert into T_DEST
select dest_seq.nextval,
'DONE',
id,
min(trxn_dt) dt1,
max(trxn_dt) dt2
from Mytab
where shop_id = 20
and item_id != 0
and trxn_dt >= '01-Jan-2005'
group by id;
The explain plan for the SELECT is:
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 305157102
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 20 | 92748 (1)| 00:21:39 | | | |
| 1 | PX COORDINATOR | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 1 | 20 | 92748 (1)| 00:21:39 | | | Q1,
| 3 | HASH GROUP BY | | 1 | 20 | 92748 (1)| 00:21:39 | | | Q1,01 | PCWP
| 4 | PX RECEIVE | | 1 | 20 | 92748 (1)| 00:21:39 | | | Q1,01 | PCWP |
| 5 | PX SEND HASH | :TQ10000 | 1 | 20 | 92748 (1)| 00:21:39 | | | Q1,00 |
| 6 | HASH GROUP BY | | 1 | 20 | 92748 (1)| 00:21:39 | | | Q1,00 | PC
| 7 | PX BLOCK ITERATOR | | 61M| 1179M| 92515 (1)| 00:21:36 | 13 | 48 | Q1,00
|* 8 | TABLE ACCESS FULL| MYTAB | 61M| 1179M| 92515 (1)| 00:21:36 | 13 | 48 | Q1,00 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
8 - filter("SHOP_ID"=20 AND "ITEM_ID"<>0)
Question: Should I use a materialized view?? Should I put hints?? Please advise me..
Thanks in advance.
|
|
|
Re: Selecting from a big table. [message #214995 is a reply to message #214788] |
Thu, 18 January 2007 21:00 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
If ID is the primary key, why are you GROUPing on it - it's unique! This will cause Oracle to build a (group by) hash table with 400M entries. This is going to do a lot of paging to disk.
Ross Leishman
|
|
|
|
|
|
Re: Selecting from a big table. [message #215150 is a reply to message #215055] |
Fri, 19 January 2007 08:17 |
cbruhn2
Messages: 41 Registered: January 2007
|
Member |
|
|
ok here is my test table :
drop table mytab;
CREATE TABLE mytab (
shop_id number,
item_id number,
trxn_id date)
PARTITION BY RANGE(trxn_id)
(
PARTITION p1 VALUES LESS THAN (TO_DATE('01-JAN-2004','DD-MON-YYYY')),
PARTITION p2 VALUES LESS THAN (TO_DATE('01-FEB-2004','DD-MON-YYYY')),
PARTITION p3 VALUES LESS THAN (TO_DATE('01-MAR-2004','DD-MON-YYYY')),
PARTITION p4 VALUES LESS THAN (TO_DATE('01-APR-2004','DD-MON-YYYY')),
PARTITION p5 VALUES LESS THAN (TO_DATE('01-MAY-2004','DD-MON-YYYY')),
PARTITION p6 VALUES LESS THAN (TO_DATE('01-JUN-2004','DD-MON-YYYY')),
PARTITION p7 VALUES LESS THAN (TO_DATE('01-JUL-2004','DD-MON-YYYY')),
PARTITION p8 VALUES LESS THAN (TO_DATE('01-AUG-2004','DD-MON-YYYY')),
PARTITION p9 VALUES LESS THAN (TO_DATE('01-SEP-2004','DD-MON-YYYY')),
PARTITION p10 VALUES less THAN (maxvalue)
)
parallel 5 ;
begin
for i in 1..1000
loop
insert into mytab values (i,i,'05-JAN-2004');
insert into mytab values (i,i,'05-FEB-2004');
insert into mytab values (i,i,'05-MAR-2004');
insert into mytab values (i,i,'05-APR-2004');
insert into mytab values (i,i,'05-MAY-2004');
insert into mytab values (i,i,'05-JUN-2004');
insert into mytab values (i,i,'05-JUL-2004');
end loop;
end;
/
I now run different test with bitmap indexing as you have not stated exactly how your indexes are defined.
create bitmap index b_mytab on mytab(trxn_id,shop_id) local nologging;
create index i_mytab on mytab(item_id) nologging;
execute dbms_stats.gather_table_stats('CRB','MYTAB',cascade => true);
by running the following select i get results very similar to your's in the explain plan
select ITEM_ID,
min(trxn_id),
max(trxn_id)
from Mytab
where shop_id = 20
and item_id != 0
and trxn_id >= '01-APR-2004'
group by ITEM_ID
;
Execution Plan
----------------------------------------------------------
Plan hash value: 3999219564
---------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 75 | 3 (34)| 00:00:01 | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 5 | 75 | 3 (34)| 00:00:01 | | | Q1,01 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | 5 | 75 | 3 (34)| 00:00:01 | | | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | 5 | 75 | 2 (0)| 00:00:01 | | | Q1,01 | PCWP | |
| 5 | PX SEND HASH | :TQ10000 | 5 | 75 | 2 (0)| 00:00:01 | | | Q1,00 | P->P | HASH |
| 6 | PX BLOCK ITERATOR | | 5 | 75 | 2 (0)| 00:00:01 | 5 | 10 | Q1,00 | PCWC | |
|* 7 | TABLE ACCESS FULL| MYTAB | 5 | 75 | 2 (0)| 00:00:01 | 5 | 10 | Q1,00 | PCWP | |
---------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
7 - filter("SHOP_ID"=20 AND "TRXN_ID" IS NOT NULL AND "ITEM_ID"<>0)
Statistics
----------------------------------------------------------
61 recursive calls
3 db block gets
73 consistent gets
0 physical reads
832 redo size
550 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
1 rows processed
If i try to force the statement to use index on trxn_id (B_MYTAB) I get the following result.
select /*+ index(Mytab,b_mytab) */ ITEM_ID,
min(trxn_id),
max(trxn_id)
from Mytab
where shop_id = 20
and item_id != 0
and trxn_id >= '01-APR-2004'
group by ITEM_ID
;
Execution Plan
----------------------------------------------------------
Plan hash value: 2241380932
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 75 | 27 (4)| 00:00:01 | | |
| 1 | HASH GROUP BY | | 5 | 75 | 27 (4)| 00:00:01 | | |
| 2 | PARTITION RANGE ITERATOR | | 5 | 75 | 26 (0)| 00:00:01 | 5 | 10 |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID| MYTAB | 5 | 75 | 26 (0)| 00:00:01 | 5 | 10 |
| 4 | BITMAP CONVERSION TO ROWIDS | | | | | | | |
|* 5 | BITMAP INDEX RANGE SCAN | B_MYTAB | | | | | 5 | 10 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("ITEM_ID"<>0)
5 - access("TRXN_ID">=TO_DATE('2004-04-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND "SHOP_ID"=20)
filter("TRXN_ID">=TO_DATE('2004-04-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND "SHOP_ID"=20)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
30 consistent gets
0 physical reads
0 redo size
550 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
As you can see I now got a different explain plan (big deal ), but what's more important I can se that my consistent gets is down by half.
As far as I can see you didn't provide any statistics result, probably because the load would be too big, bur now I'm guessing.
Hope this is helpful.
|
|
|
Re: Selecting from a big table. [message #215222 is a reply to message #215150] |
Fri, 19 January 2007 21:50 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
You can try this. It may not be faster, but maybe it's worth a go.
You will need a global b-tree index on ITEM_ID, SHOP_ID, TRXN_DT. Don't bother trying it without this: it won't work. And this is important - you MUST gather statistics - this technique only works with the CBO. No stats means rule-based.
select item_id
, (select min(trxn_dt)
from Mytab
where item_id = a.item_id
and shop_id = 20
and trxn_dt >= '01-Jan-2005') AS dt1
, (select max(trxn_dt)
from Mytab
where item_id = a.item_id
and shop_id = 20
and trxn_dt >= '01-Jan-2005') AS dt2
from (select distinct item_id
from Mytab
where trxn_dt >= '01-Jan-2005'
and item_id != 0
) a
What I am hoping to achieve here is a single parallel Fast Full Scan to get the distinct ITEM_IDs, and then single row MAX/MIN probes on the index to get the dates.
Honestly, I don't think it will help, the DISTINCT will kill it. But it is the only materially different plan I can come up with. You could also try your original query with a locally partitioned b-tree index on those three columns.
Ross Leishman
|
|
|
|
Goto Forum:
Current Time: Wed Jan 08 04:24:14 CST 2025
|