Home » RDBMS Server » Performance Tuning » Selecting from a big table.
Selecting from a big table. [message #214788] Thu, 18 January 2007 01:38 Go to next message
a_developer
Messages: 194
Registered: January 2006
Senior Member
I am not good in performance tuning.. Anyone, please help..

I have to insert into a table T_DEST the values selected by the following query:

select id,
       min(trxn_dt),
       max(trxn_dt)
  from Mytab
 where shop_id = 20
   and item_id != 0
   and trxn_dt >= '01-Jan-2005'
 group by id


Note that:
-- Mytab is partitioned by trxn_dt (monthly)
-- the ID is the primary key
-- It has Bitmap indexes on TRXN_DT and SHOP_ID, Btree index on ITEM_ID
-- It has around 400M rows

My insert statement is:

insert into T_DEST 
select dest_seq.nextval,
       'DONE',
       id,
       min(trxn_dt) dt1,
       max(trxn_dt) dt2
  from Mytab
 where shop_id = 20
   and item_id != 0
   and trxn_dt >= '01-Jan-2005'
 group by id;

The explain plan for the SELECT is:

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 305157102

----------------------------------------------------------------------------------------------------
| Id  | Operation                | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |           |     1 |    20 | 92748   (1)| 00:21:39 |       |       |        |      
|   1 |  PX COORDINATOR          |           |       |       |            |          |       |       |        |      |
|   2 |   PX SEND QC (RANDOM)    | :TQ10001  |     1 |    20 | 92748   (1)| 00:21:39 |       |       |  Q1,
|   3 |    HASH GROUP BY         |           |     1 |    20 | 92748   (1)| 00:21:39 |       |       |  Q1,01 | PCWP 
|   4 |     PX RECEIVE           |           |     1 |    20 | 92748   (1)| 00:21:39 |       |       |  Q1,01 | PCWP |
|   5 |      PX SEND HASH        | :TQ10000  |     1 |    20 | 92748   (1)| 00:21:39 |       |       |  Q1,00 |
|   6 |       HASH GROUP BY      |           |     1 |    20 | 92748   (1)| 00:21:39 |       |       |  Q1,00 | PC
|   7 |        PX BLOCK ITERATOR |           |    61M|  1179M| 92515   (1)| 00:21:36 |    13 |    48 |  Q1,00 
|*  8 |         TABLE ACCESS FULL| MYTAB     |    61M|  1179M| 92515   (1)| 00:21:36 |    13 |    48 |  Q1,00 | 
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   8 - filter("SHOP_ID"=20 AND "ITEM_ID"<>0)

Question: Should I use a materialized view?? Should I put hints?? Please advise me..

Thanks in advance.
Re: Selecting from a big table. [message #214995 is a reply to message #214788] Thu, 18 January 2007 21:00 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
If ID is the primary key, why are you GROUPing on it - it's unique! This will cause Oracle to build a (group by) hash table with 400M entries. This is going to do a lot of paging to disk.

Ross Leishman
Re: Selecting from a big table. [message #215018 is a reply to message #214995] Fri, 19 January 2007 00:12 Go to previous messageGo to next message
a_developer
Messages: 194
Registered: January 2006
Senior Member
OOPs, I am very sorry.. I'm wrong Embarassed : ID is not the primary key.. please change the query to:

select ITEM_ID,
       min(trxn_dt),
       max(trxn_dt)
  from Mytab
 where shop_id = 20
   and item_id != 0
   and trxn_dt >= '01-Jan-2005'
 group by ITEM_ID
Re: Selecting from a big table. [message #215024 is a reply to message #215018] Fri, 19 January 2007 00:26 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
How many rows are generated in the table?
Re: Selecting from a big table. [message #215055 is a reply to message #215024] Fri, 19 January 2007 03:02 Go to previous messageGo to next message
a_developer
Messages: 194
Registered: January 2006
Senior Member
Around 300,000..
Re: Selecting from a big table. [message #215150 is a reply to message #215055] Fri, 19 January 2007 08:17 Go to previous messageGo to next message
cbruhn2
Messages: 41
Registered: January 2007
Member
ok here is my test table :
drop table mytab;
CREATE TABLE mytab (
  shop_id number,
  item_id number,
  trxn_id date)
PARTITION BY RANGE(trxn_id)
(
PARTITION p1 VALUES LESS THAN (TO_DATE('01-JAN-2004','DD-MON-YYYY')),
PARTITION p2 VALUES LESS THAN (TO_DATE('01-FEB-2004','DD-MON-YYYY')),
PARTITION p3 VALUES LESS THAN (TO_DATE('01-MAR-2004','DD-MON-YYYY')),
PARTITION p4 VALUES LESS THAN (TO_DATE('01-APR-2004','DD-MON-YYYY')),
PARTITION p5 VALUES LESS THAN (TO_DATE('01-MAY-2004','DD-MON-YYYY')),
PARTITION p6 VALUES LESS THAN (TO_DATE('01-JUN-2004','DD-MON-YYYY')),
PARTITION p7 VALUES LESS THAN (TO_DATE('01-JUL-2004','DD-MON-YYYY')),
PARTITION p8 VALUES LESS THAN (TO_DATE('01-AUG-2004','DD-MON-YYYY')),
PARTITION p9 VALUES LESS THAN (TO_DATE('01-SEP-2004','DD-MON-YYYY')),
PARTITION p10 VALUES less THAN (maxvalue)
)
parallel 5 ;
                                                                                                                begin
 for i in 1..1000
 loop
 insert into mytab values (i,i,'05-JAN-2004');
 insert into mytab values (i,i,'05-FEB-2004');
 insert into mytab values (i,i,'05-MAR-2004');
 insert into mytab values (i,i,'05-APR-2004');
 insert into mytab values (i,i,'05-MAY-2004');
 insert into mytab values (i,i,'05-JUN-2004');
 insert into mytab values (i,i,'05-JUL-2004');
 end loop;
end;
/


I now run different test with bitmap indexing as you have not stated exactly how your indexes are defined.
create bitmap index b_mytab on mytab(trxn_id,shop_id) local nologging;
create index i_mytab on mytab(item_id) nologging;
                                                                                                                
execute dbms_stats.gather_table_stats('CRB','MYTAB',cascade => true);

by running the following select i get results very similar to your's in the explain plan
select ITEM_ID,
       min(trxn_id),
       max(trxn_id)
  from Mytab
 where shop_id = 20
   and item_id != 0
   and trxn_id >= '01-APR-2004'
 group by ITEM_ID
;

Execution Plan
----------------------------------------------------------
Plan hash value: 3999219564
 
---------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |          |     5 |    75 |     3  (34)| 00:00:01 |       |       |        |   |             |
|   1 |  PX COORDINATOR         |          |       |       |            |          |       |       |        |   |             |
|   2 |   PX SEND QC (RANDOM)   | :TQ10001 |     5 |    75 |     3  (34)| 00:00:01 |       |       |  Q1,01 | P->S | QC (RAND)  |
|   3 |    HASH GROUP BY        |          |     5 |    75 |     3  (34)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|   4 |     PX RECEIVE          |          |     5 |    75 |     2   (0)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|   5 |      PX SEND HASH       | :TQ10000 |     5 |    75 |     2   (0)| 00:00:01 |       |       |  Q1,00 | P->P | HASH       |
|   6 |       PX BLOCK ITERATOR |          |     5 |    75 |     2   (0)| 00:00:01 |     5 |    10 |  Q1,00 | PCWC |            |
|*  7 |        TABLE ACCESS FULL| MYTAB    |     5 |    75 |     2   (0)| 00:00:01 |     5 |    10 |  Q1,00 | PCWP |            |
---------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   7 - filter("SHOP_ID"=20 AND "TRXN_ID" IS NOT NULL AND "ITEM_ID"<>0)
 
 
Statistics
----------------------------------------------------------
         61  recursive calls
          3  db block gets
         73  consistent gets
          0  physical reads
        832  redo size
        550  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          4  sorts (memory)
          0  sorts (disk)
          1  rows processed

If i try to force the statement to use index on trxn_id (B_MYTAB) I get the following result.
select /*+ index(Mytab,b_mytab) */ ITEM_ID,
       min(trxn_id),
       max(trxn_id)
  from Mytab
 where shop_id = 20
   and item_id != 0
   and trxn_id >= '01-APR-2004'
 group by ITEM_ID
;

Execution Plan
----------------------------------------------------------
Plan hash value: 2241380932
 
---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |     5 |    75 |    27   (4)| 00:00:01 |       |       |
|   1 |  HASH GROUP BY                      |         |     5 |    75 |    27   (4)| 00:00:01 |       |       |
|   2 |   PARTITION RANGE ITERATOR          |         |     5 |    75 |    26   (0)| 00:00:01 |     5 |    10 |
|*  3 |    TABLE ACCESS BY LOCAL INDEX ROWID| MYTAB   |     5 |    75 |    26   (0)| 00:00:01 |     5 |    10 |
|   4 |     BITMAP CONVERSION TO ROWIDS     |         |       |       |            |          |       |       |
|*  5 |      BITMAP INDEX RANGE SCAN        | B_MYTAB |       |       |            |          |     5 |    10 |
---------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter("ITEM_ID"<>0)
   5 - access("TRXN_ID">=TO_DATE('2004-04-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND "SHOP_ID"=20)
       filter("TRXN_ID">=TO_DATE('2004-04-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND "SHOP_ID"=20)
 
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         30  consistent gets
          0  physical reads
          0  redo size
        550  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
 

As you can see I now got a different explain plan (big deal Smile), but what's more important I can se that my consistent gets is down by half.
As far as I can see you didn't provide any statistics result, probably because the load would be too big, bur now I'm guessing.

Hope this is helpful.


Re: Selecting from a big table. [message #215222 is a reply to message #215150] Fri, 19 January 2007 21:50 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
You can try this. It may not be faster, but maybe it's worth a go.

You will need a global b-tree index on ITEM_ID, SHOP_ID, TRXN_DT. Don't bother trying it without this: it won't work. And this is important - you MUST gather statistics - this technique only works with the CBO. No stats means rule-based.

select item_id
,      (select min(trxn_dt) 
        from   Mytab 
        where  item_id = a.item_id
        and    shop_id = 20
        and    trxn_dt >= '01-Jan-2005') AS dt1
,      (select max(trxn_dt) 
        from   Mytab 
        where  item_id = a.item_id
        and    shop_id = 20
        and    trxn_dt >= '01-Jan-2005') AS dt2
from   (select distinct item_id 
        from Mytab
        where trxn_dt >= '01-Jan-2005'
        and item_id != 0
       ) a


What I am hoping to achieve here is a single parallel Fast Full Scan to get the distinct ITEM_IDs, and then single row MAX/MIN probes on the index to get the dates.
Honestly, I don't think it will help, the DISTINCT will kill it. But it is the only materially different plan I can come up with. You could also try your original query with a locally partitioned b-tree index on those three columns.

Ross Leishman
Re: Selecting from a big table. [message #215241 is a reply to message #215222] Sat, 20 January 2007 01:46 Go to previous message
a_developer
Messages: 194
Registered: January 2006
Senior Member
Thanks guys! I'll give them all a try.
Previous Topic: Optimalization problem with static subquery
Next Topic: How to see the constraints in the table ?
Goto Forum:
  


Current Time: Sat Nov 23 10:55:11 CST 2024