Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> [SOLVED] Re: performance tuning questions: replace IN (values) by JOIN
For those interested, the complete solution was to:
select /*+ HINT BAG_LOG_IDX_DTM_TIMESTAMP */ BAGS.ID as BAGS__ID, [...],
BAG_LOG.* FROM BAGS, BAG_LOG WHERE (BAGS.ID = BAG_LOG.BAG_ID AND BAG_LOG.DTM_TIMESTAMP BETWEEN (TO_DATE('2003-19-08 00:00:00', 'yyyy-dd-mm HH24:MI:SS')) AND (TO_DATE('2003-19-08 12:00:00', 'yyyy-dd-mm HH24:MI:SS')) ) ORDER BY BAGS.ID DESC, BAG_LOG.DTM_TIMESTAMP DESC;
query went from 18 sec. to 0.02 seconds.
Thanks all, I owe you one
Jerome
PS: for those who might wonder I will rename this index as its name his very bad...
Jérôme Lacoste - CoffeeBreaks wrote:
> Hi,
>
> I am trying to improve an SQL query design while still keeping its
> performance.
>
>
> DB Model
> --------
> I have 2 tables, one called BAGS the other called BAG_LOG. Each has a
> primary key called ID. The BAG_LOG table has a foreign key called BAG_ID
> pointing to BAGS.ID.
>
> The relation is: a BAG has 1 or more BAG_LOG(s).
>
> The BAG_LOG table has a DTM_TIMESTAMP column.
>
> There are 40000 rows in BAGS and 170000 in BAG_LOG.
>
> There's no specific INDEX appart the ones generated automatically on the
> primary and foreign key.
>
>
>
> Query to improve
> ----------------
>
> Let's say I want to find all BAGS who have had a TIMESTAMP within a
> specific period. Additional criteria will add later on.
>
>
> One solution is to use 3 queries:
>
> select DISTINCT(BAG_ID) FROM BAG_LOG
> WHERE DTM_TIMESTAMP >= (TO_DATE('2003-19-08 0:00:00',
> 'yyyy-dd-mm HH24:MI:SS'))
> AND DTM_TIMESTAMP <= (TO_DATE('2003-19-08 2:00:00',
> 'yyyy-dd-mm HH24:MI:SS'));
>
> SELECT * FROM BAGS where ID IN
> (346564,346565,346566,346567,346568,346569);
>
> SELECT * FROM BAG_LOG where BAG_ID IN
> (346564,346565,346566,346567,346568,346569);
>
> The first query find the BAG IDs who satisfy the condition, the second
> and third queries find the rows of the BAG and BAG_LOG table that match
> the result of the first query. The second and third query are built
> dynamically by the calling program, based on the results returned by the
> first query.
>
>
> This has two drawbacks:
> - it doesn't exploit SQL capabilities
> - by using IN (list of values) one can hit a limitation if the first
> query returns more than a certain number of rows. (Oracle 9 has a
> default limit of 1000 possible values).
>
> It has one advantage
> - it is very simple SQL and will probably work on all DB I know.
>
>
>
> Possible improvement?
> ---------------------
>
> The 3 queries are joined into one. The columns are renamed so to be able
> to retrieve the values by column name (otherwise identical BAGS and
> BAG_LOG column names conflict).
>
> select BAGS.ID as BAGS__ID, [...],
> BAG_LOG.* FROM BAGS, BAG_LOG
> WHERE (BAGS.ID = BAG_LOG.BAG_ID AND
> BAGS.ID IN (select DISTINCT(BAG_ID) FROM BAG_LOG
> WHERE DTM_TIMESTAMP >= (TO_DATE('2003-19-08
> 00:00:00', 'yyyy-dd-mm HH24:MI:SS'))
> AND DTM_TIMESTAMP <= (TO_DATE('2003-19-08
> 12:00:00', 'yyyy-dd-mm HH24:MI:SS'))
> )
> ) ORDER BY BAGS.ID DESC, BAG_LOG.DTM_TIMESTAMP DESC;
>
> replace [...] by the full list of fields from the BAGS table one wants
> to retrieve. In my case all.
>
> [Note that I also added sorting in that query, but it doesn't seem to
> affect the performance in my case].
>
>
>
> Performance results & analysis:
> -------------------------------
>
> - (1) set of queries takes about 14 sec to perform
> - (2) takes 18 secs to perform, a 25% performance loss
>
>
> Here are some info using the autotrace function:
>
>
> ------ (1) --------
> 6 rows selected.
>
> Elapsed: 00:00:07.01
>
> Execution Plan
> ----------------------------------------------------------
> 0 SELECT STATEMENT Optimizer=CHOOSE
> 1 0 SORT (UNIQUE)
> 2 1 TABLE ACCESS (FULL) OF 'BAG_LOG'
>
>
>
>
> Statistics
> ----------------------------------------------------------
> 0 recursive calls
> 0 db block gets
> 34199 consistent gets
> 34184 physical reads
> 0 redo size
> 456 bytes sent via SQL*Net to client
> 499 bytes received via SQL*Net from client
> 2 SQL*Net roundtrips to/from client
> 1 sorts (memory)
> 0 sorts (disk)
> 6 rows processed
>
>
> 6 rows selected.
>
> Elapsed: 00:00:00.03
>
> Execution Plan
> ----------------------------------------------------------
> 0 SELECT STATEMENT Optimizer=CHOOSE
> 1 0 CONCATENATION
> 2 1 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> 3 2 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> 4 1 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> 5 4 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> 6 1 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> 7 6 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> 8 1 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> 9 8 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> 10 1 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> 11 10 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> 12 1 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> 13 12 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
>
>
> Statistics
> ----------------------------------------------------------
> 0 recursive calls
> 0 db block gets
> 18 consistent gets
> 0 physical reads
> 0 redo size
> 915 bytes sent via SQL*Net to client
> 499 bytes received via SQL*Net from client
> 2 SQL*Net roundtrips to/from client
> 0 sorts (memory)
> 0 sorts (disk)
> 6 rows processed
>
>
> 12 rows selected.
>
> Elapsed: 00:00:07.07
>
> Execution Plan
> ----------------------------------------------------------
> 0 SELECT STATEMENT Optimizer=CHOOSE
> 1 0 TABLE ACCESS (FULL) OF 'BAG_LOG'
>
>
>
>
> Statistics
> ----------------------------------------------------------
> 0 recursive calls
> 0 db block gets
> 34200 consistent gets
> 34185 physical reads
> 0 redo size
> 1774 bytes sent via SQL*Net to client
> 499 bytes received via SQL*Net from client
> 2 SQL*Net roundtrips to/from client
> 0 sorts (memory)
> 0 sorts (disk)
> 12 rows processed
>
>
>
> ------ (2) ------
>
> 12 rows selected.
>
> Elapsed: 00:00:18.06
>
> Execution Plan
> ----------------------------------------------------------
> 0 SELECT STATEMENT Optimizer=CHOOSE
> 1 0 SORT (ORDER BY)
> 2 1 MERGE JOIN
> 3 2 SORT (JOIN)
> 4 3 NESTED LOOPS
> 5 4 TABLE ACCESS (FULL) OF 'BAG_LOG'
> 6 4 TABLE ACCESS (BY INDEX ROWID) OF 'BAGS'
> 7 6 INDEX (UNIQUE SCAN) OF 'PK_BAGS1' (UNIQUE)
> 8 2 SORT (JOIN)
> 9 8 VIEW OF 'VW_NSO_1'
> 10 9 SORT (UNIQUE)
> 11 10 TABLE ACCESS (FULL) OF 'BAG_LOG'
>
>
>
>
> Statistics
> ----------------------------------------------------------
> 0 recursive calls
> 26 db block gets
> 345418 consistent gets
> 71578 physical reads
> 0 redo size
> 2422 bytes sent via SQL*Net to client
> 499 bytes received via SQL*Net from client
> 2 SQL*Net roundtrips to/from client
> 3 sorts (memory)
> 1 sorts (disk)
> 12 rows processed
>
>
>
>
>
> I clearly see that my second solution uses much more memory (consistent
> gets is about 3 times the total used by the first 3 queries), and also
> do 2 more sorts in memory and one in the disk.
>
> But I do not see any obvious design flaw in my query....
>
>
>
> -- Questions --
> ---------------
>
> Q: Is the performance loss due to the NESTED LOOPS? or is it due to the
> JOIN in itself ?
>
> Q: I was wondering what kind of query should I use in order to manage to
> improve the design of my query (1) especially removing its limitations
> without introducing the performance loss as with my second query. Should
> I use a temporary table?
>
> Q: [slightly unrelated] How one can enable the autotrace to also display
> (Cost, Card and Byte) as shown in most of Oracle examples?
>
> Any help appreciated...
>
>
> Cheers,
>
> Jerome
>
Received on Sun Aug 31 2003 - 03:36:21 CDT