Slow connect by prior ... start with subquery in 9i

From: Andrew Stewart <cdos_at_claudius.demon.co.uk>
Date: Wed, 31 Mar 2004 13:32:07 +0100
Message-ID: <83el60huvct7r8h3erupibar8cpql1soep@4ax.com>

Has anyone come across a performance problem (compared to 8i) when using hierarchical queries where the START WITH list is generated by a subquery? The culprit seems to be an extra visit to the subquery block as part of the CONNECT BY WITH FILTERING operation.

For example, take a simple tree table (id NUMBER, parentid NUMBER) and a subquery - here just a table called sample (id NUMBER) with a subset of the ids from the tree table - with which to drive the start points of the treewalk:

SELECT parentid, id, label
FROM tree
CONNECT BY PRIOR parentid = id
START WITH id IN
(

SELECT id FROM SAMPLE
)

With the tables populated and analyzed, I get this from 8i:

Execution Plan

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=19) 1 0 CONNECT BY

 2    1     NESTED LOOPS (Cost=1 Card=1280 Bytes=10240)
 3    2       INDEX (FAST FULL SCAN) OF 'ID_PK' (UNIQUE) (Cost=1
Card=1280 Bytes=5120)
 4    2       INDEX (UNIQUE SCAN) OF 'TREE_PK' (UNIQUE)
 5    1     TABLE ACCESS (BY USER ROWID) OF 'TREE'
 6    1     TABLE ACCESS (BY INDEX ROWID) OF 'TREE' (Cost=2 Card=1
Bytes=19)
 7    6       INDEX (UNIQUE SCAN) OF 'TREE_PK' (UNIQUE) (Cost=1

Card=1)

Statistics

     0  recursive calls
     4  db block gets

15687 consistent gets

    59 physical reads
     0 redo size
223313 bytes sent via SQL*Net to client 38276 bytes received via SQL*Net from client    343 SQL*Net roundtrips to/from client

     3  sorts (memory)
     0  sorts (disk)

5120 rows processed

and this is 9i:

Execution Plan

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=19) 1 0 CONNECT BY (WITH FILTERING)

 2    1     NESTED LOOPS
 3    2       NESTED LOOPS (Cost=2 Card=1280 Bytes=10240)
 4    3         INDEX (FAST FULL SCAN) OF 'ID_PK' (UNIQUE) (Cost=2
Card=1280 Bytes=5120)
 5    3         INDEX (UNIQUE SCAN) OF 'TREE_PK' (UNIQUE)
 6    2       TABLE ACCESS (BY USER ROWID) OF 'TREE'
 7    1     NESTED LOOPS
 8    7       BUFFER (SORT)
 9    8         CONNECT BY PUMP
10    7       TABLE ACCESS (BY INDEX ROWID) OF 'TREE' (Cost=2 Card=1
Bytes=19)
11   10         INDEX (UNIQUE SCAN) OF 'TREE_PK' (UNIQUE) (Cost=1
Card=20480)
12    1     INDEX (UNIQUE SCAN) OF 'SAMPLE_PK' (UNIQUE) (Cost=1 Card=1

Bytes=4)

Statistics

     1  recursive calls
     1  db block gets

20525 consistent gets

    72 physical reads
   120 redo size
224681 bytes sent via SQL*Net to client 38281 bytes received via SQL*Net from client    343 SQL*Net roundtrips to/from client

     9  sorts (memory)
     0  sorts (disk)

5120 rows processed

..so, about another 5000 logical reads, corresponding to the extra access of the sample table at the bottom of the query plan. So instead of just visiting the START WITH subquery once, to kick off the treewalk, I seem to be revisiting it for every row returned. Not too bad if that happens to be a unique index scan as here but that's not always the case.

I know I've got new options for re-writing this as a join under 9i, I'm just curious about those extra lookups and why they're necessary.

Cheers - Andrew.

Full test case if you're curious:

DROP TABLE tree;

DROP TABLE sample;

CREATE TABLE tree
(

    id NUMBER,
    parentid NUMBER,
    label VARCHAR2(30),
    CONSTRAINT tree_pk PRIMARY KEY (id)
);

CREATE TABLE sample
(

    id          NUMBER,
    filler      VARCHAR2(2000),

CONSTRAINT sample_pk PRIMARY KEY (id) )
PCTFREE 95
PCTUSED 1; INSERT /*+ APPEND */ INTO tree
(

    id,
    parentid,
    label
)
SELECT ROWNUM,

       DECODE(MOD(ROWNUM-1, 4), 0, NULL, ROWNUM-1),
       TO_CHAR(CEIL(ROWNUM/4)) || ' level ' || TO_CHAR(MOD(ROWNUM-1,

4)+1)
FROM ALL_TAB_COLUMNS, ALL_TAB_COLUMNS
WHERE ROWNUM <= 20480;

COMMIT; INSERT /*+ APPEND */ INTO sample (id, filler) SELECT id, RPAD(id, 2000, '.'||id)
FROM tree
WHERE MOD(id,16) = 0;

COMMIT; ANALYZE TABLE tree COMPUTE STATISTICS;
ANALYZE TABLE sample COMPUTE STATISTICS;

SET AUTOT TRACEONLY SELECT parentid, id, label
FROM tree
CONNECT BY PRIOR parentid = id
START WITH id IN
(

SELECT id FROM SAMPLE
);

SET AUTOT OFF Received on Wed Mar 31 2004 - 06:32:07 CST