Re: "order by" performance behavior

From: Sybrand Bakker <postbus_at_sybrandb.demon.nl>
Date: Sun, 26 Mar 2006 07:20:42 +0200
Message-ID: <1u8c2217apoijr55umjhq6sfb1rjr5brqq@4ax.com>

On 25 Mar 2006 19:24:59 -0800, "ewong" <ewong_at_mail.com> wrote:

>I have a table "result" with 5M records. I am doing this query:
>
>select distinct project
>from result order by project;
>
>It returns 50 distinct projects in order. The problem is this query
>takes over 30 seconds to complete. The project column is indexed but
>explain plan shows that it's not picking up the index. 4M out of 5M of
>the project field has NULL values. I heard that NULL value are not
>indexed so I guess that's the reason oracle isn't picking up the index?
>
>So I tried this different query:
>
>select project from (
>select distinct project from result)
>order by project;
>
>And this query returns instanteously! According to explain plan oracle
>is using the index on the project column. So why is the difference?
>The query is generated by an app so I can't force it to use the 2nd
>query. Below are the two explain plans. I am using 9.2 EE on Solaris
>10. Cost based. Table has full statistics. Thanks.
>
>-------------------------------
>
>select distinct project
>from result order by project; -- take 30 seconds --
>
> 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2128 Card=43
>Bytes=129)
> 1 0 SORT* (UNIQUE) (Cost=1715 Card=43 Bytes=129)
>:Q292625001
> 2 1 TABLE ACCESS* (FULL) OF 'RESULT' (Cost=504 Card=64752
>Bytes=19425675)
>:Q292625000
> 1 PARALLEL_TO_SERIAL SELECT DISTINCT C0 C0 FROM :Q292625000 ORDER BY
>C0
> 2 PARALLEL_TO_PARALLEL SELECT /*+ NO_EXPAND ROWID(A1) */
>A1."PROJECT" C0 FROM "RESULT" PX_GR
>
>--------------------------------
>
>select project from (
>select distinct project from result)
>order by project; -- returns instanteously --
>
> 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=3098 Card=43
>Bytes=2236)
> 1 0 VIEW (Cost=3098 Card=43 Bytes=2236)
> 2 1 SORT (UNIQUE) (Cost=3098 Card=43 Bytes=129)
> 3 2 BITMAP INDEX (FAST FULL SCAN) OF 'IDX_RESULT_PROJECT'
>
>------------------------------

Nulls are not being indexed (which you don't need to 'hear', because it is documented).
Consequently those two queries won't return identical results. You can't change the query, but you can disable parallelel query. Using parallel query doesn't help if your number of processors is low (you seem to have 2 processors) and you didn't stripe your data over multiple disks.
Tom Kyte states in one his of books , parallel query is only useful when you have 16 processors or better and you are a datawarehouse.

--
Sybrand Bakker, Senior Oracle DBA

Received on Sat Mar 25 2006 - 23:20:42 CST