Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: Why DISTINCT is costly?
Robert Klemme wrote:
> So we can leave that out for the moment. Now I'm still wondering where
> exactly the difference is in processing these two selects. It seems there
> is a cost difference attached to SORT GROUP BY vs. SORT UNIQUE but why?
> Is it just that the latter benefits filled caches? Or is there some kind
> of algorithmic difference between SORT GROUP BY and SORT UNIQUE?
There must be some algorithmic difference. Without access to source code, I can't say for sure. But I can test indirectly.
Consider this, for 9.2.0.6:
SQL> create table t1 as select * from all_indexes;
Table created.
Elapsed: 00:00:00.25
(now create a few duplicates so we can test this)
SQL> insert into t1 select * from t1;
746 rows created.
Elapsed: 00:00:00.01
SQL> /
1492 rows created.
Elapsed: 00:00:00.02
etcetc...
SQL> /
47744 rows created.
Elapsed: 00:00:00.81
(now put an index there)
SQL> create index i_t1 on
t1(owner,index_name,index_type,table_owner,
table_name,table_type,uniqueness,
compression,prefix_length,tablespace_name);
Index created.
Elapsed: 00:00:09.96
SQL> analyze table t1 compute statistics;
Table analyzed.
Elapsed: 00:00:03.37
SQL> truncate table plan_table;
Table truncated.
Elapsed: 00:00:00.03
SQL> explain plan for
2 select
3
owner,index_name,index_type,table_owner,table_name,table_type,uniqueness,
4 compression,prefix_length,tablespace_name
5 from t1
6 group by
7
owner,index_name,index_type,table_owner,table_name,table_type,uniqueness,
8 compression,prefix_length,tablespace_name
9 ;
Explained.
Elapsed: 00:00:00.03
SQL> select * from table(dbms_xplan.display);
Press Return to continue...
PLAN_TABLE_OUTPUT
| 0 | SELECT STATEMENT | | 95488 | 6807K| 1209 | | 1 | SORT GROUP BY NOSORT| | 95488 | 6807K| 1209 | | 2 | INDEX FULL SCAN | I_T1 | 95488 | 6807K| 1209 | --------------------------------------------------------------------Note: cpu costing is off
There must be some different algorithm at play here. My guess is DISTINCT considers each column at a time whereas group by concatenates all the columns into the equivalent of a raw column and then acts on that. But I have no way of proving that.
-- Cheers Nuno Souto in sunny Sydney, Australia wizofaus_at_gmaildotcomdotnospamReceived on Wed Sep 14 2005 - 21:35:08 CDT