Re: Optimization question: Unrolling subquery into IN clause
From: Ind-dba <oraclearora_at_googlemail.com>
Date: Tue, 20 Oct 2009 03:51:32 -0700 (PDT)
Message-ID: <0a668e04-8222-4746-aeb3-f4d878fcfe32_at_b3g2000pre.googlegroups.com>
On Oct 20, 12:14�pm, HansP <Hans-Peter.Sl..._at_atosorigin.com> wrote:
> On Oct 19, 3:04�pm, Wolfram Roesler <w..._at_spam.la> wrote:
>
>
>
>
>
> > Steve Howard <stevedhow..._at_gmail.com> wrote innews:c365a644-242b-4226-92df-1a16bb1fd53c_at_a32g2000yqm.googlegroups.com:
>
> > > What plan does it show when you pass in the list rather than use the
> > > subquery?
>
> > With the expanded list (without subquery) the explain plan looks
> > like this (index1 is the index on tab1.col1):
>
> > -------------------------------------------------------------------------
> > -----------------
> > | Id �| Operation � � � � � � � � � �| Name � � �| Rows �| Bytes | Cost
> > (%CPU)| Time � � |
> > -------------------------------------------------------------------------
> > -----------------
> > | � 0 | SELECT STATEMENT � � � � � � | � � � � � | � 855 | 62415 | � 161 �
> > (0)| 00:00:02 |
> > | � 1 | �INLIST ITERATOR � � � � � � | � � � � � | � � � | � � � | � � � � � �
> > | � � � � �|
> > | � 2 | � TABLE ACCESS BY INDEX ROWID| tab1 � � �| � 855 | 62415 | � 161 �
> > (0)| 00:00:02 |
> > |* �3 | � �INDEX RANGE SCAN � � � � �| index1 � �| � 855 | � � � | � �44 �
> > (0)| 00:00:01 |
> > -------------------------------------------------------------------------
> > -----------------
>
> > Outline Data
> > -------------
>
> > � /*+
> > � � � BEGIN_OUTLINE_DATA
> > � � � INDEX_RS_ASC(_at_"SEL$1" "tab1"@"SEL$1" ("tab1"."col2"))
> > � � � OUTLINE_LEAF(_at_"SEL$1")
> > � � � ALL_ROWS
> > � � � OPTIMIZER_FEATURES_ENABLE('10.2.0.3')
> > � � � IGNORE_OPTIM_EMBEDDED_HINTS
> > � � � END_OUTLINE_DATA
> > � */
>
> > Predicate Information (identified by operation id):
> > ---------------------------------------------------
>
> > � �3 - access(("col2"='Value1' OR "col2"='Value2'
> > � � � � � � � ...
> > � � � � � � � "col2"='Value20') AND "col1"='Value')
>
> > > Also, are these bind variables or literals you are using in
> > > the actual query? �I ask because you said it should only return 300
> > > rows total, so this looks like the problem...
>
> > I'm using bind variables in my application, which makes it even slower.
> > The plan with bind variables is something like this (this is from tkprof
> > and from the slightly more complex version of the query issued by
> > the application):
>
> > Rows � � Execution Plan
> > ------- �---------------------------------------------------
> > � � � 0 �SELECT STATEMENT � MODE: ALL_ROWS
> > � � 260 � SORT (ORDER BY)
> > � � 260 � �HASH JOIN (RIGHT OUTER)
> > �651633 � � TABLE ACCESS � MODE: ANALYZED (FULL) OF 'tab2' (TABLE)
> > �651633 � � HASH JOIN
> > �651633 � � �TABLE ACCESS � MODE: ANALYZED (FULL) OF 'tab2' (TABLE)
> > �651633 � � �TABLE ACCESS � MODE: ANALYZED (FULL) OF 'tab1' (TABLE)
>
> > Thanks for your help
> > W. R�sler
>
> Hi Wolfram,
>
> If the sub query yields 20 distinct values and no null values, I think
> you could try to rewrite the query as:
>
> SELECT tab1.several_columns
> FROM tab1, tab2
> WHERE tab.col1='Value'
> and tab2.col3='Something'
> and tab1.col2=tab2.key
>
> Regards Hansp- Hide quoted text -
>
> - Show quoted text -
);
Date: Tue, 20 Oct 2009 03:51:32 -0700 (PDT)
Message-ID: <0a668e04-8222-4746-aeb3-f4d878fcfe32_at_b3g2000pre.googlegroups.com>
On Oct 20, 12:14�pm, HansP <Hans-Peter.Sl..._at_atosorigin.com> wrote:
> On Oct 19, 3:04�pm, Wolfram Roesler <w..._at_spam.la> wrote:
>
>
>
>
>
> > Steve Howard <stevedhow..._at_gmail.com> wrote innews:c365a644-242b-4226-92df-1a16bb1fd53c_at_a32g2000yqm.googlegroups.com:
>
> > > What plan does it show when you pass in the list rather than use the
> > > subquery?
>
> > With the expanded list (without subquery) the explain plan looks
> > like this (index1 is the index on tab1.col1):
>
> > -------------------------------------------------------------------------
> > -----------------
> > | Id �| Operation � � � � � � � � � �| Name � � �| Rows �| Bytes | Cost
> > (%CPU)| Time � � |
> > -------------------------------------------------------------------------
> > -----------------
> > | � 0 | SELECT STATEMENT � � � � � � | � � � � � | � 855 | 62415 | � 161 �
> > (0)| 00:00:02 |
> > | � 1 | �INLIST ITERATOR � � � � � � | � � � � � | � � � | � � � | � � � � � �
> > | � � � � �|
> > | � 2 | � TABLE ACCESS BY INDEX ROWID| tab1 � � �| � 855 | 62415 | � 161 �
> > (0)| 00:00:02 |
> > |* �3 | � �INDEX RANGE SCAN � � � � �| index1 � �| � 855 | � � � | � �44 �
> > (0)| 00:00:01 |
> > -------------------------------------------------------------------------
> > -----------------
>
> > Outline Data
> > -------------
>
> > � /*+
> > � � � BEGIN_OUTLINE_DATA
> > � � � INDEX_RS_ASC(_at_"SEL$1" "tab1"@"SEL$1" ("tab1"."col2"))
> > � � � OUTLINE_LEAF(_at_"SEL$1")
> > � � � ALL_ROWS
> > � � � OPTIMIZER_FEATURES_ENABLE('10.2.0.3')
> > � � � IGNORE_OPTIM_EMBEDDED_HINTS
> > � � � END_OUTLINE_DATA
> > � */
>
> > Predicate Information (identified by operation id):
> > ---------------------------------------------------
>
> > � �3 - access(("col2"='Value1' OR "col2"='Value2'
> > � � � � � � � ...
> > � � � � � � � "col2"='Value20') AND "col1"='Value')
>
> > > Also, are these bind variables or literals you are using in
> > > the actual query? �I ask because you said it should only return 300
> > > rows total, so this looks like the problem...
>
> > I'm using bind variables in my application, which makes it even slower.
> > The plan with bind variables is something like this (this is from tkprof
> > and from the slightly more complex version of the query issued by
> > the application):
>
> > Rows � � Execution Plan
> > ------- �---------------------------------------------------
> > � � � 0 �SELECT STATEMENT � MODE: ALL_ROWS
> > � � 260 � SORT (ORDER BY)
> > � � 260 � �HASH JOIN (RIGHT OUTER)
> > �651633 � � TABLE ACCESS � MODE: ANALYZED (FULL) OF 'tab2' (TABLE)
> > �651633 � � HASH JOIN
> > �651633 � � �TABLE ACCESS � MODE: ANALYZED (FULL) OF 'tab2' (TABLE)
> > �651633 � � �TABLE ACCESS � MODE: ANALYZED (FULL) OF 'tab1' (TABLE)
>
> > Thanks for your help
> > W. R�sler
>
> Hi Wolfram,
>
> If the sub query yields 20 distinct values and no null values, I think
> you could try to rewrite the query as:
>
> SELECT tab1.several_columns
> FROM tab1, tab2
> WHERE tab.col1='Value'
> and tab2.col3='Something'
> and tab1.col2=tab2.key
>
> Regards Hansp- Hide quoted text -
>
> - Show quoted text -
As you said in your first post:
"However, only 20 rows of tab2 match the subquery, and only 300
rows of tab1 match the entire query."
Could you try the following
SELECT several_columns
FROM tab1
WHERE col1='Value'
AND col2 IN
(
SELECT /*+ cardinality(20) */ key FROM tab2 WHERE tab2.col3='Something'
);
This way the hint will be telling optimizer on the no of rows that should be returned by subquery.
If parameters are passed as binds - could you rewrite the query with a dummy comment to confirm the sql gets parsed again - this is rule of any bind peeking/histogram effect. Do you have histograms on the table/ columns?
Regards,
Sachin
Received on Tue Oct 20 2009 - 05:51:32 CDT