Re: Using bind variables for multiple values
Date: Mon, 24 Nov 2008 22:15:24 +0100
Message-ID: <6p0jreF5rm09U1@mid.individual.net>
On 24.11.2008 17:26, Jonathan Lewis wrote:
> "Robert Klemme" <shortcutter_at_googlemail.com> wrote in message
> news:6c7ef223-0934-4cf9-a44b-ad268937b86f_at_j32g2000yqn.googlegroups.com...
>> I wondered: why is it that the join version does a full table scan but >> the IN ( subselect ) version does not? I did a 10053 trace and found >> out that the same access strategy which leads to the INDEX UNIQUE SCAN >> in the case of IN ( subselect ) is costed much higher (~ factor of 30) >> in the JOIN case. BASE STATISTICS are identical in both cases and it >> seems the culprit is the join cardinality (255 vs. 8168) which seems >> derived from the outer table cardinality. The big myth to me is: where >> does the CBO get the cardinality of 255 from all of a sudden since it >> originally assumed 8168?
>
> If you check the cardinality, you will see that it is 255.25, which
> is EXACTLY 8168 / 32.
Yes that's what the report for the join analysis said. I didn't bother to quote the fractional digits in the text because I did not notice this relationship. Thanks for the heads up!
> When Oracle doesn't know what the real statistics are, it uses some
> arbitrary estimate. In this case, because the IN approach transforms
> into an inline 'select distinct column_value' and Oracle doesn't know
> how many distinct values there are, I guess is simply assumes 32.
>
> Just as 8168 is a side effect of the default block size, this 32 may also
> be based on the block size. So if you've got a database with a
> different block size handy, you might like to test what happens there.
Hm... I'll see whether I find the time to retest with a different non standard sized tablespace.
Interestingly an explicit DISTINCT does not save the join and we now get a hash join - concluding from the rows the CBO does not assume any duplicate rows - which is in line with the 10053 trace of that run, which assumes 8162 rows for the subquery:
SQL> timing start "Join and distinct"
SQL> select length(dat) from t1 join ( select distinct column_value from
table(iset(-1,2,3)) ) tt on tt.column_value = t1.id
2 /
LENGTH(DAT)
90 90
Execution Plan (truncated bytes and other cols for readability)
Plan hash value: 731532569
| Id | Operation | Name | Rows | ------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 8168 | |* 1 | HASH JOIN | | 8168 | | 2 | VIEW | | 8168 | | 3 | HASH UNIQUE | | 8168 | | 4 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | | | 5 | TABLE ACCESS FULL | T1 | 1000K| ------------------------------------------------------------------
Predicate Information (identified by operation id):
1 - access("TT"."COLUMN_VALUE"="T1"."ID")
Statistics
1 recursive calls 0 db block gets 14192 consistent gets 0 physical reads 0 redo size 363 bytes sent via SQL*Net to client 377 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2 rows processed
SQL> timing stop
timing for: Join and distinct
Elapsed: 00:00:00.23
Basically since the IN with subselect is the version that suits my needs best I am quite happy that it is blessed with a good plan. Although I have to say that it makes me a bit wary that I do not exactly understand what goes on in the other cases. I guess it's time for "CBO Fundamentals" - if only I had enough time...
Again, thanks for your help!
Kind regards
robert Received on Mon Nov 24 2008 - 15:15:24 CST