Re: Dynamic Sampling: some questions about the guts

From: David Aldridge <david_at_david-aldridge.com>
Date: Sat, 1 Dec 2007 14:05:38 -0800 (PST)
Message-ID: <209004.44232.qm@web807.biz.mail.mud.yahoo.com>

I wouold guess that this might be to do with the optimizer looking at different access paths to the table. For example, it might be not be much more selective to access the table via

"RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1

AND "RCRAPP1"."RCRAPP1_INFC_CODE"=:bind2 AND "RCRAPP1"."RCRAPP1_CURR_REC_IND"=:bind3 AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4

than it is to access it with:

"RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1

AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4

If there is one index on all four columns and one index on just the two columns then accessing via the two column index might be more efficient than using the four-column one.

Just speculation though.

In one of our reporting environments (OSEE 10.2.0.2) we have optimizer_dynamic_sampling cranked up to 5. We have a case where dynamic sampling is pushing the CBO to choose a poor plan. Our situation highlights Level 4, specifically "tables that have single-table predicates that reference 2 or more columns". (from the documentation)

Looking at a 10046 trace, I see a OPT_DYN_SAMP query that counts a sample of rows that match variations on the predicate. I have found that a 10046 can always be good grounds for finding more questions than it answers. *grin*

To avoid the bandwidth, I will show just the relevant parts of the sampling query: SELECT /* OPT_DYN_SAMP */
...
FROM
(SELECT /* ... */
1 AS C1,
CASE

    WHEN     "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1 
         AND "RCRAPP1"."RCRAPP1_INFC_CODE"=:bind2 
         AND "RCRAPP1"."RCRAPP1_CURR_REC_IND"=:bind3 
         AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4

    THEN 1
    ELSE 0
   END AS C2,
   CASE

    WHEN     "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1
         AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4

    THEN 1
    ELSE 0
   END AS C3,
   CASE

    WHEN     "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1
         AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4

    THEN 1
    ELSE 0
   END AS C4
FROM ...
SAMPLE BLOCK (.037704 , 1) SEED (1) "RCRAPP1" ) SAMPLESUB
/

First question: Why the predicate variations? I am assuming that sometimes C4 is different than C3, so I am going to ignore C4 for now. 2) How does the Optimizer use these sample counts to influence the costs and subsequent access/join methods?

I believe what is happening in our case is that sampling is "just unlucky" and it happens to come up with a inaccurate picture of how many rows match (an order of magnitude too many). I can prove this by using this query directly with different sample sizes.

I am hoping that by understanding dynamic sampling a little better, I can understand how to resolve this issue. I realize that one can add the dynamic_sampling query hint, and perhaps that is the "best" solution for this. Aside from adjusting dynamic sampling (whether it be at the system level or via a hint), is there any other way to address this situation?

TIA,

-- 
Charles Schultz 

--
http://www.freelists.org/webpage/oracle-l

Received on Sat Dec 01 2007 - 16:05:38 CST