Re: A question about huge difference in cardinality of a query with 3 predicates

From: Andy Sayer <andysayer_at_gmail.com>
Date: Sat, 28 Oct 2017 18:20:06 +0100
Message-ID: <CACj1VR4UV_U31Fffc_3+pusbWvTUjkQJMnKLtB0Yk2RZR25pSQ_at_mail.gmail.com>



Hi Nirav,

Height balanced histograms are rarely a good idea, and histograms are not needed at all unless your data is skewed (e.g. status columns are never 50% active, 50% inactive so a histogram would tell the optimizer 99% are active; although your status seems to be more detailed). A height balanced histogram on your primary key sounds like a silly idea, it's unique! However, you're not touching it here, so it's irrelevant but leads me to believe you think histograms are always a good thing. You should probably read Neil Chandler's paper/presentation 'Histograms are Evil like Chocolate is Evil'
https://www.doag.org/formes/servlet/DocNavi?action=getFile&did=9483234&key= or google 'Oracle Histograms are Chocolate', it covers the new histograms in 12c but remember that you are on 11.2.0.4 still so all you have is frequency / height balanced / none.

You could do queries like
select column_name, count(*) from table_name group by column_name order by count(*);
Is there lots of skew? Are they all pretty much the same counts? Do you have a range of values that pretty much have the same counts and then outside that there's a difference? Think about how likely it is for new inserts to contain new values, they're more likely to be covered by just low and high values than a frequency histogram.

I would do a similar query for each column as the filter (using the same equality condition) and compare each expected rows against your count. If they're all okay then the problem could be that your columns are related, you can identify which ones by doing similar queries but with two column filters. If that's the case then a column group could be necessary in the statistics. If you do regather statistics then think about what really benefits from a histogram.

What we can do with the information given would be to make sure your stats actually cover the values you're looking for, you've only shared low and high values so we can peek at those using scripts similar to

ANDY_at_pdb1>get raw_string
  1 var raw varchar2(100 char);
  2 exec :raw := '&1'
  3 declare
  4 sString varchar2(100);
  5 begin
  6 dbms_stats.convert_raw_value(:raw, sString);   7 dbms_output.put_line(sString );
  8* end;

You could have a look at dba_tab_histograms to get the values used in each bucket.

The one that sticks out is event_type_id, you're looking for 5014, and the stats say the low and high values are:
ANDY_at_pdb1>_at_raw_number C102
1
ANDY_at_pdb1>_at_raw_number C21F07
3006

There's a problem. Your 5014 is outside the low and high value range, maybe that's because it's a new value and you hadn't gathered stats then. Since it's 0.67* the range outside the range, expect the CBO to assume that filtering on event_Type_id = 5014 is going to reduce the result set by a lot. Once you've addressed this (by gathering statistics at the right time, maybe with just size 1 used in your method_opt clause), you should take a look at your new cardinality, it may have gotten better but it also could have gotten worse.

You should also bear in mind that because you're dealing with bind variables, unless the ones you are using were peeked the actual execution plan might be assuming very different values. If your bind variables are on the columns that you need histograms on then you probably want a plan per value used, you could either let your fate be decided by the ACS Gods or you could just decide to use a literal instead - personally I would drop the histogram, keep the binds and just accept that there is probably one plan that is good enough for all reasonable values.

--I was just about to hit send and realised that the peeked value for your
event_type_id filter was 3004, that's within your low and high value. Is this the actual value you used when you ran the query? Had you run the same query before with different binds? Was the cardinality correct for that time?

Regards,
Andrew

--

http://www.freelists.org/webpage/oracle-l Received on Sat Oct 28 2017 - 19:20:06 CEST

Original text of this message