Re: Better cardinality estimate when dialing optimizer_features_enable back

From: Neil Kodner <nkodner_at_gmail.com>
Date: Tue, 24 Nov 2009 15:12:01 -0500
Message-ID: <ae2c46ba0911241212p54ec7a9fw798d701554fcc129_at_mail.gmail.com>



Sorry, I fixed permissions on the two google spreadsheets.

Here's the histogram
http://bit.ly/8wCq5k

and the table/index/column stats
http://bit.ly/7aWLOK

To answer your initial question, I think index access does in fact make sense. I'm retrieving anywhere from one to a thousand rows from letter_bin per call. This query is executed thousands of times as part of a batch job.

As a frame of reference, the table represents letters being mailed to an employer(prty_id), for every employee. There are many small businesses  each having a few rows. There are medium and large businesses that have many rows. And finally there are employment agencies which have a LOT of rows. Hopefully this explains the distribution of data.

As far as the transitive closure issue, there are also other queries which query letter_bin by party_id without the join to parties - those are affected as well. The cardinality is always incorrect for any query by letter_bin.party_id.

On Tue, Nov 24, 2009 at 2:56 PM, Randolf Geist <info_at_sqltools-plusplus.org>wrote:

> > Here is the histogram on letter_bin.prty_id
> >
> > http://spreadsheets.google.com/ccc?key=0ArUTHdqN0J--dDVXZ1ZPTGlROUxfU0F
> > XV3RNdFVINlE&hl=en
> >
> > I've never manually crafted a histogram before.
>
> Neil,
>
> I can't access that Google spreadsheet - it tells me I lack privileges to
> do so.
>
> How many distinct values are in the column - how is the actual
> distribution?
>
> Are there values that are so rare that the index access makes actually
> sense? If not consider simply removing the histogram - or consider removing
> the index.
>
> Instead of fiddling with the histogram itself you can simply try to set the
> DENSITY manually of the column using DBMS_STATS.SET_COLUMN_STATS to the
> value currently shown in DENSITY - according to Alberto's post this should
> revert to the old density calculation using the DENSITY value (the new
> calculation ignores the value from the dictionary), but I haven't tested
> this recently myself. That should modify the cardinality estimate of
> unpopular column values.
>
> The plans that you've posted: There is transitive closure taking place on
> PRTY_ID between LETTER_BIN and PARTIES - which leads to the side effect that
> the optimizer removed the join condition between LETTER_BIN and PARTIES.
> With that in place the cardinality estimate might be lower and closer to
> reality. It's just a workaround and does not address the root cause but what
> do you get if you add manually the "PARTIES.PRTY_ID = <val>" to the queries?
>
> This workaround should retain the join predicate and thus lower the
> cardinality estimate.
>
> Regards,
> Randolf
>
> Oracle related stuff blog:
> http://oracle-randolf.blogspot.com/
>
> Co-author of the forthcoming "OakTable Expert Oracle Practices" book:
> http://www.apress.com/book/view/1430226684
>
> http://www.amazon.de/Expert-Oracle-Practices-Database-Administration/dp/1430226684
> ______________________________________________________
> GRATIS für alle WEB.DE-Nutzer: Die maxdome Movie-FLAT!
> Jetzt freischalten unter http://movieflat.web.de
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Nov 24 2009 - 14:12:01 CST

Original text of this message