RE: Optimizer estimated cardinality very low

From: Noveljic Nenad <nenad.noveljic_at_vontobel.com>
Date: Wed, 14 Oct 2020 08:48:47 +0000
Message-ID: <5807_1602665348_5F86BB84_5807_3672_1_b3747b77b05047e48566fdf0f7db8b76_at_vontobel.com>



Could you try with

alter session set "_fix_control"='21802552:off'; Best regards,

Nenad

https://nenadnoveljic.com/blog

From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> On Behalf Of Jonathan Lewis Sent: Mittwoch, 14. Oktober 2020 10:44
To: Daniel Coello <coello.daniel_at_gmail.com> Cc: Oracle L <Oracle-L_at_freelists.org> Subject: Re: Optimizer estimated cardinality very low

Daniel,

I've lost count of the number of times I've said: "you hardly ever need to read the 10053 (CBO) trace file"; but in this case I think I need to see the 10053 trace file. I suspect the issue is something to do with the hybrid histogram and the fact that the July data is the top partition of the table but I can't think of anything that would make the cardinality estimate drop to 1.

If you could enable event 10053 ('alter session set events '10053 trace name context forever'; and execute the june and july statements and send me the trace files (zip'ed would be good). Please start a fresh session for each query so that they are in different files, and to make sure that the optimizer does actually optimize them afresh (which is when the 10053 trace is generated) add small comment to each statement.

Regards
Jonathan Lewis

On Tue, 13 Oct 2020 at 22:30, Daniel Coello <coello.daniel_at_gmail.com<mailto:coello.daniel_at_gmail.com>> wrote: Thank you for the feedback,

The points brought up about the data are accurate and it is what would be expected with exception of the high_value for the july dates (july/21). I checked on the date values to ensure there were no inconsistencies with the statistics information; I found them to be accurate regarding low/high values for july partitions:

DBA_PART_COL_STATISTICS

TABLE_NAME    COLUMN_NAME    NUM_DISTINCT   LOW_VALUE         HIGH_VALUE        HISTOGRAM
ORDER_REQST   ORDR_RQST_DATE    3           78780707010101    78780715010101    FREQUENCY
ORDER_DETL    ORDR_RQST_DATE    3           78780707010101    78780715010101    FREQUENCY

DBA_PART_HISTOGRAMS

TABLE_NAME    PARTITION_NAME    COLUMN_NAME    ENDPOINT_ACTUAL_VALUE    ENDPOINT_ACTUAL_VALUE_RAW
ORDER_REQST    SYS_P971481     ORDR_RQST_DATE    07-JUL-20    78780707010101
ORDER_REQST    SYS_P971481     ORDR_RQST_DATE    14-JUL-20    7878070E010101
ORDER_REQST    SYS_P971481     ORDR_RQST_DATE    21-JUL-20    78780715010101
ORDER_DETL     SYS_P971761     ORDR_RQST_DATE    07-JUL-20    78780707010101
ORDER_DETL     SYS_P971761     ORDR_RQST_DATE    14-JUL-20    7878070E010101
ORDER_DETL     SYS_P971761     ORDR_RQST_DATE    21-JUL-20    78780715010101

DBA_TAB_COL_STATISTICS
TABLE_NAME    COLUMN_NAME       NUM_DISTINCT    LOW_VALUE         HIGH_VALUE    HISTOGRAM
ORDER_REQST   ORDR_RQST_DATE    2696         64650101010101    78780715010101    HYBRID
ORDER_DETL    ORDR_RQST_DATE    2696         64650101010101    78780715010101    HYBRID

DBA_TAB_HISTOGRAMS (highest value)
OWNER      TABLE_NAME    COLUMN_NAME    ENDPOINT_NUMBER   ENDPOINT_VALUE    ENDPOINT_ACTUAL_VALUE    ENDPOINT_ACTUAL_VALUE_RAW
DW_OWNER   ORDER_REQST   ORDR_RQST_DATE    84005          2459052           21-JUL-20                78780715010101
DW_OWNER   ORDER_DETL    ORDR_RQST_DATE    40208          2459052           21-JUL-20                78780715010101

Both June and July partitions have frequency histograms on ordr_rqst_date, which would mean column groups are ignored in both cases and still join estimates for june partitions are good but for july are not.

I am puzzled as individual statistics values and single table statistics are accurate on its own but when it comes down to calculating the estimate for the semi join cardinality just for July the value is lesser than 1.

Based on the feedback I think if this leaning towards an statistics issue I would like to understand what is the calculation and what factors (histograms, col groups, partition vs global stats) play a role on the estimation of the cardinality and more importantly how I can efficiently take steps to help the optimizer have the best information to an accurate result.

Thanks again,

On Fri, Oct 9, 2020 at 7:03 AM Jonathan Lewis <jlewisoracle_at_gmail.com<mailto:jlewisoracle_at_gmail.com>> wrote:

Daniel,

I point I'd completely overlooked when I made the comment about matching column groups: column group stats are not used if any of the underlying columns has a histogram and the column group itself does not. In your July stats ordr_rqst_date has a histogram and the column groups don't, so their stats won't be used.

A few other points about the stats:
The table has '01-01-0001' recorded as the low value for the ordr_rqst_date - that extreme value MIGHT have an impact on some queries. The high value is 15th July 2020 for table and July partition.

The ordr_rqst_number is stored as a varchar2 (or char), not as a number, although the values stored do look like numbers. This can have a serious impact on the optimizer's arithmetic, but since you're only using equality on this column in this query it's probably irrelevant to this case.

It looks a little odd that the high value ordr_rqst_number for the table ( 9936510407132880 ) is much bigger than the value for July ( 2020000000740002 ) - again that may be irrelevant in this case, and it may be something that you expect to see anyway (i.e. no correlation between data and number).

Finally, you've said that there's one date per week in each partition - are the loads/dates supposed to be on the same day of each week or can they move around. I ask this because the July data has a low of 7th July and a high of 15th July, and 3 distinct values. This doesn't look consistent with 3 weeks on the same day of the week, so I was wondering if this was normal, or whether it was an indication that something a little unusual had happened during the data loads.


It would be useful to have the june stats to compare with the July stats in case any significant difference shows up.


Sorry about the error in the previous post which basically said: if they match then is probably X but if they match then it's probably Y - one of those should have had a negative.

My thinking was as follows:

if specifying the partitions (with no date predicate) gives you the same estimates as supplying date predicates then the optimizer has probably recognized perfectly correctly that your dates are all about one specific partition and will be doing the same arithmetic in both cases -- and that would mean there's something about the statistics that causing a problem.

if the two variations give you different estimates that would suggest that the optimizer has NOT (emphasised just to make sure one of my statements had the appropriate negative) used a code path that is 100% about just one known partition, and therefore there may be an error in the optimizer code.

As it is you got the same estimates in both cases - which means it's probably something about the stats.

Having said that there is a quirky little difference between the two sets of predicate information you posted when you compared "<= 31st July" with "< 1st Aug". They both have the same ACCESS and FILTER predicates (apart from the change in date value, of course) at exactly the same operation - but the query using 1st Aug didn't offload the filter on the date on the ordr_rqst_detail table (viz: the plan didn't report a STORAGE predicate.)

Regards
Jonathan Lewis

--
Daniel Coello Villacis



Please consider the environment before printing this e-mail. Bitte denken Sie an die Umwelt, bevor Sie dieses E-Mail drucken.

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css">p { font-family: Arial;font-size:9pt }</style>
</head>
<body>
<p>
<br>Important Notice</br>
<br />

This message is intended only for the individual named. It may contain confidential or privileged information. If you are not the named addressee you should in particular not disseminate, distribute, modify or copy this e-mail. Please notify the sender immediately by e-mail, if you have received this message by mistake and delete it from your system.<br /> Without prejudice to any contractual agreements between you and us which shall prevail in any case, we take it as your authorization to correspond with you by e-mail if you send us messages by e-mail. However, we reserve the right not to execute orders and instructions transmitted by e-mail at any time and without further explanation.<br /> E-mail transmission may not be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete. Also processing of incoming e-mails cannot be guaranteed. All liability of Vontobel Holding Ltd. and any of its affiliates (hereinafter collectively referred to as "Vontobel Group") for any damages resulting from e-mail use is excluded. You are advised that urgent and time sensitive messages should not be sent by e-mail and if verification is required please request a printed version.</br> Please note that all e-mail communications to and from the Vontobel Group are subject to electronic storage and review by Vontobel Group. Unless stated to the contrary and without prejudice to any contractual agreements between you and Vontobel Group which shall prevail in any case, e-mail-communication is for informational purposes only and is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction.<br /> The legal basis for the processing of your personal data is the legitimate interest to develop a commercial relationship with you, as well as your consent to forward you commercial communications. You can exercise, at any time and under the terms established under current regulation, your rights. If you prefer not to receive any further communications, please contact your client relationship manager if you are a client of Vontobel Group or notify the sender. Please note for an exact reference to the affected group entity the corporate e-mail signature. For further information about data privacy at Vontobel Group please consult <a href="https://www.vontobel.com">www.vontobel.com</a>.<br />
</p>
</body>
</html>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Oct 14 2020 - 10:48:47 CEST

Original text of this message