Re: Optimizer estimated cardinality very low
Date: Wed, 7 Oct 2020 10:12:58 +0200
Message-ID: <fa575243-3d6a-e176-0d9c-7026fbd2518f_at_bluewin.ch>
Hi,
for an estimate like this, the low_value and high_values of the column
status of the join key should overlap for a high join estimate.
Your join is
"ORDER_DETL"."ORDR_RQST_NUMBER"="ORDER_REQST"."ORDR_RQST_NUMBER" AND
"ORDER_DETL"."ORDR_RQST_DATE"="ORDER_REQST"."ORDR_RQST_DATE"
My bet is on an outlier in the date column of one of tables.
Check for the min an max values of
"ORDER_DETL"."ORDR_RQST_DATE", "ORDER_REQST"."ORDR_RQST_DATE".
Then check
ORDER_DETL"."ORDR_RQST_NUMBER","ORDER_REQST"."ORDR_RQST_NUMBER".
Regards
Lothar
Am 07.10.2020 um 00:51 schrieb Daniel Coello:
> Hello,
> I am running into a performance issue where queries using EXISTS
> condition (throug a vpd policy). The resulting estimated cardinality
> is a very low number ( 1 in the example).
>
> | 3 | PX PARTITION HASH ALL | | 1 | 52
> | | 15194 (13)| 00:00:01 | 1 | 8 | Q1,00 | PCWC
> | |
> |* 4 | HASH JOIN SEMI | | 1 | 52 |
> 500M| 15194 (13)| 00:00:01 | | | Q1,00 | PCWP | |
>
> */Execution plan details below/*
>
>
> - How is the join cardinality estimation defined?
>
> - What can I do to get a more accurate estimate?
>
> Details
>
> * Database version is 12.1.0.2 Enterprise Edition on an Exadata
> environment, 4 nodes RAC.
> * The tables are partitioned by ORDR_RQST_DATE by monthly interval
> and subpartitioned by hash in ORDR_RQST_NUMBER. We have seen this
> issue mostly in latest months.
> * Every combination of ORDR_RQST_DATE,ORDR_RQST_NUMBER in the
> ORDER_REQST exists in ORDER_DETL (1-to-many)
> * Statistics are up to date, histograms are the same between months
> where the cardinality estimate is fine. Statistics have been
> gathered at subpartition, partition, global levels and for all.
> * I have tested with different dynamic sampling levels, the single
> table cardinality estimations change but the hash join semi
> estimated cardinality remains 1.
> * The EXISTS clause is created by the VPD policy which I can't
> modify the logic at this time. The issue is still present with
> simple queries with EXISTS clause and no VPD, like the example
> posted here.
> * I have tried using the CARDINALITY in the subquery " AND EXISTS (
> SELECT /*+ CARDINALITY(ORDER_DETL 30000000) */ 1 ..." the estimate
> cardinality changes for the table but has no effect to the EXISTS
> estimated cardinality.
> * SQL profiles and hints (depending on queries) can help on one by
> one cases but having users doing adhoc queries this approach is
> not manageable .
>
>
> *Table count info*
>
> ORDER_REQST: 2810794270
> ORDER_DETL: 4243601381
>
> *July/2020 partition count:*
> **
> ORDER_REQST: 26895392
> ORDER_DETL: 32314200
>
> Column stats info:
>
> *TABLE_NAME COLUMN_NAME NUM_DISTINCT*
> ORDER_REQST ORDR_RQST_DATE 2696
> ORDER_REQST ORDR_RQST_NUMBER 2719713280
> ORDER_DETL ORDR_RQST_DATE 2696
> ORDER_DETL ORDR_RQST_NUMBER 2719662080
>
>
> Column group stats:
>
> *TABLE_NAME COL_GROUP NUM_DISTINCT HISTOGRAM*
> ORDER_REQST ("ORDR_RQST_NUMBER","ORDR_RQST_DATE")
> 2717362176 NONE
> ORDER_DETL ("ORDR_RQST_LINE","ORDR_RQST_DATE","ORDR_RQST_NUMBER")
> 4243601381 NONE
>
>
> Query as it is does not represent an issue alone but when joins are
> added to other tables they are joined using Nested Loops and they
> practically don't finish.
> We can create sql profiles but in an adhoc environment there are
> thousands of versions when joins to other tables are added.
>
>
> Below query plan and extracts from 100053 traces I have collected:
>
> Query:
>
> Actual count: 26895392
> SELECT
> "ORDER_REQST"."ORDR_RQST_NUMBER" "ORDR_RQST_NUMBER",
> "ORDER_REQST"."ORDR_RQST_DATE" "ORDR_RQST_DATE",
> "ORDER_REQST"."ORDR_STATUS" "ORDR_STATUS"
> FROM "ORDER_REQST" "ORDER_REQST"
> WHERE "ORDER_REQST"."ORDR_RQST_DATE" >=
> to_date('01JUL2020','DDMONYYYY')
> AND "ORDER_REQST"."ORDR_RQST_DATE" <= to_date('31JUL2020','DDMONYYYY')
> AND EXISTS (
> SELECT 1
> FROM "ORDER_DETL" "ORDER_DETL"
> WHERE "ORDER_DETL"."ORDR_RQST_NUMBER" =
> "ORDER_REQST"."ORDR_RQST_NUMBER"
> AND "ORDER_DETL"."ORDR_RQST_DATE" =
> "ORDER_REQST"."ORDR_RQST_DATE"
> );
> Execution Plan
> -----------------------------------------------------------------------------------------------------------------------------------------------------
> | Id | Operation | Name | Rows | Bytes
> |TempSpc| Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ
> Distrib |
> -----------------------------------------------------------------------------------------------------------------------------------------------------
> | 0 | SELECT STATEMENT | | 1 | 52 | |
> 15194 (13)| 00:00:01 | | | | | |
> | 1 | PX COORDINATOR | | | |
> | | | | | | | |
> | 2 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 52
> | | 15194 (13)| 00:00:01 | | | Q1,00 | P->S | QC (RAND) |
> | 3 | PX PARTITION HASH ALL | | *1* | 52
> | | 15194 (13)| 00:00:01 | 1 | 8 | Q1,00 | PCWC
> | |
> |* 4 | HASH JOIN SEMI | | *1* | 52 |
> 500M| 15194 (13)| 00:00:01 | | | Q1,00 | PCWP | |
> | 5 | PX PARTITION RANGE SINGLE | | 26M| 692M| |
> 3376 (43)| 00:00:01 | 248 | 248 | Q1,00 | PCWC | |
> |* 6 | TABLE ACCESS STORAGE FULL| ORDER_REQST | 26M|
> 692M| | 3376 (43)| 00:00:01 | 1977 | 1984 | Q1,00 | PCWP
> | |
> | 7 | PX PARTITION RANGE SINGLE | | 31M| 751M| |
> 1937 (10)| 00:00:01 | 248 | 248 | Q1,00 | PCWC | |
> |* 8 | INDEX FAST FULL SCAN | ORDER_DETL_PK | 31M|
> 751M| | 1937 (10)| 00:00:01 | 1977 | 1984 | Q1,00 | PCWP
> | |
> -----------------------------------------------------------------------------------------------------------------------------------------------------
> Predicate Information (identified by operation id):
> ---------------------------------------------------
> 4 -
> access("ORDER_DETL"."ORDR_RQST_NUMBER"="ORDER_REQST"."ORDR_RQST_NUMBER"
> AND
> "ORDER_DETL"."ORDR_RQST_DATE"="ORDER_REQST"."ORDR_RQST_DATE")
> 6 - storage("ORDER_REQST"."ORDR_RQST_DATE">=TO_DATE(' 2020-07-01
> 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
> "ORDER_REQST"."ORDR_RQST_DATE"<=TO_DATE(' 2020-07-31
> 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
> filter("ORDER_REQST"."ORDR_RQST_DATE">=TO_DATE(' 2020-07-01
> 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
> "ORDER_REQST"."ORDR_RQST_DATE"<=TO_DATE(' 2020-07-31
> 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
> 8 - storage("ORDER_DETL"."ORDR_RQST_DATE">=TO_DATE(' 2020-07-01
> 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
> "ORDER_DETL"."ORDR_RQST_DATE"<=TO_DATE(' 2020-07-31
> 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
> filter("ORDER_DETL"."ORDR_RQST_DATE">=TO_DATE(' 2020-07-01
> 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
> "ORDER_DETL"."ORDR_RQST_DATE"<=TO_DATE(' 2020-07-31
> 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
>
> Note
> -----
> - dynamic statistics used: dynamic sampling (level=9)
> - automatic DOP: Computed Degree of Parallelism is 2
> - parallel scans affinitized for buffer cache
>
> When getting the 10053 for the semi join calculation with extended
> stats I see it is using the extended stats created on both columns for
> each table :
> ...
> Join ColGroups for ORDER_REQST[ORDER_REQST] and
> ORDER_DETL[ORDER_DETL] : (#5, #3) -- Column groups listed above
> ...
> Join selectivity using 1 ColGroups: 6.0249e-08 (sel1 = 0.910080,
> sel2 = 0.000000)
> Semi Join Card: 1.000000 = outer (16597656.000000) * sel (6.0249e-08)
> ...
> When running using "_optimizer_enable_extended_stats"=FALSE to avoid
> using extended stats for the same interval then:
> ...
> Revised join sel: 3.0946e-08 = 1.000000 * (1/32314200.00) *
> (1/1.000000)
> Capping Semi Join Card of HJ/SMJ by inner = 0.473558
> Capping Semi Join Card of HJ/SMJ (no post filters) by inner = 0.473558
> Capping Semi Join Card of NL by inner = 0.473558
> Capping Semi Join Card of without POSQ by inner = 0.473558
> Capping Semi Join Card of non adjusted NSQ by inner = 0.473558
> ...
>
> From a 10053 trace of a month interval, june/2020, where estimate
> cardinality is good:
>
> ...
> Join ColGroups for ORDER_REQST[ORDER_REQST] and
> ORDER_DETL[ORDER_DETL] : Using cdn sanity check
> ...
> ColGroup cardinality sanity check: ndv for
> ORDER_REQST[ORDER_REQST] = 17506471.000000 ORDER_DETL[ORDER_DETL] =
> 21814573.000000
> Join selectivity using 1 ColGroups: 1.000000 (sel1 = 0.000000,
> sel2 = 1.9154e-08)
> Semi Join Card: 14482345.790323 = outer (14482345.790323) * sel
> (1.000000)
> Join Card - Rounded: 14482346 Computed: 14482345.790323
> ...
>
> Thank you in advance for your help
>
> --
> Daniel Coello Villacis
>
-- http://www.freelists.org/webpage/oracle-lReceived on Wed Oct 07 2020 - 10:12:58 CEST