Re: A question about huge difference in cardinality of a query with 3 predicates
Date: Sat, 28 Oct 2017 15:52:23 +0000
Message-ID: <SYXPR01MB095986DF55CCBD4433659A4EC15B0_at_SYXPR01MB0959.ausprd01.prod.outlook.com>
Hi All,
I am having an issue of a query getting a bad plan -which I have narrowed down to major mismatch between estimated and actual cardinality of one of the tables involved in the query. I need
help from experts on understanding why the cardinality estimate is wrong and what can be done to rectify. For this purpose I have created a test table with required columns and tried to
create same situation (of cardinality mismatch) and following are the details on it. (Oracle version is : 11.2.0.4 on Linux.)
Table structure: CREATE TABLE processes
(
process_id NUMBER (12) NOT NULL PRIMARY KEY,
clientid VARCHAR2 (20) NOT NULL,
evt_type_id NUMBER (12) NOT NULL,
status VARCHAR2 (20) NOT NULL
)
Following are the stats about the table (Thanks to the print_table utility of Tom Kyte Sir - the below is output from "select * from all_tables where table_name='PROCESSES'" from
which several default stuff is remove to keep this a little brief)
OWNER : TEST TABLE_NAME : PROCESSES PCT_FREE : 10 NUM_ROWS : 2231714 BLOCKS : 12137 AVG_ROW_LEN : 31 SAMPLE_SIZE : 2231714
Here is the query and test data :
variable v_coid varchar2(16);
variable v_status varchar2(16);
variable v_evt_type_id number;
exec :v_coid := 'G31MQQNM99ABCA9V';
exec :v_status := 'INP';
exec :v_event_type_id :=5014;
SQL> SELECT COUNT (*) FROM processes wpi WHERE WPI.CLIENTID = :v_coid AND wpi.status = :v_status AND WPI.evt_type_id = :v_event_type_id;
COUNT(*)
10404
1 row selected.
Following is the Execution plsn:
Plan hash value: 3304333532
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.08 | 11406 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.08 | 11406 | |* 2 | TABLE ACCESS FULL| PROCESSES | 1 | 62 | 10404 |00:00:00.08 | 11406 |
Peeked Binds (identified by position):
1 - (VARCHAR2(30), CSID=873): 'G31MQQNM99ABCA9V' 2 - (VARCHAR2(30), CSID=873): 'INP' PLAN_TABLE_OUTPUT
3 - (NUMBER): 3004 Predicate Information (identified by operation id):
2 - filter(("WPI"."CLIENTID"=:V_COID AND "WPI"."STATUS"=:V_STATUS AND
"WPI"."EVT_TYPE_ID"=:V_EVT_TYPE_ID)) 29 rows selected.
Here we can see the Huge difference - the estimated cardinality of just 62 versus the acutal cardinality of 10404. Note that all these 3 columns involved in the query have frequency histograms
on them and following are details on it...and this is the issue - the difference between estimated cardinality of 62 versus actual value of 10404 that I need help to sort out.
exec print_table('select * from all_tab_col_statistics where owner=''TEST'' AND TABLE_NAME=''PROCESSES''');
OWNER : TEST -- same for all the columns below TABLE_NAME : PROCESSES -- same for all the columns below COLUMN_NAME : PROCESS_ID NUM_DISTINCT : 2231714 LOW_VALUE : C50D01015C04 HIGH_VALUE : C5170B1A5218 DENSITY : .000000444209506971868 NUM_NULLS : 0 NUM_BUCKETS : 254 LAST_ANALYZED : 20-oct-2017 04:20:20 --for all columns below too...the stats are current. SAMPLE_SIZE : 5548 AVG_COL_LEN : 7 HISTOGRAM : HEIGHT BALANCED
-----------------
OWNER : TEST TABLE_NAME : PROCESSES COLUMN_NAME : CLIENTID NUM_DISTINCT : 104 LOW_VALUE : 30324652425132433157563030303936 HIGH_VALUE : 47354D384A4D54394545484A51475859 DENSITY : .000000222104676853238 NUM_NULLS : 0 NUM_BUCKETS : 87 SAMPLE_SIZE : 5548 AVG_COL_LEN : 17 HISTOGRAM : FREQUENCY
-----------------
COLUMN_NAME : EVT_TYPE_ID NUM_DISTINCT : 50 LOW_VALUE : C102 HIGH_VALUE : C21F07 DENSITY : .000000222104676853238 NUM_NULLS : 0 NUM_BUCKETS : 38 SAMPLE_SIZE : 5548 AVG_COL_LEN : 4 HISTOGRAM : FREQUENCY
-----------------
COLUMN_NAME : STATUS NUM_DISTINCT : 3 LOW_VALUE : 434F4D HIGH_VALUE : 494E50 DENSITY : .000000222104676853238 NUM_NULLS : 0 NUM_BUCKETS : 3 SAMPLE_SIZE : 5548 AVG_COL_LEN : 4 HISTOGRAM : FREQUENCY
-----------------
Please let me know if I should provide any other details. One thing I found was that if I do dynamic sampling level 8 then the cardinality estimate improved to nearly 7000 (don't recall that exact figure),but I would not like to depend on it. The other thing is if I drop the histograms then the cardinality improved slightly - to 143 as shown in the plan below but that is still way off the actual value which is more than 10K (10404 to be exact).
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.08 | 11406 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.08 | 11406 | |* 2 | TABLE ACCESS FULL| PROCESSES | 1 | 143 | 10404 |00:00:00.08 | 11406 |
Peeked Binds (identified by position):
1 - (VARCHAR2(30), CSID=873): 'G31MQQNMH3TW7A9V' 2 - (VARCHAR2(30), CSID=873): 'INP' PLAN_TABLE_OUTPUT
3 - (NUMBER): 3004 Predicate Information (identified by operation id):
2 - filter(("WPI"."CLIENTID"=:V_COID AND "WPI"."EVT_TYPE_ID"=:V_EVT_TYPE_ID
AND "WPI"."STATUS"=:V_STATUS)) Regards, Nirav
-- http://www.freelists.org/webpage/oracle-lReceived on Sat Oct 28 2017 - 17:52:23 CEST