Dominic Brooks
Client-side database connection timeout insanity
Attributed to Einstein but considered a dubious association, the quote is something like “Insanity is doing the same thing over and over and expecting different results”.
Probably … hopefully… it’s a familiar story.
I can’t be the only one, I’ve seen it so many places.
Connecting applications raise incidents against your database saying their application is erroring repeatedly whilst running a query and receiving an ORA-01013: user requested cancel of current operation.
Investigation shows that yes, sql monitoring can show a bunch of sequential executions of a particular query erroring with ORA-01013 – a really nice feature of sql monitoring that it captures these more often than not.
Nearly all of these errors occur around the 20 minute mark.
Do you have a 20 minute timeout?
Yes.
Why?
I’m not sure whether the mindset is inherited from other areas of the architecture. In the past, I happened to see this much more with applications involved in distributed transactions involving queues etc and which bring with them a whole lot more complexity.
Of course it all depends on what “normal” is for any query but an awful lot of these tend to have “normal” as not a million miles away from timeout and it doesn’t take much to nudge them over the limit – a different data distribution, some other load on the database, etc, etc
But I never understand what value that timeout brings.
No, it’s not “stuck”.
No, it’s not “hanging”.
It’s executing. It’s just not executing as fast as your expectations, possibly baseless, thought.
Of course sometimes we get a plan “flip” and something which was relatively fast becomes relatively slow.
But even then I think the proper application-level alerting and notification of that whilst allowing it to complete the execution where possible is infinitely better than spending hours getting hold of a DBA, raising the right sort of ticket, often telling them what you want profiled/baslined/patched all potentially done under a high priority because the timeout won’t let the relevant process run to completion.
Madness.
ALTER INDEX REBUILD PARTITION ONLINE COMPRESS PARALLEL
After a frustrating period of quite a few minutes but less than an hour of trying various combinations, it seems to be that rebuilding an index partition online with compression is actually a two stage affair although if it explicitly states this in the documentation then I missed it. And I did an online search for similar and neither was the answer stated obviously in the first couple of pages of results or in the documents I was directed to.
I am doing some tests with enabling index key compression – a great feature by the way if it somehow passed you by over the many years it’s been around – and am expecting significant space savings.
In my particular case, it’s about time I played with some “newer” features and whilst the ADVANCED COMPRESSION HIGH scares me off a bit – not least because of some of the associated bug notes on Oracle Support for versions < 23cai – what I’ve read about ADVANCED COMPRESSION LOW holds for me fewer reservations as it seems pretty much like old style key compression but with some intelligence (heuristic not artificial) around optimal # of keys and whether that is suitable for every block.
So the two stage syntax would appear to be:
alter index <index_name> modify partition <partition_name> compress advanced low;
alter index <index_name> rebuild partition <partition_name> online parallel n;
Now as I write this, I have a horrible sense of deja vu that I’ve already written about this in the past but I had a look and I can’t find that either.
Lateral thinking about pushing predicates
Recently I’ve found myself working on a set of data which requires a fair amount of crunching and “de-duplication” of data sent from the system of record (SOR) to identify what was the last event sent for a particular set of keys that meets certain unindexed filter criteria.
Which means I’m wanting to write code which drives from a set of data, and then joins to an inline aggregation view for each row from the driving rowsource.
The aggregation dataset is too large and open-ended to aggregate independently and hash join to the driving data set.
So by instinct I find myself wanting this to operate with a nested loop from the driving rowsource and for each row to push the join predicates into aggregation view.
However, I have found that specifying the correct push predicate hint syntax is a bit awkward and I try to be mindful of those who inevitably will come to the code at a later date to make a change.
There’s a good chance they won’t have heard of predicate pushing unfortunately and I strongly dislike leaving behind a legacy of heavily hinted code, although I make some exceptions.
Having sought out a high level 2nd opinion on a comment on Jonathan Lewis’s blog, I am now having a tendency to write such code using a LATERAL join, or at least experiment with it as an alternative approach.
You might have come across the lateral join not least from when it first started appearing in internal transformations for some ANSI SQL. But here is some documentation on this as a fully fledged SQL feature from 12c onwards:
- https://oracle-base.com/articles/12c/lateral-inline-views-cross-apply-and-outer-apply-joins-12cr1
- https://dfitzjarrell.wordpress.com/2022/05/11/getting-lateral/
Now I’m aware of course that there will be other optimizer transformations which can eliminate the lateral view, not least decorrelation:
But it should never be unexpected that any new feature might experience some degree of bugs and/or wrong result issues.
However I have been encouraged by the fact that in my circumstances, under my conditions, the results are good and the query is executing how I imagined it to with no further direction from hints, etc
With respect to the join conditions, you can specify the lateral view syntax in a couple of ways – either in the standard ON () join conditions, or the slightly strange scope of referencing the outer join conditions within the view itself, example here of the latter.
Rather than giving an emp-dept example of the lateral view which you can find elsewhere, not least in the links already provided and elsewhere, I’m going to provide a cutdown, sanitised snippet of the code I’m using in the real world although I’m sure this doesn’t really help with knowing when you might use this feature because I can’t give you all the background, data, objects from the real world either.
The snippet is from a INSERT SELECT. I’ve cut a whole bunch of information from the plan not least so you don’t see the horrible numbers that the optimizer throws up in terms of estimated rows and cost! (Which might/should be a warning sign that you wish to dig deeper into)
Really I want to just leave an idea of the scenario where I’ve been considering and experimenting with lateral joins.
with subq_active_positions as
(select /*+ */
rrs.f_id
, rrs.f_source
, rrs.reg
, max(rrs.c_id) keep (dense_rank first order by rrs.eventtime desc) c_id
, max(rrs.c_source) keep (dense_rank first order by rrs.eventtime desc) c_source
from regreport_event_state rrs
where 1 = 1
and reg = p_regulation
--
... some additional date filters
--
group by
rrs.f_id
, rrs.f_source
, rrs.reg
having ... some early filtering...
)
, subq_ap_with_eligible_events as
(select /*+ */ p.*, aud.*
from subq_active_positions p
inner join lateral
(select *
from
(select /*+ qb_name(sq1) */
max(rra.rowid) keep (dense_rank first order by rra.eventtime desc) rd
, max(rra.eventtype) keep (dense_rank first order by rra.eventtime desc) last_qualifying_event
, rra.f_id
, rra.f_source
, rra.c_id
, rra.c_source
, rra.reg
from regreport_event_report rra
where 1 = 1
--
and rra.f_id = p.f_id
and rra.f_source = p.f_source
and rra.c_id = p.c_id
and rra.c_source = p.c_source
and rra.reg = p.reg
--
group by
rra.f_id
, rra.f_source
, rra.c_id
, rra.c_source
, rra.reg
) x
where last_qualifying_event = 'Something_or_other') aud
on 1 = 1)
select *
from subq_ap_with_eligible_events;
===========================================================================================================
| Id | Operation | Name |
===========================================================================================================
| 0 | INSERT STATEMENT | |
| 1 | TEMP TABLE TRANSFORMATION | |
| 2 | PX COORDINATOR | |
| 3 | PX SEND QC (RANDOM) | :TQ10001 |
| 4 | LOAD AS SELECT (TEMP SEGMENT MERGE) | SYS_TEMP_0FD9E0321_A450058D |
| 5 | FILTER | |
| 6 | SORT GROUP BY | |
| 7 | PX RECEIVE | |
| 8 | PX SEND HASH | :TQ10000 |
| 9 | SORT GROUP BY | |
| 10 | PX BLOCK ITERATOR | |
| 11 | TABLE ACCESS STORAGE FULL | REGREPORT_EVENT_STATE |
| 12 | PX COORDINATOR | |
| 13 | PX SEND QC (RANDOM) | :TQ40001 |
| 14 | LOAD AS SELECT (HIGH WATER MARK BROKERED) | CANNED_REPORT_DATA |
| 15 | SORT AGGREGATE | |
| 16 | PARTITION HASH SINGLE | |
| 17 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | REGREPORT_EVENT_REPORT |
| 18 | INDEX RANGE SCAN | REGREPORT_EVENT_REPORT_PK |
| 19 | OPTIMIZER STATISTICS GATHERING | |
| 20 | BUFFER SORT | |
| 21 | PX RECEIVE | |
| 22 | PX SEND ROUND-ROBIN | :TQ40000 |
| 23 | SORT ORDER BY | |
| 24 | NESTED LOOPS | |
| 25 | VIEW | |
| 26 | NESTED LOOPS | |
| 27 | PX COORDINATOR | |
| 28 | PX SEND QC (RANDOM) | :TQ30000 |
| 29 | VIEW | |
| 30 | PX BLOCK ITERATOR | |
| 31 | TABLE ACCESS STORAGE FULL | SYS_TEMP_0FD9E0321_A450058D |
| 32 | PX COORDINATOR | |
| 33 | PX SEND QC (RANDOM) | :TQ20001 |
| 34 | VIEW | VW_LAT_75FFB6E0 |
| 35 | FILTER | |
| 36 | SORT GROUP BY | |
| 37 | PX RECEIVE | |
| 38 | PX SEND HASH | :TQ20000 |
| 39 | SORT GROUP BY | |
| 40 | FILTER | |
| 41 | PX PARTITION HASH ITERATOR | |
| 42 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | REGREPORT_EVENT_REPORT |
| 43 | INDEX RANGE SCAN | REGREPORT_EVENT_REPORT_PK |
| 44 | TABLE ACCESS BY USER ROWID | REGREPORT_EVENT_REPORT |
===========================================================================================================
Possibly this sounds a bit esoteric at beast and highly situation specific and yet without a useful standalone data setup to adequately explain the whys and wherefores. It is what it is.
The summary is if you want to join to an inline view and you think you want to push predicates, maybe do some testing around a lateral join to that inline view.
Is data distribution part of your performance checklist?
I’ve been meaning to blog about this for a while but it’s really hard these days to find interesting things to blog about which have the right balance of words vs technical content. Apart from anything else, it’s hard to get the real world evidence anonymised and out of the workplace and I just don’t have the time to model problems from scratch for blog purposes – and few people seem to read blogs these days…
So there’s some performance problem that has been reported and you start digging. Probably like me you have a few usual suspects you will immediately turn to.
But let’s start with what I definitely do NOT start with:
- Lists of objects with stale statistics
- Lists of indexes which some script off the internet has determined need rebuilding
- AWR reports
We need EVIDENCE. Evidence of the problem. Appropriately scoped evidence of the problem. Maybe it’s far too early to look for the WHYs but certainly we’re on the lookout for WHATs. If something took some time, it was actively doing something or actively waiting on something. So let’s look for that/those something/s.
So what do I start with?
- Is it still running? If not when was it running?
- Sources – GV$SESSION, GV$ACTIVE_SESSION_HISTORY, DBA_HIST_ACTIVE_SESS_HISTORY
- At least 8 times out of 10, we’re probably talking about a specific SQL statement eventually identifiable so hopefully you have a SQL ID (and f it’s not that well that could be the start of another set of blog posts)
- Assuming that is what it is, what do you go to next?
For me, it’s probably DBA_HIST_SQLSTAT.
I want to compare other executions, if there are any. Depending on how often it executes and how long it executes for, I might aggregate this by AWR snapshot or by day.
What am I looking for ?
- 9 times out of 10 I expect to find a change in execution plan
- Is there a large variety of very similarly performing plans?
- Does a particular plan standout as particularly “bad”?
- Maybe the optimizer arithmetic makes the mathematical choice quite close but in reality it makes a big difference
- Maybe there’s a set of parameters which suit one way and not another
- Hopefully though you can see evidence in some of the metrics which show you, not necessarily why something happened, but what was the contribution to the performance compared to previous executions.
- For example, you might see that logical IO went through the roof but there can be a myriad of reasons for that, some of which might be caused by other processes
- Note that the SQL_ID might not be constant – when it is it makes this easier
- But maybe your code uses literals so every combination of literals is different but at least we can use FORCE_MATCHING_SIGNATURE to identify other executions
- Or maybe the SQL is completely dynamic which different subqueries, so maybe we’re mining DBA_HIST_SQLTEXT for certain patterns of features of a particular piece of code
Then?
- For me, it’s probably the real time sql monitoring report (RTSM) for that SQL – DBMS_SQLTUNE.REPORT_SQL_MONITOR
- What might jump out there?
- How long has/was it running?
- Was it parallel? How parallel? Was all the parallelism it wanted to run with granted?
- Which operations took the longest?
- Did the optimizer get some of the estimates badly wrong?
- If it did get some badly wrong, would it have made a difference? e.g.
- Might it have had an a NESTED LOOP vs HASH JOIN decision
- Might it have caused undersized workareas which spilled to TEMP?
- If you’re familiar with the application, even the code, does the execution plan make sense to you?
- Would you have done it differently? Maybe driven from a different table?
- Where did the broad categories of time go?
- CPU vs IO vs Application Waits for example
- How much data did you read both logically and physically?
- From the activity detail, if available, are there any wait events which stand out as taking all the time?
- Note that if there are executions in memory, it can be very useful to compare metrics across specific individual executions by looking at GV$SQL_MONITOR
- Maybe nothing is still in memory and you have to resort to historical information – see DBA_HIST_REPORTS and DBMS_AUTO_REPORT.REPORT_REPOSITORY_DETAIL?
So we start to build up a picture of contributing factors, elements which might have been cause, might have been effect, but we start joining up the dots.
As mentioned, very often it’s a simple plan change and you can fix it through your tool of choice – sql profile? Sql plan baseline? Sql patch? Change the code?
Often I find that changing the code is the best long term choice particularly if you look at it and think “well I wouldn’t have written it like that.” So often I find a badly performing piece of SQL is not structured in a natural way.
Anyway, what happens when we find that the execution plan is stable?
We can see that one or some executions definitely took a lot longer.
Perhaps it’s time to start looking at some of the data?
Another great thing about the RTSM reports is it tells you about the binds which were used for those executions.
Maybe you’re using binds and you’re comparing apples to oranges with small dates ranges vs large date ranges, etc, etc
Because most of the time, the problem is already identified using one or both of the sources above, I’m usually pretty slow at starting to dig into the actual parameters passed into the procedures.
At this point, well it depends. I might start looking at the tables involved and start slicing and dicing data to see whether there is some ridiculous data skew.
Twice recently I’ve had very notable issues caused not by an issue with the plan, not by anything happening on the database, but by an abuse of data. Normally by other systems sending data in a way they are not meant to but not necessarily in a way that we can control. Maybe not in a way that can be protected by some contraint or other. But in a way that causes a serious execution issue.
Here is the RTSM report
==========================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows |
| | | | (Estim) | | Active(s) | Active | | (Actual) |
==========================================================================================================================================================
| 0 | SELECT STATEMENT | | | | 1175 | +26 | 1 | 0 |
| 1 | SORT ORDER BY | | 17 | 764K | 1175 | +26 | 1 | 0 |
| 2 | NESTED LOOPS OUTER | | 17 | 764K | 1175 | +26 | 1 | 202K |
| 3 | NESTED LOOPS OUTER | | 17 | 764K | 1175 | +26 | 1 | 202K |
| 4 | VIEW | | 17 | 764K | 1175 | +26 | 1 | 202K |
| 5 | VIEW | | 17 | 764K | 1175 | +26 | 1 | 202K |
| 6 | WINDOW BUFFER | | 17 | 764K | 1178 | +23 | 1 | 202K |
| 7 | COUNT STOPKEY | | | | 3 | +24 | 1 | 1M |
| 8 | PX COORDINATOR | | | | 27 | +0 | 17 | 1M |
| 9 | PX SEND QC (ORDER) | :TQ10001 | 17 | 764K | 3 | +22 | 8 | 1M |
| 10 | VIEW | | 17 | 764K | 3 | +22 | 8 | 1M |
| 11 | SORT ORDER BY STOPKEY | | 17 | 764K | 4 | +21 | 8 | 1M |
| 12 | PX RECEIVE | | 17 | 764K | 3 | +22 | 8 | 2M |
| 13 | PX SEND RANGE | :TQ10000 | 17 | 764K | 3 | +20 | 8 | 2M |
| 14 | SORT ORDER BY STOPKEY | | 17 | 764K | 21 | +2 | 8 | 2M |
| 15 | FILTER | | | | 21 | +2 | 8 | 2M |
| 16 | NESTED LOOPS | | 17 | 764K | 21 | +2 | 8 | 2M |
| 17 | PX BLOCK ITERATOR | | 17 | 764K | 21 | +2 | 8 | 2M |
| 18 | TABLE ACCESS STORAGE FULL | ER_A | 17 | 764K | 21 | +2 | 235 | 2M |
| 19 | PARTITION HASH ITERATOR | | 1 | | 21 | +2 | 2M | 2M |
| 20 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | ER_R | 1 | | 22 | +1 | 2M | 2M |
| 21 | INDEX RANGE SCAN | ER_RPK | 1 | | 22 | +1 | 2M | 2M |
| 22 | PARTITION HASH ITERATOR | | 1 | | 1175 | +26 | 202K | 202K |
| 23 | TABLE ACCESS BY GLOBAL INDEX ROWID | DP_S | 1 | | 1175 | +26 | 202K | 202K |
| 24 | INDEX UNIQUE SCAN | DP_SPK | 1 | | 1175 | +26 | 202K | 202K |
| 25 | VIEW PUSHED PREDICATE | | 1 | 2 | 1175 | +26 | 202K | 202K |
| 26 | SORT GROUP BY | | 1 | 2 | 1175 | +26 | 202K | 202K |
| 27 | PARTITION HASH SINGLE | | 1 | 2 | 1175 | +26 | 202K | 273M |
| 28 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | DP_A | 1 | 2 | 1176 | +25 | 202K | 273M |
| 29 | INDEX RANGE SCAN | DP_APK | 1 | 1 | 1175 | +26 | 202K | 273M |
==========================================================================================================================================================
This comes from an event sourcing system, the long term repository for a variety of events being sent between systems triggering workflows, processing and reporting. And used to pass downstream to other controls and consolidation systems.
See line 25, what we’ve got here is a join to an inline view which is trying to offer some protection against “duplicate” events*.
And we can see that in those metrics.
We do an index range scan on line 29 fed by 202K input rows. But the output of those 202K lookups is 273 million rows which then are deduplicated by the SORT GROUP BY on line 26 back to 202K.
Ouch. That is where the pain is.
If we delve into the data for the bind parameters in question, we start to find examples of 64k “duplicates”. The SQL deduplication protection mechanism is expecting a handful of “duplicates”* not many thousands.
*For technical reasons, these are not duplicates. The upstream SORs have transmitted a very similar event many times in a very short space of time (due to a bug probably). The reporting systems will have generated the reports many times because of those events and so we the long term event sourcing system need to keep them. But the downstream systems try and extract additional meaning from these events and therein lies the problem for this code.
ALTER TABLE NOCOMPRESS
What a difference three words can make.
No, not those ones, these ones MODIFY DEFAULT ATTRIBUTES.
create table domtest
(col1 number
,col2 varchar2(200))
partition by range (col1) interval (1)
(partition p0 values less than (1))
compress for oltp;
> Table created.
insert into domtest
with x as (select level lvl from dual connect by level <= 10)
select lvl, rpad(lvl,200,'X') from x;
> 10 row(s) inserted.
commit;
> Statement processed.
select table_name, def_compression, def_compress_for
from user_part_tables
where table_name = 'DOMTEST';
TABLE_NAME DEF_COMPRESSION DEF_COMPRESS_FOR
DOMTEST ENABLED ADVANCED
select table_name, partition_name, high_value, compression, compress_for
from user_tab_partitions
where table_name = 'DOMTEST';
TABLE_NAME PARTITION_NAME HIGH_VALUE COMPRESSION COMPRESS_FOR
DOMTEST P0 1 ENABLED ADVANCED
DOMTEST SYS_P487584 2 ENABLED ADVANCED
DOMTEST SYS_P487585 3 ENABLED ADVANCED
DOMTEST SYS_P487586 4 ENABLED ADVANCED
DOMTEST SYS_P487587 5 ENABLED ADVANCED
DOMTEST SYS_P487588 6 ENABLED ADVANCED
DOMTEST SYS_P487589 7 ENABLED ADVANCED
DOMTEST SYS_P487590 8 ENABLED ADVANCED
DOMTEST SYS_P487591 9 ENABLED ADVANCED
DOMTEST SYS_P487592 10 ENABLED ADVANCED
DOMTEST SYS_P487593 11 ENABLED ADVANCED
If I want to change the default compression at table level so that new partitions are no longer compressed, then those three words make a big difference.
alter table domtest nocompress;
>Table altered.
This does what we want:
select table_name, def_compression, def_compress_for
from user_part_tables
where table_name = 'DOMTEST';
TABLE_NAME DEF_COMPRESSION DEF_COMPRESS_FOR
DOMTEST DISABLED -
But it does a whole lot we don’t. It doesn’t move the existing data but it does change the partition attributes for new data.
select table_name, partition_name, high_value, compression, compress_for
from user_tab_partitions
where table_name = 'DOMTEST';
TABLE_NAME PARTITION_NAME HIGH_VALUE COMPRESSION COMPRESS_FOR
DOMTEST P0 1 DISABLED -
DOMTEST SYS_P487584 2 DISABLED -
DOMTEST SYS_P487585 3 DISABLED -
DOMTEST SYS_P487586 4 DISABLED -
DOMTEST SYS_P487587 5 DISABLED -
DOMTEST SYS_P487588 6 DISABLED -
DOMTEST SYS_P487589 7 DISABLED -
DOMTEST SYS_P487590 8 DISABLED -
DOMTEST SYS_P487591 9 DISABLED -
DOMTEST SYS_P487592 10 DISABLED -
DOMTEST SYS_P487593 11 DISABLED -
Whereas if we reset and then use the correct syntax from the get-go:
alter table domtest modify default attributes nocompress;
>Table altered.
select table_name, def_compression, def_compress_for
from user_part_tables
where table_name = 'DOMTEST';
TABLE_NAME DEF_COMPRESSION DEF_COMPRESS_FOR
DOMTEST DISABLED -
select table_name, partition_name, high_value, compression, compress_for
from user_tab_partitions
where table_name = 'DOMTEST';
TABLE_NAME PARTITION_NAME HIGH_VALUE COMPRESSION COMPRESS_FOR
DOMTEST P0 1 ENABLED ADVANCED
DOMTEST SYS_P487584 2 ENABLED ADVANCED
DOMTEST SYS_P487585 3 ENABLED ADVANCED
DOMTEST SYS_P487586 4 ENABLED ADVANCED
DOMTEST SYS_P487587 5 ENABLED ADVANCED
DOMTEST SYS_P487588 6 ENABLED ADVANCED
DOMTEST SYS_P487589 7 ENABLED ADVANCED
DOMTEST SYS_P487590 8 ENABLED ADVANCED
DOMTEST SYS_P487591 9 ENABLED ADVANCED
DOMTEST SYS_P487592 10 ENABLED ADVANCED
DOMTEST SYS_P487593 11 ENABLED ADVANCED
Row Cache Object Clones – ORA-00600 [kkdccui2]
We hit a serious problem about 5 months ago when we upgraded one of the development databases from 19.12 to 19.13. Going back to 19.12 solved the problem so definitely for us something introduced in 19.13
Nothing out there at the time on Oracle Support or from other reliable sources.
Our problem happened under row cache load which in particular effected our commit builds when there was heavy dropping and creating of objects across a number of schemas.
Best indication so far is that this is related to the cloning feature of row cache objects, present in 19.13, new feature with aims to automatically create clones of row cache objects if it detects that the object is accessed frequently.
Apparently many bugs related to this feature have already been addressed but suggestion is that disabling the feature might prevent the issue, via the snappily named parameter _bug33046179_kqr_hot_copy_sleep_limit = 0.
Will follow-up with comment on effectiveness.
Partition List Iterator occasionally reading way too many partitions still
I’m triple posting this on oracle-l, my blog and on Oracle community to get some more help.
Apologies if you’ve tried to help before and it didn’t come to anything.
This is a redux of an old issue which I looked at 18 months ago but with a new example of the problem.
https://orastory.wordpress.com/2021/02/22/oracle-19c-upgrade-query-reading-way-more-partitions-than-it-should/
https://community.oracle.com/tech/developers/discussion/4480886/partition-list-iterator-reading-too-many-partitions/
When the problem originally surfaced it was after an upgrade from 11.2.0.4 to 19.6. Now version is 19.12.
Now I have one example which gets executed four or five times a day
The example has two sql ids involved in the process but both are almost identical and have identical sql plans.
Queries in question get executed after specific data loads.
For each data load where it gets executed, it will be doing pretty much the same workload day-on-day.
The query gets supplied a collection which tends to have one or a couple of rows row and each row in the collection provides the values to prune to a specific subpartition.
The issue is that about once a week, an execution will go slow.
Exactly the same sql plan each execution.
When it works, it runs in a couple of seconds max.
When it goes wrong, the partition list iterator decides to read every partition/subpartition.
This is the snippet of the plan in question:
| 4 | NESTED LOOPS | | 39 | 4017 | 334 (2)| 00:00:01 | | |
| 5 | COLLECTION ITERATOR PICKLER FETCH | | 1 | 2 | 29 (0)| 00:00:01 | | |
| 6 | PARTITION LIST ITERATOR | | 3853 | 380K| 305 (2)| 00:00:01 | KEY | KEY |
| 7 | PARTITION LIST ITERATOR | | 3853 | 380K| 305 (2)| 00:00:01 | KEY | KEY |
| 8 | TABLE ACCESS STORAGE FULL | POSITION_BALANCE | 3853 | 380K| 305 (2)| 00:00:01 | KEY | KEY |
There was some speculation about possible causes in the OTN thread.
The best guess was that there is some timeout or invalidation or event or contention which means Oracle decides that it cannot reliably prune.
There might be other data loads ongoing.
There could be a partition operation (add new or exchange) which coincides with the degraded iterator performance.
There are > 70k partitions / subpartitions.
select count(*) from dba_tab_partitions where table_name = 'POSITION_BALANCE';
COUNT(*)
----------
60941
select count(*), sum(case when segment_created = 'YES' then 1 else 0 end) from dba_tab_subpartitions where table_name = 'POSITION_BALANCE';
COUNT(*) SUM(CASEWHENSEGMENT_CREATED='YES'THEN1ELSE0END)
---------- -----------------------------------------------
78154 33705
I’m reasonably comfortable that what I’ve described is roughly what’s happening.
ASH confirms that it’s reading lots of subpartitions
Just got no idea why and how to figure that out.
The average-ish once-a-week occurence makes it tricker.
Even though when it runs in runs in a couple of seconds, I do have some of the ASH samples that show this.
select /*+ parallel(4) */ instance_number, sql_id, sql_plan_hash_value, sql_full_plan_hash_value, sql_exec_id, sql_exec_start, min(sample_time), max(sample_time), max(sample_time) - min(sample_time) duration, count(*) cnt_samples
, sum(case when in_sql_execution = 'Y' then 1 else 0 end) cnt_in_sql_execution
, sum(case when in_parse = 'Y' then 1 else 0 end) cnt_in_parse
, sum(case when in_hard_parse = 'Y' then 1 else 0 end) cnt_in_hard_parse
from dba_hist_active_sess_history h
where 1=1
and h.user_id IN (select user_id from dba_users where username like '%TODS%')
and sql_id in ('fw985yayy1mfz','0gk5hj9wnp2vs')
--and sql_id = 'ft2prc1xx2hmd'
and sql_exec_id is not null
group by instance_number, sql_id, sql_plan_hash_value, sql_full_plan_hash_value, sql_exec_id, sql_exec_start
order by h.sql_exec_start;
INSTANCE_NUMBER SQL_ID SQL_PLAN_HASH_VALUE SQL_FULL_PLAN_HASH_VALUE SQL_EXEC_ID SQL_EXEC_START MIN(SAMPLE_TIME) MAX(SAMPLE_TIME) DURATION CNT_SAMPLES CNT_IN_SQL_EXECUTION CNT_IN_PARSE CNT_IN_HARD_PARSE
--------------- ------------- ------------------- ------------------------ ----------- -------------------- ------------------------------- ------------------------------- ------------------- ----------- -------------------- ------------ -----------------
4 0gk5hj9wnp2vs 2221895109 2867134094 67108864 23-MAY-2022 09:18:24 23-MAY-22 09.18.27.378000000 AM 23-MAY-22 09.18.27.378000000 AM +00 00:00:00.000000 1 1 0 0
3 0gk5hj9wnp2vs 2221895109 2867134094 50331648 24-MAY-2022 07:59:49 24-MAY-22 07.59.50.659000000 AM 24-MAY-22 07.59.50.659000000 AM +00 00:00:00.000000 1 1 0 0
2 0gk5hj9wnp2vs 2221895109 2867134094 33554432 25-MAY-2022 07:09:07 25-MAY-22 07.09.08.709000000 AM 25-MAY-22 07.09.08.709000000 AM +00 00:00:00.000000 1 1 0 0
2 0gk5hj9wnp2vs 2221895109 2867134094 33554434 25-MAY-2022 09:41:53 25-MAY-22 09.41.59.718000000 AM 25-MAY-22 12.41.49.605000000 PM +00 02:59:49.887000 1054 1054 0 0
2 fw985yayy1mfz 2221895109 2867134094 33554434 25-MAY-2022 10:57:41 25-MAY-22 10.57.49.221000000 AM 25-MAY-22 01.56.37.861000000 PM +00 02:58:48.640000 1048 1048 0 0
3 0gk5hj9wnp2vs 2221895109 2867134094 50331648 25-MAY-2022 11:41:04 25-MAY-22 11.41.05.539000000 AM 25-MAY-22 11.41.05.539000000 AM +00 00:00:00.000000 1 1 0 0
3 0gk5hj9wnp2vs 2221895109 2867134094 50331648 27-MAY-2022 08:01:28 27-MAY-22 08.01.30.371000000 AM 27-MAY-22 08.01.30.371000000 AM +00 00:00:00.000000 1 1 0 0
3 fw985yayy1mfz 2221895109 2867134094 50331648 27-MAY-2022 08:01:38 27-MAY-22 08.01.40.611000000 AM 27-MAY-22 08.01.40.611000000 AM +00 00:00:00.000000 1 1 0 0
1 0gk5hj9wnp2vs 2221895109 2867134094 16777218 01-JUN-2022 06:48:24 01-JUN-22 06.48.27.979000000 AM 01-JUN-22 09.48.17.547000000 AM +00 02:59:49.568000 1054 1054 0 0
3 0gk5hj9wnp2vs 2221895109 2867134094 50331649 01-JUN-2022 09:27:03 01-JUN-22 09.27.10.915000000 AM 01-JUN-22 09.27.10.915000000 AM +00 00:00:00.000000 1 1 0 0
3 fw985yayy1mfz 2221895109 2867134094 50331648 02-JUN-2022 07:11:55 02-JUN-22 07.11.57.315000000 AM 02-JUN-22 07.11.57.315000000 AM +00 00:00:00.000000 1 1 0 0
3 0gk5hj9wnp2vs 2221895109 2867134094 50331648 07-JUN-2022 07:04:13 07-JUN-22 07.04.17.155000000 AM 07-JUN-22 07.04.17.155000000 AM +00 00:00:00.000000 1 1 0 0
2 0gk5hj9wnp2vs 2221895109 2867134094 33554434 07-JUN-2022 08:17:56 07-JUN-22 08.17.58.693000000 AM 07-JUN-22 11.17.49.285000000 AM +00 02:59:50.592000 1054 1054 0 0
1 0gk5hj9wnp2vs 2221895109 2867134094 16777216 08-JUN-2022 07:09:22 08-JUN-22 07.09.24.427000000 AM 08-JUN-22 07.09.24.427000000 AM +00 00:00:00.000000 1 1 0 0
2 0gk5hj9wnp2vs 2221895109 2867134094 33554432 08-JUN-2022 08:14:25 08-JUN-22 08.14.26.278000000 AM 08-JUN-22 08.14.26.278000000 AM +00 00:00:00.000000 1 1 0 0
3 fw985yayy1mfz 2221895109 2867134094 50331649 09-JUN-2022 06:51:27 09-JUN-22 06.51.29.219000000 AM 09-JUN-22 06.51.29.219000000 AM +00 00:00:00.000000 1 1 0 0
3 fw985yayy1mfz 2221895109 2867134094 50331648 10-JUN-2022 11:10:13 10-JUN-22 11.10.14.595000000 AM 10-JUN-22 11.10.14.595000000 AM +00 00:00:00.000000 1 1 0 0
2 0gk5hj9wnp2vs 2221895109 2867134094 33554433 13-JUN-2022 08:45:43 13-JUN-22 08.45.45.509000000 AM 13-JUN-22 08.45.45.509000000 AM +00 00:00:00.000000 1 1 0 0
2 fw985yayy1mfz 2221895109 2867134094 33554433 13-JUN-2022 08:45:55 13-JUN-22 08.45.55.749000000 AM 13-JUN-22 08.45.55.749000000 AM +00 00:00:00.000000 1 1 0 0
3 0gk5hj9wnp2vs 2221895109 2867134094 50331648 15-JUN-2022 07:08:19 15-JUN-22 07.08.22.275000000 AM 15-JUN-22 07.08.22.275000000 AM +00 00:00:00.000000 1 1 0 0
3 0gk5hj9wnp2vs 2221895109 2867134094 50331649 15-JUN-2022 09:08:54 15-JUN-22 09.08.56.387000000 AM 15-JUN-22 12.08.46.083000000 PM +00 02:59:49.696000 1054 1054 0 0
1 fw985yayy1mfz 2221895109 2867134094 16777216 16-JUN-2022 06:54:04 16-JUN-22 06.54.05.259000000 AM 16-JUN-22 06.54.05.259000000 AM +00 00:00:00.000000 1 1 0 0
1 0gk5hj9wnp2vs 2221895109 2867134094 16777217 16-JUN-2022 08:54:13 16-JUN-22 08.54.18.891000000 AM 16-JUN-22 11.54.08.844000000 AM +00 02:59:49.953000 1054 1054 0 0
3 fw985yayy1mfz 2221895109 2867134094 50331649 16-JUN-2022 11:16:29 16-JUN-22 11.16.31.491000000 AM 16-JUN-22 11.16.31.491000000 AM +00 00:00:00.000000 1 1 0 0
4 0gk5hj9wnp2vs 2221895109 2867134094 67108865 17-JUN-2022 07:33:10 17-JUN-22 07.33.11.282000000 AM 17-JUN-22 07.33.11.282000000 AM +00 00:00:00.000000 1 1 0 0
4 fw985yayy1mfz 2221895109 2867134094 67108865 17-JUN-2022 07:33:20 17-JUN-22 07.33.21.522000000 AM 17-JUN-22 07.33.21.522000000 AM +00 00:00:00.000000 1 1 0 0
2 0gk5hj9wnp2vs 2221895109 2867134094 33554432 21-JUN-2022 06:59:24 21-JUN-22 06.59.25.728000000 AM 21-JUN-22 06.59.25.728000000 AM +00 00:00:00.000000 1 1 0 0
2 0gk5hj9wnp2vs 2221895109 2867134094 33554433 21-JUN-2022 09:14:05 21-JUN-22 09.14.10.464000000 AM 21-JUN-22 12.14.00.096000000 PM +00 02:59:49.632000 1054 1054 0 0
2 0gk5hj9wnp2vs 2221895109 2867134094 33554434 21-JUN-2022 11:07:36 21-JUN-22 11.07.44.480000000 AM 21-JUN-22 02.07.23.680000000 PM +00 02:59:39.200000 1053 1053 0 0
4 fw985yayy1mfz 2221895109 2867134094 67108865 21-JUN-2022 11:43:04 21-JUN-22 11.43.13.010000000 AM 21-JUN-22 02.42.33.651000000 PM +00 02:59:20.641000 1051 1051 0 0
4 fw985yayy1mfz 2221895109 2867134094 67108866 21-JUN-2022 12:06:49 21-JUN-22 12.06.57.586000000 PM 21-JUN-22 03.06.38.514000000 PM +00 02:59:40.928000 1053 1053 0 0
3 fw985yayy1mfz 2221895109 2867134094 50331649 23-JUN-2022 10:05:02 23-JUN-22 10.05.05.667000000 AM 23-JUN-22 01.04.45.251000000 PM +00 02:59:39.584000 1053 1053 0 0
1 0gk5hj9wnp2vs 2221895109 2867134094 16777216 23-JUN-2022 10:57:04 23-JUN-22 10.57.04.857000000 AM 23-JUN-22 10.57.04.857000000 AM +00 00:00:00.000000 1 1 0 0
This is a section of a RTSM report when it was ok:
Status : DONE (ALL ROWS)
SQL ID : 0gk5hj9wnp2vs
SQL Execution ID : 16777216
Execution Started : 06/15/2022 10:45:41
First Refresh Time : 06/15/2022 10:45:44
Last Refresh Time : 06/15/2022 10:45:44
SQL Plan Monitoring Details (Plan Hash Value=2221895109)
===============================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Mem | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | (Max) | (%) | (# samples) |
===============================================================================================================================================================================================
| 0 | SELECT STATEMENT | | | | 1 | +3 | 1 | 2255 | | | . | | |
| 1 | TEMP TABLE TRANSFORMATION | | | | 1 | +3 | 1 | 2255 | | | . | | |
| 2 | LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_5FD9DBC0F_97EDDA7B | | | 1 | +3 | 1 | 1 | | | 1MB | | |
| 3 | FILTER | | | | 1 | +3 | 1 | 2255 | | | . | | |
| 4 | NESTED LOOPS | | 39 | 713 | 1 | +3 | 1 | 243K | | | . | | |
| 5 | COLLECTION ITERATOR PICKLER FETCH | | 1 | 29 | 1 | +3 | 1 | 4 | | | . | | |
| 6 | PARTITION LIST ITERATOR | | 3853 | 684 | 1 | +3 | 4 | 243K | | | . | | |
| 7 | PARTITION LIST ITERATOR | | 3853 | 684 | 1 | +3 | 4 | 243K | | | . | | |
| 8 | TABLE ACCESS STORAGE FULL | POSITION_BALANCE | 3853 | 684 | 1 | +3 | 4 | 243K | 71 | 33MB | . | | |
| 9 | NESTED LOOPS | | 1 | 9 | 1 | +3 | 67382 | 140 | | | . | | |
| 10 | NESTED LOOPS | | 1 | 7 | 1 | +3 | 67382 | 140 | | | . | | |
| 11 | NESTED LOOPS | | 1 | 3 | 1 | +3 | 67382 | 67382 | | | . | | |
| 12 | TABLE ACCESS BY INDEX ROWID | CLASS_SCHEME | 1 | 1 | 2 | +2 | 67382 | 67382 | | | . | | |
| 13 | INDEX UNIQUE SCAN | UK_CLASS_SCHEME_CODE | 1 | | 1 | +3 | 67382 | 67382 | | | . | | |
| 14 | TABLE ACCESS BY INDEX ROWID BATCHED | CLASS | 1 | 2 | 1 | +3 | 67382 | 67382 | | | . | | |
| 15 | INDEX RANGE SCAN | UK_CLASS_STATUS2 | 1 | 1 | 1 | +3 | 67382 | 67382 | | | . | | |
| 16 | TABLE ACCESS BY INDEX ROWID BATCHED | BOOK_CLASS | 1 | 4 | 1 | +3 | 67382 | 140 | | | . | | |
| 17 | INDEX RANGE SCAN | UK_BOOK_CLASS_STATUS2 | 2 | 2 | 1 | +3 | 67382 | 336K | 230 | 2MB | . | | |
| 18 | INDEX RANGE SCAN | IDX_BOOK_STATUS_01 | 1 | 2 | 1 | +3 | 140 | 140 | | | . | | |
| 19 | VIEW | | 1 | 2 | 1 | +3 | 1 | 2255 | | | . | | |
| 20 | TABLE ACCESS STORAGE FULL | SYS_TEMP_5FD9DBC0F_97EDDA7B | 1 | 2 | | | 1 | | | | . | | |
===============================================================================================================================================================================================
This is a section taken whilst running in the middle of the problem:
Status : EXECUTING
SQL ID : 0gk5hj9wnp2vs
SQL Execution ID : 33554434
Execution Started : 06/21/2022 11:07:36
First Refresh Time : 06/21/2022 11:07:41
Last Refresh Time : 06/21/2022 13:55:15
SQL Plan Monitoring Details (Plan Hash Value=2221895109)
================================================================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Mem | Activity | Activity Detail | Progress |
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | | (%) | (# samples) | |
================================================================================================================================================================================================================================
| 0 | SELECT STATEMENT | | | | | | | | | | . | | | |
| 1 | TEMP TABLE TRANSFORMATION | | | | | | | | | | . | | | |
| 2 | LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_0FD9E59B6_25BEB636 | | | | | 1 | | | | . | | | |
| 3 | FILTER | | | | | | 1 | | | | . | | | |
| 4 | NESTED LOOPS | | 39 | 710 | 1 | +5 | 1 | 0 | | | . | | | |
| -> 5 | COLLECTION ITERATOR PICKLER FETCH | | 1 | 29 | 10061 | +5 | 1 | 1 | | | . | | | |
| 6 | PARTITION LIST ITERATOR | | 3853 | 681 | | | 1 | | | | . | | | |
| 7 | PARTITION LIST ITERATOR | | 3853 | 681 | | | 38175 | | | | . | | | |
| -> 8 | TABLE ACCESS STORAGE FULL | POSITION_BALANCE | 3853 | 681 | 10065 | +1 | 44021 | 0 | 625K | 162GB | . | 100.00 | gc cr block remote read (1) | 100% |
| | | | | | | | | | | | | | gc cr grant 2-way (26) | |
| | | | | | | | | | | | | | gc cr multi block request (85) | |
| | | | | | | | | | | | | | gc current grant 2-way (1) | |
| | | | | | | | | | | | | | Cpu (7530) | |
| | | | | | | | | | | | | | cell multiblock physical read (1946) | |
| | | | | | | | | | | | | | cell single block physical read (215) | |
| | | | | | | | | | | | | | cell smart table scan (15) | |
| 9 | NESTED LOOPS | | 1 | 9 | | | | | | | . | | | |
| 10 | NESTED LOOPS | | 1 | 7 | | | | | | | . | | | |
| 11 | NESTED LOOPS | | 1 | 3 | | | | | | | . | | | |
| 12 | TABLE ACCESS BY INDEX ROWID | CLASS_SCHEME | 1 | 1 | | | | | | | . | | | |
| 13 | INDEX UNIQUE SCAN | UK_CLASS_SCHEME_CODE | 1 | | | | | | | | . | | | |
| 14 | TABLE ACCESS BY INDEX ROWID BATCHED | CLASS | 1 | 2 | | | | | | | . | | | |
| 15 | INDEX RANGE SCAN | UK_CLASS_STATUS2 | 1 | 1 | | | | | | | . | | | |
| 16 | TABLE ACCESS BY INDEX ROWID BATCHED | BOOK_CLASS | 1 | 4 | | | | | | | . | | | |
| 17 | INDEX RANGE SCAN | UK_BOOK_CLASS_STATUS2 | 2 | 2 | | | | | | | . | | | |
| 18 | INDEX RANGE SCAN | IDX_BOOK_STATUS_01 | 1 | 2 | | | | | | | . | | | |
| 19 | VIEW | | 1 | 2 | | | | | | | . | | | |
| 20 | TABLE ACCESS STORAGE FULL | SYS_TEMP_0FD9E59B6_25BEB636 | 1 | 2 | | | | | | | . | | | |
================================================================================================================================================================================================================================
Potentially I could set system level trace for this sql id and wait for it to reoccur but what events would I be setting to try to capture what causes this runtime decision?
I presume it wouldn’t be an optimizer trace?
Any thoughts or ideas?
Reluctant Local Index
I have having a couple of not-completely-related performance issues at the moment but which link back to a couple of queries against large tables choosing to do a FULL segment scan rather than using a non-prefixed local index.
The problems right now are not currently reproducible in or outside of production.
But when I look at a specific point in time at the reproduction of a problem, the system state is not necessarily as it was when the problem was occurring.
My main theory is that somewhere on these large partitioned tables there is intermittently an unusable partition of the index perhaps due some aspect of a data load, data movement or other maintenance operation on a different subpartition (where each partition represents a different day of a specific feed and each subpartition represents a version of that daily load).
However I have no evidence currently to support that theory.
If this theory was in the right ballpark, what would it mean?
That a query using bind variables (tick) would not use the index by default as the optimizer would not be able to guarantee that that shareable plan would not be accessing the unusable.
A query using literals would not be subject to the same restrictions as the optimizer would know, if an unusable subpartition was relevant, that the unusable subpartition was not relevant to this specific query (and any DDL which might make a relevant partition unusable would invalidate the shared cursor).
Whilst I was musing over this yesterday, I had a vague sense of familiarity and I found these previous posts to job my memory:
But just to illustrate my theory…
Let’s start with a roughly representative table:
create table t1
(pkey varchar2(24) not null
,spkey number not null
,id varchar2(24) not null
,version number not null
,status varchar2(24) not null
,filler varchar2(255))
partition by list (pkey)
subpartition by list (spkey)
subpartition template
(subpartition sp_1 values (1))
(partition p_0 values ('X'));
alter table t1 add partition p_abc values ('ABC');
alter table t1 add partition p_def values ('DEF');
alter table t1 add partition p_ghi values ('GHI');
create index i1 on t1 (id) local;
create sequence s1 start with 10000000;
create type o1 is object
(pkey varchar2(24)
,spkey number
,id varchar2(24));
/
create type c1 is table of o1;
/
select object_name, subobject_name, object_type from user_objects where object_name = 'T1' order by object_name, subobject_name nulls first;
OBJECT_NAM SUBOBJECT_ OBJECT_TYPE
---------- ---------- -----------------------
T1 TABLE
T1 P_0 TABLE PARTITION
T1 P_0_SP_1 TABLE SUBPARTITION
T1 P_ABC TABLE PARTITION
T1 P_ABC_SP_1 TABLE SUBPARTITION
T1 P_DEF TABLE PARTITION
T1 P_DEF_SP_1 TABLE SUBPARTITION
T1 P_GHI TABLE PARTITION
T1 P_GHI_SP_1 TABLE SUBPARTITION
select object_name, subobject_name, object_type from user_objects where object_name = 'I1' order by object_name, subobject_name nulls first;
OBJECT_NAM SUBOBJECT_ OBJECT_TYPE
---------- ---------- -----------------------
I1 INDEX
I1 P_0 INDEX PARTITION
I1 P_0_SP_1 INDEX SUBPARTITION
I1 P_ABC INDEX PARTITION
I1 P_ABC_SP_1 INDEX SUBPARTITION
I1 P_DEF INDEX PARTITION
I1 P_DEF_SP_1 INDEX SUBPARTITION
I1 P_GHI INDEX PARTITION
I1 P_GHI_SP_1 INDEX SUBPARTITION
And I’m going to seed some data simply:
declare
p sys.odcivarchar2list := sys.odcivarchar2list('ABC','DEF','GHI');
begin
for i in 1 .. p.count
loop
for j in 1 .. 10000
loop
insert into t1
values (p(i), 1, 'PB:'||s1.nextval, 1, 'LATEST', rpad('X',255,'X'));
end loop;
end loop;
commit;
end;
/
select pkey, spkey, count(*), min(id), max(id) from t1 group by pkey, spkey;
PKEY SPKEY COUNT(*) MIN(ID) MAX(ID)
---- ---------- ---------- ------------------------ ------------------------
ABC 1 10000 PB:10000000 PB:10009999
DEF 1 10000 PB:10010000 PB:10019999
GHI 1 10000 PB:10020000 PB:10029999
And then just to keep it representative to my real world problem, I’m going to run a bulk update to set the status of some of the rows to SUPERSEDED (and which is then in the real world followed by the INSERT of some LATEST versions of those rows supplied by the client but no need to do that here):
declare
v1 c1 := c1(o1('DEF',1,'PB:10010001'),
o1('DEF',1,'PB:10010002'),
o1('DEF',1,'PB:10010003'),
o1('DEF',1,'PB:10010004'),
o1('DEF',1,'PB:10010005'));
begin
forall i in 1 .. v1.count
update /*+ find_me_dom */
t1
set status = 'SUPERSEDED'
where pkey = v1(i).pkey
and spkey = v1(i).spkey
and id = v1(i).id;
end;
/
And I’m going to lookup my sql id from v$sql and plug it into DBMS_XPLAN to see how my UPDATE performed:
select * from table(dbms_xplan.display_cursor('bjddz8c4jrk3y',0));
-------------------------------------------------------------
| Id | Operation | Name |
-------------------------------------------------------------
| 0 | UPDATE STATEMENT | |
| 1 | UPDATE | T1 |
| 2 | PARTITION LIST SINGLE | |
| 3 | PARTITION LIST SINGLE | |
| 4 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1 |
| 5 | INDEX RANGE SCAN | I1 |
-------------------------------------------------------------
Range Scan of I1 as desired
Now I will mark one subpartition of the index unusable (different subpartition from one affected by the update) and repeat the update, the getting of the SQL and the fetching of the plan (different child number):
alter index i1 modify subpartition P_ABC_SP_1 unusable;
--Repeat dml
--lookup sql
select * from table(dbms_xplan.display_cursor('bjddz8c4jrk3y',1));
-- ^ different child cursor number because marking the index subpartition as unusable
-- has invalidated the previous child cursor 0
---------------------------------------------
| Id | Operation | Name |
---------------------------------------------
| 0 | UPDATE STATEMENT | |
| 1 | UPDATE | T1 |
| 2 | PARTITION LIST SINGLE | |
| 3 | PARTITION LIST SINGLE | |
| 4 | TABLE ACCESS STORAGE FULL| T1 |
---------------------------------------------
The access method is no longer an index range scan.
As I covered initially, as part of the parse process the execution plan has to be good for all possible ranges of supplied values in the event that that SQL child cursor is shared for executions with different binds than those initially parsed with. Because an update (or select etc, etc) against rows for table subpartition (ABC,1) cannot use the index because that index subpartition is unusable, the optimizer has to discount it.
We can force the index:
declare
v1 c1 := c1(o1('DEF',1,'PB:10010001'),
o1('DEF',1,'PB:10010002'),
o1('DEF',1,'PB:10010003'),
o1('DEF',1,'PB:10010004'),
o1('DEF',1,'PB:10010005'));
begin
forall i in 1 .. v1.count
update /*+ find_me_dom index(t1 (id))*/
t1
set status = 'SUPERSEDED'
where pkey = v1(i).pkey
and spkey = v1(i).spkey
and id = v1(i).id;
end;
/
select * from table(dbms_xplan.display_cursor('3m2xyxgruxkpr',0));
-------------------------------------------------------------
| Id | Operation | Name |
-------------------------------------------------------------
| 0 | UPDATE STATEMENT | |
| 1 | UPDATE | T1 |
| 2 | PARTITION LIST SINGLE | |
| 3 | PARTITION LIST SINGLE | |
| 4 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1 |
| 5 | INDEX RANGE SCAN | I1 |
-------------------------------------------------------------
But note, as covered in my earlier link, if we force the index in a situation where it can’t be used, it will error in current versions:
alter index i1 modify subpartition P_DEF_SP_1 unusable;
declare
v1 c1 := c1(o1('DEF',1,'PB:10010001'),
o1('DEF',1,'PB:10010002'),
o1('DEF',1,'PB:10010003'),
o1('DEF',1,'PB:10010004'),
o1('DEF',1,'PB:10010005'));
begin
forall i in 1 .. v1.count
update /*+ find_me_dom index(t1 (id))*/
t1
set status = 'SUPERSEDED'
where pkey = v1(i).pkey
and spkey = v1(i).spkey
and id = v1(i).id;
end;
/
ORA-01502: index 'DOM.I1' or partition of such index is in unusable state
ORA-06512: at line 8
01502. 00000 - "index '%s.%s' or partition of such index is in unusable state"
*Cause: An attempt has been made to access an index or index partition
that has been marked unusable by a direct load or by a DDL
operation
*Action: DROP the specified index, or REBUILD the specified index, or
REBUILD the unusable index partition
If I rebuild that second partition again, then I can look at my second option – using literals not binds, at least for the partition keys, but really we then lose the option of FORALL.
alter index i1 rebuild subpartition P_DEF_SP_1;
declare
v1 c1 := c1(o1('DEF',1,'PB:10010001'),
o1('DEF',1,'PB:10010002'),
o1('DEF',1,'PB:10010003'),
o1('DEF',1,'PB:10010004'),
o1('DEF',1,'PB:10010005'));
begin
for i in 1 .. v1.count
loop
execute immediate
'update /*+ find_me_dom */
t1
set status = ''SUPERSEDED''
where pkey = '''||v1(i).pkey||''''||'
and spkey = '||v1(i).spkey||'
and id = :1' using v1(i).id;
end loop;
end;
/
select * from table(dbms_xplan.display_cursor('5a3vcac58x32q',0));
----------------------------------------
| Id | Operation | Name |
----------------------------------------
| 0 | UPDATE STATEMENT | |
| 1 | UPDATE | T1 |
| 2 | PARTITION LIST SINGLE | |
| 3 | PARTITION LIST SINGLE| |
| 4 | INDEX RANGE SCAN | I1 |
----------------------------------------
At least we have options whilst playing a waiting game to see if we can observe a problem state which might cause such an issue…. or wait for other possibilities to make themselves known…
Need an Ace Up Your Sleeve?
I’m open to new opportunities.
There’s a bit about me here https://orastory.wordpress.com/about-2/ and there are links to my linkedin profile both there and in the side nav
I’m currently an Oracle Ace (for a few more weeks at least!), I love writing code, digging out significant performance gains in existing processes and troubleshooting problems. Simples.
If you’re interested, please drop me a mail, add a comment, send me a linkedin message, anything.
Performance Issue with inline view, UNION ALL and Virtual Column
This is the distillation of a performance problem in some regulatory reporting where a mistake in the optimizer cardinality estimates causes all sorts of knock-on performance issues. This post does not look at those knock-on effects but examines what is happening with the optimizer in this case. Most of the digging in the issue belongs to Valerii Satybaldyev and thanks to Sayan Malakshinov, notably for the contributions on the oracle-l thread.
Script to reproduce (currently on 19.6):
drop table t1;
create table t1
(id number not null
,version number not null
,create_ts timestamp not null
,modify_ts timestamp
,status varchar2(24) generated always as (NVL2("MODIFY_TS",'SUPERSEDED','LATEST'))
,id2 number not null
,yn varchar2(1) not null
,business_date date not null);
insert into t1
(id, version, create_ts, id2, yn, business_date)
select rownum
, 1
, systimestamp
, rownum
, case when mod(rownum,2) = 1 then 'Y' else 'N' end
, trunc(sysdate,'MON') + mod(rownum,10)
from dual
connect by level <= 1000;
exec dbms_stats.gather_table_stats(USER,'T1');
explain plan for
with x as
(select * from t1
union all
select * from t1)
select *
from x
where yn = 'Y';
select * from table(dbms_xplan.display);
The problem comes with the estimate of 1 row in the outermost SELECT, particularly for subsequent joins this cardinality estimate causes significant performance problems.
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 42 | 547 (0)| 00:00:01 |
| 1 | VIEW | | 1000 | 40000 | 548 (0)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
|* 3 | TABLE ACCESS STORAGE FULL| T1 | 500 | 20000 | 274 (0)| 00:00:01 |
|* 4 | TABLE ACCESS STORAGE FULL| T1 | 500 | 20000 | 274 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - storage("T1"."YN"='Y')
filter("T1"."YN"='Y')
4 - storage("T1"."YN"='Y')
filter("T1"."YN"='Y')
If we look at the 10053 trace, we can see the root cause of the problem.
Compare the BASE STATISTICAL INFORMATION snippets for T1 and for the inline view X.
Note in particular the numbers for column YN which is the predicate as well as for ID2 and BUSINESS_DATE.
Note that these are columns which appear after the virtual column STATUS.
(in the original trace file, the entry under T1 for column #8 comes before column #7, I have reordered below just to make the correlation between T1 and X more obvious)
Table Stats::
Table: T1 Alias: T1
#Rows: 1000 SSZ: 0 LGR: 0 #Blks: 1006 AvgRowLen: 40.00 NEB: 0 ChainCnt: 0.00 ScanRate: 0.00 SPC: 0 RFL: 0 RNF: 0 CBK: 0 CHR: 0 KQDFLG: 1
#IMCUs: 0 IMCRowCnt: 0 IMCJournalRowCnt: 0 #IMCBlocks: 0 IMCQuotient: 0.000000
Column (#8): BUSINESS_DATE(DATE)
AvgLen: 8 NDV: 10 Nulls: 0 Density: 0.100000 Min: 2459336.000000 Max: 2459345.000000
Column (#7): YN(VARCHAR2)
AvgLen: 2 NDV: 2 Nulls: 0 Density: 0.500000
Column (#6): ID2(NUMBER)
AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 1.000000 Max: 1000.000000
Column (#4): MODIFY_TS(TIMESTAMP)
AvgLen: 1 NDV: 0 Nulls: 1000 Density: 0.000000
Column (#3): CREATE_TS(TIMESTAMP)
AvgLen: 11 NDV: 1 Nulls: 0 Density: 1.000000 Min: 2459347.407141 Max: 2459347.407141
Column (#2): VERSION(NUMBER)
AvgLen: 3 NDV: 1 Nulls: 0 Density: 1.000000 Min: 1.000000 Max: 1.000000
Column (#1): ID(NUMBER)
AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 1.000000 Max: 1000.000000
Table Stats::
Table: X Alias: X (NOT ANALYZED)
#Rows: 2000 SSZ: 0 LGR: 0 #Blks: 2012 AvgRowLen: 40.00 NEB: 0 ChainCnt: 0.00 ScanRate: 0.00 SPC: 0 RFL: 0 RNF: 0 CBK: 0 CHR: 0 KQDFLG: 1
#IMCUs: 0 IMCRowCnt: 0 IMCJournalRowCnt: 0 #IMCBlocks: 0 IMCQuotient: 0.000000
Column (#8): BUSINESS_DATE(DATE) NO STATISTICS (using defaults)
AvgLen: 7 NDV: 0 Nulls: 0 Density: 0.000000
Column (#7): YN(VARCHAR2)
AvgLen: 8 NDV: 10 Nulls: 0 Density: 0.100000 Min: 2459336.000000 Max: 2459345.000000
Column (#6): ID2(NUMBER)
AvgLen: 2 NDV: 2 Nulls: 0 Density: 0.500000
Column (#5): STATUS(VARCHAR2)
AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 1.000000 Max: 1000.000000
Column (#4): MODIFY_TS(TIMESTAMP)
AvgLen: 1 NDV: 0 Nulls: 2000 Density: 0.000000 Min: 0.000000 Max: 0.000000
Column (#3): CREATE_TS(TIMESTAMP)
AvgLen: 11 NDV: 1 Nulls: 0 Density: 1.000000 Min: 2459347.407141 Max: 2459347.407141
Column (#2): VERSION(NUMBER)
AvgLen: 3 NDV: 1 Nulls: 0 Density: 1.000000 Min: 1.000000 Max: 1.000000
Column (#1): ID(NUMBER)
AvgLen: 4 NDV: 1000 Nulls: 0 Density: 0.001000 Min: 1.000000 Max: 1000.000000
For inline view X, column STATUS inherits the stats for ID2, ID2 gets those of YN, YN those of BUSINESS_DATE and BUSINESS_DATE gets NO STATISTICS.
So for our predicate on YN we get completely the wrong base statistics.
Then in addition, because the supplied value is out of range from the wrong statistics, estimates get further pro-rated.
Potential workaround is not risk-free as it moves the implicit order of the selected columns which could affect certain code.
alter table t1 modify status invisible;
alter table t1 modify status visible;
explain plan for
with x as
(select * from t1
union all
select * from t1)
select *
from x
where yn = 'Y';
select * from table(dbms_xplan.display);
Plan hash value: 3505968351
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 90000 | 548 (0)| 00:00:01 |
| 1 | VIEW | | 1000 | 90000 | 548 (0)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
|* 3 | TABLE ACCESS STORAGE FULL| T1 | 500 | 20000 | 274 (0)| 00:00:01 |
|* 4 | TABLE ACCESS STORAGE FULL| T1 | 500 | 20000 | 274 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Reason it works is that the virtual column is now logically at the end of the column definitions.
If we compare the trace once “fixed”, then we see the following in the X section:
Table: X Alias: X (NOT ANALYZED)
...
Column (#8): STATUS(VARCHAR2) NO STATISTICS (using defaults)
Online Statistics Gathering
Online Statistics Gathering, https://sqlmaria.com/2017/01/03/online-statistics-gathering/, is a pretty cool feature for the ages-old version that is Oracle 12.1, especially data warehouses.
As a recap, when you do direct path load into an empty segment, Oracle will take advantage of gathering statistics on the data whilst it is loading it, i.e no need to go back and re-read in an explicit stats gathering.
Note that, per Maria’s note, only base table and column statistics are gathered, not histograms but those can be gathered without re-gathering base column statistics via GATHER AUTO.
Old news.
We have only recently upgraded a complex application from 11.2.0.4 and hit an issue which is really obvious in hindsight but just worth pointing out, particular in combination with another feature?
I’m sure the behaviour below will come as no surprise in 11.2.0.4
drop table t1;
create global temporary table t1
on commit preserve rows
as
select * from user_objects where rownum <= 100;
explain plan for
select * from t1;
select * from table(dbms_xplan.display);
select num_rows from user_tables where table_name = 'T1';
select column_name, num_nulls, num_distinct from user_tab_columns where table_name = 'T1';
The output of note:
Plan hash value: 3617692013
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 19000 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS STORAGE FULL | T1 | 100 | 19000 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
NUM_ROWS
----------
COLUMN_NAME NUM_NULLS NUM_DISTINCT
-------------------- ---------- ------------
GENERATED
SECONDARY
NAMESPACE
EDITION_NAME
OBJECT_NAME
SUBOBJECT_NAME
OBJECT_ID
DATA_OBJECT_ID
OBJECT_TYPE
CREATED
LAST_DDL_TIME
TIMESTAMP
STATUS
TEMPORARY
We load 100 rows, there are no stats, dynamic sampling kicks in, bingo the optimizer expects 100 rows.
Then for > 11.2.0.4 ( in this case 19.6), with the addition of SCOPE column in the script for 19c, we see that we now have table and column statistics after the load despite no explicit stats gathering call. Which, given the feature we’re discussing, should be completely obvious and expected.
I’ve used a temporary table here deliberately purely to facilitate a comment at the end. Clearly this could be a normal heap table – but the SCOPE in the output is specific to the new feature of temporary table private statistics.
Plan hash value: 3617692013
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 11300 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS STORAGE FULL | T1 | 100 | 11300 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------
Note
-----
- Global temporary table session private statistics used
NUM_ROWS SCOPE
---------- -------
SHARED
100 SESSION
COLUMN_NAME NUM_NULLS NUM_DISTINCT SCOPE
--------------------- ---------- ------------ -------
OBJECT_NAME 0 7 SESSION
SUBOBJECT_NAME 7 93 SESSION
OBJECT_ID 0 100 SESSION
DATA_OBJECT_ID 3 97 SESSION
OBJECT_TYPE 0 3 SESSION
CREATED 0 95 SESSION
LAST_DDL_TIME 0 97 SESSION
TIMESTAMP 0 96 SESSION
STATUS 0 1 SESSION
TEMPORARY 0 1 SESSION
GENERATED 0 2 SESSION
SECONDARY 0 1 SESSION
NAMESPACE 0 1 SESSION
EDITION_NAME 100 0 SESSION
SHARING 0 1 SESSION
EDITIONABLE 100 0 SESSION
ORACLE_MAINTAINED 0 1 SESSION
APPLICATION 0 1 SESSION
DEFAULT_COLLATION 94 1 SESSION
DUPLICATED 0 1 SESSION
SHARDED 0 1 SESSION
CREATED_APPID 100 0 SESSION
CREATED_VSNID 100 0 SESSION
MODIFIED_APPID 100 0 SESSION
MODIFIED_VSNID 100 0 SESSION
Again, we’ve loaded 100 rows, as part of the direct path load, we have some base & column statstics, the optimizer expects 100 rows. No dynamic sampling necessary
But what if our data is loaded in multiple steps, might this new online statistics gathering cause us a bit of a problem?
It only gathers the stats when the segment is empty.
I’m not going to repeat the output above for the table & column statistics, just the execution plan detail.
drop table t1;
create global temporary table t1
on commit preserve rows
as
select * from user_objects where rownum <= 100;
insert /*+ append */ into t1
select * from user_objects
commit;
insert /*+ append */ into t1
select * from user_objects
commit;
explain plan for
select * from t1;
Which in 11.2.0.4 gives:
Plan hash value: 3617692013
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 32376 | 6007K | 93 (0)| 00:00:01 |
| 1 | TABLE ACCESS STORAGE FULL | T1 | 32376 | 6007K | 93 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
select count(*) from t1;
COUNT(*)
----------
34392
And in 19.6
Plan hash value: 3617692013
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 11300 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS STORAGE FULL | T1 | 100 | 11300 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------
Note
-----
- Global temporary table session private statistics used
select count(*) from t1;
COUNT(*)
----------
59712*
*With an obvious difference in actual rows, because these are different environments on different versions.
The problem now is that our subsequent processing code which drives off a table populated with different volumes of data in multiple stages used to work fine in 11.2.0.4 but is now at risk of some serious consequences from having cardinality mis-estimates thanks to inaccurate statistics from our new feature which is behaving as it should.
In our real-world example, it was not always the case that there was a big skew between the first load and the subsequent loads and actually this consequence, obvious in hindsight, took a few weeks to reveal itself until that first load into an empty segment happened to be very small followed by some other loads which were much larger.
Furthermore, this also involved the private temporary table aspect and so when you’re looking at solutions, you might consider whether some explicit stats gathering at the end of the load is the solution – in which case should it be private stats or shared stats or whether it even matters when quite possibly the easiest solution is to just use dynamic sampling in the subsequent select/s.
I was reminded of this issue when writing this other article earlier on a more serious problem that online statistics gathering was causing:
Real-time Statistics ORA-00600s / Integer Overflow
Unfortunately this is just an anecdotal long after-the-fact note on issues observed on upgrade from 11.2.0.4 to 19.6 with new feature real-time statistics. I thought I had joted this down previously, but doesn’t appear to be another article published or in draft. Issue was not reproducible outside of the application unfortunately and was also intermittent.
Feature was causing several issues in one particular area of the application.
First off was ORA-00600s in the processing with immediate disconnect on the processing session (apart from trace file generation)
Second was sometimes leaving evidence of integer overflow issues in column statistics.
Third, “corrupt” statistics then leading to subsequent bad execution plans for complicated queries evidenced by MERGE JOIN / MERGE JOIN CARTESIAN and row estimates of max 18E.
Examples of the latter complex queries with cardinality issues leading to bad plans and severe performance problems, when distilled to simplest form, could be seen as follows:
explain plan for
SELECT *
FROM XXXXX
WHERE business_date = :busines_date
AND id = :id
AND processed_flag = 'Y';
select * from table(dbms_xplan.display);
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 18E| 15E| 647K (1)| 00:00:26 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| XXXXXXXXXXXX | 18E| 15E| 647K (1)| 00:00:26 |
|* 2 | INDEX RANGE SCAN | XXXXXXXXXXXX | 733K| | 1538 (1)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Note
-----
- dynamic statistics used: statistics for conventional DML
explain plan for
SELECT /*+ opt_param('optimizer_features_enable','18.1.0') */ *
FROM XXXXX
WHERE business_date = :busines_date
AND id = :id
AND processed_flag = 'Y';
select * from table(dbms_xplan.display);
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 1405 | 13 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| XXXXXXXXXXXX | 5 | 1405 | 13 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | XXXXXXXXXXXX | 10 | | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
The root cause corruption evidenced as follows:
select num_nulls, notes from dba_tab_col_statistics where....
NUM_NULLS SAMPLE_SIZE NOTES
---------------------- ----------- ------------------
-917567866115141580119 267 STATS_ON_CONVENTIONAL_DML
500 267 STATS_ON_CONVENTIONAL_DML
500 267 STATS_ON_CONVENTIONAL_DML
-10558214104677900 267 STATS_ON_CONVENTIONAL_DML
-917567866114408000000 267 STATS_ON_CONVENTIONAL_DML
-917567866114408000000 267 STATS_ON_CONVENTIONAL_DML
-115153305100 267 STATS_ON_CONVENTIONAL_DML
It should be obvious that NUM_NULLs cannot be < 1 and also some strange, unsuitable values in some parts on LOW_VALUE which I’ve omitted from output above.
Traction with Oracle Support was disappointing and, unfortunately, once they knew we didn’t have a problem if we just turned the feature off, that was good enough for them and to be honest we just needed to move on in our testing – this was back in August 2020.
So, feature real-time statistics is turned off across the board via the parameters below which being underscore parameters we had Oracle Support’s blessing to do.
_optimizer_use_stats_on_conventionaml_dml
_optimizer_gather_stats_on conventional_dml
Oracle 19c Upgrade: Query hanging on PX Deq: Join Ack Part 2
Previously I had blogged about an intermittent problem which was affecting one of our feeds where a parallel query would just hang as the QC would wait forever on a parallel slave which never joined the party.
The theory was the parallel slave was affected by another issue – possibly the slave process died and until PMON cleans it up, the QC assumes it is still alive (ref doc id 250960.1)
Intermittent problems can be a right pain but this issue mainly affected one particular process most days kicking off between 1 and 2 am – same process runs 30-40 times per day but only tended to have the issue once so it seemed somewhat data dependent – but also would affect some other processes in different schemas very occasionally.
Over time, evidence built from trace files suggest that the issue was mostly related to:
ORA-00600: internal error code, arguments: [17182]
or
ORA-00600: internal error code, arguments: [17114]
or both in the same error stack
If the QC received the error, connection would terminate instantly (apart from trace file generation), if the slave received the error then QC would hang.
One patch – 31045929 – suggested by Oracle Support was applied but was not effective.
Second patch – 28907129 – similar.
Ultimately, a workaround, which was tested to validate that the second patch was in the right ballpark, was effective and that was to disable _optimizer_gather_stats_on_load (https://sqlmaria.com/2017/01/03/online-statistics-gathering/)
It’s disappointing to have to turn off yet another “new” feature – this one related to stats gathering for direct path into an empty segment – when we have already had to turn off real-time statistics for conventional dml previously ( link to future post – I thought I had blogged about this but seems not) but hopefully we will still be able to pursue patch fix to success.
Count Rows in Partition Where…
Just an illustration of a SQL / XML utility to count rows in a partition.
In my case, my requirement was to count rows in all subpartitions where there is data which might not belong in that partition (due to NOVALIDATE partition exchange). My thoughts were that I wanted the equivalent of a NESTED LOOP – for each partition… do XYZ and was happy for this to run for a while in the background without consuming significant parallel resources.
Partition keys here are not nullable.
with s as (select /*+ */ sp.table_owner, sp.table_name, sp.subpartition_name , sn.snapshot_id, sn.version from dba_tab_subpartitions sp join my_snaphot_table sn on (sn.subpartition_name = sp.subpartition_name) where sp.table_owner = :owner and sp.table_name = :table_name) select s.*, x.cnt from s , xmltable('for $i in /ROWSET/ROW/CNT return $i' passing xmltype( dbms_xmlgen.getxml ('select /*+ parallel(2) */ count(*) cnt ' ||'from '||s.table_owner||'.'||s.table_name||' SUBPARTITION ('||s.subpartition_name||') ' ||'where (snapshot_id, snapshot_version) != (('''||s.snapshot_id||''', '||s.version||'))' )) columns cnt number path '/') x where x.cnt > 0;
Oracle 19c Upgrade: Bug 31602782 – Same Instance Slave Parse Failure Flood Control
It’s almost a universal truth that for any non-trivial application you can expect to hit one or two serious incidents in production post-upgrade, most often due to some combination of workload and/or timing which just didn’t happen in testing beforehand.
In this instance (/these instances), it was the bug 31602782, the patch for which has the above title, the doc id description being “Contention on CURSOR: Pin S wait on X when PQ slave’s execution plan does not match with QC.
This has been around since 12.2 and is first included in 20.1. The doc describes it as
When a parallel query is executed, QC sends parse message to all slaves in all instances…. if the slave could not build the cursor with the same plan as QC and could not join, another slave tries the same and so on until all slaves get exhausted
In practice, what this meant for us was avery large and significant contention on library cache lock and cursor: pin s wait for x affecting all data loads, large parsing backlogs, issues with anything then wanting library cache lock on the same underlying objects – e.g.partition maintenance operations, etc
There were two specific processes involved and had we not got a bug diagnosis when we did, I would have started to rewrite those processes. It is not uncommon that the flows which end up hitting bugs and performance issues have mileage left in them in terms of optimisation. In hindsight, I believe we did hit this problem in UAT but not in this area and in UAT I had rewritten the process in question as it was problematic anyway,
Certainly once this patch was applied, our most urgent post-upgrade issues were largely resolved.
As an aside, I would urge anyone looking at an upgrade to review beforehand the list of Important Recommended One-off Patches which for 19c is Doc Id: 2720807.1
Oracle 19c Upgrade: Query hanging on PX Deq: Join Ack
Next installment of issues observed post Oracle Upgrade (11.2.0.4 to 19.6)
Intermittent query affecting a number of SQL statement, but one particular SQL more than any other.
Event PX Deq: Join Ack is usually seen briefly when the QC (Query Co-ordinator) has to build the slave sets in order to execute a SQL statement in parallel. A message is sent to each slave and then the QC waits for acknowledgement (reference Doc Id: 250960.1)
On my application, when this problem occurs, the QC will hang forever waiting for the acknowledgement from one or a slow sequence of slaves. Longest I saw before intervention was 3 days.
Per that note, possible reason for prolonged wait times is that something happened to the PQ slave process. Unless/until PMON cleans up the internal structures, any new QC will assume the process is alive. Another possible cause is depletion of SGA memory such that the slave cannot establish its response channel.
With us, this issue is currently under investigation with Oracle Support but is believed to correlate to memory corruption errors:
ORA-00600: internal error code, arguments: [17182]
which ties back to bug 31045929: PGA Memory Corruption Caused By Cursor Frame Overrun
Oracle 19c Upgrade: Query reading way more partitions than it should
Still a work in progress… but despite a query plan which details that partition pruning should be happening, evidence (both physical reads and obvious timing) suggest that an OLTP-style query is reading all table partitions.
There is no evidence to suggest that this was happening before the upgrade (11.2.0.4 to 19.6) and plenty of evidence of it afterwards. Not consistent suggesting some significant triggering event.
Oracle: Groundbreakers Developer Community https://community.oracle.com/tech/developers/discussion/4480886/partition-list-iterator-reading-too-many-partitions/ currently has all the latest information. If I ever get to a conclusion, then I might collate & summarise here.
Exchange Partition Validation
Testing an upgrade from 11.2.0.4 to 19.6, we observed a very significant degradation in partition exchange performance.
Is it a case of expected behaviour or not?
The documentation is quite clear in both 11.2 and 19c:
https://docs.oracle.com/cd/E11882_01/server.112/e25523/part_admin002.htm#i1107555
When you specify WITHOUT
VALIDATION
for the exchange partition operation, this is normally a fast operation because it involves only data dictionary updates. However, if the table or partitioned table involved in the exchange operation has a primary key or unique constraint enabled, then the exchange operation is performed as if WITH
VALIDATION
were specified to maintain the integrity of the constraints.
Personally, in a data warehouse with exchange partition I would tend to have defined such a constraint as ENABLE NOVALIDATE if it was even required.
The CONSTRAINT is a PK constraint backed up a local index, partition keys as leading columns.
The trace file points to a clear culprit:
select /*+ first_rows(1) ordered */ 1
from "X"."FB_P_161261_SP_9" A? ,"X"."FACILITY_BALANCE" B
where "A"."SNAPSHOT_ID" = "B"."SNAPSHOT_ID" and "A"."SNAPSHOT_VERSION" = "B"."SNAPSHOT_VERSION" and "A"."FACILITY_BALANCE_ID" = "B"."FACILITY_BALANCE_ID" and "A"."VERSION" = "B"."VERSION"
and ( tbl$or$idx$part$num("X"."FACILITY_BALANCE",0,0,0 ,B? .rowid ) < 62358 or tbl$or$idx$part$num("X"."FACILITY_BALANCE",0,0,0 ,B? .rowid ) > 62358 )
and tbl$or$idx$part$num("X"."FACILITY_BALANCE",0,0,0 ,A? ."SNAPSHOT_ID" ,A? ."SNAPSHOT_VERSION" ) <> 62358 and rownum < 2
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 8.40 8.46 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 4229.27 5278.73 27688674 27783294 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 4237.67 5287.20 27688674 27783294 0 0
We know from Jonathan Lewis’s work on http://www.nocoug.org/download/2008-11/partitioning_NOCOUG.pdf that this is expected internal SQL from a ENABLE VALIDATE constraint in an EXCHANGE WITHOUT VALIDATION scenario
So why the big change in performance?
Probably this is an indication of just a simple execution plan issue. I would need to trace the same in 11g to confirm which is easier said than done under the circumstances.
This is the info from the 19c tkprof at least:
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
0 0 0 COUNT STOPKEY (cr=27783294 pr=27688674 pw=0 time=287636041 us starts=1)
0 0 0 NESTED LOOPS (cr=27783294 pr=27688674 pw=0 time=287636035 us starts=1 cost=4 size=1061 card=1)
238651024 238651024 238651024 VIEW VW_JF_SET$91EDBF25 (cr=27783288 pr=27688673 pw=0 time=3245932610 us starts=1 cost=4 size=62 card=2)
238651024 238651024 238651024 UNION-ALL (cr=27783288 pr=27688673 pw=0 time=2708356160 us starts=1)
11351302 11351302 11351302 PARTITION COMBINED ITERATOR PARTITION: KEY KEY (cr=69570 pr=69469 pw=0 time=5491712 us starts=1 cost=2 size=43 card=1)
11351302 11351302 11351302 INDEX STORAGE FAST FULL SCAN FIRST ROWS PK_FACILITY_BALANCE PARTITION: KEY KEY (cr=69570 pr=69469 pw=0 time=4380693 us starts=20 cost=2 size=43 card=1)(object id 42709299)
227299722 227299722 227299722 PARTITION COMBINED ITERATOR PARTITION: KEY KEY (cr=27713718 pr=27619204 pw=0 time=2376271490 us starts=1 cost=2 size=43 card=1)
227299722 227299722 227299722 INDEX STORAGE FAST FULL SCAN FIRST ROWS PK_FACILITY_BALANCE PARTITION: KEY KEY (cr=27713718 pr=27619204 pw=0 time=1916674628 us starts=62357 cost=2 size=43 card=1)(object id 42709299)
0 0 0 INDEX UNIQUE SCAN FB_P_161261_SP_9$C3 (cr=6 pr=1 pw=0 time=1734545604 us starts=238651024 cost=0 size=1030 card=1)(object id 72993779)
What is clear is that if the constraint definition is changed to ENABLE NOVALIDATE then the exchange partition takes about 1 minute. Which still feels a bit slow but a heck of a lot better than 90 minutes.
If you see Mohamed Houri’s comment below, this experience is not isolated and in his case, they went for EXCHANGE PARTITION (WITH VALIDATION). After the Christmas break, I will re-evaluate the choice I made.
Parallel Window Consolidator Calls The Stops
Sharing observations of performance issue arising out of testing 19.6 upgrade from 11.2.0.4 (bug possibly from 12+ looking at the fixes which work).
Sharing this one in particular as it doesn’t seem that common from anecdotal evidence (forums, blogs, support), it doesn’t appear fixed yet, progress with support has been disappointing, however circumstances don’t seem particularly niche.
We’ve had a few issues with a number of parallel queries just “hanging” forever.
Issue seems to be related to WINDOW CONSOLIDATOR and the parallel distribution method.
Haven’t been able to find a good matching bug via Oracle Support (nor happy with the progress of SR with them) but found a good match on a Lothar Flatz blog post which led me to the PQ_DISTRIBUTE_WINDOW as cause.
SQL contains parallel + join to view, view contains analytic
SQL Executes in under a second “normally”.
Without a fix, the SQL will just hang for a long time. The longest I’ve left it is 3 days but given the wait states of the sessions involved, there’s no reason to think it would stop of its own accord.
Jumping ahead to the workarounds which work, any of the following:
- Turn off fix_control 13345888 via alter session or opt_param
- Turn off “_adaptive_window_consolidator_enabled” via alter session/system or opt_param
- Hint a PQ distribution method other than 2. 3 doesn’t work here so 1 is the other option – PQ_DISTRIBUTE_WINDOW(@query_block 1)
Here we see some evidence from just now, running for nearly 30 minutes:
select s.inst_id i,s.sid,s.serial#,s.module
, s.blocking_session blks,s.blocking_instance bi,s.final_blocking_session f_blk_s, s.final_blocking_instance fbi
, CASE WHEN state != 'WAITING' THEN 'WORKING' ELSE 'WAITING' END AS state
, CASE WHEN state != 'WAITING' THEN 'On CPU / runqueue' ELSE event END AS sw_event, s.seconds_in_wait wait
, s.sql_id,s.sql_exec_id,to_char(s.sql_exec_start,'DD-MON-YYYY HH24:MI:SS') exec_start
from gv$session s
where sql_id = '5xyrp7v4djagv';
I SID SERIAL# MODULE BLKS BI F_BLK_S FBI STATE SW_EVENT WAIT SQL_ID SQL_EXEC_ID EXEC_START
---------- ---------- ---------- -------------------- ---------- ---------- ---------- ---------- ------- ------------------------- ---------- ------------- ----------- -----------------------------
2 2805 15947 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 3095 3608 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 3367 28066 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 5610 8452 SQL Developer WAITING PX Deq: Table Q Normal 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 5885 54481 SQL Developer 9672 2 9672 2 WAITING PX Deq: Execute Reply 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 8828 57832 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 9111 37143 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 9383 50792 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 9672 29993 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
Real-time SQL Monitoring report looks per below (note Duration compared to any other progress/metrics)… note that in later versions of Oracle, RTSM has a habit of timing out and reporting DONE(ERROR) even though the SQL is still going:
Global Information
------------------------------
Status : EXECUTING
Instance ID : 2
SQL ID : 5xyrp7v4djagv
SQL Execution ID : 33554434
Execution Started : 10/26/2020 12:18:59
First Refresh Time : 10/26/2020 12:18:59
Last Refresh Time : 10/26/2020 12:18:59
Duration : 1592s
Module/Action : SQL Developer/-
Service : LNTDH8U.UK.DB.COM
Program : SQL Developer
Global Stats
=========================================
| Elapsed | Cpu | Other | Buffer |
| Time(s) | Time(s) | Waits(s) | Gets |
=========================================
| 0.00 | 0.00 | 0.00 | 3 |
=========================================
Parallel Execution Details (DOP=4 , Servers Allocated=8)
==========================================================================================
| Name | Type | Server# | Elapsed | Cpu | Other | Buffer | Wait Events |
| | | | Time(s) | Time(s) | Waits(s) | Gets | (sample #) |
==========================================================================================
| PX Coordinator | QC | | 0.00 | 0.00 | 0.00 | 3 | |
| p000 | Set 1 | 1 | | | | | |
| p001 | Set 1 | 2 | | | | | |
| p002 | Set 1 | 3 | | | | | |
| p003 | Set 1 | 4 | | | | | |
| p004 | Set 2 | 1 | | | | | |
| p005 | Set 2 | 2 | | | | | |
| p006 | Set 2 | 3 | | | | | |
| p007 | Set 2 | 4 | | | | | |
==========================================================================================
SQL Plan Monitoring Details (Plan Hash Value=3844894891)
===============================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | (%) | (# samples) |
===============================================================================================================================================================================================
| 0 | SELECT STATEMENT | | | | | | 1 | | | |
| -> 1 | PX COORDINATOR | | | | 1 | +0 | 1 | 0 | 100.00 | Cpu (1) |
| 2 | PX SEND QC (RANDOM) | :TQ20006 | 2M | 445 | | | | | | |
| 3 | HASH JOIN | | 2M | 445 | | | | | | |
| 4 | PART JOIN FILTER CREATE | :BF0000 | 232 | 441 | | | | | | |
| 5 | PX RECEIVE | | 232 | 441 | | | | | | |
| 6 | PX SEND BROADCAST | :TQ20005 | 232 | 441 | | | | | | |
| 7 | VIEW | | 232 | 441 | | | | | | |
| 8 | WINDOW CONSOLIDATOR BUFFER | | 232 | 441 | | | | | | |
| 9 | PX RECEIVE | | 232 | 441 | | | | | | |
| 10 | PX SEND HASH | :TQ20004 | 232 | 441 | | | | | | |
| 11 | WINDOW BUFFER | | 232 | 441 | | | | | | |
| 12 | HASH JOIN | | 232 | 441 | | | | | | |
| 13 | NESTED LOOPS | | 505 | 4 | | | | | | |
| 14 | TABLE ACCESS BY GLOBAL INDEX ROWID | D_B | 1 | 3 | | | | | | |
| 15 | PX RECEIVE | | 1 | 2 | | | | | | |
| 16 | PX SEND HASH (BLOCK ADDRESS) | :TQ20002 | 1 | 2 | | | | | | |
| 17 | PX SELECTOR | | | | | | | | | |
| 18 | INDEX UNIQUE SCAN | D_B_PK | 1 | 2 | | | | | | |
| 19 | INDEX FULL SCAN | D_R_UK01 | 505 | 1 | | | | | | |
| 20 | PX RECEIVE | | 108 | 437 | | | | | | |
| 21 | PX SEND BROADCAST | :TQ20003 | 108 | 437 | | | | | | |
| 22 | VIEW | V_V2 | 108 | 437 | | | | | | |
| 23 | WINDOW CONSOLIDATOR BUFFER | | 108 | 437 | | | | | | |
| 24 | PX RECEIVE | | 108 | 437 | | | | | | |
| 25 | PX SEND HASH | :TQ20001 | 108 | 437 | | | | | | |
| 26 | WINDOW SORT | | 108 | 437 | | | | | | |
| 27 | NESTED LOOPS | | 108 | 436 | | | | | | |
| 28 | NESTED LOOPS | | 108 | 4 | | | | | | |
| 29 | TABLE ACCESS BY GLOBAL INDEX ROWID | D_B | 1 | 3 | | | | | | |
| 30 | PX RECEIVE | | 1 | 2 | | | | | | |
| 31 | PX SEND HASH (BLOCK ADDRESS) | :TQ20000 | 1 | 2 | | | | | | |
| 32 | PX SELECTOR | | | | | | | | | |
| 33 | INDEX UNIQUE SCAN | D_B_PK | 1 | 2 | | | | | | |
| 34 | INDEX RANGE SCAN | D_BS_UK01 | 108 | 1 | | | | | | |
| 35 | PX COORDINATOR | | | | | | | | | |
| 36 | PX SEND QC (RANDOM) | :TQ10002 | 1 | 4 | | | | | | |
| 37 | BUFFER SORT | | 2M | | | | | | | |
| 38 | VIEW | V_V1 | 1 | 4 | | | | | | |
| 39 | UNION ALL PUSHED PREDICATE | | | | | | | | | |
| 40 | TABLE ACCESS BY GLOBAL INDEX ROWID | D_S | 1 | 2 | | | | | | |
| 41 | BUFFER SORT | | | | | | | | | |
| 42 | PX RECEIVE | | 1 | 1 | | | | | | |
| 43 | PX SEND HASH (BLOCK ADDRESS) | :TQ10000 | 1 | 1 | | | | | | |
| 44 | PX SELECTOR | | | | | | | | | |
| 45 | INDEX UNIQUE SCAN | D_S_PK | 1 | 1 | | | | | | |
| 46 | TABLE ACCESS BY INDEX ROWID | D_S | 1 | 2 | | | | | | |
| 47 | BUFFER SORT | | | | | | | | | |
| 48 | PX RECEIVE | | 1 | 1 | | | | | | |
| 49 | PX SEND HASH (BLOCK ADDRESS) | :TQ10001 | 1 | 1 | | | | | | |
| 50 | PX SELECTOR | | | | | | | | | |
| 51 | INDEX UNIQUE SCAN | D_S | 1 | 1 | | | | | | |
| 52 | PX BLOCK ITERATOR ADAPTIVE | | 8255 | 3 | | | | | | |
| 53 | TABLE ACCESS STORAGE FULL | F_FACT | 8255 | 3 | | | | | | |
===============================================================================================================================================================================================
ASH says it’s not doing anything (as we’d expect it to, being consistent with RTSM report above).
select count(*), min(sample_time), max(sample_time), sysdate from gv$active_session_history where sql_id = '5xyrp7v4djagv';
COUNT(*) MIN(SAMPLE_TIME) MAX(SAMPLE_TIME) SYSDATE
---------- ---------------------------- ---------------------------- --------------------
1 26-OCT-20 12.18.59.828000000 26-OCT-20 12.18.59.828000000 26-OCT-2020 12:50:29
Miscellaneous Slave Info
select
decode(px.qcinst_id,NULL,'QC',
' - '||lower(substr(pp.SERVER_NAME,
length(pp.SERVER_NAME)-4,4) ) )"Username",
decode(px.qcinst_id,NULL, 'QC', '(Slave)') "QC/Slave" ,
to_char( px.server_set) "SlaveSet",
to_char(s.sid) "SID",
to_char(px.inst_id) "Slave INST",
decode(px.qcinst_id, NULL ,to_char(s.sid) ,px.qcsid) "QC SID",
to_char(px.qcinst_id) "QC INST",
px.req_degree "Req. DOP",
px.degree "Actual DOP"
from gv$px_session px,
gv$session s ,
gv$px_process pp
where px.sid=s.sid (+)
and px.serial#=s.serial#(+)
and px.inst_id = s.inst_id(+)
and px.sid = pp.sid (+)
and px.serial#=pp.serial#(+)
and s.sql_id = '5xyrp7v4djagv'
order by 6 , 1 desc
/
Usernam QC/Slav SlaveSet SID Slave INST QC SID QC INST Req. DOP Actual DOP
------- ------- ---------- --------- ---------- ---------- -------- ------------- -----------
QC QC 5885 2 5885
- p007 (Slave) 2 9672 2 5885 2 4 4
- p006 (Slave) 2 3367 2 5885 2 4 4
- p005 (Slave) 2 9395 2 5885 2 4 4
- p004 (Slave) 2 3095 2 5885 2 4 4
- p003 (Slave) 1 9111 2 5885 2 4 4
- p002 (Slave) 1 2805 2 5885 2 4 4
- p001 (Slave) 1 8828 2 5885 2 4 4
- p000 (Slave) 1 5610 2 5885 2 4 4
Detail! And don’t forget the notes…
…and how sparse tables can mess up your dynamic sampling.
A little lesson to self about paying attention to the detail, the obvious frickin’ detail. Getting older and slower.
We had one of those problems where a multitude of things were slow and yet there were no significant system-level problems (e.g. no general slow IO, no setting of the fast=false parameter, etc)
In such situations, if you check the high level system level metrics – AWR etc – and nothing is obvious then it makes sense to narrow the scope & drill into a specific problem, and distill that problem/statement as small as you can. If you can find the issue there then you might hope to find that that is the same as the wider problem and if not well you can just take that approach to every individual area,
I had a problem across a number of reporting modules, I picked on one module and from there I picked on the statement which was currently running and which was running for much longer than expected, I picked out one of the obvious problem joins within the problem statement and started looking at a simple two table join where the cardinality estimates were going wrong.
Most of the specific details of this plan and SQL can be ignored. It’s not important.
What is important is the cardinality difference.
And in fact, in this particular cut down example, the cardinality difference does not have the plan impact it does in the original setting.
set pages 999
explain plan for
SELECT /*+ parallel(4)*/
count(*)
FROM psbal
LEFT OUTER JOIN
prdct
ON (prdct.d_product_id = psbal.d_product_id
AND prdct.create_ts TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6'));
select * from table(dbms_xplan.display);
Plan hash value: 3885851338
--------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 72 | 3430K (3)| 01:54:22 | | | | | |
| 1 | SORT AGGREGATE | | 1 | 72 | | | | | | | |
| 2 | PX COORDINATOR | | | | | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10003 | 1 | 72 | | | | | Q1,03 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 72 | | | | | Q1,03 | PCWP | |
|* 5 | HASH JOIN | | 7466M| 500G| 3430K (3)| 01:54:22 | | | Q1,03 | PCWP | |
| 6 | PART JOIN FILTER CREATE | :BF0000 | 286K| 1680K| 56 (0)| 00:00:01 | | | Q1,03 | PCWP | |
| 7 | PX RECEIVE | | 286K| 1680K| 56 (0)| 00:00:01 | | | Q1,03 | PCWP | |
| 8 | PX SEND BROADCAST | :TQ10000 | 286K| 1680K| 56 (0)| 00:00:01 | | | Q1,00 | P->P | BROADCAST |
| 9 | PX BLOCK ITERATOR | | 286K| 1680K| 56 (0)| 00:00:01 | | | Q1,00 | PCWC | |
| 10 | INDEX STORAGE FAST FULL SCAN| S_PK | 286K| 1680K| 56 (0)| 00:00:01 | | | Q1,00 | PCWP | |
|* 11 | HASH JOIN RIGHT OUTER | | 7466M| 458G| 3425K (3)| 01:54:12 | | | Q1,03 | PCWP | |
| 12 | PX RECEIVE | | 1 | 30 | 306 (0)| 00:00:01 | | | Q1,03 | PCWP | |
| 13 | PX SEND BROADCAST | :TQ10001 | 1 | 30 | 306 (0)| 00:00:01 | | | Q1,01 | P->P | BROADCAST |
| 14 | PX BLOCK ITERATOR | | 1 | 30 | 306 (0)| 00:00:01 | 1 | 3 | Q1,01 | PCWC | |
|* 15 | TABLE ACCESS STORAGE FULL | PRODUCT | 1 | 30 | 306 (0)| 00:00:01 | 1 | 3 | Q1,01 | PCWP | |
|* 16 | HASH JOIN | | 7466M| 250G| 3420K (3)| 01:54:01 | | | Q1,03 | PCWP | |
| 17 | PX RECEIVE | | 6903K| 118M| 16379 (1)| 00:00:33 | | | Q1,03 | PCWP | |
| 18 | PX SEND BROADCAST | :TQ10002 | 6903K| 118M| 16379 (1)| 00:00:33 | | | Q1,02 | P->P | BROADCAST |
| 19 | PX BLOCK ITERATOR | | 6903K| 118M| 16379 (1)| 00:00:33 | 1 | 3 | Q1,02 | PCWC | |
|* 20 | TABLE ACCESS STORAGE FULL | BOOK | 6903K| 118M| 16379 (1)| 00:00:33 | 1 | 3 | Q1,02 | PCWP | |
| 21 | PX PARTITION LIST JOIN-FILTER | | 8064M| 135G| 3398K (3)| 01:53:17 |:BF0000|:BF0000| Q1,03 | PCWC | |
| 22 | TABLE ACCESS STORAGE FULL | BALANCE | 8064M| 135G| 3398K (3)| 01:53:17 |:BF0000|:BF0000| Q1,03 | PCWP | |
--------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("FCT"."S_VID"="DNSP"."S_VID")
11 - access("PRDCT"."PRODUCT_ID"(+)="FCT"."PRODUCT_ID")
15 - storage(NVL("PRDCT"."MODIFY_TS"(+),TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 08:16:34.822981000' AND
"PRDCT"."CREATE_TS"(+)TIMESTAMP' 2020-10-14 08:16:34.822981000' AND
"PRDCT"."CREATE_TS"(+)<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
16 - access("BOOK_ID"="FCT"."BOOK_ID" AND "VERSION"="FCT"."BOOK_VERSION")
20 - storage("COLUMN1" IS NULL OR "D_BOOK_VID"=(-2) OR "D_BOOK_VID"=(-1))
filter("COLUMN1" IS NULL OR "D_BOOK_VID"=(-2) OR "D_BOOK_VID"=(-1))
Note
-----
- dynamic sampling used for this statement (level=7)
- Degree of Parallelism is 4 because of hint
and this:
set pages 999
explain plan for
select count(*)
FROM prdct
WHERE prdct.create_ts <= TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6')
AND NVL(prdct.modify_ts, DATE '9999-12-31') > TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6');
select * from table(dbms_xplan.display);
Plan hash value: 2639949168
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 14 | 1103 (1)| 00:00:14 | | |
| 1 | SORT AGGREGATE | | 1 | 14 | | | | |
| 2 | PARTITION LIST ALL | | 399 | 5586 | 1103 (1)| 00:00:14 | 1 | 3 |
|* 3 | TABLE ACCESS STORAGE FULL| PRODUCT | 399 | 5586 | 1103 (1)| 00:00:14 | 1 | 3 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - storage(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14
08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
filter(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14
08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
Once you see the problem, you can’t unsee it but for whatever reason, for ages, I was focusing on why there would be such a cardinality difference between A and B for what are basically the single table cardinality estimates for the same table without seeing the explanation right there in front of me.
Note
-----
- dynamic sampling used for this statement (level=7)
I then got distracted and spent some time investigating why dynamic sampling was kicking in. Parallel query has different rules about when it can kick in. And I found reasons why it might.
But then I put that thought to one side.
Why is dynamic sampling saying this will return 1 row?
This is a small table and the estimate should be accurate and we like dynamic sampling so we want to show the feature to its full capability whenever and wherever it kicks in
Sure enough, as we might expect, if I hint dynamic sampling on the single table, estimate is one.
set pages 999
explain plan for
select /*+ dynamic_sampling(4) */ count(*)
FROM prdct
WHERE prdct.create_ts <= TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6')
AND NVL(prdct.modify_ts, DATE '9999-12-31') > TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6');
select * from table(dbms_xplan.display);
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 14 | 1103 (1)| 00:00:14 | | |
| 1 | SORT AGGREGATE | | 1 | 14 | | | | |
| 2 | PARTITION LIST ALL | | 1 | 14 | 1103 (1)| 00:00:14 | 1 | 3 |
|* 3 | TABLE ACCESS STORAGE FULL| PRODUCT | 1 | 14 | 1103 (1)| 00:00:14 | 1 | 3 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - storage(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14
08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
filter(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14
08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
Note
-----
- dynamic sampling used for this statement (level=4)
Without access to the sample statements, what suggestive evidence can we find?
select table_name, num_rows, blocks, avg_row_len
from user_tables
where table_name = 'PRODUCT';
TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN
------------------------------ ---------- ---------- -----------
PRODUCT 462 4064 193
So that’s not a lot of rows in a lot more blocks than should be necessary.
At this point, we should go off and find out what the reasons might be.
For what it’s worth, for this application, my suspicions are historic direct path maintenance code coupled with table partitioning on status (LATEST/SUPERSEDED) and therefore row movement means an update is a delete + insert, coupled with periods of very low rate of change and small row sizes.
So we end up with old rows moving partition – using a new block for the updated version and leaving behind an empty block where the version was before the update and then relatively few new version rows per run which are using new blocks, and so lots of empty blocks.
i.e. sparsely populated table.
And then dynamic sampling kicks in and samples a bunch of blocks, and finds few actual rows from those samples and comes up with a low estimate and then the whole problem starts. And these were shared tables across the reporting modules so it had a reasonable wide impact.
It would be nice to back this up with some trace files detailing any of the dynamic sampling numbers but I don’t have access by default and mocking this all up outside of work is always a worthwhile exercise but just takes up so much time.
Anyway… move some partitions, rebuild some indexes and 4000 blocks becomes just over 40.
Now our dynamic sampling is looking much better:
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 14 | 1103 (1)| 00:00:14 | | |
| 1 | SORT AGGREGATE | | 1 | 14 | | | | |
| 2 | PARTITION LIST ALL | | 297 | 14 | 1103 (1)| 00:00:14 | 1 | 3 |
|* 3 | TABLE ACCESS STORAGE FULL| PRODUCT | 297 | 14 | 1103 (1)| 00:00:14 | 1 | 3 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - storage(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14
08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
filter(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14
08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
Note
-----
- dynamic sampling used for this statement (level=4)
Exercise repeated for a whole bunch of tables in a similar situation and everything across the board is a lot healthier.