Dominic Brooks
ORs, IN lists and LNNVL
I’ve previously written about manually rewriting an OR condition into a UNION ALL using LNNVL.
This is a description of a performance issue observed in the real world from the optimizer coming up with a CONCATENATION operation against many child operations including an INLIST operator and other children which then have to use LNNVL as a filter.
Really this is nothing more than an illustration of our old performance killing friend the function call from within SQL, particularly from within predicates, and the problem of context switching.
Ok. So, the real world example is a crazy query.
It contains 1091 OR clauses each stipulating a unique key lookup.
Yes, “1091″ OR clauses.
I know, I know.
Probably itself a workaround to avoid the 1000 limit on an IN list, who knows?
Anyway.
I’m going to use a table just to illustrate the execution plan and then use some metrics from the execution of the real world example.
drop table t1;
create table t1
(col1 number
,col2 number
,col3 varchar2(10)
,constraint pk_t1 primary key (col1, col2));
insert into t1
select mod(rownum,10000)+1 col1
, ceil(rownum/10000) col2
, rpad('X',10,'X') col3
from dual
connect by rownum <= 100000
order by col1, col2;
commit;
If we use a simple multi-column OR condition, we don’t get the desired LNNVLs.
explain plan for select * from t1 where (col1 = 1 and col2 = 1) or (col1 = 2 and col2 = 2) or (col1 = 3 and col2 = 3) or (col1 = 3 and col2 = 4); select * from table(dbms_xplan.display);
The optimizer just uses an INLIST iterator:
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1129 | 37257 | 5 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 1129 | 37257 | 5 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PK_T1 | 7 | | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access(("COL1"=1 AND "COL2"=1 OR "COL1"=2 AND "COL2"=2 OR "COL1"=3 AND
"COL2"=3 OR "COL1"=3 AND "COL2"=4))
However, if we combine an IN list with some of those ORs:
explain plan for select * from t1 where (col1 = 1 and col2 = 1) or (col1 = 2 and col2 = 2) or (col1 = 3 and col2 IN (3,4)); select * from table(dbms_xplan.display);
Which gives
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1140 | 37620 | 2 (0)| 00:00:01 |
| 1 | CONCATENATION | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | T1 | 11 | 363 | 1 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | PK_T1 | 1 | | 2 (0)| 00:00:01 |
| 4 | INLIST ITERATOR | | | | | |
| 5 | TABLE ACCESS BY INDEX ROWID| T1 | 1129 | 37257 | 1 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | PK_T1 | 1 | | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("COL1"=3)
filter("COL2"=3 OR "COL2"=4)
6 - access(("COL1"=1 AND "COL2"=1 OR "COL1"=2 AND "COL2"=2))
filter(LNNVL("COL1"=3) OR LNNVL("COL2"=3) AND LNNVL("COL2"=4))
We get the CONCATENATION of the simpler OR conditions which are combined into a single INLIST ITERATOR operations with another child operation which applies LNNVLs in the filter predicate.
Now imagine we have a much larger number of the simpler OR predicates combined with liberal scatterings of such IN clauses throughout our 1091 OR’d predicates.
Perhaps we’d be a bit concerned about the context switching and increased cpu usage?
Looking at my real world example right here… there’s no point me pasting in the thousands of lines.
If I run the real world behemoth:
Client elapsed time 108 seconds
DBMS_XPLAN.DISPLAY_CURSOR tells me it ran in 1.26 seconds:
plan hash value: 4262066066
--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1628 |00:00:01.26 | 3352 | 418 |
| 1 | CONCATENATION | | 1 | | 1628 |00:00:01.26 | 3352 | 418 |
| 2 | INLIST ITERATOR | | 1 | | 966 |00:00:00.94 | 2735 | 389 |
| 3 | TABLE ACCESS BY INDEX ROWID| XXXXXXXXXX | 966 | 624 | 966 |00:00:00.94 | 2735 | 389 |
|* 4 | INDEX RANGE SCAN | XXXXXXXXXX_UK01 | 966 | 624 | 966 |00:00:00.86 | 1915 | 380 |
| 5 | TABLE ACCESS BY INDEX ROWID | XXXXXXXXXX | 1 | 1 | 2 |00:00:00.01 | 4 | 0 |
|* 6 | INDEX RANGE SCAN | XXXXXXXXXX_UK01 | 1 | 1 | 2 |00:00:00.01 | 3 | 0 |
....
| 253 | TABLE ACCESS BY INDEX ROWID | XXXXXXXXXX | 1 | 2 | 47 |00:00:00.01 | 14 | 1 |
|*254 | INDEX RANGE SCAN | XXXXXXXXXX_UK01 | 1 | 2 | 47 |00:00:00.01 | 3 | 1 |
--------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access(((("XXXXXXXXXX"."VERSION"=1 AND "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611722) OR
("XXXXXXXXXX"."VERSION"=1 AND "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611795) OR ("XXXXXXXXXX"."VERSION"=1 AND
"XXXXXXXXXX"."ID_XXXXXXXXXX"=1611863) OR ("XXXXXXXXXX"."VERSION"=1 AND
"XXXXXXXXXX"."ID_XXXXXXXXXX"=1612023) OR ("XXXXXXXXXX"."VERSION"=1 AND .....
6 - access "XXXXXXXXXX"."ID_XXXXXXXXXX"=2046939)
filter(((LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611722)) AND
(LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611795)) AND
(LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611863)) AND
.....
DBMS_MONITOR tells me the query ran in 1.68 seconds:
Global Information Status : DONE (ALL ROWS) Instance ID : 2 Session ID : 1220 SQL ID : 5fvt3tfbgmqp3 SQL Execution ID : 33554432 Plan Hash Value : 4262066066 Execution Started : 04/05/2013 14:53:42 First Refresh Time : 04/05/2013 14:53:42 Last Refresh Time : 04/05/2013 14:53:45 -------------------------------------------------------------------- | Elapsed | Cpu | IO | Cluster | Fetch | Buffer | Reads | | Time(s) | Time(s) | Waits(s) | Waits(s) | Calls | Gets | | -------------------------------------------------------------------- | 1.68 | 0.50 | 1.13 | 0.05 | 4 | 3352 | 418 | --------------------------------------------------------------------
Fortunately V$SQL tells me the real story:
select sql_id, executions, rows_processed, elapsed_time/1000/1000, cpu_time/1000/1000 from v$sql where sql_id = '5fvt3tfbgmqp3';
Gives
SQL_ID EXECUTIONS ROWS_PROCESSED ELAPSED_TIME/1000/1000 CPU_TIME/1000/1000 ------------- ---------- -------------- ---------------------- ------------------ 5fvt3tfbgmqp3 1 1628 105.81511 104.581102
Then again, if you write nasty code, nasty things can happen.
Which of my sql statements are using dynamic sampling?
From a reply I gave on an OTN forum thread, how to see all queries currently in memory which use dynamic sampling?
Dynamic sampling is an attribute in V$SQL_PLAN.OTHER_XML.
In 11gR1, it says “yes”.
In 11gR2, it gives the level.
Using EXTRACTVALUE which is deprecated in 11.2:
select p.sql_id, extractvalue(h.column_value,'/info') lvl from v$sql_plan p , table(xmlsequence(extract(xmltype(p.other_xml),'/other_xml/info'))) h where p.other_xml is not null and extractvalue(h.column_value,'/info/@type') = 'dynamic_sampling';
Another approach using one of the prefered alternatives – XMLTABLE:
select p.sql_id, t.val
from v$sql_plan p
, xmltable('for $i in /other_xml/info
where $i/@type eq "dynamic_sampling"
return $i'
passing xmltype(p.other_xml)
columns attr varchar2(50) path '@type',
val varchar2(50) path '/') t
where p.other_xml is not null;
Warning: I tried a number of approaches with XMLTABLE and whilst they all worked in 11.2.0.3, in 11.1.0.7 they all ran into the error below. It might well just be a local issue but I did not investigate:
ORA-00600: internal error code, arguments: [qctVCO:csform], [0], [0], [0], [0], [112], [2], [224], [], [], [], []
Fun with Distributed Transactions II
Short on examples, long on words…
Last week I did a quick post about a couple of the more obscure implications of using distributed transactions, in particular:
- The current impact on subquery materialisations
- The incompatibility with MVs
- The unusual situation of transactions with locks but no sessions
Anyway, earlier today I was having a closer look at some “DFS lock handle” wait events in an 11gR1 RAC database (not because there was an obvious problem but because I saw a few of them an wanted to have a closer look).
On a DFS lock handle wait, you have to decode P1 to find out what it’s all about as discussed by Riyaj Shamsudeen.
In my case, these were mostly DX and BB enqueues which are related to my old friend the distributed transaction.
Why am I still banging on about distributed transactions?
Well, you might think that they’re not very common but in the JDBC world they seem to be everywhere.
XA transactions are used (overused) throughout the JDBC world and so they might well be very relevant to any database that you’ve got with a JDBC app sitting atop.
A common pattern seen is to take some message off a queue, do something related to that message in the database, and use an XA transaction so that both stand or fail together.
In my previous post, I mentioned an AskTom thread discussing Materialized Views and distributed transactions.
I added some thoughts to that thread and Tom’s observation was
that the java/jdbc world for some reasons wishes to use an external resource manager so they have to do two phase commits against a single database isn’t what I was talking about. when I’ve seen the XA stuff – there is typically *one* database involved and it makes everything really complex,
hard to understand and slower than it needs to be.
Which might well be a fair point but still doesn’t change the fact that if your insert/update/delete comes in on an XA, then being unable to use a fast refresh on commit MV is really quite a restriction.
Anyway, back to my observations on these enqueues.
There’s not much information related to the BB enqueue but I believe that it’s related to the coordination of global transactions on a RAC cluster.
From 11gR1 distributed transactions can be processed on any instance in the cluster and these BB enqueues seem to be part of that picture. Prior to this change, any branches of a distributed transaction had to execute on the same node.
Whilst we’re on the subject, inevitably there are some bugs around these, in particular the BB enqueue and GTX processes (processes introduced to manage these enhanced cross cluster distributed transaction features).
The DX enqueue is perhaps more interesting.
It looks after “tightly coupled distributed transactions”.
So, WTF are tightly coupled distributed transactions and how do they differ from loosely coupled distributed transactions?
Tightly coupled transactions enable other branches of the transaction – and from our database perspective we’re only talking about multiple sessions in the same database which are part of the same distributed transaction – to:
- share each others locks and
- see each others changes.
The DX enqueue helps manage this.
As part of this, what the DX enqueue does is make sure that only one transaction branch is actively executing SQL at any one time.
So that’s potentially a pretty hefty point of serialisation then.
For more information, the Oracle whitepaper “XA and Oracle controlled Distributed Transactions” linked to below is a really good resource particularly the section “Distributed Transactions and Database Locking”.
However… just to emphasise the point …
In a tightly coupled tranasction, the DX enqueue is obtained before executing any statement.
By contrast, loosely coupled tranasctions do not need to get this DX lock before executing a statement, i.e. no serialisation between different transaction branches.
As the developer’s guide below says “loosely coupled transaction branches result in greater concurrency.”
So, not only is there an overhead to XA transactions but there is an additional overhead to tightly coupled transactions.
And, how many applications really use multiple transaction branches in a single database.
Very, very few, I wager!
And if they do, how many of those applications appreciate the serialisation involved anyway?
Now depending on your version, you might not see the DX and BB enqueues in your enqueue statistics.
In 11.1.0.7 I don’t seem to see anything in V$ENQUEUE_STAT for DX or BB.
There are however some relevant session/system statistic buckets:
- DX/BB enqueue lock foreground requests
- DX/BB enqueue lock foreground wait time
- DX/BB enqueue lock background gets
- DX/BB enqueue lock background get time
For more information see:
- Oracle whitepaper “XA and Oracle controlled Distributed Transactions”
- Oracle Database Advanced Application Developer’s Guide: Developing Application with Oracle XA


