Skip navigation.

Jonathan Lewis

Syndicate content Oracle Scratchpad
Just another Oracle weblog
Updated: 17 hours 20 min ago

Securefiles

Thu, 2014-05-29 11:15

A few weeks ago someone emailed me about a problem they had importing securefiles – it was very slow. Such things are never easy to address by email, of course, but there were three features to consider: (a) it was securfiles, (b) it was impdp, and (c) it was across a database link. If you read my blog regularly you’ll have seen me comment a few times that the easiest way to break Oracle is to mix a few features – so

  • securefiles and impdp (I know why LOBs generally appear to be slow to import, was it “LOBs” or specifically securefile LOBs)
  • securefiles and database links (db links are always slower than local actions – easy to do a comparative test)
  • impdp and database links (probably not, they’re supposed to work very well together in general – could do a local/remote comparison)
  • impdp with securefiles across a database link (easy enough to factor out the database link)

It was just a brief email, and I didn’t have an answer offhand, so I pointed out that there were a few bugs on MoS about impdp and LOBs and left it at that. A couple of days ago I got a follow-up email telling me that the problem was Bug 13609098 : IMPORTING SMALL SECUREFILE LOBS USING DATA PUMP IS SLOW.

There are two reasons for writing this note – the first, of course, is just to publicise the bug because I’ve seen three of four complaints over the Internet about slow imports with LOBs  and maybe a couple of those were actually “small securefile LOBs”; and then it’s possible that there are other people who haven’t even realised that their imports could be running faster.

The second reason, though, is to highlight a viewpoint that leaves me approaching Oracle features with extreme caution: this looks like the sort of bug that many people should have noticed, but the first reference is Jan 2012, and the earliest patch seems to be dated Oct 2013 – 22 months later! There could be various reasons for the long gap – but the one that always comes to my mind first in cases like this is: “are there so few people using ‘feature X’ that this bug stayed near the bottom of the todo list for a long time ?” – followed by the slightly less alarmist “maybe there are quite a lot of people, but very few have noticed” and “but the specific combination is, perhaps, just a little unlikely”. If there really are very few people using the feature then I’m not going to be keen to advise a client to take it on without doing an extremely careful set of tests – at scale – of everything they’re likely to do with the feature. I don’t want something to break after go-live and find that it take weeks to identify the root cause and months to fix.

In this particular case I’ll believe that the combination of Securefile LOBs (“large” objects) that were actually small and in large numbers is significant. I’m prepared to assume that the customer base using Securefiles is a reasonable size but the subset who hit this combination is a small fraction of the whole;  and that means I won’t be quite so paranoid about suggesting Securefiles as an option to a client – though I’d still insist on modelling any special cases that their requirements might highlight.

Footnote

The bug is fixed in 12.2 with several backports to 11.2.0.x for different platforms.

 

 

 


10053 trace

Fri, 2014-05-23 06:37

I published a note yesterday about enabling SQL trace system-wide for a single statement – and got a response on twitter from Bertrand Drouvot referencing a blog post he’d done a few months ago about using a similar method to dump the optimizer trace (10053) for a statement whenever it was optimized. He also highlighted the dbms_sqldiag package with its dump_trace() procedure – which is something I’d wanted to use a couple of weeks ago but couldn’t remember the package name - and supplied a reference to MoS note 225598.1 which then led on to Greg Rahn’s blog note about dbms_sqldiag.

The package is so important that I decided I’d create a reference note on my blog about it (I finally found my own reference notes on my laptop after getting a clue from Bertrand’s blog about the filename); but, at the same time, I think I have to report that it might not have helped me when I was looking for it two weeks ago. Here’s a critical paragraph from the MoS note:

How to Obtain Tracing of Optimizer Computations (EVENT 10053) (Doc ID 225598.1)

NOTE: The parse environment uses information captured in V$SQL_OPTIMIZER_ENV which does not record all information about the environment that parsed the query such as NLS settings. The result of this is that trace extracted from this may not always generate a trace that is truly representative of what happens when parsed from an application client. For example, if you parse from a client with NLS_SORT set differently to the Database then the application plan may be different to the database and so a trace generated from V$SQL_OPTIMIZER_ENV may cause confusion when the plan for a given cursor in the application is different to the one extracted.

The particular task I wanted to do was to re-optimize a query that had been run by another user (I was logged on as SYS at the time) because the critical user would have added a security predicate based on their current context to the table definitions in the query – and I didn’t know what that predicate would look like, but I knew it would have affected the execution path.  Judging from the note, though, it seems likely that the call to the package wouldn’t have re-acquired the user’s context and might, therefore, not be able to regenerate the necessary predicates. On the other hand, the actual security predicates used during the original optimisation could still be in-memory (in v$vpd_policy) so maybe dump_trace() would be able to do something about them.

Status: to be tested when I next have a few minutes free.

 


sql_trace

Thu, 2014-05-22 06:24

Here’s a convenient aid to trouble-shooting that appeared in 11g with its enhancements to setting events. It’s a feature that may help you to work out (among other things) why a given statement seems to have a highly variable performance profile. If you can find the SQL_ID for a parent cursor you can enable tracing for just that cursor whenever it executes, whoever executes it.


--	Last tested:  11.2.0.4

define m_sql_id = '&1'
define m_sql_id = '9tz4qu4rj9rdp'

alter system set events
	'
	sql_trace[SQL: &m_sql_id ]
/*
	plan_stat=all_executions,
	wait=false,
	bind=false
*/
	'
;

pause Press return to stop tracing

alter system set events
	'
	sql_trace[SQL: &m_sql_id ]
	off
	'
;
;

This is the whole of a little script I’ve got – the generic &1 is how I normally use it, I’ve just included a specific value (which is the sql_id for “select count(*) from all_objects;” as an example – that starts tracing across the entire system, but only for a given SQL_ID. On a production system, if I think I really need to do this, I would check the expected frequency of execution for the statement and wait enough time to capture a few occurrences before disabling the trace. Each session generates its own trace file, but if you’ve been sufficiently patient you get a reasonable sampling of the different workloads – and if you’ve captured the bind variables as well, this may give you some clues about why different work loads can appear.

A nice detail about this feature is that if the SQL_ID is for a pl/sql block, all the SQL executing inside the block is traced as part of the trace on the block (and that was particularly helpful the last time I had to make use of the feature); in the sample I’ve given I also found that some recursive SQL – relating to the XMLSCHEMA object types – executed within the view was also traced as the main statement was traced … so that makes it easy to see the effects of SQL statements calling PL/SQL functions that contain SQL statements.

 


Ignoring Hints

Mon, 2014-05-19 11:21

Does Oracle ignore hints – not if you use them correctly, and sometimes it doesn’t ignore them even when you use them incorrectly!

Here’s an example that I’ve run on 11.2.0.4 and 12.1.0.1


create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	rownum			id,
	rownum			n1,
	rpad('x',100)		padding
from
	generator	v1
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);
end;
/

create index t1_i1 on t1(id);
alter index t1_i1 unusable;

select n1 from t1 where id = 15;
select /*+ index(t1 (id)) */ n1 from t1 where id = 15;

Any guesses about the output from the last 4 statements ?

Index created.

Index altered.

        N1
----------
        15

1 row selected.

select /*+ index(t1 (id)) */ n1 from t1 where id = 15
*
ERROR at line 1:
ORA-01502: index 'TEST_USER.T1_I1' or partition of such index is in unusable state

That’s a pretty convincing display of Oracle not ignoring hints.

Update:

Technically, of course, I haven’t demonstrated that Oracle is not ignoring the hint (i.e. that it’s obeying the hint – if you want to avoid the double negative) until I demonstrate that in the absence of the hint the error would not occur – but that task is left as an exercise to the reader.

 


Subquery with OR

Thu, 2014-05-15 11:23

Prompted by a pingback on this post, followed in very short order by a related question (with a most gratifying result) on Oracle-L, I decided to write up a note about another little optimizer enhancement that appeared in 12c. Here’s a query that differs slightly from the query in the original article:


select
	id, modded, mod_15
from
	t1
where
	t1.mod_15 = 1                     -- originally t1.mod_15 > 0
and	(   t1.modded is null             -- originally t1.modded = 0
	 or exists (
		select	null
		from	t2
		where	t2.id = t1.modded
	    )
	)
;

As a general principle, the “OR EXISTS” stops the optimizer from unnesting the subquery, so my original article suggested a workaround that required you to rewrite the query with a UNION ALL, using the lnnvl() function (where possible) as the easy way to eliminate accidental duplication. Take a look at the plans for my new query, though – first in 11.2.0.4, then in 12.1.0.1:


Execution Plan for 11.2.0.4
----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |    34 |   374 |    50   (0)| 00:00:01 |
|*  1 |  FILTER            |       |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| T1    |   667 |  7337 |    50   (0)| 00:00:01 |
|*  3 |   INDEX UNIQUE SCAN| T2_PK |     1 |     3 |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("T1"."MODDED" IS NULL OR  EXISTS (SELECT 0 FROM "T2" "T2"
              WHERE "T2"."ID"=:B1))
   2 - filter("T1"."MOD_15"=1)
   3 - access("T2"."ID"=:B1)

Execution Plan for 12.1.0.1
------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |    27 |   378 |    50   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI NA|       |    27 |   378 |    50   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL  | T1    |   667 |  7337 |    50   (0)| 00:00:01 |
|*  3 |   INDEX UNIQUE SCAN  | T2_PK |     1 |     3 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T1"."MOD_15"=1)
   3 - access("T2"."ID"="T1"."MODDED")

As expected, 11.2.0.4 has had to use a filter subquery approach – but 12.1.0.1 has found a different path. For this special “is null” case the optimizer has unnested the subquery and used a “null aware (NA) semi-join”. In this very small example there is no change in the reported cost, and the mechanics of the execution plan will be quite similar at run time – but in real systems there are bound to be cases where the new strategy is more efficient.

Unfortunately …

Bug 18650065 (fixed in 12.2) rears it’s ugly head: WRONG RESULTS ON QUERY WITH SUBQUERY USING OR EXISTS.
I can demonstrate this with the following code:


update t1 set modded = null
where id <= 30;
commit;

select
	id, modded, mod_15
from
	t1
where
	t1.id = 1                     -- previously mod_15 = 1
and	(   t1.modded is null
	 or exists (
		select	null
		from	t2
		where	t2.id = t1.modded
	    )
	)
;

alter table t1 add constraint t1_pk primary key(id);

select
	id, modded, mod_15
from
	t1
where
	t1.id = 1                     -- previously mod_15 = 1
and	(   t1.modded is null
	 or exists (
		select	null
		from	t2
		where	t2.id = t1.modded
	    )
	)
;

And here’s the output from the above script:


30 rows updated.

Commit complete.

        ID     MODDED     MOD_15
---------- ---------- ----------
         1                     1

1 row selected.

Table altered.

no rows selected

I’ve modified a few rows so that the “null-aware” bit of the new transformation matters, but I’ve now got a data set and transformation where I get the wrong results because I’ve defined a primary key (unique would have done) on a critical column in the query. If you check the execution plan you’ll find that the optimizer has switched from a null aware semi-join to a simple nested loop join.

There is a workaround for this problem – disable the relevant feature:

alter session set "_optimizer_null_accepting_semijoin"=false;
For Reference:

Here’s the SQL to generate the data for the above demonstration:

create table t1
as
with generator as (
	select	--+ materialize
		rownum 	id
	from	all_objects
	where	rownum <= 5000
)
select
	rownum			id,
	mod(rownum,999)		modded,
	mod(rownum,15)		mod_15,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 10000
;

update t1 set modded = null where modded = 26;

create index t1_i1 on t1(id);
create index t1_i2 on t1(modded);

create table t2
as
select
	2 * rownum		id,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	all_Objects
where
	rownum <= 20
;	

alter table t2 add constraint t2_pk primary key(id);


Feature Bypass

Wed, 2014-05-14 06:23

Here’s a little tip that might be helpful occasionally when you’re trying to work out why the optimizer transformation you were expecting isn’t appearing

If you’ve ever checked the 10053 trace (and who wants to do that for a complex query) you may have noticed lines like:

SU: SU bypassed: Remote table referenced.

So now you know that SU – Subquery Unnesting – has limitations in distributed queries.

When I first saw a line like this, it crossed my mind that it would be useful to keep a reference list of features that could be reported as bypassed, which I do through a simple unix line:

strings -a oracle | grep -i bypassed > bypassed.txt

If you need a reference for the various short codes for transformations you can find it near the top of the 10053 trace, looking like this:

[sourecode gutter="false"]
CBQT – cost-based query transformation
JPPD – join predicate push-down
OJPPD – old-style (non-cost-based) JPPD
FPD – filter push-down
PM – predicate move-around
CVM – complex view merging
SPJ – select-project-join
SJC – set join conversion
SU – subquery unnesting
OBYE – order by elimination
OST – old style star transformation
ST – new (cbqt) star transformation
CNT – count(col) to count(*) transformation
JE – Join Elimination
JF – join factorization
SLP – select list pruning
DP – distinct placement
… big gap here
AP – adaptive plans
[/sourcecode]


Compression Units – 6

Mon, 2014-05-12 06:34

I received an email recently asking me if I knew how Oracle found specific rows and columns in a compression unit. This is a topic that I’ve spoken about a couple of times, and I’ve published several notes on the blog about it, including an image of a critical slide from one of my presentations, and I was expecting to find some notes somewhere about Oracle catalogues all the bits and pieces. Part 4 of this series lists some of the detail but I was slightly surprised to discover that it made the comment: “and somewhere in the CU there has to be a set of pointers to the first byte for each compressed column” - and this was the bit that my interrogator wanted to know. This posting is the answer – and I’ll start with a dump of the first block of a CU:


block_row_dump:
tab 0, row 0, @0x30
tl: 8016 fb: --H-F--N lb: 0x0  cc: 1
nrid:  0x01c06484.0
col  0: [8004]
Compression level: 03 (Archive Low)
 Length of CU row: 8004
kdzhrh: ------PC CBLK: 11 Start Slot: 00
 NUMP: 11
 PNUM: 00 POFF: 7884 PRID: 0x01c06484.0
 PNUM: 01 POFF: 15900 PRID: 0x01c06485.0
 PNUM: 02 POFF: 23916 PRID: 0x01c06486.0
 PNUM: 03 POFF: 31932 PRID: 0x01c06487.0
 PNUM: 04 POFF: 39948 PRID: 0x01c06488.0
 PNUM: 05 POFF: 47964 PRID: 0x01c06489.0
 PNUM: 06 POFF: 55980 PRID: 0x01c0648a.0
 PNUM: 07 POFF: 63996 PRID: 0x01c0648b.0
 PNUM: 08 POFF: 72012 PRID: 0x01c0648c.0
 PNUM: 09 POFF: 80028 PRID: 0x01c0648d.0
 PNUM: 10 POFF: 88044 PRID: 0x01c0648e.0
CU header:
CU version: 0   CU magic number: 0x4b445a30
CU checksum: 0x5c173142
CU total length: 93755
CU flags: NC-U-CRD-OP
ncols: 7
nrows: 4361
algo: 0
CU decomp length: 93159   len/value length: 309310
row pieces per row: 1
num deleted rows: 0
START_CU:
 00 00 1f 44 1f 0b 00 00 00 0b 00 00 1e cc 01 c0 64 84 00 00 00 00 3e 1c 01
 c0 64 85 00 00 00 00 5d 6c 01 c0 64 86 00 00 00 00 7c bc 01 c0 64 87 00 00
 00 00 9c 0c 01 c0 64 88 00 00 00 00 bb 5c 01 c0 64 89 00 00 00 00 da ac 01
 c0 64 8a 00 00 00 00 f9 fc 01 c0 64 8b 00 00 00 01 19 4c 01 c0 64 8c 00 00
 00 01 38 9c 01 c0 64 8d 00 00 00 01 57 ec 01 c0 64 8e 00 00 00 4b 44 5a 30
 42 31 17 5c 00 01 6e 3b eb 06 00 07 11 09 00 04 b8 3e 01 00 00 00 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
...
...  20 repetitions of the previous line of zeros deleted
...  This is mostly the bitmap identifying deleted rows.
...  nrows=4361, so this will be about 4261, so this will be
...  about 4361/8 = 545 bytes of zeros.
...
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 02 54 00 00 30
 bc 00 00 4a 82 00 00 64 37 00 00 a6 b8 00 00 e9 39 00 01 2b ba e6 c2 eb 8b
 00 00 2e 68 01 40 01 02 00 00 53 f0 78 9c a5 dc 07 57 23 5b 96 25 e0 ee 24

If you look at the symbolic dump there are lots of numbers that you could try to locate in the binary dump labelled START_CU:, for example the CU magic number 4b3445a30 is very obvious at the end of line 37 above (with the bytes in the right order), and the CU checksum of 5c173142 is visible at the start of line 38 (with the bytes in reverse order – no, don’t ask why me there’s a difference).

Bearing in mind how Oracle frequently uses “self-describing” strategies for data we might search for the value that gives the length of the CU as this could well be the start of the compressed data. But there are two CU lengths – the CU total length (93755) and the CU decomp length (93159). To my mind the latter is the “amount of CU requiring decompression” but let’s check for both: 93,755 (dec) = 0x16e3b, and 93,159 (dec) = 0x16be7 .

Take another look at line 38 – the second 4 bytes are: 00 01 6e 3b, the CU total length. Unfortunately we can’t see the CU decomp length anywhere, so let’s change strategy slightly – the different between the two lengths is 93755 – 93159 =  596 = 0×254, and by a convenient coincidence if you look at line 46, five bytes from the end, you find 02 54. Is this really a coincidence ? Let’s count 596 bytes from the CU total length and see where we get to – the answer is : the last 4 bytes of line 46 (e6 c2 eb 8b) This doesn’t seem to help at first sight, but take a look at where we are and what precedes that 4 bytes. There are 7 groups of 4 bytes that look like increasing values – and since I know more about the data than you do (it’s a table with 7 columns, and the last 4 columns are identical across all rows), I’ve extracted those 28 bytes and reformatted them:


                    	Offset		Delta
                    	------		-----
        00 00 02 54 	  596
        00 00 30 bc 	12476		11880
        00 00 4a 82 	19074		 6598
        00 00 64 37 	25655		 6581
        00 00 a6 b8 	42680		17025
        00 00 e9 39 	59705		17025
        00 01 2b ba 	76730		17025
                    	93755	<----


The “Offset” column is what you get by converting from Hex to decimal, and the “Delta” column is the difference between one value and the next. Notice how the last three deltas are the same, and how, if you add the same delta to the last Offset you get a number which is the CU total length. The “02 54″ I found as the difference between the CU total length and the CU decomp length is actually the starting point for the list of pointers to each of the compressed columns, and the repeated 17,025 is the length of my duplicated final four columns. The e6 c2 eb 8b is the first four bytes of the first (compressed) column.

Having worked out that it’s possible to find the list of pointers quite easily from the dump, the next step (left as an exercise to the interested reader) is to work out the algorithm that Oracle uses to locate the list – since the CU decomp length is clearly derived and not actually stored. As a little clue, you might start by taking another look at line 38: bytes 11/12 are 00 07 – the number of columns in the table, and bytes 13/14 are 11 09 – the number of rows in the CU (and that’s the most significant number at this point because it tells you how long the bitmap for deleted rows will be).