Vlad Sadilovskiy
BLOB write size and CPU
Someone asked to help in identifying a strange problem. The problem found to be a combination of two issues, ASSM and the BLOB loading software, that manifested as an excessive CPU utilization. In this post we should see how different write buffer size can affect write efficiency.
Unfortunately, Tkprof and Statspack would often fail to point in proper direction when it gets to profiling activity performed by a Call-Less Cursor. In this case such cursor was responsible for writing and reading BLOBs. This issue was not specific to JDBC API. Later it was reproduced with DBMS_LOB API.
Just as an additional point to the mentioned above topic, here is an example of Tkprof and Statspack reports from the system where this issue was reproduced with help of single threaded application in an idle environment.
Tkprof
OVERALL TOTALS FOR
ALL
NON-RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 16 0.00 0.02 0 0 0 0
Execute 23 3.09 4.57 6 13894 18366 17
Fetch 3 0.00 0.00 0 6 0 3
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 42 3.10 4.59 6 13900 18366 20
OVERALL TOTALS FOR
ALL
RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 230212 5.39 5.14 0 0 0 0
Execute 230215 176.87 173.89 47 692034 175390 176321
Fetch 153422 4.22 3.96 5 230821 3 77048
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 613849 186.49 183.01 52 922855 175393 253369
Statspack
Load Profile
Per Second Per Transaction
~~~~~~~~~~~~ --------------- ---------------
Redo size: 1,553,489.88 668,736,513.89
Logical reads: 630,504.58 271,415,631.42
Block changes: 782.47 336,833.37
Physical reads: 171.33 73,752.42
Physical writes: 181.92 78,311.00
User calls: 42.80 18,424.58
Parses: 44.40 19,114.63
Hard parses: 0.00 1.00
Sorts: 0.39 169.89
Logons: 0.00 0.84
Executes: 44.94 19,344.89
Transactions: 0.00
Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time
----------------------------------------- ------------ ----------- ------ ------
CPU time 7,764 95.9
log file parallel write 85,190 147 2 1.8
log file switch (checkpoint incomplete) 246 88 357 1.1
log file switch completion 502 76 151 .9
control file parallel write 7,973 9 1 .1
-------------------------------------------------------------
Instance Activity Stats
Statistic Total per Second per Trans
--------------------------------- ------------------ -------------- ------------
...
db block gets 5,116,321,961 625,543.7 ############
db block gets direct 1,442,703 176.4 75,931.7
db block gets from cache 5,114,879,258 625,367.3 ############
...
physical writes 1,487,909 181.9 78,311.0
physical writes direct 1,444,083 176.6 76,004.4
physical writes direct (lob) 1,442,684 176.4 75,930.7
session logical reads 5,156,896,997 630,504.6 ############
...
Segments by Logical Reads
Subobject Obj. Logical Pct
Owner Tablespace Object Name Name Type Reads Total
---------- ---------- -------------------- ------------ ----- ------------ -----
SYS SYSTEM SEG$ TABLE 434,992 23.6
SYS SYSTEM TSQ$ TABLE 432,816 23.5
TEST TEST SYS_LOB0000056448C00 LOB 289,808 15.7
SYS SYSTEM I_FILE#_BLOCK# INDEX 288,448 15.7
SYS SYSTEM FILE$ TABLE 220,416 12.0
-------------------------------------------------------------
As it can be seen, the number of session logical reads in the test doesn’t align well with figures in Segments by Logical Reads section. Tkprof doesn’t even have a clue about 7K CPU seconds.
But let’s get back to the business. What could be happening that triggered such a vast amount of reads when writing BLOBs? A call to the vendor of the software revealed that the BLOB data is written in chunks and for each chunk the BLOB was closed and reset with an updated position. So, we duplicated this behavior and have gotten following statistics using “runstats” and DbmsOutput.java shared by Tom Kyte and Java API shared by R. M. Menon on AskTom.
BLOB...chunk size 32,768 32,768 32,768 32,768 32,768 32,768 32,768 32,768 32,768
BLOB...write buffer size
2,048 4,096 8,132 8,192 16,264 16,384 24,396 32,528 32,768
STAT
...consistent gets 971 375 119 319 51 234 27 190 222
STAT
...consistent gets direct 252 159 52 79 16 32
9 4 20
STAT
...db block gets 6,146 884 423 479 209 268 109 100 208
STAT
...db block gets direct
268 140 64 76 32 44 32 16 36
STAT
...physical reads direct (lob)
252 124 48 60 16 28 16 0 20
STAT
...physical writes direct (lob)
268 140 64 76 32 44 32 16 36
STAT
...session logical reads 7,117 1,259 542 798 260 502 136 290 430
BLOB...chunk size 16,384 16,384 16,384 16,384 16,384 16,384 16,384 16,384 16,384
BLOB...write buffer size
2,048 4,096 8,132 8,192 16,264 16,384 24,396 32,528 32,768
STAT
...consistent gets 449 222 99 213 47 74 25 35 63
STAT
...consistent gets direct 126 80 24 53 8 22
3 8 19
STAT
...db block gets 3,904 733 397 813 182 295 113 143 250
STAT
...db block gets direct
142 78 32 46 16 30 22 16 26
STAT
...physical reads direct (lob)
126 62 16 30 0 14 6 0 10
STAT
...physical writes direct (lob)
142 78 32 46 16 30 22 16 26
STAT
...session logical reads 4,353 955 496 1,026 229 369 138 178 313
BLOB...chunk size 8,192 8,192 8,192 8,192 8,192 8,192 8,192 8,192 8,192
BLOB...write buffer size
2,048 4,096 8,132 8,192 16,264 16,384 24,396 32,528 32,768
STAT
...consistent gets 347 264 95 128 71 91 23 17 39
STAT
...consistent gets direct 63 78 16 35 16 27 0 0 7
STAT
...db block gets 2,657 996 349 468 273 369 103 85 161
STAT
...db block gets direct
79 47 16 31 16 23 16 16 21
STAT
...physical reads direct (lob)
63 31 0 15 0 7 0 0 5
STAT
...physical writes direct (lob)
79 47 16 31 16 23 16 16 21
STAT
...session logical reads 3,004 1,260 444 596 344 460 126 102 200
By looking at physical reads direct (lob) statistics we can see that there are certain write buffer sizes for each BLOB chunk size that do no produce any additional reading. For our tests with chunk sizes 8K, 16K and 32K it appears that these local minimum points can be derived from BLOB chunk size and certain chunk overhead as in N*(chunk size – K), where N is any natural number and K overhead which dependents on the chunk size as in 8K – 60 Bytes, 16K – 120 Bytes and 32K – 240 Bytes. In these points the amount of physical writes direct (lob) is minimal and what is interesting depends only on the size of the incoming data and the Block Size as in Blob Size/Block Size. From which we can see that each BLOB chunk is written only once.
Here is one more interesting thing to mention, although it is not confirmed. Number of additional operations in other cases exactly matches following scenario that can be clearly described as rewriting BLOB chunks, when ongoing operation reads entire BLOB chunk that is left incompleteby the previous write operation, appends data from current buffer and stores all chunk’s blocks back into the DB. This also triggers significant degradation of LIO per effective amount of written data ratio, which improves with the increase of the write buffer size. This behavior was one of the culprits of the original issue. Of course, not closing BLOB stream on each write would be a better way. However, sometime it isn’t possible to rewrite the code. If an application can be configured to use buffer of a certain length, the issue can be alleviated.
ASSM was playing not the least role in this situation. It has known for causing process “spinning” behavior. In our case the session was sitting in the following stack.
#0 0x000000000159fa03 in kcbgcur ()
#1 0x0000000001007f3b in ktugusc ()
#2 0x0000000001019032 in ktugti () - KTU: Kernel Transaction Undo Get Transaction table entry Information
#3 0x0000000001047e66 in ktbIsItlEntryCommitted ()
#4 0x000000000104db76 in ktrIsItlEntryCommitted ()
#5 0x00000000027cfa2e in kdlxgs_init () - reclaim space from transaction freelist in index
#6 0x0000000000bfab81 in ktsplbfmb ()
#7 0x0000000000bfbd80 in ktsplbrecl ()
#8 0x0000000000bd36e9 in ktspgsp_cbk1 ()
#9 0x00000000027e740f in kdlgsp_init () - Space management batching
#10 0x00000000027e45cb in kdl_write1 ()
#11 0x0000000001d2ece3 in koklwrite () - KOK Lob WRITE
#12 0x00000000022f2094 in kpolob () -
KPP
Lob operations
#13 0x0000000000711ef8 in opiodr ()
#14 0x00000000039a973b in ttcpip ()
#15 0x000000000070df90 in opitsk ()
#16 0x0000000000710e36 in opiino ()
#17 0x0000000000711ef8 in opiodr ()
#18 0x000000000070bc23 in opidrv ()
#19 0x000000000070a0ce in sou2o ()
#20 0x00000000006d008b in opimai_real ()
#21 0x00000000006cffdc in main ()
Moving hot BLOBs to tablespaces with manual segment space management further improved the situation.
Here are few papers on Metalink that could be helpful in resolving similar BLOB issues.
Note:162345.1 “LOBS – Storage, Read-consistency and Rollback”
Note:66431.1 “LOBS – Storage, Redo and Performance Issues”
Note:268476.1 “LOB Performance Guideline”
Bug: 5253061 “LOB INSERT PERFORMANCE DIFFERS AFTER INSERT DEPENDING ON COMMIT”
Bug: 6128525 “ELAPSED TIME IS TOO HIGHER THAN “WAITED TIME + CPU TIME” IN ASSM LOB ACCESS”
Bug: 5131464 “10.1.0.4 RDBMS 10.1.0.4 SPACE PRODID-5 PORTID-226”
Going to OOW 2007
Took entire week of PTO and I’m going to California for the first time. I’m going to Oracle Open World for the first time as well. I heard some good words about it all. I’m going there with my family too.
Tricky part is to find enough time to enjoy both – the event and the place. So, next week I probably will run fewer tests and help fewer customers. But I’m sure I’ll catch up later with all that new energy.
Recursive Calls
Here is one interesting subject – recursive calls and performance monitoring instrumentation. This is just a few observations of the session tracing facility, Statspack, AWR and Tkprof point of view on recursive calls. We want to observe and clarify how the Tkprof, Statspack and AWR calculate top and recursive call statistics.
The definition of recursive call as per Metalink Note:41634.1 and 10g Tkprof Manual:
“Recursive Calls
—————
Sometimes to execute a SQL statement issued by a user, Oracle must issue
additional statements. Such statements are called ‘recursive calls’ or
‘recursive SQL statements’. For example, if you insert a row into a table
that does not have enough space to hold that row, Oracle makes recursive
calls to allocate the space dynamically. Recursive calls are also generated
when data dictionary information is not available in the data dictionary
cache and must be retrieved from disk.
If recursive calls occur while the SQL trace facility is enabled, TKPROF
produces statistics for the recursive SQL statements and clearly marks them
as recursive SQL statements in the output file. Note that the statistics for
a recursive SQL statement are included in the listing for that statement,
not in the listing for the SQL statement that caused the recursive call. So
when you are calculating the total resources required to process a SQL
statement, you should consider the statistics for that statement as well as
those for recursive calls caused by that statement.”
Recursive calls are triggered by top level calls that governs processing of statements directly issued by users (or more precisely, issued by a program that communicates directly with Oracle on behalf of its end-users or scripts). Tkprof is supposed to report base statistics (excluding the statistics of its recursive calls) of a call at any level.
For the purpose of the following discussion let me add the following statistics definitions taken from Statistics Descriptions that can be viewed either in Statspack or AWR report or directly using v$sysstat view:
“CPU used by this session – Amount of CPU time (in 10s of milliseconds) used by a session from the time a user call starts until it ends. If a user call completes within 10 milliseconds, the start and end user-call time are the same for purposes of this statistics, and 0 milliseconds are added.
recursive cpu usage – Total CPU time used by non-user calls (recursive calls). Subtract this value from “CPU used by this session” to determine how much CPU time was used by the user calls.”
Documentation says that “CPU used by this session” statistics includes CPU used by recursive calls.
Let’s run a test and take a look at this from practical perspecive and see how well 10.2.0.3 DB conforms to its documentation. I’ve chosen following code for this testing:
create or replace function recursive_function(leveln number, maxlevel number) return number
as
l_level number := leveln;
c number := 0;
begin
loop
exit when c > 4000000000;
c := c + 1;
end loop;
if l_level+1 <= maxlevel then
execute immediate 'select /*+ query at level '||(l_level+1)||' */ recursive_function('||(l_level+1)||','||maxlevel||') from dual' into c;
end if;
return c;
end;
/
begin
perfstat.statspack.snap(i_ucomment=>'start at '||to_char(sysdate,'yy/mm/dd hh24:mi:ss'));
DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ('
ALL
');
end;
/
alter session set events '10046 trace name context forever, level 8';
select /*+ top */ recursive_function(0,3) from dual;
begin
perfstat.statspack.snap(i_ucomment=>'end at '||to_char(sysdate,'yy/mm/dd hh24:mi:ss'));
DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ('
ALL
');
end;
/
alter session set events '10046 trace name context off';
It triggers chain of recursive calls with max level 3. This code generates quasi-equal amount of pure CPU load and should allow identifying resource consumption on any level quite easily.
Raw Trace:
PARSING IN CURSOR #53 len=51 dep=0 uid=57 oct=3 lid=57 tim=1166023881658166 hv=1409094977 ad='6cf1aba8'
select /*+ top */ recursive_function(0,3) from dual
END OF STMT
PARSE #53:c=0,e=163,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1166023881658162
EXEC #53:c=0,e=37,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1166023881658276
WAIT #53: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=56115 tim=1166023881658319
WAIT #53: nam='SQL*Net message from client' ela= 44597 driver id=1413697536 #bytes=1 p3=0 obj#=56115 tim=1166023881702983
*** 2007-11-02 09:05:46.334
=====================
PARSING IN CURSOR #13 len=64 dep=1 uid=57 oct=3 lid=57 tim=1166024166342227 hv=1557986045 ad='72a95828'
select /*+ query at level 1 */ recursive_function(1,3) from dual
END OF STMT
PARSE #13:c=2999,e=2789,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,tim=1166024166342221
EXEC #13:c=0,e=37,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1166024166342394
*** 2007-11-02 09:10:59.564
=====================
PARSING IN CURSOR #55 len=64 dep=2 uid=57 oct=3 lid=57 tim=1166024472231315 hv=4273745374 ad='6e06c2e8'
select /*+ query at level 2 */ recursive_function(2,3) from dual
END OF STMT
PARSE #55:c=3000,e=2617,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=1,tim=1166024472231309
EXEC #55:c=0,e=38,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=1,tim=1166024472231491
*** 2007-11-02 09:15:50.950
=====================
PARSING IN CURSOR #14 len=64 dep=3 uid=57 oct=3 lid=57 tim=1166024756787955 hv=1817767248 ad='6e2b2ec8'
select /*+ query at level 3 */ recursive_function(3,3) from dual
END OF STMT
PARSE #14:c=2000,e=2633,p=0,cr=0,cu=0,mis=1,r=0,dep=3,og=1,tim=1166024756787948
EXEC #14:c=0,e=65,p=0,cr=0,cu=0,mis=0,r=0,dep=3,og=1,tim=1166024756788155
*** 2007-11-02 09:20:42.250
FETCH #14:c=291241724,e=284472442,p=0,cr=0,cu=0,mis=0,r=1,dep=3,og=1,tim=1166025041260627
STAT
#14 id=1 cnt=1 pid=0 pos=1 obj=0 op='
FAST
DUAL (cr=0 pr=0 pw=0 time=5 us)'
FETCH #55:c=582567435,e=569029386,p=0,cr=0,cu=0,mis=0,r=1,dep=2,og=1,tim=1166025041260907
STAT
#55 id=1 cnt=1 pid=0 pos=1 obj=0 op='
FAST
DUAL (cr=0 pr=0 pw=0 time=3 us)'
FETCH #13:c=895731828,e=874918635,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,tim=1166025041261058
WAIT #53: nam='SQL*Net message to client' ela= 4 driver id=1413697536 #bytes=1 p3=0 obj#=56115 tim=1166025041261123
FETCH #53:c=1187120530,e=1159558108,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,tim=1166025041261160
WAIT #53: nam='SQL*Net message from client' ela= 54812 driver id=1413697536 #bytes=1 p3=0 obj#=56115 tim=1166025041316063
STAT
#53 id=1 cnt=1 pid=0 pos=1 obj=0 op='
FAST
DUAL (cr=0 pr=0 pw=0 time=5 us)'
In the 10g raw trace time is given in microseconds. So, you see how the CPU cosumption and elapsed time are reported in the trace. At each level reported as its own CPU load plus the CPU load from the call sub-tree (all descendants):
CPUL3 (291241724) = CPUL3.base + CPUL4(0: there are no further descendants)
CPUL2 (582567435) = CPUL2.base + CPUL3(291241724)
CPUL1 (895731828) = CPUL1.base + CPUL2(582567435)
CPUL0 (1187120530) = CPUL0.base + CPUL1(895731828)
or,
CPUL3.base = CPUL3 = 291241724 = 291 s
CPUL2.base = CPUL2 (582567435) - CPUL3(291241724) = 291325711 = 291 s
CPUL1.base = CPUL1 (895731828) - CPUL2(582567435) = 313164393 = 313 s
CPUL0.base = CPUL0 (1187120530) - CPUL1(895731828) = 291388702 = 291 s
Tkprof:
select /*+ top */ recursive_function(0,3)
from
dual
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 1187.12 1159.55 0 0 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 1187.12 1159.55 0 0 0 1
Parsing user id: 57
================================================================================
select /*+ query at level 1 */ recursive_function(1,3)
from
dual
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 313.16 305.88 0 0 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 313.16 305.88 0 0 0 1
Parsing user id: 57 (recursive depth: 1)
================================================================================
select /*+ query at level 2 */ recursive_function(2,3)
from
dual
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 291.32 284.55 0 0 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 291.32 284.55 0 0 0 1
Parsing user id: 57 (recursive depth: 2)
================================================================================
select /*+ query at level 3 */ recursive_function(3,3)
from
dual
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 291.24 284.47 0 0 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 291.24 284.47 0 0 0 1
Parsing user id: 57 (recursive depth: 3)
I always thought Tkprof doesn’t report net statistics for calls, but now I see how wrong I was. From this test it is obvious that the Tkprof reported base times for recursive calls at all levels, and only for the top level calls the CPU and elapsed time are defined as gross load including all levels: CPUL0 = CPUL0.base + CPUL1.base + CPUL2.base + CPUL3.base = 291 + 313 + 291 + 291 = 1187 s. My memory might be playing tricks with me, but I recall I saw a discussion on Jonathan Lewis’s blog about this Tkprof deficiency. Maybe someone will update me where it was.
AWR:
SQL ordered by Elapsed Time
Elapsed CPU Elap per % Total
Time (s) Time (s) Executions Exec (s) DB Time SQL Id
---------- ---------- ------------ ---------- ------- -------------
1,160 1,160 1 1159.6 99.8 1rcptud9zu5a1
Module: SQL Developer
select /*+ top */ recursive_function(0,3) from dual
875 875 1 874.9 75.3 9pgc4u1fdtyrx
Module: SQL Developer
select /*+ query at level 1 */ recursive_function(1,3) from dual
569 569 1 569.0 49.0 0v0qmk7zbsbfy
Module: SQL Developer
select /*+ query at level 2 */ recursive_function(2,3) from dual
284 284 1 284.5 24.5 2zfr66tq5jvah
Module: SQL Developer
select /*+ query at level 3 */ recursive_function(3,3) from dual
Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
CPU time 1,162 100.0
control file parallel write 396 1 1 0.0 System I/O
...
Time Model Statistics
Statistic Name Time (s) % of DB Time
------------------------------------------ ------------------ ------------
sql execute elapsed time 1,162.0 100.0
DB CPU 1,161.9 100.0
PL/SQL execution elapsed time 1,159.6 99.8
...
DB time 1,162.1 N/A
background elapsed time 2.9 N/A
background cpu time 1.5 N/A
Instance Activity Stats
Statistic Total per Second per Trans
-------------------------------- ------------------ -------------- -------------
CPU used by this session 118,971 100.0 13,219.0
CPU used when call started 118,946 99.9 13,216.2
recursive cpu usage 89,804 75.5 9,978.2
I rearranged it a little bit for better view.
First of all, the top section shows that each of the times reported at each level contain times of its descendants (i.e. as in raw trace). Well, no wonder, there is probably not enough information to track call dependency and subtract statistics of children from a parent. Anyhow, it reports gross load for each recursive level and top level calls. Anyone analyzing AWR reports should note statement “Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code.”that precedes this section. Apparently it is actually true not only for PL/SQL code, but for any type of code. This can be seen in the time model statistics where the sql execute elapsed time roughly equal PL/SQL execution elapsed time.
Also, comparing this section to the raw trace or Tkprof output shows that in the AWR report CPU time matches exactly the elapsed time, although it was not the same in the raw trace. This would seem just a reporting “feature”, if not for the information from the further sections. CPU time in Top 5 Timed Events and DB CPU in Time Model Statistics is roughly about 1,160 s. Ignoring small difference, same time is spent by the top level statement. Only Instance Activity Stats section shows 1,189.7 s that almost matches the raw trace CPU time consumed by all subject statements with a little offset of 2.58 s. Further, the recursive cpu usage of 898 s matches the CPU time of the recursive calls starting at level 1 with difference of 2.3 s. Both deltas for the most part are time spent in Statspack subroutines that were captured by AWR.
Statspack:
Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time
----------------------------------------- ------------ ----------- ------ ------
CPU time 1,190 99.9
control file parallel write 397 1 1 .0
...
In Statspack the Time Model Statistics and Instance Activity Stats precisely equal to the corresponding sections in the AWR report. The SQL statistics wasn’t captured for our statements because they didn’t generate logical or physical reads nor they generate other activity that is considered a treat by the Statspack. However in the Top 5 Timed Events you can see that the CPU time is the figure of CPU used by this session statistics in the Instance Activity Stats, so it comes from v$sysstat view.
This behavior is different from AWR algorithm, which rather seem to be taking CPU time in Top 5 Timed Events from v$sys_time_model or directly from their common source. In case v$sys_time_model and v$sysstat have different source of information, AWR is not bound to the figures of the later when reporting Top 5 Timed Events. This perhaps holds an explanation for the AWR oddity seen by Jonathan Lewis. It would be interesting to find the relevant clues in the reports or in this particular system.
Call-Less Cursor
Many people know that the Tkprof does not report base statistics and times for calls that have recursive calls. This can be handled by other alternative tools. There is one more unfortunate effect to watch out for when doing tuning of a DB that does fair amount of LOB operations. It seems to be more fundamental and much harder to cope with.
Someone was trying to capture performance problem for BLOB loading application. Any method they tried didn’t show good clue to the problem, and they could wonder around it for longer time unless weren’t hinted on certain LOB handling patterns that had been known for some buggy behavior.
But this was just an issue that made me post this other related observation. Here are two excerpts from Tkprof and Statspack reports from database where the case was reproduced.
Tkprof
OVERALL TOTALS FOR
ALL
NON-RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 7 0.02 0.02 4 86 0 0
Execute 9 0.66 1.14 82 2956 5618 8
Fetch 1 0.00 0.00 0 2 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 17 0.68 1.16 86 3044 5618 9
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message from client 100014 54.07 112.26
db file sequential read 1 0.00 0.00
SQL*Net message to client 100013 0.00 0.19
log file sync 10 0.00 0.02
latch: shared pool 11 0.02 0.02
OVERALL TOTALS FOR
ALL
RECURSIVE STATEMENTS
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 4644 0.94 0.93 0 0 107 0
Execute 11059 6.06 9.85 570 45295 5516 8933
Fetch 23131 0.70 2.23 305 38484 3 28673
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 38834 7.71 13.03 875 83779 5626 37606
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 113211 0.30 324.45
control file sequential read 855 0.03 0.06
direct path write 154743 0.00 0.70
direct path read 99973 0.00 0.64
log file switch completion 111 0.97 65.21
log file switch (checkpoint incomplete) 255 0.97 119.11
latch: shared pool 4 0.00 0.00
latch: checkpoint queue latch 1 0.00 0.00
latch: library cache 1 0.00 0.00
latch: cache buffers chains 1 0.00 0.00
db file scattered read 1 0.00 0.00
Statspack
Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time
----------------------------------------- ------------ ----------- ------ ------
db file sequential read 121,368 373 3 49.4
CPU time 141 18.7
log file switch (checkpoint incomplete) 263 122 463 16.1
log file switch completion 113 66 586 8.8
log file parallel write 3,837 47 12 6.3
-------------------------------------------------------------
Time Model System Stats DB/Inst: TST10203/tst10203 Snaps: 2625-2631
-> Ordered by % of DB time desc, Statistic name
Statistic Time (s) % of DB time
----------------------------------- -------------------- ------------
DB CPU 292.3 40.1
sql execute elapsed time 45.8 6.3
PL/SQL execution elapsed time 32.8 4.5
parse time elapsed 5.4 .7
hard parse elapsed time 5.3 .7
PL/SQL compilation elapsed time 0.1 .0
hard parse (sharing criteria) elaps 0.1 .0
connection management call elapsed 0.0 .0
repeated bind elapsed time 0.0 .0
hard parse (bind mismatch) elapsed 0.0 .0
sequence load elapsed time 0.0 .0
DB time 728.5
background elapsed time 130.7
background cpu time 27.5
-------------------------------------------------------------
Both parts indicate that there was serious activity, like waits on the sequential disk reading and redo log synchronization, CPU utilization, that was not reported in category “sql execute elapsed time” and “PL/SQL execution elapsed time”. There was no statement reported in Tkprof or Statspack that generated this activity, but there was a particular cursor seen all over the 10046 trace that was responsible for some of these missing components.
I kind of was lucky to see it, because the cursor has no associated calls whatsoever, only wait records. Unless there are waits associated with such cursor, there will be no records about it in the Oracle Trace. Another “funny” thing is that this call-less cursor can have recursive calls too. This cursor is created solely to write and read LOB content.
So, here are few words of caution for those who do performance troubleshooting using raw trace files, Statspack or ASH/AWR reports. Tkprof or other tools that would try to calculate CPU, Elapsed Time or Wait statistics and build hierarchy of calls of sessions containing LOB operations will likely produce incorrect results from slightly wrong to misleading. This is because calls contain major information that allows traversing call hierarchy and calculating base statistics and timing figures. The “missing” calls might cause whole call hierarchy tree to break. For the same reason Statspack reports may show resource consumption without any visible consumer.
One more thing worth mentioning. If LOB operations are wrapped into code that uses calls to communicate with the database (i.e. PL/SQL anonymous blocks or procedures), then the resources consumed by the LOB operations inside that code will be assigned to these calls.
Tuning Collections in Queries [1]
I will continue series about tuning collections in queries starting from the simplest case – collection of values joined to one table. Our subject SQL statements look similar to the following two forms.
1)
select table1.*
from table1
where table1.f1 in
(select column_value from table(test_statement.getList1) nlist1);
and
2)
select table1.*
from table1, table(test_statement.getList1) nlist1
where t1.object_id = nlist1.column_value;
We will run them in several ways by adding hints and CBO parameters. First test run is dedicated to show default CBO computations. Note cryptic name KOKBF$ in the predicate section of the explain plan. KOKBF$ is a generic alias given the table expressions. COLLECTION ITERATOR PICKLER FETCH is an operation that retrieves values from collections (if collection was instantiated in the query the operation will read: COLLECTION ITERATOR CONSTRUCTOR FETCH).
Test 1.1:
select /*+ gather_plan_statistics */ t1.object_name,t1.object_type from t1
where t1.object_id in (select * from table(cast(test_statement.getList1 as table_of_number)) nlist1)
Plan:
----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes |
----------------------------------------------------------------------------------------------------------------------------
| 1 | NESTED LOOPS | | 1 | 255 | 1252K|00:02:07.32 | 5091K| 177K| 2959 |
| 2 |
SORT
UNIQUE | | 1 | 255 | 1878K|00:00:31.30 | 31 | 2959 | 2959 |
| 3 | COLLECTION ITERATOR PICKLER FETCH| GETLIST1 | 1 | | 2000K|00:00:04.28 | 0 | 0 | 0 |
| 4 | TABLE ACCESS BY INDEX ROWID | T1 | 1878K| 1 | 1252K|00:01:31.73 | 5091K| 174K| 0 |
|* 5 | INDEX UNIQUE SCAN | T1_I | 1878K| 1 | 1252K|00:00:37.04 | 3839K| 28328 | 0 |
----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("T1"."OBJECT_ID"=VALUE(KOKBF$))
About two minute runtime is not completely bad. But let’s see what we are driving into here. Nested Loops and Index Access isn’t very good choice for getting > 10% of the data of the base table. Clearly, the estimated cardinality in bold is the culprit here. The value in row with Id 2 is not the part of the output, rather it was taken from CBO trace figures. The original 10053 trace file doesn’t have definite clue where this figure comes from. For the reason described below, this could be a function of the default block size.
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: KOKBF$ Alias: KOKBF$ (NOT ANALYZED)
#Rows: 8168 #Blks: 100 AvgRowLen: 100.00
...
Join order[2]: KOKBF$[KOKBF$]#1 T1[T1]#0
SORT
resource Sort statistics
Sort width: 459 Area size: 402432 Max Area size: 80530432
Degree: 1
Blocks to Sort: 2865 Row size: 13 Total Rows: 1800000
Initial runs: 2 Merge passes: 1 IO Cost / pass: 1554
Total IO sort cost: 4419 Total CPU sort cost: 1764307007
Total Temp space used: 28894000
***************
Now joining: T1[T1]#0
***************
NL Join
Outer table: Card: 255.25 Cost: 4653.33 Resp: 4653.33 Degree: 1 Bytes: 2
The outer table KOKBF$ was assigned default cardinality as num_of_blocks * (block_size – cache_layer) / avg_row_len. In all tests cache_layer is 24. Something tells me it is similar to the 24 bytes per INITRANS, but I still cannot get my head around the fact where the block fixed header size has gone. It might be something else though. Anyway, given the figures, our default cardinality is 100 * (8192 – 24) / 100 = 8168. With 16K block this would equal 16360 etc. When the collection is uniquely sorted for “IN” clause, its computed cardinality is reduced to 255.25 and rounded to 255. With 16K block this becomes 511.5. This is persuasively looks like default_cardinality/32. One more reason to think the value comes from default statistics assumptions shown in the below CBO trace which is generated by this simple query.
select distinct column_value from table(test_statement.getList1);
The CBO trace in this case has following number of distinct values taken from the default statistics of the collection. If this is so, it explains how CBO arrives at the cardinality of the uniquely sorted collection values in the above calculation.
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: KOKBF$ Alias: KOKBF$ (NOT ANALYZED)
#Rows: 8168 #Blks: 100 AvgRowLen: 100.00
Column (#1): KOKBC$(VARCHAR2) NO STATISTICS (using defaults)
AvgLen: 1.00 NDV: 255 Nulls: 0 Density: 0.0039177
But I will shut up about it for now until I find out this for certain. Now let’s see what the second form of the testing queries produce.
Test 1.2:
select /*+ gather_plan_statistics */
t1.object_name,t1.object_type from t1, table(cast(test_statement.getList1 as table_of_number)) nlist1
where t1.object_id = nlist1.column_value
Plan:
------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------------------------
| 1 | NESTED LOOPS | | 1 | 8168 | 1333K|01:27:04.70 | 5422K| 2692K|
| 2 | COLLECTION ITERATOR PICKLER FETCH| GETLIST1 | 1 | 8168 | 2000K|00:00:06.30 | 0 | 0 |
| 3 | TABLE ACCESS BY INDEX ROWID | T1 | 2000K| 1 | 1333K|01:26:51.90 | 5422K| 2692K|
|* 4 | INDEX UNIQUE SCAN | T1_I | 2000K| 1 | 1333K|00:05:39.05 | 4088K| 1382K|
------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T1"."OBJECT_ID"=VALUE(KOKBF$))
In this case the default cardinality of the 8168 rows of the collection was left intact. The plan didn’t change from Nested Loops for obvious reasons. The time and number of reads skyrocketed. This happened because the collection values were randomly chosen and not sorted as they were in the previous example. Access of the base table became more scattered and subsequently more reads took place. If a key value distribution is known and clustered, it might be worth using co-located values in collections.
Now it’s time to move on to the different settings. To change CBO default cardinality estimation this query uses officially unsupported CARDINALITY hint. You should ask Oracle permission to use it in production.
Test 2.1 (adjusted cardinality):
select /*+ gather_plan_statistics */ t1.object_name,t1.object_type from t1 where
t1.object_id in (select /*+ cardinality(nlist1 1800000) */ * from table(cast(test_statement.getList1 as table_of_number)) nlist1)
Plan:
----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes |
----------------------------------------------------------------------------------------------------------------------------
| 1 | NESTED LOOPS | | 1 | 255 | 1252K|00:01:37.31 | 5092K| 172K| 2959 |
| 2 |
SORT
UNIQUE | | 1 | 255 | 1878K|00:00:27.63 | 160 | 2970 | 2959 |
| 3 | COLLECTION ITERATOR PICKLER FETCH| GETLIST1 | 1 | | 2000K|00:00:04.32 | 129 | 11 | 0 |
| 4 | TABLE ACCESS BY INDEX ROWID | T1 | 1878K| 1 | 1252K|00:01:04.31 | 5091K| 169K| 0 |
|* 5 | INDEX UNIQUE SCAN | T1_I | 1878K| 1 | 1252K|00:00:23.96 | 3839K| 23944 | 0 |
----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("T1"."OBJECT_ID"=VALUE(KOKBF$))
No changes in the plan and statistics. What happened?! Did CBO even considered the change? Let’s take a look at the CBO trace.
***********************
Table Stats::
Table: KOKBF$ Alias: KOKBF$ (NOT ANALYZED)
#Rows: 8168 #Blks: 100 AvgRowLen: 100.00
***************************************
SINGLE TABLE ACCESS PATH
No statistics type defined for function TEST_STATEMENT
Table: KOKBF$ Alias: KOKBF$
Card: Original: 8168 >> Single Tab Card adjusted from: 8168.00 to: 1800000.00
Rounded: 1800000 Computed: 1800000.00 Non Adjusted: 8168.00
It has obviously mentioned the hint value. But the final cardinality got value as before for the same reason of collection being processed as part of “IN” sub-query. I’ll get back to this a little later.
Alright, let’s see what the second form would be like.
Test 2.2 (adjusted cardinality):
select /*+ gather_plan_statistics cardinality(nlist1 1800000) */
t1.object_name,t1.object_type from t1, table(cast(test_statement.getList1 as table_of_number)) nlist1 where t1.object_id = nlist1.column_value
Plan:
---------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes |
---------------------------------------------------------------------------------------------------------------------------
|* 1 | HASH JOIN | | 1 | 1800K| 1333K|00:01:37.68 | 145K| 189K| 45724 |
| 2 | COLLECTION ITERATOR PICKLER FETCH| GETLIST1 | 1 | 1800K| 2000K|00:00:04.22 | 0 | 0 | 0 |
| 3 | TABLE ACCESS FULL | T1 | 1 | 10M| 10M|00:00:31.42 | 145K| 143K| 0 |
---------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."OBJECT_ID"=VALUE(KOKBF$))
Now, the hinted cardinality is in the final execution plan. And no wonder the CBO switched the plan from Nested Loops to Hash Join. Compare the logical and physical reads of this test and tests 1.1 and 2.1. If I would have allowed bigger hash area for this query, it could run in-memory and show even better results. Last test for this post is suppose to fix the query with the sub-query. For this we would need to do an additional modification. This time modification of the CBO parameter “_always_semi_join” is needed.
Test 3.1: (adjusted cardinality, “_always_semi_join”=off):
select /*+ gather_plan_statistics */ t1.object_name,t1.object_type from t1 where
t1.object_id in (select /*+ cardinality(nlist1 1800000) */ * from table(cast(test_statement.getList1 as table_of_number)) nlist1)
Plan:
-----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes |
-----------------------------------------------------------------------------------------------------------------------------
|* 1 | HASH JOIN | | 1 | 1800K| 1252K|00:01:26.75 | 145K| 195K| 50757 |
| 2 | VIEW | VW_
NSO
_1 | 1 | 1800K| 1878K|00:00:11.52 | 0 | 6825 | 6825 |
| 3 | HASH UNIQUE | | 1 | 1800K| 1878K|00:00:09.64 | 0 | 6825 | 6825 |
| 4 | COLLECTION ITERATOR PICKLER FETCH| GETLIST1 | 1 | 1800K| 2000K|00:00:04.24 | 0 | 0 | 0 |
| 5 | TABLE ACCESS FULL | T1 | 1 | 10M| 10M|00:00:31.40 | 145K| 144K| 0 |
-----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."OBJECT_ID"="COLUMN_VALUE")
Why, you might ask, the Semi Join parameter has any effect on this? In pre evaluation phase CBO tries to apply different query rewrite techniques in order to optimize the query. One of them is unnesting sub-query. In this case it can either do Semi Join or unnest the sub-query as a set (collection of distinct values). Since the Semi Join is disabled it used the later. The sub-query took following form with cardinality hint propagated to the outer query.
SELECT "T1"."OBJECT_NAME" "OBJECT_NAME","T1"."OBJECT_TYPE" "OBJECT_TYPE" FROM (SELECT /*+
OPT
_ESTIMATE (TABLE "KOKBF$" ROWS=1800000.000000 ) */ DISTINCT VALUE(KOKBF$) "COLUMN_VALUE" FROM TABLE(CAST("TEST"."TEST_STATEMENT"."GETLIST1"() AS "TABLE_OF_NUMBER") ) "KOKBF$") "VW_
NSO
_1","TEST"."T1" "T1" WHERE "T1"."OBJECT_ID"="VW_
NSO
_1"."COLUMN_VALUE"
Basically the original query was rewritten into following form. It is similar to the second original form with the exception of distinct clause. Cardinality hint was not ignored as it was in the original query and CBO generated desired plan.
select /*+ cardinality(nlist1 1800000) */
t1.object_name,t1.object_type
from t1
, (select distinct column_value from table(cast(test_statement.getList1 as table_of_number))) nlist1
where t1.object_id = nlist1.column_value;
Few words in conclusion. When tuning these queries, make sure CBO doesn’t rely on default statistics. In most cases CBO will subsequently generate sub-optimal plan, i.e. NLs instead of desired HJ or vice versa. If you cannot disable Semi Join, unnest “IN” sub-query. When tuning “IN” sub-query, use cardinality value close to the number of distinct elements, because that is what CBO will use in its calculations. If distribution of table data is deterministic, co-locate the values of the joined collection whenever possible to increase clustering of the access operation.
ORA-01461: can bind a LONG value only for insert into a LONG column
Just one another issue with JDBC I think is worth mentioning. You perhaps wonder why so obvious error message could be troublesome to comprehend. Unfortunately, it speaks of things that a developer never meant to happen. It doesn’t have to be a LONG value. This error could appear when ASCII strings are bound for VARCHAR2 columns. In this case it happens when JDBC v. 10.1/2 is used with a Oracle Server 9i database configured for multi-byte character set. Before going any further, please note that according to support matrix Oracle JDBC 10g and 9i are cross supported with Oracle Server 9i and 10g.
For the testing I used two databases Oracle 10g and Oracle 9i with database character set UTF8 and JDBC 10.2.0.3.
Some of the Asian language characters can consume up to 3 bytes in UTF8 encoding. But can you expect that a string composed of 1334 ASCII characters would not bind for VARCHAR2(4000)? However, it is possible. Let’s dig up little more details.
Here is how the string of 4000 “a” characters would appear to the Oracle 9iwhen bound from JDBC 10g with default properties. This bind is not spooled into the 10046 trace file. Instead it can be captured by enabling stacktrace dump on 1461 event.
bind 19: dty=1 mxl=4001(12000) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 size=4000 offset=0
bfp=ffffffff7cd7f060 bln=4000 avl=00flg=05
Oracle Server 10g. Note, that when you test this scenario, there are no errors – Oracle Server 10g is perfectly fine with the 12000 bytes in the private buffer.
Bind#19
oacdty=01 mxl=4001(12000) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1000010 frm=01 csi=871 siz=4000 off=0
kxsbbbfp=ffffffff7b55a0c0 bln=4000 avl=4000 flg=05
value="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa....
According to Metalink Note: 3756847.8 this problem was introduced in some 10.1 versions of JDBC, but was fixed in 10.2. However, it is not enabled by default. Following JDBC connection property enables the fix.
java.util.Properties props = new java.util.Properties();
props.put("oracle.jdbc.RetainV9LongBindBehavior","true");
ods.setConnectionProperties(props);
ods.setURL(url);
And here is how the bind looks like after the fix is enabled.
Oracle Server 9i:
bind 19: dty=1 mxl=4000(4000) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 size=4000 offset=0
bfp=ffffffff7cd56088 bln=4000 avl=4000 flg=05
value="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...
Oracle Server 10g:
Bind#19
oacdty=01 mxl=4000(4000) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1000010 frm=01 csi=871 siz=4000 off=0
kxsbbbfp=ffffffff7b56f060 bln=4000 avl=4000flg=05
value="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...
There is no need in enabling this fix for 10g/10g setup. 10g version of the trace is shown just for comparison purposes.
In conclusion I want to clarify that this article describes the unexpected behavior with legitimate length values. However, there are other cases when apparently correct values fail to insert with same message. I.e. already mentioned strings that, when using UTF8 for database character set, can expand from 4000 Asian language characters to up to 12000 byte values. In these cases, Notes: 241358.1, 370438.1 recommend different workarounds. Note: 445072.1 has decent example of getting the length of encoded string and suggests converting VARCHAR2 columns into CLOB if encoded string length exceeds maximum length of VARCHAR2 type.
Tuning Collections in Queries [Introduction]
[To Next Article [1]] [To Performance Study]
If you never heard of using collections instead of regular IN-LIST expressions, you might want to read varying elements in IN list article from AskTom. It unfortunately has little information about the way CBO treats collections. So, if you have problems with tuning queries that employ nested tables or just evaluating what is in the way please read on. Here is one of the basic forms of SQL queries that employ collections.
select *
from t1
where t1.object_id in
(
select *
from table(cast(:1 as table_of_number))
);
There are several optimization issues related to tuning queries that use collections. Many of them can be attributed to the CBO statistical assumptions. The CBO calculations and figures are a little different in 9i and 10g. However the issue might have change the appearance, but not the essence. There will be few more posts on some basic forms of queries with collections.
Over the past three years I’ve engaged with the Oracle TS several times to find a good resolution for this nuisance. That generated couple bugs: 4484645, 4235962. One of them is closed as “NOT A BUG”.
Blog theme
I know, I know… The theme looks exactly like the one Jonathan Lewis has for his Scratchpad. Sorry, about that. I’ve tried every possibility and this is the only one that is comfortable for posting Oracle trace or code snippets. The closest thing I found that wouldn’t wrap the code is a Sandbox theme. Others would have very small or otherwise irritating font.
“Undocumented?!” – Use it or not
Rainy day… Thoughts arn’t well aligned… At the end of the day I was wondering on the Internet searching for an inspiration. For a moment I thought I found something interesting. It was on the Jonathan Lewis’s blog – something about Oracle traces. But just a second later after finished reading I grasped what the message was about. In few words there was confusion, doubt and something else, or so it seemed. The title was Trace Files. And here you go an inspiration came, and as usual from a direction you’d never expect.
Should we use undocumented or unofficial methods? Yes, no doubt. That is of course when they are judicially legitimate. I just wanted to point out few key elements of the documented versus undocumented approach as I understand and interpret it. In a different scope this might become discussion about supported versus unsupported and so on.
Documentation is a right place to go if it exists. If it doesn’t, “hacking” in the right meaning of the word, is the way to know the system better than it can possibly be documented. Consider Oracle tracing facility as free of charge and supported debugger. By using it you’d learn about the Oracle and application that uses Oracle, about specifics of different Oracle versions, the improvements in newer releases and deficiencies in former releases.
What you would lack when you use Oracle tracing is official evidence. If you see a problem in the trace, but struggle to reproduce it using documented forms of proof, you might have some trouble building a case for an SR. You also will have certain difficulties explaining it to the Oracle TS what is that you are trying to solve. And final point against it, when filing SR, is that in Oracle R&D work people just like you. Reverse enginiring an issue from a trace file that you attached is the last thing they wanted to do. Similar to any other trace whether that is a stack trace from JVM or a core dump. It is always better for them to have a reproducible at will test case. However, many times by looking at traces I found things to build a simple test case that I wouldn’t have gotten that easily otherwise.
To summarize, as far as I’m concerned, if the Oracle traces, considering their generation can be done using supported API, allow me to find true problems and do right decisions I’ll keep using them even if their format is neither documented nor supported.
Statement Cache causes OutOfMemoryError (JDBC)
How efficient parse one – execute many in JDBC can get? So obvious recommendation to keep PreparedStatement open as long as possible is not easy. Testing custom built statement cache solution proved it to be not very well scalable. Increasing number of concurrent connections or the size of the cache of open statements lead to ineviable OutOfMemoryError exceptions.
I noticed during my testing by fetching from different “width” tables and playing with prefetch row count, that the wider the row or higher prefetch count were the faster my cache had grown. I debugged my code for many hours looking whether I was closing result sets or for something else very obvious. Nope, no luck – everything was correct. And yet OOME was consistent.
In the end it appeared that JDBC OracleStatement kept reference to so called accessors’ arrays (like a cache of prefetched rows data) for a result set that had been last executed even when the result set was already closed. The size of that data was proportional to the product of the fetched data (in my tests table) width and prefetch row count. I’ve tried then JDBC implicit and explicit caching and was showing precisely same memory footprint.
Opened SR and after few weeks was advised by a nice Oracle TS lady that there could be one workaround (which is mentioned above). It was to use property FreeMemoryOnEnterImplicitCache and so instead of this:
conn = (OracleConnection) DriverManager.getConnection(url, user, password);
use this
OracleDataSource ods = new OracleDataSource();
Properties props = new Properties();
props.put("user", user);
props.put("password", password);
props.put("oracle.jdbc.FreeMemoryOnEnterImplicitCache", true);
ods.setConnectionProperties(props);
ods.setURL(url);
conn = ods.getConnection();
During testing I didn’t find a way a custom or explicit cache with key can be used effectively, especially if they are LRU caches (the later appeared to be LRU type of cache) – that residue data is going to get into OldGeneration before the statement is reused and the arrays are repopulated. I really hope the “bug” is going to be fixed in JDBC 10.2.0.4.
Metalink bug reference: 6082980