Skip navigation.

Bobby Durrett's DBA Blog

Syndicate content
Oracle database performance
Updated: 10 hours 42 min ago

Commit Puzzle

Fri, 2014-08-22 11:54

This graph represents commit time compared to CPU utilization and redo log write time.  I’ve included only the hourly intervals with more than 1,000,000 commits.  At these peaks the number of commits ranges 1 to 1.6 million commits per hour so each point on the graph represents roughly the same commit rate.  I’m puzzled by why the commit time bounces around peaking above 5 milliseconds when I can’t see any peaks in I/O or CPU that correspond to the commit time peaks.

commitvscpuio

I derived CPU% from DBA_HIST_OSSTAT.  I got the other values by getting wait events from DBA_HIST_SYSTEM_EVENT.  Commit time is log file sync wait time.  Redo write time is log file parallel write wait time.  I converted the wait times to milliseconds so they fit nicely on the chart with CPU%.

I thought I would pass this along as a puzzle that I haven’t figured out.

Here is a zip of the script I used to get the data, its raw output, and the spreadsheet I used to make the chart: zip

- Bobby

P.S.  This is on HP-UX 11.31, Itanium, Oracle 11.2.0.3

P.P.S  Did some more work on this today.  Looks like the high commit time periods have short spikes of long redo log writes even though the average over the hour is still low.  I’m looking at DBA_HIST_SYSTEM_EVENT to get a histogram of the log file parallel write waits and there are a number in the 1024 bucket when the log file sync time is high on average.

END_INTERVAL_TIME   LFPW_MILLI LFPW_COUNT AVG_COMMIT_MS AVG_WRITE_MS
------------------- ---------- ---------- ------------- ------------
21-AUG-14 11.00 AM           1     268136    9.14914833   2.45438987
21-AUG-14 11.00 AM           2     453913    9.14914833   2.45438987
21-AUG-14 11.00 AM           4     168370    9.14914833   2.45438987
21-AUG-14 11.00 AM           8      24436    9.14914833   2.45438987
21-AUG-14 11.00 AM          16       5675    9.14914833   2.45438987
21-AUG-14 11.00 AM          32       6122    9.14914833   2.45438987
21-AUG-14 11.00 AM          64       3369    9.14914833   2.45438987
21-AUG-14 11.00 AM         128       2198    9.14914833   2.45438987
21-AUG-14 11.00 AM         256       1009    9.14914833   2.45438987
21-AUG-14 11.00 AM         512        236    9.14914833   2.45438987
21-AUG-14 11.00 AM        1024         19    9.14914833   2.45438987
21-AUG-14 11.00 AM        2048          0    9.14914833   2.45438987
21-AUG-14 02.00 PM           1     522165    2.97787777   1.64840599
21-AUG-14 02.00 PM           2     462917    2.97787777   1.64840599
21-AUG-14 02.00 PM           4     142612    2.97787777   1.64840599
21-AUG-14 02.00 PM           8      17014    2.97787777   1.64840599
21-AUG-14 02.00 PM          16       4656    2.97787777   1.64840599
21-AUG-14 02.00 PM          32       5241    2.97787777   1.64840599
21-AUG-14 02.00 PM          64       1882    2.97787777   1.64840599
21-AUG-14 02.00 PM         128        820    2.97787777   1.64840599
21-AUG-14 02.00 PM         256        149    2.97787777   1.64840599
21-AUG-14 02.00 PM         512         10    2.97787777   1.64840599
21-AUG-14 02.00 PM        1024          2    2.97787777   1.64840599
21-AUG-14 02.00 PM        2048          0    2.97787777   1.64840599

There were 19 waits over half a second in the first hour and only 2 in the second hour.  Maybe all the log file sync waits pile up waiting for those long writes.  Here is a graph that compares the number of waits over half a second – the 1024 ms bucket – to the average log file sync and log file parallel write times for the hour:

longwrites

You can see that the average redo write time goes up a little but the commit time goes up more.  Maybe commit time is more affected by a few long spikes than by a lot of slightly longer write times.

Found a cool blog post that seems to explain exactly what we are seeing: blog post

 

Categories: DBA Blogs

Commit scalability

Wed, 2014-08-20 13:57

I am learning about how well commits scale on Oracle 11.2 and so far they seem to scale surprisingly well.

I’ve looked at two waits – log file parallel write and log file sync.  Based on documents I’ve read on Oracle’s support site log file parallel write represents the time it takes to do one write to the redo logs.  For mirrored redo logs the log file parallel write time includes the time to write to both of the copies.  Log file sync represents the time it takes for a session to complete a commit and should include all the time measured for the write to the redo logs and added CPU time to process the commit.  So, the log file sync time should equal or exceed the log file parallel write time.

Looking at AWR data I found that at peak times one of our databases had 1 millisecond log file parallel write waits and about 1.2 million waits per hour.  Since there are 3.6 million milliseconds in an hour it seemed to me that during this peak hour the redo logs were about 33% utilized because writes to the redo logs were occurring during 1.2 million of the available 3.6 million milliseconds.  I decided to look at a simple queuing theory model that I had read about in Craig Shallahamer’s Forecasting Oracle Performance book to get a basic idea of how queuing might impact redo log write time as the utilization of the redo log grew closer to 100%.

scalability

This model predicts that the redo log write time will go through the roof as the number of writes per hour approaches 3.6 million, assuming a constant 1 millisecond write time.

To attempt to confirm the predictions made by this graph I decided to build some scripts that will run a bunch of commits on a test database and attempt to max out the writes to the redo log so I could graph the results and compare it to the theoretical model.  In the test I had twenty tables named TEST1, TEST2,…, TEST20.  These tables have one row and one column.  I ran a test of 1, 2, 5, 10, and 20 updating processes that I designed to generate a bunch of commits quickly.  Each process ran 100,000 updates and commits like these:

update test1 set a=a+1;
commit;
update test1 set a=a+1;
commit;
update test1 set a=a+1;
commit;

Each process had its own table – i.e. process 15 acted on table TEST15.

My hope was that as I ran tests with increasing numbers of processes running in parallel eventually I would max out the writes to the redo log and see increasing log file parallel write wait times.  But, surprisingly, as the commit rate increased the redo log write rate actually went down.

commitswrites

This is just one test, but it makes me wonder if I can max out the writes to the redo log.  I believe that as the commit rate increases the database batches the commits together in some efficient way which makes commits more scalable than I realized.

I think that an Oracle database must have some limit to commit scalability that relates more to the CPU used to process the commits instead of the writes to the redo logs.  In these same tests the log file sync or commit time did increase slowly as the number of commits ramped up.

committime

It started around half a millisecond at 3 million commits per hour and slowly grew to almost 2 milliseconds at 10 million commits per hour.  So, commit time grew, but nothing like the original response time graph which went through the roof at around 3.6 million writes to the redo logs per hour.

Here is a zip of the scripts I used to generate the data, the spreadsheets I used to make the charts, and the results from the tests: zip

To run the tests yourself enter your own username and password at the top of setup.sql, commits.sql, and report.sql.  Run one of the five reports – run1.sh, run2.sh, run5.sh, run10.sh or run20.sh like this:

./run5.sh

Wait for the last process to finish outputting then run the report:

./report.sh

Results look like this:

WAIT_EVENT              ELAPSED_SECS WAITS_PER_HOUR WAITS_PER_SECOND AVG_WAIT_MS AVG_WAIT_SEC
----------------------- ------------ -------------- ---------------- ----------- ------------
log file parallel write          542     3008922.51       835.811808  .648076577   .000648077
log file sync                    542     5306207.38       1473.94649   1.1727226   .001172723

I’m really just getting started understanding how commits scale, but it was surprising to me how hard it was to get the rate of redo log writes high enough to cause the  write time to increase due to queuing.  I assume this is because the database batches commits together more efficiently that I expected, which makes commits more scalable than I realized.

- Bobby

P.S.  This is on HP-UX 11.31, Itanium, Oracle 11.2.0.3.0

Categories: DBA Blogs

Passed OCP 12c test

Tue, 2014-08-05 15:27

Well, I finally passed the Oracle Certified Professional (OCP) 12c upgrade exam, 1Z0-060.  I got 86% right on the new features section and 78% correct on the general DBA section.  So, that means I missed roughly 7 questions on each section which was more than I expected because I felt good about all of my answers.

I’m happy with the three resources I used which are:

  1. Self Test software – $99
  2. Matthew Morris book – $10
  3. Sam Alapati book – $44

I spent a lot of time in the 12c manuals.  Be careful, though, because the 12.1.0.2 manuals just came out and some things are different in 12.1.0.2 from the test, because the test designers based the test on 12.1.0.1.

I built two Linux virtual machines for test preparation.  On one I installed the 12.1.0.1 database software and created a CDB and a non-CDB for testing.  On the other I installed the grid software and database software and created a CDB on ASM disks.  I spent many hours testing 12c features on these two environments and three databases.

I learned a ton both about new 12c features and features that exist in earlier versions, but it was a long and tedious process and I’m glad to finally be done.

- Bobby

Categories: DBA Blogs

Query with new plan

Mon, 2014-07-28 18:24

I came up with a simple query that shows a running SQL executing a different plan than what it had in the past.  Here is the query:

-- show currently executing sqls that have history
-- but who have never run with the current plan
-- joins v$session to v$sql to get plan_hash_value of 
-- executing sql.
-- queries dba_hist_sqlstat for previously used values 
-- of plan_hash_value.
-- only reports queries that have an older plan that is 
-- different from the new one.

select
vs.sid,
vs.sql_id,
vs.last_call_et,
sq.plan_hash_value
from
v$session vs,
v$sql sq
where
vs.sql_id=sq.sql_id and
vs.SQL_CHILD_NUMBER=sq.child_number and
sq.plan_hash_value not in 
(select ss.plan_hash_value
from dba_hist_sqlstat ss
where 
ss.sql_id=sq.sql_id) and 
0 < 
(select count(ss.plan_hash_value)
from dba_hist_sqlstat ss
where 
ss.sql_id=sq.sql_id);

Example output:

       SID SQL_ID        LAST_CALL_ET PLAN_HASH_VALUE
---------- ------------- ------------ ---------------
       229 cq8bhsxbbf9k7            0      3467505462

This was a test query.  I ran it a bunch of times with an index and then dropped the index after creating an AWR snapshot.  The query executed with a different plan when I ran it without the index.  The same type of plan change could happen in production if an index were accidentally dropped.

I’m hoping to use this query to show production queries that have run in the past but whose current plan differs from any that they have used before.  Of course, a new plan doesn’t necessarily mean you have a problem but it might be helpful to recognize those plans that are new and that differ from the plans used in the past.

- Bobby

 

Categories: DBA Blogs

12c Threaded Execution Test

Tue, 2014-07-22 17:39

I did a quick check of some facts I’m studying about Oracle 12c and its new threaded execution mode.  I set this parameter:

alter system set THREADED_EXECUTION=true scope=spfile;

I had to connect SYS as SYSDBA with a password to get the system to bounce.

Then it had these processes only:

oracle    1854     1  0 09:17 ?        00:00:00 ora_pmon_orcl
oracle    1856     1  0 09:17 ?        00:00:00 ora_psp0_orcl
oracle    1858     1  2 09:17 ?        00:00:00 ora_vktm_orcl
oracle    1862     1  3 09:17 ?        00:00:00 ora_u004_orcl
oracle    1868     1 99 09:17 ?        00:00:17 ora_u005_orcl
oracle    1874     1  0 09:17 ?        00:00:00 ora_dbw0_orcl

This differs from some of my 12c OCP study material but agrees with the manuals.  Only pmon, psp, vktm, and dbw have dedicated processes.

Also, I found that I needed this value in the listener.ora:

dedicated_through_broker_listener=on

I needed that value to connect using a thread.  Before I put that in it spawned a dedicated server process when I connected over the network.

Lastly, contrary to what I had read I didn’t need to set the local_listener parameter to get the new connections to use a thread:

SQL> show parameter local_listener

NAME                                 TYPE        VALUE
------------------------------------ ----------- ---------------------
local_listener                       string

- Bobby

Categories: DBA Blogs

Finished first pass through Alapati 12c OCP upgrade book

Fri, 2014-07-11 17:29

I just finished reading Sam Alapati’s 12c OCP upgrade book for the first time and I really like it because of the content that it covered which I hadn’t discovered through my study of the Oracle manuals.  Also, it did a good job explaining some things that Oracle’s manuals left unclear.

After reading each chapter I took the end of chapter test and got between 60% and 75% of the questions right.  Next I plan to take the computer based test that was on the CD that came with the book and which covers both parts of the upgrade exam.

I did find minor errors throughout the book, but I still found it very useful especially after having already studied the same topics on my own without a study guide like this one to direct me.  The author’s insights into the test and the material it covers adds value because they guide me to the areas that I need to focus on.

- Bobby

Categories: DBA Blogs

Used Delphix to quickly recover ten production tables

Tue, 2014-07-08 10:24

Yesterday I used Delphix to quickly recover ten production tables that had accidentally been emptied over the weekend.  We knew that at a certain time on Saturday the tables were fully populated and after that some batch processing wrecked them so we created a new virtual database which was a clone of production as of the date and time just before the problem occurred.  We could have accomplished the same task using RMAN to clone production but Delphix spun up the new copy more quickly than RMAN would have.

The source database is 5.4 terabytes and there were about 50 gigabytes of archive logs that we needed to apply to recover to the needed date and time.  It took about 15 minutes to complete the clone including applying all the redo.  The resulting database occupies only 10 gigabytes of disk storage.

If we had used RMAN we would first have to add more disk storage because we don’t have a system with enough free to hold a copy of the needed tablespaces.  Then, after waiting for our storage and Unix teams to add the needed storage we would have to do the restore and recovery.  All these manual steps take time and are prone to human error, but the Delphix system is point and click and done through a graphical user interface (GUI).

Lastly, during the recovery we ran into Oracle bug 7373196 which caused our first attempted recovery to fail with an ORA-00600 [krr_init_lbufs_1] error.  After researching this bug I had to rerun the restore and recovery with the parameter _max_io_size set to 33554432 which is the workaround for the bug.  Had we been using RMAN we probably would have to run the recovery at least twice to resolve this bug.  Maybe we could have started at the point it failed but I’m not sure.  With Delphix it was just a matter of setting the _max_io_size parameter and starting from scratch since I knew the process only took 15 minutes.  Actually it took me two or three attempts to figure out how to set the parameter, but once I figured it out it was so simple I’m not sure why I didn’t do it right the first time.  So, at the end of the day it was just under 3 hours from my first contact about this issue until they had the database up and were able to funnel off the data they needed to resolve the production issue.  Had I been doing an RMAN recover I don’t doubt that I would have worked late into the night yesterday accomplishing the same thing.

- Bobby

P.S. These databases are on HP-UX 11.31 on IA64, Oracle version 11.1.0.7.0.

 

 

Categories: DBA Blogs

Sam Alapati’s 12c OCP upgrade book includes test software with key DBA skills

Thu, 2014-06-26 11:42

Sweet!  I just installed the software from the CD that came with Sam Alapati’s book related to the OCP 12c upgrade exam and found that it has questions related to the key DBA skills section of the test.  Now I feel good about my test preparation materials.  The first time I took the test I didn’t have this degree of guidance on what to study, especially on the key DBA skills section.

Now I have questions on key DBA skills from the testing software included with Sam Alapati’s book, questions on key DBA skills in the Kaplan Selftest software,  and a large section on key DBA skills in Matthew Morris’ book.

I’ve read Matthew’s book already, and I’ve read the key DBA skills part twice.  Now I’m going to start reading Sam’s book and then on to the software tests.  All this to buy me one more correct answer on the test but I’d rather over prepare this time instead of leaving it to chance.

- Bobby

Categories: DBA Blogs

Bought Matthew Morris’ OCP 12c upgrade book for Key DBA Skills section

Wed, 2014-06-18 17:22

I haven’t even started on Sam Alapati’s OCP book but I realized today that Matthew Morris’s OCP 12c upgrade book has an outline of the Key DBA Skills section of the OCP exam and that is what I need most.  I was two questions away on that section when I took the test the first time and I need to make sure I’m studying the right things.  After I got back to the office after failing my first attempt (a $245 lesson) I made notes about what I remember being unsure of and that gave me some ideas what to study.  But, it is nice to get some advice from someone else who has passed the exam.  I saved a couple of bucks by getting the Kindle version for $9.99 so it seemed like a safe bet.  Hopefully I can hit it hard on both books and my practice test software, which also covers the key DBA skills, and be able to take the test again by the end of the August.

Categories: DBA Blogs

Limits of SQL Optimization Toastmasters Talk

Thu, 2014-06-12 16:48

I think I’m done with the script for my Toastmaster’s talk: pdf

It’s Toastmaster’s talk 3 “Get to the Point” so I’m trying to do a 5-7 minute talk that focuses on how Oracle’s SQL optimizer sometimes chooses a slow plan.  It’s tough to say anything meaningful in 7 minutes and I’m not using any kind of slides so I have to describe everything verbally.

But, it’s a topic I’m passionate about so I will enjoy giving it.  I kind of got bogged down thinking about ways to relate this information to my non-technical audience’s life and I decided to give up on that.  I would probably need another 7 minutes or more to explain why they should care that there are limits to SQL optimization so I decided to focus on convincing them that the limits exist.

- Bobby

Categories: DBA Blogs

Adding an outline hint to a PS/Query

Wed, 2014-06-11 17:52

I just finished working with a coworker on speeding up a PS/Query.  It was running well in a development environment but not in production.  I tried to find a simple hint or two to make production run like development but didn’t have any luck.  Then I remembered that my script to get a query’s plan prints out a full set of outline hints like this:

  /*+
      BEGIN_OUTLINE_DATA
      USE_HASH_AGGREGATION(@"SEL$E903463E")
      USE_NL(@"SEL$E903463E" "CLS"@"SEL$4")
      USE_NL(@"SEL$E903463E" "SEC"@"SEL$3")
      USE_NL(@"SEL$E903463E" "SOC"@"SEL$4")

... lines removed to make this post shorter ...

      OUTLINE_LEAF(@"SEL$F5BB74E1")
      PUSH_PRED(@"SEL$F5BB74E1" "A1"@"SEL$1" 1)
      OUTLINE_LEAF(@"SEL$E903463E")
      ALL_ROWS
      OPT_PARAM('_unnest_subquery' 'false')
      DB_VERSION('11.2.0.3')
      OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

A coworker edited the PS/Query and tried to add this hint but got errors because of the quotation marks.  So, I simplified the hint by removing all the double quotes and taking out the lines with single quotes because I knew they weren’t needed.  They only related to parameters that I knew were already set in production.

Here is the new quote-less hint:

  /*+
      BEGIN_OUTLINE_DATA
      USE_HASH_AGGREGATION(@SEL$E903463E)
      USE_NL(@SEL$E903463E CLS@SEL$4)
      USE_NL(@SEL$E903463E SEC@SEL$3)
      USE_NL(@SEL$E903463E SOC@SEL$4)

... lines removed to make this post shorter ...

      OUTLINE_LEAF(@SEL$F5BB74E1)
      PUSH_PRED(@SEL$F5BB74E1 A1@SEL$1 1)
      OUTLINE_LEAF(@SEL$E903463E)
      ALL_ROWS
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

We ran the modified PS/Query in production and I verified that the query was running the correct plan.  The only weird thing was that because the query has a group by PS/Query stuck the hint in the group by clause and in the select clause.

SELECT /*+ BEGIN_OUTLINE_DATA USE_HASH_AGGREGATION(@SEL$E90346...

... middle of select statement ...

GROUP BY /*+ BEGIN_OUTLINE_DATA USE_HASH_AGGREGATION(@SEL$E90346...

... end of select statement ...

At first this put us off, but really the hint in the group by clause is just a comment and syntactically is insignificant even though it is ugly.

By the way, here are the options I used to output the outline hint:

select * from 
table(dbms_xplan.display('PLAN_TABLE',NULL,'ADVANCED'));

I get all my plans this way now.  It generates a lot of output that I don’t usually use.  But, seeing it reminds you that it is there if you need it.

- Bobby

Categories: DBA Blogs

v$sql_cs_statistics bug in 12c?

Wed, 2014-06-04 17:27

I think I’ve run across a bug in 12c where v$sql_cs_statistics is not populated in 12c as it was in 11.2.0.3.

I’ve messed with adaptive cursor sharing using an example on Tim Hall’s web site: web example

I’ve tried to break it and change it in various ways to understand adaptive cursor sharing.  I was using some 11g databases and on those systems the example results in output from v$sql_cs_statistics.  But in 12c the view returns nothing.  Maybe this is normal but it seems bug like to me.

Here is a zip of my version of Tim’s example and it’s output on an 11.2.0.3 and 12.1 database: zip

In the working 11.2.0.3 example you have this output from v$sql_cs_statistics:

ORCL::SYSTEM>SELECT * FROM v$sql_cs_statistics WHERE sql_id = '9bmm6cmwa8saf';

ADDRESS          HASH_VALUE SQL_ID        CHILD_NUMBER BIND_SET_HASH_VALUE P EXECUTIONS ROWS_PROCESSED BUFFER_GETS   CPU_TIME
---------------- ---------- ------------- ------------ ------------------- - ---------- -------------- ----------- ----------
0000000089035390 4171522382 9bmm6cmwa8saf            1          2064090006 Y          1          50001         501          0
0000000089035390 4171522382 9bmm6cmwa8saf            0          2342552567 Y          1              3           3          0

On 12c you get nothing:

ORCL:CDB$ROOT:SYSTEM>SELECT * FROM v$sql_cs_statistics WHERE sql_id = '9bmm6cmwa8saf';

no rows selected

The other queries in the example all return results so I guess it is a bug.

- Bobby

Categories: DBA Blogs

Senior Oracle DBA Position in Tempe, Arizona

Fri, 2014-05-30 17:37

We have a Senior Oracle DBA position open on my team here in Tempe, Arizona.

Here is the link: url This link is for non-recruiters only.

We have a good team and would love to have someone join us.

- Bobby

Categories: DBA Blogs

Updated 12c OCP upgrade Self Test software has more questions

Fri, 2014-05-30 17:17

I guess I should have waited around a bit longer before taking my 12c OCP upgrade exam.  I purchased Kaplan’s Self Test software to prepare for the 12c OCP upgrade exam but it only included 50 questions and they only covered the first part of the OCP exam – the part I passed!

I noticed on the Self Test website that the test now includes 105 questions.  So, I got the bright idea of seeing if there was some way to refresh my software with an update and voilà! Now I have 105 questions and 128 flash cards.  Probably I should have waited for this practice software to mature before attempting the 12c OCP upgrade test.  One thing of particular interest to me is that it does include practice questions related to the second part of the exam which I failed.

So, hopefully I’ll be in good shape for the retest.  I am doing a lot of hacking around with different features but I need to figure out a more methodical way to study.  Still, having taken the test once I know some of the features that I need to study.  Combine that with the new questions on the Self Test software and that will get me closer.  Lastly, I pre-ordered Sam Alapati’s book on Amazon so that should put me over the top.  I was really only two questions away on my first attempt but I definitely should have waited and have done more thorough preparation.

- Bobby

Categories: DBA Blogs

First draft Sudden SQL Slowness Toastmasters talk

Thu, 2014-05-29 15:23

I’m doing Toastmasters to try to improve my speaking and my third talk will be related to why a function on a web site can suddenly slow down due to a change in plan.  It has to be 5 to 7 minutes long and the audience is non-technical.

So, just for fun, here is my first draft.  If you feel like giving me any suggestions that would be great.

First draft of Toastmasters Sudden SQL Slowness talk

- Bobby

Categories: DBA Blogs

SQL Profile appears to not work but really does

Wed, 2014-05-28 16:46

Over the weekend a coworker applied a SQL profile to a long running SQL statement but it looked like the profile was not applied.  For one thing, the SQL statement ran just as long after applying the profile.  Also, the hash value for the plan that the optimizer chose for the statement after applying the SQL Profile was not the same as the desired plan.  But, after looking at it today I’ve realized that the SQL Profile did apply the desired plan.  It was just that the desired plan wasn’t really any more efficient than the one picked by the optimizer.

Here are earlier posts related to our use of SQLT to force a given SQL statement to run with a good plan:

Using coe_xfr_sql_profile.sql to force a good plan

SQLT Emergency Plan Change

In this weekend’s case we were looking at sql_id fpa2sb9pt9d6s.  It looked like plan 1394068771 was faster than the most recently used plan because it had a lower average elapsed time:

SQL_ID        PLAN_HASH_VALUE END_INTERVAL_TIME         EXECUTIONS_DELTA Elapsed Average ms CPU Average ms IO Average ms Cluster Average ms Application Average ms Concurrency Average ms Average buffer gets Average disk reads Average rows processed
------------- --------------- ------------------------- ---------------- ------------------ -------------- ------------- ------------------ ---------------------- ---------------------- ------------------- ------------------ ----------------------
fpa2sb9pt9d6s      1615990741 11-MAY-14 11.00.06.474 AM                1         21550951.2        9720630    467066.276                  0                  27.24              82825.793           126238536            1913606                      0
fpa2sb9pt9d6s      1394068771 18-MAY-14 12.00.52.752 PM                1         1687493.09         657760    402607.781                  0                 39.814               36561.61             4701985             566205                      0
fpa2sb9pt9d6s      1611789738 25-MAY-14 10.00.59.900 AM                1         11390629.8        6835390    462073.968                  0                 32.818              39010.945            86448768             993957                      0

So we used SQLT to force fpa2sb9pt9d6s to use plan 1394068771 but instead it appeared to use plan 420390762 and it didn’t run any faster:

SQL_ID        PLAN_HASH_VALUE END_INTERVAL_TIME         EXECUTIONS_DELTA Elapsed Average ms CPU Average ms IO Average ms Cluster Average ms Application Average ms Concurrency Average ms Average buffer gets Average disk reads Average rows processed
------------- --------------- ------------------------- ---------------- ------------------ -------------- ------------- ------------------ ---------------------- ---------------------- ------------------- ------------------ ----------------------
fpa2sb9pt9d6s      1615990741 11-MAY-14 11.00.06.474 AM                1         21550951.2        9720630    467066.276                  0                  27.24              82825.793           126238536            1913606                      0
fpa2sb9pt9d6s      1394068771 18-MAY-14 12.00.52.752 PM                1         1687493.09         657760    402607.781                  0                 39.814               36561.61             4701985             566205                      0
fpa2sb9pt9d6s      1611789738 25-MAY-14 10.00.59.900 AM                1         11390629.8        6835390    462073.968                  0                 32.818              39010.945            86448768             993957                      0
fpa2sb9pt9d6s       420390762 25-MAY-14 11.00.27.175 PM                1         19055402.5        8975130    492406.429                  0                  24.93              64279.331           120027479            1943256                      0

But if you look at the plan for fpa2sb9pt9d6s that has hash value 420390762 it says it is using the sql profile we created which should result in a plan with hash value 1394068771:

Note
-----
   - SQL profile "coe_fpa2sb9pt9d6s_1394068771" used for this statement

So, I dumped out both plans – 420390762 and 1394068771 and edited out the columns from the plan output that were irrelevant and did a diff and came up with this result:

45c45
< |  41 |                          TABLE ACCESS FULL      | SYS_TEMP_0FD9D67B9_9DB13F0D |       |       |  Q2,01 | PCWP |            |
---
> |  41 |                          TABLE ACCESS FULL      | SYS_TEMP_0FD9D6897_9DB13F0D |       |       |  Q2,01 | PCWP |            |

420390762 and 1394068771 were identical except that the system generated temporary segment name was slightly different.  The temp segment name affects the hash value for the plan so this is a gotcha on using plan hash value to compare plans.  Two distinct plan hash values could really represent the same plan if they have system generated temporary segments in the plan.

So, SQLT did successfully apply a given plan to a given sql_id using a SQL Profile but it didn’t improve performance and the resulting plan hash value was not what we expected.

- Bobby

Categories: DBA Blogs

Standalone 12c grid install example

Wed, 2014-05-21 14:33

Here are my rough screenshots from an install of 12c grid on a standalone Linux vm: pdf

I’m running version 6 of Oracle’s Linux in a VMWare Player 4 gigabyte virtual machine on my Windows 7 laptop.  I ran into a couple of key points that I wanted to document.

newdisks

I added two new disks to my VM but after booting my machine they disappeared.  I ended up adding them back in and rebooting again and they stayed.  Not sure what to say except after you add them reboot and make sure they are still there.

diskdevices

The new disk devices showed up as /dev/sdb and /dev/sdc.  Then I had to run fdisk on each one to create a partition that took up the entire disk.  That led to devices being out there called /dev/sdb1 and /dev/sdc1.

Next I had to do some steps related to the oracleasm utility but I found out later that the next step I should have done was disable SE Linux:

disableselinux

I had to edit the file /etc/selinux/config so that SELINUX=disabled and reboot.  Then I ran these commands to configure oracleasm:

yum install oracleasm-support
/usr/sbin/oracleasm configure -i
/usr/sbin/oracleasm init
/usr/sbin/oracleasm createdisk DISK1 /dev/sdb1
/usr/sbin/oracleasm createdisk DISK2 /dev/sdc1

The next tricky thing was figuring out the path to these two disks during the creation of the disk group as part of the grid install.  Here is what the install screen looked like:

noasmdisks

It didn’t have any disks to choose from.  I tried various discovery paths such as /dev/sd* and ORCL:* but finally found one that worked:

oracleasmdiscoverypath

Now the disks showed up and I was able to continue:

asmdisksvisible

So, that’s all I wanted to point out, just a few key things I had to understand.  You can check out my link for all the screenshots I took.

- Bobby

Categories: DBA Blogs

12c Online Table Redefinition Examples

Fri, 2014-05-16 16:22

I’ve been learning about online table redefinition in Oracle 12c.  Here is a zip of some test scripts I built for my own understanding: zip

I spent some time trying out DBMS_REDEFINITION on an 11.2 Exadata test system and then started messing with it on 12c in a generic Linux VM.  The 12c version of the DBMS_REDEFINITION includes a new procedure called REDEF_TABLE which lets you do in one step certain things you did with multiple calls to the package in 11.2.  This is an online table compress on 12c:

BEGIN
  DBMS_REDEFINITION.REDEF_TABLE(
    uname                      => user,
    tname                      => 'TEST',
    table_compression_type     => 'COMPRESS');
END;
/

Things that can’t be done in one step like this require calls to procedures such as CAN_REDEF_TABLE, START_REDEF_TABLE, REGISTER_DEPENDENT_OBJECT, COPY_TABLE_DEPENDENTS, and FINISH_REDEF_TABLE.  Example online12c5.sql uses all of these.  Here is a summary of each included file’s test:

online12c1.sql – compress table in one step

online12c2.sql – compress table in multiple steps and show that it creates a hidden column without a primary key

online12c3.sql – same as previous test but with primary key

online12c4.sql – copy contraints and indexes

online12c5.sql – change the columns for the non-unique index

online12c6.sql – change order and type of columns

- Bobby

Categories: DBA Blogs

Create new cdb with two pdbs

Tue, 2014-05-13 17:38

Quick post.  I’m messing with 12c again.  I finished recreating my cdb but this time found the option to create multiple pdbs on the initial creation.  I’m using DBCA.

PDF with screenshots of me creating a cdb with two pdbs

- Bobby

Categories: DBA Blogs

Simple tuning example but uncertain test results

Tue, 2014-05-06 16:58

Last week I came across what seemed like a simple query tuning problem.  A PeopleSoft batch job ran for many hours and when I did an AWR report I found that the top query was doing a full scan when an index should help.

Here is the query and its bad plan:

SQL_ID 1jx5w9ybgb51g
--------------------
UPDATE PS_JGEN_ACCT_ENTRY 
SET 
JOURNAL_ID = :1, 
JOURNAL_DATE = TO_DATE(:2,'YYYY-MM-DD'), 
FISCAL_YEAR = :3,
ACCOUNTING_PERIOD = :4,
GL_DISTRIB_STATUS = 'D', 
JOURNAL_LINE = :5 
WHERE 
PROCESS_INSTANCE = 6692638 AND
GL_DISTRIB_STATUS = 'J'  AND 
ACCOUNT=:6 AND 
DEPTID=:7 AND 
CURRENCY_CD=:8 AND
FOREIGN_CURRENCY=:9

Plan hash value: 1919611120

-----------------------------------------------------------------------------------------
| Id  | Operation          | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |                    |       |       | 21649 (100)|          |
|   1 |  UPDATE            | PS_JGEN_ACCT_ENTRY |       |       |            |          |
|   2 |   TABLE ACCESS FULL| PS_JGEN_ACCT_ENTRY |     1 |    58 | 21649   (5)| 00:01:27 |
-----------------------------------------------------------------------------------------

The problematic batch job ran variations of this query with different literal values for PROCESS_INSTANCE corresponding to each flat file being loaded.  Three updates of this type were in the awr report for the 16 hour period that covered the run of the batch job:

  Elapsed      CPU                  Elap per  % Total
  Time (s)   Time (s)  Executions   Exec (s)  DB Time    SQL Id
---------- ---------- ------------ ---------- ------- -------------
    16,899      5,836        3,811        4.4     3.5 4h54qqmbkynaj

UPDATE PS_JGEN_ACCT_ENTRY SET JOURNAL_ID = :1, JOURNAL_DATE = TO_DATE(:2,'YYYY-M
M-DD'), FISCAL_YEAR = :3, ACCOUNTING_PERIOD = :4, GL_DISTRIB_STATUS = 'D', JOURN
AL_LINE = :5 WHERE PROCESS_INSTANCE = 6692549 AND GL_DISTRIB_STATUS = 'J' AND A
CCOUNT=:6 AND DEPTID=:7 AND CURRENCY_CD=:8 AND FOREIGN_CURRENCY=:9

     6,170      2,190        1,480        4.2     1.3 a5rd6vx6sm8p3

UPDATE PS_JGEN_ACCT_ENTRY SET JOURNAL_ID = :1, JOURNAL_DATE = TO_DATE(:2,'YYYY-M
M-DD'), FISCAL_YEAR = :3, ACCOUNTING_PERIOD = :4, GL_DISTRIB_STATUS = 'D', JOURN
AL_LINE = :5 WHERE PROCESS_INSTANCE = 6692572 AND GL_DISTRIB_STATUS = 'J' AND A
CCOUNT=:6 AND DEPTID=:7 AND CURRENCY_CD=:8 AND FOREIGN_CURRENCY=:9

     6,141      1,983        1,288        4.8     1.3 1jx5w9ybgb51g

UPDATE PS_JGEN_ACCT_ENTRY SET JOURNAL_ID = :1, JOURNAL_DATE = TO_DATE(:2,'YYYY-M
M-DD'), FISCAL_YEAR = :3, ACCOUNTING_PERIOD = :4, GL_DISTRIB_STATUS = 'D', JOURN
AL_LINE = :5 WHERE PROCESS_INSTANCE = 6692638 AND GL_DISTRIB_STATUS = 'J' AND A
CCOUNT=:6 AND DEPTID=:7 AND CURRENCY_CD=:8 AND FOREIGN_CURRENCY=:9

The batch job ran about 15 and a half hours so these three plus others like them surely combined to make up the bulk of the run time.

It made sense to me to just add an index on all the columns in the where clause – PROCESS_INSTANCE,GL_DISTRIB_STATUS,ACCOUNT,DEPTID,CURRENCY_CD,FOREIGN_CURRENCY

Just to check how selective this combination of columns might be I did a count on each grouping of these columns and came up with about 50 rows per combination:

>select max(cnt),avg(cnt),min(cnt)
  2  from
  3  (select
  4  PROCESS_INSTANCE,
  5  GL_DISTRIB_STATUS,
  6  ACCOUNT,
  7  DEPTID,
  8  CURRENCY_CD,
  9  FOREIGN_CURRENCY,
 10  count(*) cnt
 11  from sysadm.PS_JGEN_ACCT_ENTRY
 12  group by
 13  PROCESS_INSTANCE,
 14  GL_DISTRIB_STATUS,
 15  ACCOUNT,
 16  DEPTID,
 17  CURRENCY_CD,
 18  FOREIGN_CURRENCY);

  MAX(CNT)   AVG(CNT)   MIN(CNT)
---------- ---------- ----------
      9404  50.167041          1

The table itself has 3 million rows so this is pretty selective:

OWNER                TABLE_NAME                       NUM_ROWS     BLOCKS AVG_ROW_LEN SAMPLE_SIZE LAST_ANALYZED       DEGREE     INSTANCES
-------------------- ------------------------------ ---------- ---------- ----------- ----------- ------------------- ---------- ----------
SYSADM               PS_JGEN_ACCT_ENTRY                3145253      82857         204     3145253 2014-04-21 21:07:02          1          1

But, the strange thing was when we added the index on our test system we didn’t see any performance improvement!  We ran the largest production file on test and it ran in ten minutes with or without the index.  Yack!

So, I tried my own test in sqlplus with the select equivalent of the update and hardcoded values instead of bind variables – quick and dirty.  I thought I had extracted some valid values although I later realized they weren’t.  Here is what I ran and notice the full scan ran just as fast as with the index:

>select * from
  2  sysadm.PS_JGEN_ACCT_ENTRY
  3  WHERE PROCESS_INSTANCE = 6138803 AND
  4  GL_DISTRIB_STATUS = 'J'  AND ACCOUNT=1234567 AND DEPTID=567 AND CURRENCY_CD='USD' AND
  5  FOREIGN_CURRENCY = NULL;

no rows selected

Elapsed: 00:00:00.30

Execution Plan
----------------------------------------------------------
Plan hash value: 1762298626

---------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                    |     1 |   203 |     0   (0)|          |
|*  1 |  FILTER                      |                    |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| PS_JGEN_ACCT_ENTRY |     1 |   203 |     5   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | PSAJGEN_ACCT_ENTRY |     1 |       |     4   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(NULL IS NOT NULL)
   3 - access("PROCESS_INSTANCE"=6138803 AND "GL_DISTRIB_STATUS"='J' AND
              "CURRENCY_CD"='USD')
       filter(TO_NUMBER("ACCOUNT")=1234567 AND TO_NUMBER("DEPTID")=567 AND
              "CURRENCY_CD"='USD')


Statistics
----------------------------------------------------------
       1761  recursive calls
          0  db block gets
        557  consistent gets
         14  physical reads
          0  redo size
       1866  bytes sent via SQL*Net to client
        239  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
          0  rows processed

>
>select /*+full(PS_JGEN_ACCT_ENTRY) */ * from
  2  sysadm.PS_JGEN_ACCT_ENTRY
  3  WHERE PROCESS_INSTANCE = 6138803 AND
  4  GL_DISTRIB_STATUS = 'J'  AND ACCOUNT=1234567 AND DEPTID=567 AND CURRENCY_CD='USD' AND
  5  FOREIGN_CURRENCY = NULL;

no rows selected

Elapsed: 00:00:00.17

Execution Plan
----------------------------------------------------------
Plan hash value: 3728573827

-----------------------------------------------------------------------------------------
| Id  | Operation          | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                    |     1 |   203 |     0   (0)|          |
|*  1 |  FILTER            |                    |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| PS_JGEN_ACCT_ENTRY |     1 |   203 | 12185   (2)| 00:02:27 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(NULL IS NOT NULL)
   2 - filter("PROCESS_INSTANCE"=6138803 AND "GL_DISTRIB_STATUS"='J' AND
              TO_NUMBER("ACCOUNT")=1234567 AND TO_NUMBER("DEPTID")=567 AND "CURRENCY_CD"='USD')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
       1873  bytes sent via SQL*Net to client
        239  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

It looks like I passed in a NULL for a column with a NOT NULL constraint and that is what made the full scan version just as fast as the indexed one.  The FILTER condition must have realized no rows could meet both NULL and NOT NULL conditions.  With both plans the database realized immediately that there were no rows matching this bogus collection of constants.  So, then I replaced the NULL with a zero and finally we had proof of the performance improvement of the index:

>select * from
  2  sysadm.PS_JGEN_ACCT_ENTRY
  3  WHERE PROCESS_INSTANCE = 6138803 AND
  4  GL_DISTRIB_STATUS = 'J'  AND ACCOUNT=1234567 AND DEPTID=567 AND CURRENCY_CD='USD' AND
  5  FOREIGN_CURRENCY = 0;

no rows selected

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 2047014499

--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |     1 |   203 |     5   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| PS_JGEN_ACCT_ENTRY |     1 |   203 |     5   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | PSAJGEN_ACCT_ENTRY |     1 |       |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("PROCESS_INSTANCE"=6138803 AND "GL_DISTRIB_STATUS"='J' AND
              "CURRENCY_CD"='USD')
       filter(TO_NUMBER("ACCOUNT")=1234567 AND TO_NUMBER("DEPTID")=567 AND
              TO_NUMBER("FOREIGN_CURRENCY")=0 AND "CURRENCY_CD"='USD')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
       1866  bytes sent via SQL*Net to client
        239  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

>
>select /*+full(PS_JGEN_ACCT_ENTRY) */ * from
  2  sysadm.PS_JGEN_ACCT_ENTRY
  3  WHERE PROCESS_INSTANCE = 6138803 AND
  4  GL_DISTRIB_STATUS = 'J'  AND ACCOUNT=1234567 AND DEPTID=567 AND CURRENCY_CD='USD' AND
  5  FOREIGN_CURRENCY = 0;

no rows selected

Elapsed: 00:00:37.11

Execution Plan
----------------------------------------------------------
Plan hash value: 1758291200

----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    |     1 |   203 | 12185   (2)| 00:02:27 |
|*  1 |  TABLE ACCESS FULL| PS_JGEN_ACCT_ENTRY |     1 |   203 | 12185   (2)| 00:02:27 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROCESS_INSTANCE"=6138803 AND "GL_DISTRIB_STATUS"='J' AND
              TO_NUMBER("ACCOUNT")=1234567 AND TO_NUMBER("DEPTID")=567 AND
              TO_NUMBER("FOREIGN_CURRENCY")=0 AND "CURRENCY_CD"='USD')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      56110  consistent gets
      55409  physical reads
          0  redo size
       1873  bytes sent via SQL*Net to client
        239  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

So, I can’t tell you what happened in test but I suspect that we were passing null into one of the bind variables and got a similar efficient filter due to some data that was missing running a production file on an out of date test system.  But, once I forced the equivalent to the production full scan by supplying non-null values for all the constants the value of the index became clear.  It went into production last week and this weekend’s run ran in a few minutes instead of 15 hours.

- Bobby

 

 

Categories: DBA Blogs