Re: SQL performance in prod

From: Cee Pee <carlospena999_at_gmail.com>
Date: Thu, 19 Jul 2018 15:49:47 -0500
Message-ID: <CAPTPB12Wvno3ESQNxZhuaC8GoFRaF4X7LwrfsqPf0Wui8-0Ybg_at_mail.gmail.com>



Nice thread, we get same kind of problems too. Is there a book or paper on how to read the 10053 trace file, like "basics of 10053"?

Reading Tim's response, I think it is a good time someone wrote a book on that topic.

CP

On Thu, Jul 19, 2018 at 9:38 AM, Larry Elkins <elkinsl_at_verizon.net> wrote:

> I’ve normally used DBMS_SHARED_POOL.PURGE with the C option to purge
> specific SQL’s and get a reparse and desired plan. It continued to work ok,
> for me anyway, in 12c, but Carlos Sierra, and maybe some others, had
> mentioned potential issues in 12x. Like the link mentions, I’d normally do
> this to get a re-parse it to pick up a baseline we just implemented, or
> simply to get a re-parse away from an “odd” value such as is already being
> discussed.
>
>
>
> https://carlos-sierra.net/2017/11/22/purging-a-cursor-in-oracle-revisited/
>
>
>
>
>
> Regards,
>
>
>
> Larry G. Elkins
>
> elkinsl_at_verizon.net
>
> 214.695.8605
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_
> freelists.org] *On Behalf Of *Mark W. Farnham
> *Sent:* Thursday, July 19, 2018 9:08 AM
> *To:* tim.evdbt_at_gmail.com; 'Stefan Knecht' <knecht.stefan_at_gmail.com>
> *Cc:* vraman4list_at_gmail.com; 'oracle-l-freelists' <oracle-l_at_freelists.org>
> *Subject:* RE: SQL performance in prod
>
>
>
> well… if you change the number of cpus, then you can get a different
> number of parallel servers and a different hash distribution, so something
> prone to skew of row assignments could in theory be pretty flat with one
> number of cpus getting one number of parallel servers and slow with a
> different number of cpus getting a different number of parallel servers
> with a small number of them getting the load.
>
>
>
> So you were initially right (but for a pretty rare yet interesting case.)
>
>
>
> I don’t know whether JL’s laundry list of “nothing changed, but the
> performance is different” laundry list has been mentioned in this thread,
> but as laundry lists go it is better than most.
>
>
>
> Of course something like Method-R’s tool can pin down what is taking time
> in the pair. That removes all guessing about WHAT is consuming the time,
> which sometimes helps diagnose what triggers the different classes of
> response.
>
> But most likely you are getting entirely different plan, so you want to
> focus on that to rule it out first. It is often the case that non-prod has
> a less robust parameter choice in the collection of regression tests for
> performance. As in the story of the rainy day, if in prod an unusual
> parameter choice produces a plan that is sub-optimal for the majority of
> re-uses of the plan for that query, then you would like to be able to flush
> just that one query from the shared pool and let it get reparsed. I don’t
> know how to do that, despite asking for that rifle shot in place of the
> shot gun flush shared pool that can cause a parse storm decades ago.
>
> Still, if you have the sql text you can modify it with a comment and see
> if a fresh parse is likely to get the good plan and more importantly,
> eliminate the prescience required to ask for a Wolfgang trace.
>
>
>
> In addition to difference of parameter (in the sense of bind variable
> predicate choice as opposed to init stuff), timing of stats collection
> versus running of the query such that some predicates exceed the recorded
> high value is a classic way to get a completely different plan.
>
>
>
> Your mileage may vary. In addition to this oracle-l thread, I would
> certainly also review the Oracle Scratchpad laundry list.
>
>
>
> mwf
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_
> freelists.org <oracle-l-bounce_at_freelists.org>] *On Behalf Of *Tim Gorman
> *Sent:* Thursday, July 19, 2018 9:36 AM
> *To:* Stefan Knecht
> *Cc:* vraman4list_at_gmail.com; oracle-l-freelists
> *Subject:* Re: SQL performance in prod
>
>
>
> Stefan,
>
> I had one too many: hardware changes could not affect execution plans. I
> added that to pad the list without much thought just before pressing SEND,
> and of course regretted it two seconds later.
>
> There are certainly more items to add, but I started with the
> easily-verifiable stuff.
>
> Clearly a 10053 trace is the ultimate, but as the OP noted, it requires
> prescience to be set before it is needed, and prescience isn't always
> available.
>
> 10053 trace output is also not easy to read. Reading two such traces,
> comprehending both, and then comparing and contrasting usually requires
> intelligence and attention to detail approaching the level of Wolfgang
> Breitling.
>
> Thanks!
>
> -Tim
>
> On 7/18/18 21:03, Stefan Knecht wrote:
>
> Tim, you forgot one:
>
>
>
> 7. The fact whether it rains Monday morning or not
>
>
>
> The original anecdote referred to the fact that if it rained, a certain
> employee that normally arrives first on a sunny day, would get to the
> office later - which caused a different employee to first trigger execution
> plan creation, with different bind variables, leading to a different plan.
>
>
>
> So the query would run fast all day on a sunny day, but slow all day when
> it rained.
>
>
>
> Venky - try looking at the values of the bind variables of a good run vs a
> bad run.
>
>
>
>
>
>
>
>
>
> On Thu, Jul 19, 2018 at 5:58 AM, Tim Gorman <tim.evdbt_at_gmail.com> wrote:
>
> Venky,
>
> "Assuming there is not much change in the DB"
>
> Let's narrow down the things that can change an execution plan...
>
> 1. hardware change (i.e. #-cpus, # GB of RAM, storage, etc)
> 2. application software change (i.e. change to the SQL text)
> 3. Oracle software change (i.e. patch, upgrade, etc)
> 4. initialization parameter change
> 5. gathering system statistics
> 6. gathering table, index, column statistics
>
>
> When you state the assumption about "no much change in the DB", I am
> assuming that you're discussing items #1-4.
>
> How about item #5? Can you query the SYS.AUX_STATS$ table and display the
> column PVAL1 where PNAME has the value "DSTART" or "DSTOP"?
>
> How about item #6? Can you display the contents of DBA_TAB_STATS_HISTORY
> for the tables involved in the query? Please refer to the useful blog
> posts by Uwe Hesse HERE
> <https://uhesse.com/2012/04/23/diff_table_stats_in_history-example/> and
> by Marcel-Jan Krijgsman HERE
> <https://mjsoracleblog.wordpress.com/2013/02/19/more-statistics-history/>
> for more information, if necessary?
>
> Hope this helps?
>
> Thanks!
>
> -Tim
>
>
>
> On 7/18/18 15:30, V Raman wrote:
>
> List
>
>
>
> We have a SQL that is performing intermittently bad in our prod env. The
> good ones take 2 to 5 mins, the bad ones run for hours we kill them. They
> run fine in the non prod env. I ran an awsqrpt and based on that I see that
> there are a few executions with the bad ones taking hours. Looking at the
> differences in the execution plan, the good ones have lots of nested loops
> in them, with the bad ones having lots of hash joins.
>
>
>
> I am trying to figure out the cause(s). Assuming there is not much change
> in the DB, the first thing that comes to mind is statistics. Can the
> listers help with ideas? Thanks.
>
>
>
> If anyone is interested is seeing the report, i can provide a link to them
> by email.
>
>
>
> Venky
>
>
>
>
>
>
>
> --
>
> //
>
> zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
>
> Visit us at zztat.net | _at_zztat_oracle | fb.me/zztat | zztat.net/blog/
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Jul 19 2018 - 22:49:47 CEST

Original text of this message