Re: DBA Job Functions

From: Sayan Malakshinov <xt.and.r_at_gmail.com>
Date: Tue, 6 Mar 2018 05:52:07 +0300
Message-ID: <CAOVevU7Df7mBJCqBezEstLReLFiotbfosUr2uZqrbba6TcvG-g_at_mail.gmail.com>



Mladen,

During the tuning process, you are not looking for suboptimal plans, you
> are looking where in the application the time is spent.

  • What if client says that slowdown was a couple of hours ago? What sessions and when would you trace? Would you say to client, that you need at least one more slow execution just to analyze it?
  • Some PL/SQL procedure(or query) has been started 2 hours ago and it's still running, though usually it takes just about 10 minutes: you can enable 10046, but how can you be sure that slow part caused a problem hasn't finished before you started trace? What would you get from short trace without a plan and row-source statistics while query is still running?
  • from 10046 trace you can't find:
  • historical data
  • why CBO chooses suboptimal plan
  • information about existing other child cursors
  • who holds locks/latch/mutex/etc that your session is trying to get
  • slowest parts of PL/SQL code (w/o SQL)
  • session statistics
  • details about "On CPU"/runqueue part
  • slowest oracle internal functions

10046 is just wait events + their p1,p2,p3 + row-source statistics(only if cursor was open and closed during tracing). And nothing more... We can easily find where the most of the time is spent using ASH(S-ASH), session snapper or real-time sql monitor. And of course, sometimes, we need more specific details about the problem: in such cases, depending on the type of issue, we can choose what and where to get it: trace events (10046, 10032, 10200, 10053, 100938, etc...), dtrace or diagnostic views.
You agreed that lots of problems caused by plan changes. Why do you want to trace with 10046 in such cases, though we can easily find and fix this problem with more convenient tools?
We can investigate and implement quick fix(SQL profiles/patches/baselines) for such issues usually in 1-5 minutes and permanent fix in 5-30 minutes. How can help 10046 with this?

Sometimes the problem is in a row-by-row fetch

You can easily find this problem using snapper(statistics+sql*net wait events). You can get fetchsize from RTSM. Even v$sql can help to troubleshoot this problem. (
http://orasql.org/2017/11/16/easy-way-to-check-fetchsize/)

ps. I'm too tired now and have no enough time last days, so I'll try to add more details and cases later.
pps. It seems like you forgot or skipped some parts of Cary's book, especially about fixed views, statistics, CPU and unaccounted time... And do not forget that this book was written even before Oracle 10g where ASH has appeared.

On Tue, Mar 6, 2018 at 2:26 AM, Mladen Gogala <gogala.mladen_at_gmail.com> wrote:

> When you say so. You haven't provided any arguments or examples, so I
> cannot discuss your statement. On my part, I am not at all convinced. I
> have good experience with the method-R and will stick to it. I know that
> trace based method-R works and see no reasons to stop using it.
>
>
> On 03/05/2018 09:36 AM, Lothar Flatz wrote:
>
>> Nonsense. You are looking in ASH where the time is spent. If the issue is
>> an useful index not used, you don't need a trace to figure that.
>>
>
> --
> Mladen Gogala
> Database Consultant
> Tel: (347) 321-1217
>
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>

-- 
Best regards,
Sayan Malakshinov
Oracle performance tuning engineer
Oracle ACE Associate
http://orasql.org

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Mar 06 2018 - 03:52:07 CET

Original text of this message