Re: question on table access by index rowid batched

From: Lok P <loknath.73_at_gmail.com>
Date: Sun, 10 Oct 2021 23:29:03 +0530
Message-ID: <CAKna9VaGzWx9-Tx-wr2M8LWX5b-wgDqh-37dDsOhGFE+o-ie0w_at_mail.gmail.com>

I think you are correct. As Jonathan explained, the issue here is just because of the extra work added by the statistics collector operation which does the blocking/buffering operation to decide future Joins. But yes ~131 seconds of blocking is too much for any first_rows/oltp/search query to wait for. And as you pointed out in one of the cases the same table access by index rowid operation took 5GB and other was less . And to which Jonathan highlighted, the difference in data pattern made that thing happen(which inturn was because of the statistics collector), looks interesting. I have never thought of that.

We also have few of our OLTP environments on 19C but there we have optimizer_adaptive_plans itself set as false and we were planning to enable it based on certain inputs from other team members. Curious to know if others experienced such issues with first_rows type of queries with adaptive plans enabled and we should stop enabling it?

On Sat, Oct 9, 2021 at 9:50 PM Pap <oracle.developer35_at_gmail.com> wrote:

> Thank you Jonathan.
>
> There is a difference of ~350+ seconds between the run time of both the
> plans and out of this, ~131 seconds is contributed by the statistics
> collector which does the additional mathematics/counting to opt for the
> best path at that moment (must be based on the runtime/actual cardinality
> information). This query was actually a search query and running with
> first_rows hint and customer expecting to see the result as first ~50 odd
> rows asap on the screen but with the optimizer_adaptive_reporting_only
> 'false' it was using another set of mathematics(or as you mentioned
> multipasses) in this query adding significant overhead for this quick
> query. So is it recommended to have this parameter set as 'true' mainly in
> the oltp kind of environment ? Or should totally turn the key driver i.e.
> optimizer_adpative_plan as false?
>
> Note -- Just to note, after the upgrade , I think the infra team just kept
> it all default i.e. optimizer_adaptive_plan- true,
> optimizer_adaptive_reporting_only- false, optimizer_adaptive_statistics-
> false. So wondering if it's advisable/ and safe to change few to non
> default?
>
> As you rightly explained, there seems to be no role of rowid batching in
> this performance issue. Again, not much experience in reading
> complex execution paths, but from the path initially I was interpreting, it
> as both plans are exactly the same , with just the difference in 'rowid
> batching operation. But from your explanation, I see I was completely
> wrong. Now, I can see in the good plan case it has read ~28million at step
> ~46 for index TP_PK but in the bad plan case it is not noted in the sql
> monitor. but conversely , ~5GB data being read in the bad plan, at
> step-59 with ~661k read requests and it resulted in ~770k rows there. But
> in case of a good plan those bytes read and number of read requests are not
> noted. So does this point to the fact that the statistics collector is
> really playing a role in the order in which data results out from one
> operation/step to another?
>
>
> On Sat, Oct 9, 2021 at 4:49 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>>
>> Remember I was talking POSSIBILITIES in my previous post, not certainties.
>>
>> With the extra information you have now supplied I have a better
>> hypothesis, based on the fact that you have adaptive_plans_enabled (which I
>> should have realised in the first place), and that the plans you've
>> supplied are different in exactly the TYPE of reason I suggested, but not
>> for the exact reason. My basic hypothesis now is that when you enabled
>> adaptive plans but set it to reporting only Oracle will always include the
>> statistics collector operations, but it will not use them (i.e. not
>> counting etc,) to make a decision about whether to take a hash join or a
>> nested loop join. It will simply generate a path based on its first pass
>> calculations.
>>
>> (You could check this with a suitable set of examples and testing with
>> (a) adaptive plans enable (b) adaptive plans enabled but reporting only (c)
>> adaptive plans disabled. You'd need to set up 4 possibilities: nlj
>> changing to hj due to adaptive plans, hj changing to nlj due to adaptive
>> plans, nlj not changing, hj not changing).
>>
>> So Looking at your SQL Monitor reports - lines 41 and 42 as a starting
>> point:
>>
>> WIth reporting only = TRUE
>>
>> ==============================================================================================================================================================================
>> | Id | Operation |
>> Name | Rows | Cost | Time | Start | Execs | Rows
>> | Read | Read | Mem |
>> | | |
>> | (Estim) | | Active(s) | Active | | (Actual)
>> | Reqs | Bytes | (Max) |
>>
>> ===============================================================================================================================================================================
>> | 41 | INDEX RANGE SCAN | TCX_PK
>> | 1 | 2 | 279 | +2 | 2M |
>> 2M | | | . |
>> | 42 | INDEX RANGE SCAN | TCX_PK
>> | 1 | 2 | | | |
>> | | | . |
>>
>> With reporting only = FALSE
>>
>> ===============================================================================================================================================================================================
>> | Id | Operation |
>> Name | Rows | Cost | Time | Start | Execs | Rows
>> | Read | Read | Write | Write | Mem |
>> | | |
>> | (Estim) | | Active(s) | Active | | (Actual)
>> | Reqs | Bytes | Reqs | Bytes | (Max) |
>>
>> ===============================================================================================================================================================================================
>> | 41 | INDEX RANGE SCAN | TCX_PK
>> | 1 | 2 | | | |
>> | | | | | . |
>> | 42 | INDEX RANGE SCAN | TCX_PK
>> | 1 | 2 | 1 | +131 | 1 |
>> 976K | | | | | . |
>>
>> These two lines are the options available as a result of a statistics
>> collector further up the plan, and one plan takes operation 41, the other
>> takes operation 42
>>
>> Putting them into context (adding in some parent and sibling rows)
>>
>> Reporting only = true ... plan based on estimates
>>
>> ==============================================================================================================================================================================
>> | Id | Operation |
>> Name | Rows | Cost | Time | Start | Execs | Rows
>> | Read | Read | Mem |
>> | | |
>> | (Estim) | | Active(s) | Active | | (Actual)
>> | Reqs | Bytes | (Max) |
>>
>> ===============================================================================================================================================================================
>> | 7 | NESTED LOOPS OUTER |
>> | 1 | 3M | 279 | +2 | 1 | 2M
>> | | | . | | |
>> | 8 | HASH JOIN OUTER |
>> | 1 | 3M | 279 | +2 | 1 | 2M
>> | | | . |
>> | 9 | NESTED LOOPS OUTER |
>> | 1 | 3M | 279 | +2 | 1 | 2M
>> | | | . |
>> | 10 | STATISTICS COLLECTOR |
>> | | | 279 | +2 | 1 | 2M
>> | | | . |
>> | 11 | NESTED LOOPS OUTER |
>> | 1 | 3M | 279 | +2 | 1 | 2M
>> | | | . |
>>
>> ...
>> | 41 | INDEX RANGE SCAN | TCX_PK
>> | 1 | 2 | 279 | +2 | 2M |
>> 2M | | | . |
>> | 42 | INDEX RANGE SCAN | TCX_PK
>> | 1 | 2 | | | |
>> | | | . |
>> | 43 | TABLE ACCESS BY INDEX ROWID | TC
>> | 1 | 2 | 279 | +2 | 2M |
>> 2M | 16037 | 125MB | . | | |
>>
>>
>> Reporting only = false ... plan adapts to counting
>> (OPTERATION 10 dictates a HASH JOIN where the estimated stats suggested
>> NLJ)
>>
>> ===============================================================================================================================================================================================
>> | Id | Operation |
>> Name | Rows | Cost | Time | Start | Execs | Rows
>> | Read | Read | Write | Write | Mem |
>> | | |
>> | (Estim) | | Active(s) | Active | | (Actual)
>> | Reqs | Bytes | Reqs | Bytes | (Max) |
>>
>> ===============================================================================================================================================================================================
>> | 7 | NESTED LOOPS OUTER |
>> | 1 | 3M | 27 | +131 | 1 | 2M
>> | | | | | . |
>> | 8 | HASH JOIN OUTER |
>> | 1 | 3M | 155 | +3 | 1 | 2M
>> | 3035 | 1GB | 3035 | 1GB | 309MB |
>> | 9 | NESTED LOOPS OUTER |
>> | 1 | 3M | 129 | +3 | 1 | 2M
>> | | | | | . |
>> | 10 | STATISTICS COLLECTOR |
>> | | | 129 | +3 | 1 | 2M
>> | | | | | . |
>> | 11 | NESTED LOOPS OUTER |
>> | 1 | 3M | 129 | +3 | 1 | 2M
>> | | | | | . |
>> ...
>> | 41 | INDEX RANGE SCAN | TCX_PK
>> | 1 | 2 | | | |
>> | | | | | . |
>> | 42 | INDEX RANGE SCAN | TCX_PK
>> | 1 | 2 | 1 | +131 | 1 |
>> 976K | | | | | . |
>> | 43 | TABLE ACCESS BY INDEX ROWID | TC
>> | 1 | 2 | 27 | +131 | 2M |
>> 2M | 21549 | 168MB | | | . |
>>
>>
>> As you can see, the ability to switch plans makes a big difference to the
>> amount of time Oracle spends working before it starts executing step 7
>> (start time = +2 seconds vs. +131 seconds).
>>
>> This one extract from the plans doesn't answer the question about why
>> "table access by index rowid batched" has appeared; nor does it explain the
>> total difference between the two executions because you also need to look
>> at the statistics collector at operation 6 and the impact that has on which
>> of operation 45/46 and 47/48 get chosen. The other point that then comes up
>> is that the order in which data appears from a hash join is (almost
>> certainly) different from the order it appears from a nested loop join -
>> which means that when you use the result of the join to probe other tables
>> by index the order of the driving data from one join may be roughly in line
>> with the indexes and data from the next table(s) in the join and benefit
>> from a lot of "self-induced" caching, while the other join might produce
>> data in what is effectively a randomised order resulting in "self-flushing"
>> and more random reads - and that might explain what happens at operations
>> 60/61
>>
>> What my observations suggest is a good argument for why you had such a
>> difference in performance, and that it had nothing to do with the batching
>> or not of the table access by rowid.
>>
>> I may write this up in more detail as a blog some day, but I don't have
>> time right now.
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>>
>>
>> On Fri, 8 Oct 2021 at 18:34, Pap <oracle.developer35_at_gmail.com> wrote:
>>
>>> It's fluctuating. I am now not able to reproduce the scenario for that
>>> same small query for which I had posted here just before. And you are
>>> correct , I was luckily having sql monitors saved. I am attaching those
>>> here. If you see that, the main query was showing those rowid batched
>>> operations when we have the optimizer_adaptive_reporting_only set as
>>> default/false and was taking a long time to finish and also the first few
>>> rows were also taking longer to get produced out of the query. But the
>>> sample small query which i had posted a cursor plan for was just showing
>>> the opposite behaviour. That is going for a rowid batched path when
>>> optimizer_adaptive_reporting_only sets as true.
>>>
>>> However, now I am seeing that same small query in both the cases
>>> (irrespective of value of optimizer_adaptive_reporting_only) going for
>>> 'rowid batched' execution path. Not sure if it's just stats or
>>> anything else influencing and I am seeing different things behaviour. Just
>>> to note we have 'optimizer_adaptive_plans' set to true ,
>>> 'optimizer_adaptive_statistics' set to false. The only change we made was
>>> moving ' optimizer_adaptive_reporting_only' from false to true.
>>>
>>> And Jonathan when you said the optimizer_adaptive_reporting_only = true
>>> will introduce 'statistics collector' operation, but if you see the
>>> attached sql monitor for the main query, i am seeing 'statistics collector'
>>> even when optimizer_adaptive_reporting_only is = false. Is that expected
>>> behaviour?
>>>
>>>
>>>
>>
>>
>>

--
http://www.freelists.org/webpage/oracle-l

Received on Sun Oct 10 2021 - 19:59:03 CEST

This message: [ Message body ]
Next message: Jonathan Lewis: "Re: question on table access by index rowid batched"
Previous message: Pap: "Re: question on table access by index rowid batched"
In reply to: Pap: "Re: question on table access by index rowid batched"
In reply to Pap: "Re: question on table access by index rowid batched"
Next in thread: Jonathan Lewis: "Re: question on table access by index rowid batched"
Reply: Jonathan Lewis: "Re: question on table access by index rowid batched"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message