Re: Improve Query sqlm_9csc97utkgjub

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Fri, 25 Oct 2024 15:48:48 +0100
Message-ID: <CAGtsp8mWOO+rJsu0pKLUpyG5L9N19mBnknXpq4aJbpYWEE5GtQ_at_mail.gmail.com>



If you have 360 columns in the table any row that uses more than 255 of them will be split into two row pieces, with 255 columns in the second row piece and the remainder in the first row piece. This is sometimes known as intra-block chaining, although it is possible for subsequent 2nd and subsequent row pieces to migrate to another block and you end up with inter-block chaining.

My comments in the blog note about direct path tablescans resulting in non-cached db files sequential reads as Oracle follows chained rows are now out of date - the mechanism was fixed some time in 12c so that the db file sequential reads were cached. (It's still possible that many of your single block reads are still from following chained rows - depending how large your cache is, and how many row pieces are no longer in their original block.)

Regards
Jonathan Lewis

On Fri, 25 Oct 2024 at 15:21, Amit Saroha <eramitsaroha_at_gmail.com> wrote:

> Thank you for your detailed email and feedback. I will carefully read both
> the blog entries but one thing I have checked is that the last update date
> is not in the middle of the table, the table has 360 plus columns and the
> rows are not chained.
>
> Best Regards,
> AMIT
>
>
> On Thu, Oct 24, 2024 at 6:10 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>>
>> In your case an index on last_update_date will almost certainly be a bad
>> idea. I wrote a blog about exactly that (type of) index on a fairly large
>> E-business system back in 2008, the notes are still relevant:
>> https://jonathanlewis.wordpress.com/2008/09/26/index-analysis/. In your
>> case you're also likely to see a lot of buffer busy waits and ITL waits for
>> index node splits because you have 100's of transactions per minute and the
>> "last update date" is always going to be in the last few seconds making the
>> high end of the index a very hot spot. You might be able to reduce the
>> contention by hash partitioning the index - since you've got the time
>> compoment with the date, and you'd probably need 64 partitions.
>>
>> HOWEVER - the work done in that tablescan is mostly single block reads
>> (with a small fraction of the time spent on direct path reads). This is
>> either undo blocks being read to make table blocks read-consistent, or it's
>> Oracle following chained (not migrated) rows, and you need to find out
>> which (or if it's both). A column like "last_update_date" is often stuck at
>> the end of the row - so if your problem is chained rows (i.e. too long to
>> fit in a block and split into pieces, or rows more than 255 columns shared
>> across blocks) then you may need to rebuild the table with the
>> last_update_date at the start. If the problem is read-consistency you
>> might want to talk to Oracle support about doing tablescans "backwards" as
>> this may help. (I have another blog note on the topic -
>> https://jonathanlewis.wordpress.com/2016/03/14/quiz-2/
>>
>> You may still decide that accessing the rows by index, and bearing the
>> cost of the index maintenance and contention IS STILL better than the
>> overheads of the tablescan - it might reduce the number of single block
>> reads you do quite dramatically, and that be the overriding factor. To get
>> some idea of the potential you could run a query to select the rowids for
>> the date range you're interested in then you the dbms_rowid() package to
>> extract the file_id and block_id, and count the number of distinct blocks
>> you would access by rowid, If you copied the rowids into a temporary table
>> you could then use it to drive a nested loop joining by rowid (you'd have
>> to hint use_nl() and rowid()) to see how much work you would then have to
>> do visiting undo and following chained rows.
>>
>>
>> That's only one part of your problem, though. The other part is where you
>> do another tablescan of the same table to find 61M rows where
>> chared_periodicity_code is null (and then pass these through several hash
>> joins (including one where the build table is 8.9M rows and the join burns
>> a huge amount of CPU and also spills to disc) before going through a hash
>> right semijoin that drops the row count to 113K. If you can re-order the
>> joins to eliminate most of the rows as early as possible that would help -
>> even better, perhaps may be to make the code gather the 113K rows from the
>> set$1 union all and use them to drive into the big table by nested loop..
>> Again this tablescan spends a little time on direct path reads and a lot of
>> time on single block reads - so the same checks for undo vs. chaining apply.
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>>
>> To get an
>>
>>
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>> On Wed, 23 Oct 2024 at 18:07, Amit Saroha <eramitsaroha_at_gmail.com> wrote:
>>
>>> Hi Tim,
>>>
>>> I completely agree but our system processes hundreds of transactions per
>>> minute across many business verticals, including sales orders, shipping,
>>> and billing, making it difficult to reproduce the test environment.
>>>
>>> Mr. Richard Foote block also shows that it can have impact so do you
>>> have any restricted inputs to test the impact of updates done by different
>>> business processes.
>>>
>>> https://richardfoote.wordpress.com/category/update-indexes/ *"Now with
>>> the non-unique index, this resulted in the index doubling in size as we
>>> created an additional index entry for each and every row. After the
>>> rollback, we were effectively left with an index that not only was twice
>>> the size but had only 1/2 empty leaf blocks."*
>>>
>>>
>>> Best Regards,
>>> AMIT
>>>
>>>
>>> On Wed, Oct 23, 2024 at 12:14 PM Tim Gorman <tim.evdbt_at_gmail.com> wrote:
>>>
>>>> Amit,
>>>>
>>>> <rant>
>>>>
>>>> The statement "*I'm not sure it's okay to establish an index on a
>>>> column that's often changed*" displays a void which must first be
>>>> resolved to the best of our ability. I'm sure a simple test case can be
>>>> devised to provide factual evidence, one way or the other?
>>>>
>>>> The only way to determine benefits and costs is to test it out. Nobody
>>>> should simply guess whether something is OK or not, regardless of how many
>>>> years they've been doing this stuff. Nothing stays the same in this field,
>>>> so each time an unknown is encountered, it should be tested empirically.
>>>>
>>>> I hate to call out names, but Jonathan Lewis (who is supposed to be
>>>> retired but is still suspiciously active!) has been working this way for
>>>> decades, yet I have never seen him sit back and make a technical assertion
>>>> based on those decades of experience. Instead, in each case, he creates a
>>>> viable and reproducible test case, then blogs to state the problem, explain
>>>> the test case, describe the results, to arrive at a hypothesis
>>>> transparently. This invites review by peers, and validation (or
>>>> refutation) as necessary. My apologies to Mr Lewis for using his name (and
>>>> reputation) in this manner, but as I began to describe the method without
>>>> mentioning his name, I felt that I was just dancing around the point. I
>>>> certainly do not intend to speak for him, and I certainly do not wish to
>>>> constrain any response he might have.
>>>>
>>>> Anyway, it's a lot of work, but that's why they pay us, and why we're
>>>> on this list asking (and answering) questions.
>>>>
>>>> We're not here to keep applying the same old solutions, day after day,
>>>> year after year, decade after decade, googled from StackOverflow, or
>>>> bloviated from someone with more years in the field who should know
>>>> better. We're here on this list to understand.
>>>>
>>>> My apologies for the rant, and my sincere apologies to Jonathan for
>>>> stolen valor.
>>>>
>>>> </rant>
>>>>
>>>> Hope this helps?
>>>>
>>>> -Tim
>>>>
>>>> P.S. Long story short, add the index and find a way to test the impact
>>>> on modifications as well as queries, then share the testing and results
>>>> with your peers to decide whether adding the index is a beneficial enough
>>>> for queries to justify the impact on modifications.
>>>>
>>>>
>>>>
>>>> On 10/23/2024 8:30 AM, Amit Saroha wrote:
>>>>
>>>> Hi Petr,
>>>> Thank you for your feedback.
>>>> The table is regularly updated. I mean, the latest update date column
>>>> in a row changes every few minutes, so I'm not sure whether it's okay to
>>>> establish an index on a column that's often changed.
>>>>
>>>> Best Regards,
>>>> AMIT
>>>>
>>>>
>>>> On Wed, Oct 23, 2024 at 11:22 AM Petr Novak <novak_petr_at_yahoo.de>
>>>> wrote:
>>>>
>>>>> Hallo Amit,
>>>>>
>>>>> Table OE_ORDER_LINES_ALL - can you create index on LAST_UPDATE_DATE ?
>>>>> Check also percentage of chained rows in this table.
>>>>>
>>>>> Best Regards,
>>>>> Petr
>>>>>
>>>>>
>>>>> Am Mittwoch, 23. Oktober 2024 um 17:02:56 MESZ hat Amit Saroha <
>>>>> eramitsaroha_at_gmail.com> Folgendes geschrieben:
>>>>>
>>>>>
>>>>> Hi All,
>>>>> I am requesting that you check the included SQL monitoring data and
>>>>> let me know if there is anything we can do to optimize the run time.
>>>>>
>>>>> Thank you for your assistance and advice in advance.
>>>>>
>>>>> Best Regards,
>>>>> AMIT
>>>>>
>>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Oct 25 2024 - 16:48:48 CEST

Original text of this message