Re: Parallel Query change from smart scan to cell single block access
From: Tanel Poder <tanel_at_tanelpoder.com>
Date: Sun, 19 Jun 2022 01:47:23 -0400
Message-ID: <CAMHX9JLBUyZ=gGqEpApjuKQWh0Zaj+eR9pWQV7gsppqgxe=P0Q_at_mail.gmail.com>
Date: Sun, 19 Jun 2022 01:47:23 -0400
Message-ID: <CAMHX9JLBUyZ=gGqEpApjuKQWh0Zaj+eR9pWQV7gsppqgxe=P0Q_at_mail.gmail.com>
This is somewhat related to the topic (long running transactions causing some optimizations not be usable):
- EXADATA and SuperCluster : Check if long running transactions are preventing min active scn from progressing, resulting in Storage Indexes not being used (Doc ID 2081483.1)
-- Tanel Poder https://learn.tanelpoder.com On Sat, Jun 18, 2022 at 8:10 PM Ls Cheng <exriscer_at_gmail.com> wrote:Received on Sun Jun 19 2022 - 07:47:23 CEST
> Hi
>
> Sorry for the late reply.
>
> I did take snaps of gv$sesstat for the PQ slave which was doing cell
> single block reads but saw nothing strange
>
> The following was sesstat in two snapshots with 60 seconds interval, only
> statistics where value - prev_value > 0 is considered
>
> SNAP_DATE NAME
> VALUE-PREV_VALUE INST_ID SID
> ---------------
> ----------------------------------------------------------------
> ---------------- ---------- ------
> 20220614 221231 messages sent
> 2 7 1221
> 20220614 221231 cluster wait time
> 110 7 1221
> 20220614 221231 blocks decrypted
> 364 7 1221
> 20220614 221231 gc hash slot removed
> 374 7 1221
> 20220614 221231 prefetched blocks aged out before use
> 461 7 1221
> 20220614 221231 pinned buffers inspected
> 536 7 1221
> 20220614 221231 dirty buffers inspected
> 2156 7 1221
> 20220614 221231 shared hash latch upgrades - no wait
> 4236 7 1221
> 20220614 221231 user I/O wait time
> 5439 7 1221
> 20220614 221231 non-idle wait time
> 5549 7 1221
> 20220614 221231 hot buffers moved to head of LRU
> 6684 7 1221
> 20220614 221231 calls to kcmgas
> 40158 7 1221
> 20220614 221231 free buffer inspected
> 66597 7 1221
> 20220614 221231 cell flash cache read hits
> 66681 7 1221
> 20220614 221231 physical read requests optimized
> 66681 7 1221
> 20220614 221231 physical read total IO requests
> 66681 7 1221
> 20220614 221231 physical reads
> 66681 7 1221
> 20220614 221231 physical read IO requests
> 66681 7 1221
> 20220614 221231 gc local grants
> 66681 7 1221
> 20220614 221231 physical reads cache
> 66681 7 1221
> 20220614 221231 free buffer requested
> 66681 7 1221
> 20220614 221231 gc remote disk read
> 66681 7 1221
> 20220614 221231 consistent gets from cache
> 68471 7 1221
> 20220614 221231 consistent gets examination
> 68471 7 1221
> 20220614 221231 consistent gets
> 68471 7 1221
> 20220614 221231 consistent changes
> 68471 7 1221
> 20220614 221231 transaction tables consistent reads - undo records applied
> 68471 7 1221
> 20220614 221231 session logical reads
> 68471 7 1221
> 20220614 221231 non-idle wait count
> 161928 7 1221
> 20220614 221231 file io wait time
> 54386966 7 1221
> 20220614 221231 physical read bytes
> 546250752 7 1221
> 20220614 221231 cell physical IO interconnect bytes
> 546250752 7 1221
> 20220614 221231 physical read total bytes
> 546250752 7 1221
> 20220614 221231 physical read total bytes optimized
> 546250752 7 1221
> 20220614 221231 logical read bytes from cache
> 560914432 7 1221
>
> 35 rows selected.
>
> Thanks
>
>
> On Wed, Jun 15, 2022 at 10:10 PM Tanel Poder <tanel_at_tanelpoder.com> wrote:
>
>> Check the v$sesstat metrics, there's nowadays plenty of feedback from
>> storage cells telling how "deep" the offload processing actually was or if
>> there was pushback/passthrough caused
>>
>> Common causes for smart scan "hiccups" with single block reads:
>>
>> 1. chained rows
>> 2. migrated rows for some operations (update, maybe select for update
>> etc)
>> 3. uncommitted transactions in blocks (or not trivially cleanable
>> blocks in storage)
>>
>>
>> But v$sesstat metrics will tell you more.
>>
>> Here's an article from 12 years ago, where chained rows caused problems
>> for a smart scan.
>>
>> -
>> http://tech.e2sn.com/oracle/exadata/performance-troubleshooting/exadata-smart-scan-performance
>>
>> Things have somewhat improved by now, but back then the "cell chained
>> rows%" metrics didn't even exist, had to reason by looking at other
>> metrics, like the "table fetch continued row" and the difference of
>> "processing depth" at cache & txn layer vs data layer in the storage cells:
>>
>> 869, DWH , STAT, table fetch continued row ,
>> 3660, 732,
>> 869, DWH , STAT, cell blocks processed by cache layer ,
>> 5428, 1.09k,
>> 869, DWH , STAT, cell blocks processed by txn layer ,
>> 5428, 1.09k,
>> 869, DWH , STAT, cell blocks processed by data layer ,
>> 3625, 725,
>>
>> --
>> Tanel Poder
>> https://learn.tanelpoder.com
>>
>>
>> On Tue, Jun 14, 2022 at 6:26 PM Ls Cheng <exriscer_at_gmail.com> wrote:
>>
>>> Hi all
>>>
>>> I have a strange situation where a very simple query (no join, a single
>>> FROM table) sometimes is fast (seconds) and sometimes slow (hours). After
>>> digging a bit It seems that one of PQ slave instead of accessing the table
>>> using cell smart table scan is accessing by cell single blocks.
>>>
>>> This is 19.10, has anyone observed or faced such a problem?
>>>
>>> Thanks
>>>
>>>
-- http://www.freelists.org/webpage/oracle-l