Re: Direct read stopped suddenly

From: Sayan Malakshinov <xt.and.r_at_gmail.com>
Date: Wed, 25 May 2022 02:26:18 +0300
Message-ID: <CAOVevU582Jm_0j=f5BmC5tTDbspVRy8jZj1SYtY9=RbjyU08HA_at_mail.gmail.com>



Hi Pap,

Do you mean to say on DB node-1 the object was fully cached whereas on
> node-2 and 3 and 4 it wasn't, and thus when the query was executed from
> node-2/3/4 it was going for smart scan whereas when it was executed from
> node-1 it was going for buffered read?

yes, absolutely correct

 And should we then flush it from the buffer cache but it can then happen
> again , so how should we handle this odd behaviour?

Yes, you can flush buffer cache, but also you can use "table_stats" hint to force serial direct path reads: we know that Oracle choses serial direct path reads when our segment is bigger than "_very_large_object_threshold" percent of buffer cache. By default, it's equal to 500, i.e. 500%, so if that table segment is 5 times bigger than buffer cache, Oracle should choose direct reads.

Simple example:
https://gist.github.com/xtender/3c2d608f0d8de6b63b78f814903c70ea

SQL> _at_vparam_ _very_large_object_threshold

NAME                                     VALUE        DEFLT        TYPE

    DESCRIPTION

---------------------------------------- ------------ ------------
------------ ------------------------------------------------------------
_very_large_object_threshold             500          TRUE         number

    upper threshold level of object size for direct reads

SQL> select name,block_size,buffers from v$buffer_pool;

NAME                 BLOCK_SIZE    BUFFERS
-------------------- ---------- ----------
DEFAULT                    8192      41118

SQL> col vlot new_value vlot;
SQL> select buffers*5+1 as vlot from v$buffer_pool;

      VLOT


    205591

SQL> create table t_direct as select 1 x from dual;

Table created.

SQL> select name,value from v$statname n,v$sesstat s where s.sid=userenv('sid') and s.statistic#=n.statistic# and name='physical reads direct';

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical reads direct                                                     0

SQL> select count(*) from t_direct;

  COUNT(*)


         1

SQL> select name,value from v$statname n,v$sesstat s where s.sid=userenv('sid') and s.statistic#=n.statistic# and name='physical reads direct';

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical reads direct                                                     0

SQL> select/*+ table_stats(t_direct set rows=1000000 blocks=&vlot) */ count(*) from t_direct;

  COUNT(*)


         1

SQL> select name,value from v$statname n,v$sesstat s where s.sid=userenv('sid') and s.statistic#=n.statistic# and name='physical reads direct';

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical reads direct
*1*

More info:
http://orasql.org/2019/04/16/correct-syntax-for-the-table_stats-hint/ http://orasql.org/2013/03/07/just-test-of-adaptive-direct-path-reads-with-index_stats/

<http://orasql.org/2013/03/07/just-test-of-adaptive-direct-path-reads-with-index_stats/> Full example:
https://gist.github.com/xtender/3c2d608f0d8de6b63b78f814903c70ea

On Tue, May 24, 2022 at 11:13 PM Pap <oracle.developer35_at_gmail.com> wrote:

> Thank you Sayan. This query has been running fine since the last 5-6months
> and always using cell smart scan , as i can see from 'offload returned
> bytes' column value >0 in dba_hist_sqlstat for this query and thus the
> elapsed time was alway <1second and never encountered such behaviour. Do
> you mean to say on DB node-1 the object was fully cached whereas on node-2
> and 3 and 4 it wasn't, and thus when the query was executed from node-2/3/4
> it was going for smart scan whereas when it was executed from node-1 it was
> going for buffered read? And should we then flush it from the buffer cache
> but it can then happen again , so how should we handle this odd behaviour?

>

> On Wed, May 25, 2022 at 1:28 AM Sayan Malakshinov <xt.and.r_at_gmail.com>
> wrote:
>
>> Hi Pap,
>>
>> ntcache is a number of blocks of that object in the buffer cache. Since
>> there are almost all blocks in the cache, oracle decides to do not use
>> direct reads.
>>
>>
>> Best regards,
>> Sayan Malakshinov
>> Oracle performance tuning expert
>> Oracle Database Developer Choice Award winner
>> Oracle ACE
>> http://orasql.org
>>
>> вт, 24 мая 2022 г., 22:29 Pap <oracle.developer35_at_gmail.com>:
>>
>>> Hello Listers, We have one legacy system on version 11.2.0.4 and it is
>>> planned to be migrated to 19C. It's a four node database. We have a select
>>> query suddenly stopped doing smartscan from one node and the response time
>>> goes from <1 seconds to >5seconds. And the surprising thing is , when we
>>> manually trigger the sql , it's doing a smart scan when triggered from
>>> other nodes but from one node only it's not doing that and the response
>>> time going high for that specific node. It seems that one query showing
>>> this behaviour and other running fine.Ideally my understanding was that it
>>> should not vary based on DB node as because the storage nodes are common
>>> across all the four DB nodes. So I'm wondering , why is it happening?
>>>
>>> Collected the 10358 trace and below is the trace. It does show a
>>> difference in the value of "ntcache" and "ntdist" for both the nodes and
>>> that is possibly driving the direct read/smartscan decision. So I wanted to
>>> understand what is that and why that value is different in both DB nodes
>>> and how to fix this odd behaviour?
>>>
>>> query:-
>>> SELECT c1, c8, c2, c3, c4, c5, c6, c7 from TAB1 where c1 = :1
>>>
>>> Plan hash value: 2108848690
>>>
>>> --------------------------------------------------------------------------------------------------------
>>> | Id | Operation                | Name                      | Rows |
>>> Bytes | Cost (%CPU)| Time    |
>>>
>>> --------------------------------------------------------------------------------------------------------
>>> |  0 | SELECT STATEMENT         |                           |      |
>>>  | 1583 (100)|         |
>>> |  1 | TABLE ACCESS STORAGE FULL| TAB1                      |    9 |
>>>  711 | 1583  (1)| 00:00:19 |
>>>
>>> --------------------------------------------------------------------------------------------------------
>>> Query Block Name / Object Alias (identified by operation id):
>>> -------------------------------------------------------------
>>>   1 - SEL$1 / TAB1_at_SEL$1
>>>
>>>
>>> Trace from DB node which is doing smartscan:-
>>>
>>> NSMTIO: qertbFetch:[MTT < OBJECT_SIZE < VLOT]: Checking cost to read
>>> from caches(local/remote) and checking storage reduction factors (OLTP/EHCC
>>> Comp)
>>> NSMTIO: kcbdpc:DirectRead: tsn: 29, objd: 1607047, objn: 1607047
>>> ckpt: 1, nblks: 757301, ntcache: 10, ntdist:30
>>> Direct Path for tsn 29 objd 1607047 objn 1607047
>>> Direct Path 1 ckpt 1, nblks 757301 ntcache 10 ntdist 30
>>> Direct Path mndb 0 tdiob 12 txiob 0 tciob 66
>>> Direct path diomrc 128 dios 2
>>> NSMTIO: Additional Info: VLOT=65089150
>>> Object# = 1607047, Object_Size = 757301 blocks
>>> SqlId = 64d0vkmtjgq2x, plan_hash_value = 2108848690, Partition# = 0
>>> KCBO: prepared ksr msg for cid=131228, tsn=29, obj=1607047, 0
>>>
>>>
>>> Trace from DB node which is not doing smartscan:-
>>>
>>> NSMTIO: qertbFetch:[MTT < OBJECT_SIZE < VLOT]: Checking cost to read
>>> from caches(local/remote) and checking storage reduction factors (OLTP/EHCC
>>> Comp)
>>> NSMTIO: kcbdpc:NoDirectRead:[CACHE_READ]: tsn: 29, objd: 1607047, objn:
>>> 1607047
>>> ckpt: 0, nblks: 757301, ntcache: 756116, ntdist:1185
>>> Direct Path for tsn 29 objd 1607047 objn 1607047
>>> Direct Path 0 ckpt 0, nblks 757301 ntcache 756116 ntdist 1185
>>> Direct Path mndb 18 tdiob 10 txiob 0 tciob 70
>>> Direct path diomrc 128 dios 2
>>> NSMTIO: Additional Info: VLOT=65089150
>>> Object# = 1607047, Object_Size = 757301 blocks
>>> SqlId = 64d0vkmtjgq2x, plan_hash_value = 2108848690, Partition# = 0
>>>
>>

-- 
Best regards,
Sayan Malakshinov
Oracle performance tuning engineer
Oracle ACE
http://orasql.org

--
http://www.freelists.org/webpage/oracle-l
Received on Wed May 25 2022 - 01:26:18 CEST

Original text of this message