Re: Question on concurrency wait time

From: Pap <oracle.developer35_at_gmail.com>
Date: Mon, 24 Oct 2022 15:37:03 +0530
Message-ID: <CAEjw_fjKuD4v-G3ok4PgaupcrwOgP_MV7EjMC1Ba4+JwXruGng_at_mail.gmail.com>



Trying to send the reply again to Sayan, Mark and Mohamed which has got bounced back earlier from the list..

_at_Sayan, I believe , it's not possible to just hash-subpartition the local PK index independently without subpartitioning the main table with the same keys. So what you are suggesting is to hash - subpartition the table itself by key (unix_time_id, status) so that means all the local indexes will also be hash partitioned in the same way automatically including this primary key index. And we have ~6 other local indexes present on this table. Please correct me if my understanding is wrong? I was wondering if it will increase the number of segment scans for all those local index scan queries(as those will now be broken into 8 or 16 parts each) and thus will have additional IO overhead for select queries using those indexes? Also as i mentioned we have another index(on create_date column) which is appearing second top in the concurrency/buffer busy wait list. So I was wondering , if this approach will also benefit that index too or that will need a different partitioning strategy altogether?

_at_Mark, The status column is having Number(10) data type having ~2000 distinct values in it. Yes, column "part_date_time" is the partition key with date data type and it stores both date and time components in it. As mentioned its a weekly range partition table and so all the indexes too are local only. Regarding reverse key index , i fear if the physical read will increase post making the PK index as reverse key and also i see range scan is not possible for reverse key index, and we have scenarios in which we are using first two columns(unix_time_id, status) of the index as equi join / filter with the third column(part_date_time) as range predicate and those cases range scan is currently helping us.

It's a row by row insert query and yes, there are multiple concurrent inserts executed simultaneously from different sessions. Each session does Maximum two inserts, so not sure making it bulk will be helpful at all.

_at_Mohamed
Actually we are using ZDLRA for backup so all are incremental-L1 only. Daily we have one L1 , which happens but that finishes in 7-8 minutes. However this concurrency wait happens other times of the day too. Not seeing Ora-0431 recently though, however one thing we notice is that the number of inserts to this table is the same even higher at other times of the day, but we don't see this concurrency throughout the day. And this time period of concurrency wait matches exactly with one of the heavy activity periods from another app, So can it be possible that this concurrency wait for these two indexes is appearing just because of a smaller buffer cache i.e. not sufficient for heavy activity from other jobs in that specific period ? and thus increasing the size of the buffer cache will help here?

On Mon, Oct 24, 2022 at 3:29 PM Pap <oracle.developer35_at_gmail.com> wrote:

> Thank You So much Stefan.
>
> I am not sure why all the responses I sent back earlier bounced back from
> the list. I had replied earlier, suspecting maybe the smaller size of the
> buffer cache is causing this, because the rate of inserts happen throughout
> the day is almost the same , but we see the concurrency during the time
> when there is heavy activity happening on the database from another app.
> But now as you rightly pointed out and also its showing in the ASH stats
> the BCT Buffer space wait seems to be the starting point. And then we may
> have to increase the size of the BCT buffer. Currently we have database
> size ~27TB and the BCT file size in v$block_change_tracking showing as ~2GB.
>
> But again how can this be related? Its understood that during higher
> activity(say high DML's) from another application which may not be related
> to this object/index, but then that can cause this INSERT query(along with
> other DMLS) response to go slow itself and the top wait event it should log
> against that insert query as "BCT buffer space". However we are seeing the
> top wait event against this INSERT query as concurrency/buffer busy waits
> and the current_obj# pointing to the index partition? As my understanding
> was concurrency/buffer busy wait on index block can only happen during the
> race scenario, i.e. if the inserts are too heavy/concurrent and are all
> going to change one specific block. Please correct me if my understanding
> is wrong.
>
> And yes, we were initially suspecting if it's the size of the buffer cache
> which matters here and has to be bigger to cater the load. Then, we tried
> to see the sga advisory , if it's pointing to the same fact. This database
> has a db_cache_size of 6GB with sga_max_size ~35GB and sga_target set as 0.
> But not sure why , I don't see any records populated in
> dba_hist_sga_advice. But from the rows in dba_hist_db_cache_advice it's
> showing that doubling the db cache i.e. increasing it by a factor of -2
> will just help in reducing the physical reads by ~15 to ~17%. Not sure if
> these figures are reliable or not.
>
> On Mon, Oct 24, 2022 at 1:44 PM Stefan Koehler <contact_at_soocs.de> wrote:
>
>> Hello Pap,
>> based on the wait chain output the root cause of your "buffer busy waits"
>> problem is not the index(es) or application behavior but "block change
>> tracking buffer space".
>>
>> This is related to the block change tracking feature (e.g. for RMAN
>> incremental backups) - it seems like your BCT buffer is way to small to
>> keep up with the change rate (and/or I/O to your block change tracking file
>> is very slow) at this time. I troubleshot several issues with block change
>> tracking - even in the most recent versions, so not a very uncommon case.
>>
>> You can try to increase the BTC buffer (e.g. MOS ID #2094946.1) but
>> please double/triple check if the increase works as there are several known
>> bugs (e.g. like #32428097 and many more - even it says fixed in 19.x).
>>
>> Best Regards
>> Stefan Koehler
>>
>> Independent Oracle performance consultant and researcher
>> Website: http://www.soocs.de
>> Twitter: _at_OracleSK
>>
>> > Pap <oracle.developer35_at_gmail.com> hat am 23.10.2022 14:32 CEST
>> geschrieben:
>> >
>> > Hello Listers,
>> > We have a customer database on version 19.15. We are experiencing high
>> concurrency waits(Buffer busy waits) for one of the INSERT query and the
>> object its pointing to in ASH is the primary key composite index which is
>> on three columns(Unix_time_id,status,part_date_time) followed by other one
>> which is on one column i.e create_date column. Both of these two indexes
>> are local indexes. And the table is a weekly range partition on the date
>> column (part_date_time which is populated by sysdate value from
>> application).
>> >
>> > Below is the output from the Tanel's DASH_wait_chain query from the
>> issue period. This spike in concurrency happens for 2-3minutes(even less
>> time duration in many occasions) impacting one of the critical latency
>> sensitive jobs. Our understanding was , as the first column of the primary
>> key index is generated from application code as a unix timestamp(defined as
>> VARCHAR2(40) data type) and is mostly unique, so the contention should be
>> minimal. For a specific time period, the values of the first column -
>> Unix_time_id looks like below i.e even different but the first 7 to 8
>> characters are the same. So can it be the cause of concurrency here and if
>> yes, how can we avoid it?
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Oct 24 2022 - 12:07:03 CEST

Original text of this message