Re: Insert Query issue

From: Pap <oracle.developer35_at_gmail.com>
Date: Fri, 20 Aug 2021 00:50:26 +0530
Message-ID: <CAEjw_fgPdQ34qjHRPGUUpTdS_SA_eGHxnvomDgKwSV-z+n5CmQ_at_mail.gmail.com>



No Mark, this is a convention load happening in a batch from informatica and its happening in one node only and no parallelism used here.

On Wed, Aug 4, 2021 at 9:17 PM Mark W. Farnham <mwf_at_rsiz.com> wrote:

> Are you always running this on the same node with parallel local? Is there
> anything else running that might populate this on a different node?
>
>
>
> Depending on your answers the aggregate load for all uses on the
> interconnect can introduce differences in the timings of your runs.
>
>
>
> Keep following JL’s next steps, that is just a random aside.
>
>
>
> mwf
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [mailto:
> oracle-l-bounce_at_freelists.org] *On Behalf Of *Jonathan Lewis
> *Sent:* Wednesday, August 04, 2021 5:16 AM
> *To:* Pap
> *Cc:* Oracle L
> *Subject:* Re: Insert Query issue
>
>
>
> 10 out of 10 for capturing the stats, but minus several million for
> presentation.
>
>
>
> It's up to you to line up the start and end figures in the spreadsheet and
> find the differences; repeat for the fast run and the slow run, then create
> a result set which give the start diff, the end diff, and the difference
> between the two - deleting rows where the start dif and end diff are both
> zero. This gives us the smallest visual field with the maximum information
> content.
>
>
>
> For the system and session activity stats keep the same order as the
> output order - don't sort by value or alphetic order of name.
>
>
>
> You missed one set of stats, the SYSTEM activity stats (v$sysstat) so I
> wouldn't have been able to give you any clues about what the slow and fast
> runs were doing that introduced the extra workload.
>
>
>
> >> I am able to get hold of two different runs for 10minutes duration and
> I captured the stats from the v$views.
>
> >> The slow run as stated by the application team was not extremely slow
> but it was slower as compared to fast run.
>
> >> I was not able to figure out much from the session stats though. But
> comparison if the 'cell single block physical read'
>
> >> wait time from the v$event histogram for both the fast and slow run
> looks as below. Does it mean that we are simply
>
> >> doing more work during the slow run period which may be because of
> something similar to delayed block cleanout
>
> >> but not exactly that(as we don't see that stats in v$sessstat) as
> highlighted by lothar earlier. So it's doing some tidying up
>
> >> of the block post delete and thus doing more work during INSERT? I do
> see few 'backup' related stats in v$
>
> >> system_event , but that is visible in both the slow and fast run
> scenario.
>
>
>
> You posted the difference in the two runs for the event histogram of cell
> single block physical reads; and during the slow run the SYSTEM was doing
> nearly twice as many physical reads - and even though most of them were
> clearly flash cache reads enough of them were showing "real" and slower
> disc read times than during the fast run. Something is happening in the
> rest of the system that is introducing an I/O load that is slowing this
> session down. We don't know what that might be, but that's where the SYSTEM
> stats would have been helpful. Forget any ideas you may have had about
> this session doing any delayed block cleanout or delayed longging block
> cleanout.
>
>
>
> I've not looked at any of the spreadsheet data - only at the
> event_histogram summary you produced in the other post.
>
>
>
> Regards
>
> Jonathan Lewis
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Tue, 3 Aug 2021 at 20:55, Pap <oracle.developer35_at_gmail.com> wrote:
>
> I am able to get hold of two different runs for 10minutes duration and I
> captured the stats from the v$views. The slow run as stated by
> the application team was not extremely slow but it was slower as compared
> to fast run. I was not able to figure out much from the session stats
> though. But comparison if the 'cell single block physical read' wait time
> from the v$event histogram for both the fast and slow run looks as below.
> Does it mean that we are simply doing more work during the slow run period
> which may be because of something similar to delayed block cleanout but not
> exactly that(as we don't see that stats in v$sessstat) as highlighted by
> lothar earlier. So it's doing some tidying up of the block post delete and
> thus doing more work during INSERT? I do see few 'backup' related stats in
> v$system_event , but that is visible in both the slow and fast run
> scenario.
>
>
>
> Don't have the rights to create a function so I have captured the stats
> from v$ views manually when the insert started and then after ~10minutes
> past and added to the attached excel sheet for both fast and slow run. Did
> the same for both fast and slow runs. Below results from v$event_histogram
> is a difference/delta of those numbers. Anything else I should look at from
> these captured v$ view stats to get some more clue?
>
>
>
> EVENT
>
> WAIT_TIME_MILLI
>
> Slow run(Delta Wait_count)
>
> Fast run (Delta Wait count)
>
> cell single block physical read
>
> 1
>
> 2655942
>
> 1414931
>
> cell single block physical read
>
> 2
>
> 56441
>
> 5459
>
> cell single block physical read
>
> 4
>
> 37837
>
> 1374
>
> cell single block physical read
>
> 8
>
> 28521
>
> 393
>
> cell single block physical read
>
> 16
>
> 11066
>
> 291
>
> cell single block physical read
>
> 32
>
> 1579
>
> 90
>
> cell single block physical read
>
> 64
>
> 88
>
> 5
>
> cell single block physical read
>
> 128
>
> 2
>
> 0
>
> cell single block physical read
>
> 256
>
> 1
>
> 0
>
> cell single block physical read
>
> 512
>
> 0
>
> 0
>
> cell single block physical read
>
> 1024
>
> 0
>
> 0
>
> cell single block physical read
>
> 2048
>
> 0
>
> 0
>
> cell single block physical read
>
> 4096
>
> 0
>
> 0
>
> cell single block physical read
>
> 8192
>
> 0
>
> 0
>
>
>
>
>
>
>
> On Sat, Jul 31, 2021 at 10:18 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
> It's good to see the session activity stats for a timed period.
>
>
>
> In this case your figures are showing us that the two runs are basically
> doing the same work in the same way, but the slower session is suffering
> from slow (real) disk I/O, and getting a slighlty smaller fraction of its
> "disk" reads from cell flash cache. (Note that physical reads = physical
> read I/O requests, so it's all single block, which is consistent with
> seeking and updating the index; note also that the "cell flash cache read
> hits" is only a small fraction of the physical read I/O requests. so the
> drop from 2.07% to 1.69% is not particularly significant.
>
>
>
>
>
> There is one interesting pair of numbers - though it's not telling us
> anything about performance. You've inserted (in the good run) roughly 38M
> rows in this interval; but the "Heap Segment Array Inserts" is 1.2M, which
> means (internally) Oracle is inserting about 30 rows per array (compared to
> your (external) array size of 352). But your figures for the size of the
> table (215GB, 1.7B rows) suggest a block should hold about 65 rows, and
> your delete "where cr_date < sysdate - 3" with no supporting index suggests
> a tablescan path which we might expect to empty blocks completely. This
> combination suggests there's a pattern created by past activity that could
> be producing interesting side-effects. (NOTE, though that whatever is
> happening it's happening in both the good and bad here, so this oddity
> isn't an explanation for the difference).
>
>
>
> Although the session activity stats haven't provided an answer, they have
> eliminated some possibilities, so it is a step forward. The thing to do now
> is to extend the information you are collecting. Most of the time is I/O
> wait time, but there is some other time to account, and the exact breakdown
> of I/O time would be interesting; so next steps would be:
>
>
>
> 1) repeat this capture, but gather the session's event wait time
> (v$session_event) at the same time, and the SYSTEM stats and event times so
> that we can see if there's anything happening outside this session that is
> causing this session's I/O time. If possible, add to that capture the
> capture of the event histogram - v$event_histogram which will give you some
> idea of how even (or not) the slowness is by giving you a numeric summary
> histogram of the event times. There's an example at this URL:
> https://jonathanlewis.wordpress.com/2008/06/15/event-snapshots/ showing
> how I've set up some tests in the past, but it's easier in the short term
> to just take two snapshots and find the difference.
>
>
>
> 2) If all else fails to supply any clues enable sql_trace remotely for the
> session for a limited period with wait state tracing enabled to see if you
> can spot any pattern in the disk reads and times taken.
>
>
>
> Regards
>
> Jonathan Lewis
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, 30 Jul 2021 at 12:13, Pap <oracle.developer35_at_gmail.com> wrote:
>
> Thank you Mark.
>
>
>
> I was wondering if it's a good idea to partition this table just to help
> it purge in a cleaner fashion. As because, if we just partition it to make
> our purging better we also have to see the impact on the read queries. To
> have the table partitioned with the delete key(i.e. create_timestamp), the
> primary key also has to include the delete key in it or else it will not
> allow it to make it a local index. And in case of global index we will have
> to go through index invalidation/rebuild stuff both in case of drop or
> partition exchange etc. And also the read queries now have to go through
> multiple local index segments scan vs the current one segment scan.
>
>
>
> I was trying to see/prove if it's really delayed block cleanout impacting
> INSERT, So i tried to collect the stats from v$sesstat for two different
> executions/data loads for close to ~10minutes duration though it's not
> exact(i think i monitored the bad execution for a bit longer duration). And
> as v$sesstat accumulates the statistics so I collected the delta value for
> each run. As per application, the first execution was running slow and the
> next one was running with better response. Attached are the stats for the
> same. But surprisingly I see the delta value of statistics 'transaction
> table consistent read - Undo records applied' is zero in both the cases. So
> I'm wondering if there are some other phenomena(other than delayed
> block cleanout) which are happening here and how can we fix? In both the
> cases I see from ASH the wait event (cell single block physical read) was
> highest logged against the table followed by the PK index.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Jul 30, 2021 at 3:10 AM Mark W. Farnham <mwf_at_rsiz.com> wrote:
>
> That’s right.
>
>
>
> But what you probably **should** do is partition the table by your delete
> key. And if you can change the insert into append, then you just partition
> exchange out and drop the unwanted rows with nary a delete seen. That also
> gives you a cheap ability to archive the swapped out partition to a table
> to back up as a stand alone period of time.
>
>
>
> Good luck. IF Lothar’s suggestion of delayed block cleanout is correct,
> either of these things should fix it. IF the deleted space being used
> dramatically increasing the number of blocks you need to read via an
> indexed access path by fubaring the actual cluster factor, the partitioning
> fixes that.
>
>
>
> *From:* Pap [mailto:oracle.developer35_at_gmail.com]
> *Sent:* Thursday, July 29, 2021 3:49 PM
> *To:* Mark W. Farnham
> *Cc:* Lothar Flatz; Oracle L
> *Subject:* Re: Insert Query issue
>
>
>
> Thanks Mark. If I get it correct , you are suggesting a method to scan
> the blocks in the buffer cache which are getting deleted. But considering
> we are going to hold ~3days worth of data at any point in time, I was
> thinking if it's easy one to just scan or gather stats with auto sample
> size after delete to get the blocks cleaned. but before that is it the same
> effect i.e. delayed block cleanout which we are suffering from here? I
> thought Lothar might be pointing to some other similar effect but not
> exactly delayed block cleanout. Please correct me if I'm wrong.
>
>
>
> On Fri, Jul 30, 2021 at 12:41 AM Mark W. Farnham <mwf_at_rsiz.com> wrote:
>
> If you stash (probably insert append, then after the subsequent read
> truncate) into a table the rowid of all the rows you are about to delete
> (using the same predicate), then do the delete, then commit, that will give
> you the list of the candidate blocks. If you read from the table you
> deleted from where rowid in (select rowid from the new table) from a
> different node on your Exadata, you’re probably going to get that in sga.
> IF not, you can pl/sql loop them one block at a time.
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [mailto:
> oracle-l-bounce_at_freelists.org] *On Behalf Of *Pap
> *Sent:* Thursday, July 29, 2021 2:34 PM
> *To:* Lothar Flatz
> *Cc:* Oracle L
> *Subject:* Re: Insert Query issue
>
>
>
> Thank you Lothar.
>
>
>
> It appears to be something matching to delayed block cleanout effect in
> which post DML sometimes we have to perform a segment scan so that the
> commit scn can be updated for all the blocks and for that sometimes we have
> to do it using manual stats gathering or forced segment full scan to pass
> all the data blocks through buffer cache. But for that we used to see one
> stats i.e. 'transaction table consistent read undo records applied' on the
> SELECT query execution.
>
>
>
> So is this tidying up of block post delete something similar as above and
> we can get that verified and confirmed from some session statistics from
> v$sesstat while the insert is running at run time? Also I am thinking what
> can be done so as to tidying up the blocks after delete, if a force segment
> full scan or stats gather post deletion can anyway help, so as to not
> impact the insert performance?
>
>
>
>
>
>
>
>
>
> *Betreff: *
>
> Re: Insert Query issue
>
> *Datum: *
>
> Thu, 29 Jul 2021 19:59:25 +0200
>
> *Von: *
>
> Lothar Flatz <l.flatz_at_bluewin.ch> <l.flatz_at_bluewin.ch>
>
> *An: *
>
> oracle-l_at_freelists.org
>
>
>
> Hi,
>
> there reason should be that after a delete rows are marked as technical
> deleted, but the block is not yet tidied up. (I suppose that would mean
> e.g. that other rows are not yet shifted for contiguous space etc.)
> The tidying is done on the next insert, which therefore will be
> substantially slower for once.
> I heard this explanation long ago , but suppose it still holds.
> My whole understanding of this is a bit vague.
> I guess Jonathan could fill in some details here.
>
> Regards
>
> Lothar
>
> Am 29.07.2021 um 18:10 schrieb Pap:
>
> This legacy database is on version 11.2.0.4 of oracle Exadata.
>
> We are having below conventional insert query introduced and its a bulk
> insert happening with an arraysize of ~352 from informatica. And the
> response time was ~3millisec/execution i.e. for one chunk or ~352 rows. But
> we noticed suddenly the response time becomes
> tripled(~8-9millisec/execution) for the same set of rows/array size insert.
> No changes done to the underlying table structure.
>
> During digging into the AWR views we found that the increase in response
> time for this insert query started after we introduced the delete script to
> purge data from this table. The delete is also happening in bulk fashion
> with an arraysize of 100K through plsql procedure and its deleting data
> with CRE_TS<sysdate-3. And it's clear from dba_hist_sqlstat that it's the
> cpu_time_delta and IOWAIT_DELTA which increased thus making the
> ELAPSED_TIME_delta go up by three times for the INSERT query. No such
> increase in BUFFER_GETS_delta observed. But one thing we notice is that the
> DELETE is not running at exactly the same time as the insert/data load
> script runs. So howcome this delete is impacting the INSERT query? Is this
> that as delete is making some blocks empty so that now the INSERT has to
> now see through all to find the empty blocks before insert and thus
> spending more time?
>
> The table is ~215GB in size and is not partitioned holding ~1.7billion
> rows. There exists only a primary key composite index on two number data
> type columns. With Blevel-3 , and having index size ~65GB with ~7.3million
> leaf blocks.
>
>
> Below is the AWR plan for the INSERT and DELETE queries.
>
> INSERT INTO TAB1(TAID,TID,AID,APP_NM,CRE_TS) VALUES ( :1, :2, :3,:4, :5)
>
>
> -------------------------------------------------
> | Id | Operation | Name | Cost |
> -------------------------------------------------
> | 0 | INSERT STATEMENT | | 1 |
> | 1 | LOAD TABLE CONVENTIONAL | | |
> -------------------------------------------------
>
> Query Block Name / Object Alias (identified by operation id):
> -------------------------------------------------------------
>
> 1 - INS$1
>
> Note
> -----
> - cpu costing is off (consider enabling it)
>
>
> DELETE FROM USER1.TAB1 WHERE ROWID = :1
>
> Plan hash value: 2488984540
>
>
> ----------------------------------------------------------------------------------------------------
> | Id | Operation | Name | Rows | Bytes
> | Cost (%CPU)| Time |
>
> ----------------------------------------------------------------------------------------------------
> | 0 | DELETE STATEMENT | | | |
> 1 (100)| |
> | 1 | DELETE | TAB1 | | |
> | |
> | 2 | TABLE ACCESS BY USER ROWID| TAB1 | 1 | 25 |
> 1 (0)| 00:00:01 |
>
> ----------------------------------------------------------------------------------------------------
>
> Query Block Name / Object Alias (identified by operation id):
> -------------------------------------------------------------
>
> 1 - DEL$1
> 2 - DEL$1 / TAB1_at_DEL$1
>
> Outline Data
> -------------
>
> /*+
> BEGIN_OUTLINE_DATA
> IGNORE_OPTIM_EMBEDDED_HINTS
> OPTIMIZER_FEATURES_ENABLE('11.2.0.4')
> DB_VERSION('11.2.0.4')
> ALL_ROWS
> OUTLINE_LEAF(_at_"DEL$1")
> ROWID(_at_"DEL$1" "TAB1"_at_"DEL$1")
> END_OUTLINE_DATA
> */
>
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Aug 19 2021 - 21:20:26 CEST

Original text of this message