Re: Big Update/DML
Date: Fri, 28 Aug 2020 07:26:59 +0100
Message-ID: <CAGtsp8kCq8047ToBuU33PEUo0OpbEj8f0g8-QHmEk04crtq1bw_at_mail.gmail.com>
Sanjay,
You've mentioned sharing an SQL Monitor report, but I seem to have missed
it.
Thoughts on compression:
If you want to "create compressed table as select *+1 columns from compressed table" then you have to pull blocks from storage, decompress them, generate new rows, compress them, and write back to storage. That's going to take a lot of CPU. Add to that the fact that your initial compressed table may have a lot of migrated rows which lead to lots of cell single block reads.
Two checks:
a) Is most of your time CPU
b) On the full tablescan do you record lots of cell single block reads
The former is about compression, the latter is about migration,
You could also test compression effects separately.
Create a copy of the 1st 10M rows of a table, then copy it with the change
you want.
Do this 4 times covering:
compressed -> compressed
compressed -> not compressed
not compressed -> not compressed
not compressed -> compressed
Check for any significant time differences - check also the time difference between making a copy from the original and making a copy from the copy.
Regards
Jonathan Lewis
On Fri, Aug 28, 2020 at 3:22 AM Sanjay Mishra <dmarc-noreply_at_freelists.org> wrote:
> Tim
>
> Thanks for the update. Based on the Sqlmonitor data shared earlier , it
> was using Parallel for DDL operation but I will give another try again as
> working to load another table in Nocompress as existing one is Compress for
> OLTP. Tablespace was given 1.5T space as it was initially extending it
> heavily and that helped to increase the throughput. Contention was only
> seen for Interconnect or some gc wait and CPU Contention or any high IO
> wait was also not seen in the AWR report.
>
> Do we think Advance compression can be major bottleneck as I earlier
> thought that it might help in utilizing buffer space but doesn't look like
> the case.
>
> Will share details soon
>
> Tx
> Sanjay
>
> On Thursday, August 27, 2020, 05:08:41 PM EDT, Tim Gorman <
> tim.evdbt_at_gmail.com> wrote:
>
>
> Agreed! Pardon my unwarranted guess, but perhaps parallel DDL (which is
> enabled by default) was not enabled in the session for some reason before
> starting CTAS? To verify, consider running the following query after ALTER
> SESSION ENABLE PARALLEL DDL and before CTAS?
>
> select pddl_status from v$session where sid in (select distinct sid from
> v$mystat);
>
>
> ...or better yet, check PDDL_STATUS from another session?
>
> Another possible issue is high-water mark management. If the datafiles in
> the tablespace in which you're creating the new table in the CTAS statement
> has to be autoextended to accommodate the volume of data you're loading,
> then things might take longer. If the INCREMENT_BY (i.e. NEXT) for the
> datafile(s) is really small, then datafile autoextension might take a
> *REALLY* long time; look for waits on "enq: HV - contention" or "enq: HW -
> contention"? At any rate, the solution is to either make INCREMENT_BY
> really large, or just resize the datafile to accommodate what you're
> building.
>
> There are other possible issues to be considered. The key is to monitor
> the operation and use the information that the Oracle RDBMS provides to
> understand why you're bottlenecked and resolve it.
>
> If you need suggestions for monitoring specific actions, please ask? It
> is frustrating when someone announces "it doesn't work" with no attempt to
> understand why.
>
>
>
> On 8/27/2020 3:41 AM, Lothar Flatz wrote:
>
> Hi,
>
> with regards to CTAS it is very hard to believe it takes that long. I am
> pretty sure that there is something wrong.
> A sql monitor would be extremly helpfull.
>
> Regards
>
> Lothar
>
> Am 27.08.2020 um 02:43 schrieb Sanjay Mishra (Redacted sender smishra_97
> for DMARC):
>
> Sayan
>
> Update statement is
>
> Update snows.stamp_detail set set stamp_process_calc=processed_calc_amt;
>
> Tried to use
> 1. Parallel DML with 100 --> Taking 20+hrs
> 2. CTAS was tried using half a billion as well as 1 billion rows with
> parallel 50, 75,100 - Almost same result
> 3. CTAS with nologging using same step 2 but still not much improvement
>
> We have 5-10 such big table and so running each with this much time-frame
> need high downtime
> Tx
> Sanjay
>
> On Wednesday, August 26, 2020, 08:23:18 PM EDT, Sayan Malakshinov
> <xt.and.r_at_gmail.com> <xt.and.r_at_gmail.com> wrote:
>
>
> Hi Sanjay,
>
> It would be better if you provide more details about your update. Exact
> update statement would be helpful. is this column nullable/not null?
>
>
> On Thu, Aug 27, 2020 at 1:12 AM Sanjay Mishra <dmarc-noreply_at_freelists.org>
> wrote:
>
> Andy
>
> Yes look like is an option if we are doing work online and despite take
> more time but need not require downtime. In our case multiple DDL are
> running to existing environment due to Application upgrade and so all work
> has to be done with downtime. So challenge is reduce time of DML operations
> on big tables containing few billions rows.
>
> Tx
> Sanjay
>
> On Wednesday, August 26, 2020, 11:20:55 AM EDT, Andy Sayer <
> andysayer_at_gmail.com> wrote:
>
>
> It does sound like a virtual column could be the ideal solution. But if
> data needs to be physically stored or cannot be calculated
> deterministically at any point in time then Connor has a great demo of
> using dbms_redefinition to create a new table online with a function to map
> the new column. There’s obviously some overhead with context switching but
> it may be far better than some of the obstacles you might be facing at the
> moment:
>
> https://connor-mcdonald.com/2016/11/16/performing-a-large-correlated-update/
> (and you might be able to help it with pragma udf in the right
> circumstances).
>
> Obviously, how helpful this is depends where the work is currently going
> and how online this needs to be.
>
>
> Thanks,
> Andrew
>
> On Wed, 26 Aug 2020 at 16:00, Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>
> Is that 3-4 billion rows each, or total ?
>
> I would be a little suspicious of an update which populates a new column
> with a value derived from existing columns. What options might you have for
> declaring a virtual column instead - which you could index if needed.
>
> Be extremely cautious about calculating space requirements - if you're
> updating every row on old data might you find that you're causing a
> significant fraction of the rows in each block to migrate, and there's a
> peculiarity of bulk row migration that can effectively "waste" 25% of the
> space in every block that becomes the target of a migrated row.
>
> This effects can be MUCH work when the table is compress (even for OLTP)
> since the update has to decompress the row before updating and then only
> "re-compresses" intermittently as the block becomes full. The CPU cost can
> be horrendous and you still have the problem of migration if the addition
> means the original rows can no longer fit in the block.
>
> If it is necessary to add the column you may want to review "alter table
> move online" can do in the latest versions (in case you can make it add the
> column as you move) or review the options for dbms_redefinition - maybe
> running several redefinitions concurrently rather than trying to do any
> parallel update to any single table.
>
> Regards
> Jonathan Lewis
>
>
>
>
>
>
>
>
>
> --
> Best regards,
> Sayan Malakshinov
> Oracle performance tuning engineer
> Oracle ACE Associate
> http://orasql.org
>
>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Fri Aug 28 2020 - 08:26:59 CEST