Re: Better Delete method
Date: Mon, 12 Jul 2021 23:30:12 +0530
Message-ID: <CAKna9VbP1pneheCqdMvO=c395htC5CeoR+5dV6cOR8PYqoNgVA_at_mail.gmail.com>
Thank You So much for the detailed explanation and the associated oddity with this.
On Mon, Jul 12, 2021 at 4:45 AM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:
>
> * >> If I get it correct, it looks like the CTAS way will be the fastest
> way in our case *
>
> Almost certainly true.
>
> The CTAS is the most efficient for undo/redo as a way of getting rid of
> the 50% of rows you don't want; using the exchange partition trick avoids
> the need to recreate grants and synonyms.
>
> There are a couple of traps with the method, though, that the initial
> script won't address (and, personally, I wouldn't want to use a script like
> that anyway). The first relates to the primary key - if you do CTAS (create
> table t2 as select * from t1) and have declared a primary key on t1 then
> the PK column on t1 will be reported by DESCRIBE as NOT NULL; depending on
> how you created the primary key the same may not be true for t2. If it is
> not true then adding a primary key to t2 will require 2 tablescans of t2 -
> one to check and impose a NOT NULL constraint, then the second to create
> the required supporting index. See
> https://jonathanlewis.wordpress.com/2016/01/27/add-primary-key/
>
> The second trap is that if you have any function-based indexes on the
> table (which you don't) or any virtual columns, or if you have marked
> unused (but not dropped) any columns then the extract from user_tab_cols
> will include those columns in the CTAS statement and they will have to be
> edited out. Worse, though, as a follow-up on unused columns - if you have
> any unused columns in the table you need to create the copy with the same
> columns in the right positions and mark them in the copy as unused
> otherwise the exchange will raise error: ORA-14097: column type or size
> mismatch in ALTER TABLE EXCHANGE PARTITION. (Oracle 12c allows you to
> "create table for exchange" which means it will cater for all the odd,
> invisible, things you can't see in the in the normal use of the table but
> which exist in the data dictionary.
>
> In passing - if you have any unused columns in your table you probably
> don't want to drop them in order to avoid the mismatch problem as this will
> result in Oracle updating every row once for every column dropped: and that
> will genreate a huge amount of undo and redo.
> https://www.red-gate.com/simple-talk/sql/oracle/dropping-columns/
>
>
> Regards
> Jonathan Lewis
>
>
>
>
>
>
>
>
>
> On Sun, 11 Jul 2021 at 20:25, Lok P <loknath.73_at_gmail.com> wrote:
>
>> Thank you.
>>
>> If I get it correct, it looks like the CTAS way will be the fastest way
>> in our case and should be followed if we can afford a couple of hours of
>> downtime. And as you mentioned the delete based on ROWID approach in bulk
>> collect will generate gobs of UNDo/REDO and also we may need to reorg the
>> table/index at the end considering 50% of the data is getting deleted.
>>
>> And yes both the Primary key constraint/index of data integrity and other
>> composite indexes for Select queries are needed . But as you said we will
>> try to create these two indexes in parallel from two sessions to make this
>> process faster. But the doubt i had was, if i create a unique index in
>> parallel-32 from one session and later on will create PK constraint using
>> the previously created unique index will that happen in seconds? and the
>> constraint can be created in VALIDATE status? or the PK constraint
>> validation will separately happen and will take time then?
>>
>>
>> On Sun, Jul 11, 2021 at 10:37 PM ahmed.fikri_at_t-online.de <
>> ahmed.fikri_at_t-online.de> wrote:
>>
>>> Given the size of the table you mentioned, I'm assuming that clearing
>>> the data via partition exchange will require a maintenance window of less
>>> than 60 minutes.
>>>
>>>
>>>
>>> Are the PK and the other index critical to the application (you
>>> mentioned no FKs are pointing to this table)? Why not create them in two
>>> sessions each with dop 32 afterwards? The whole operation should take less
>>> then 60 minutes.
>>>
>>>
>>>
>>> If no downtime is tolerated, you can hide your table behind a view and
>>> use trigger to implement some sort of redo mechanism analog to the one from
>>> Oracle itself.
>>>
>>>
>>>
>>> To remove more than 400 GB of data, UNDO / REDO should be avoided as it
>>> only pollutes the entire database. They are just a boilerplate. Why do we
>>> need to generate all of this data when in some situations we are 300% sure
>>> that we don't need it?(Your first method is not efficient. Also requires
>>> one single block read for each row -- this is too huge)
>>>
>>>
>>>
>>> The problem is that Oracle is not ready to change concepts from the
>>> 1970s.So why not introduce (implement) a note /*+ dml_no_consistency_read *
>>> /, for example. If this hint is used in a delete statement, nothing is
>>> written to the UNDO and if the user has chosen to roll back, appropriate
>>> redos are identified and skipped.
>>>
>>>
>>>
>>> Best regards
>>>
>>> Ahmed
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----Original-Nachricht-----
>>>
>>> Betreff: Re: Better Delete method
>>>
>>> Datum: 2021-07-11T15:07:14+0200
>>>
>>> Von: "Lok P" <loknath.73_at_gmail.com>
>>>
>>> An: "ahmed.fikri_at_t-online.de" <ahmed.fikri_at_t-online.de>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thank you Ahmed. So this code is doing data purge by creating a
>>> temp table which will be of similar structure(indexes and constraints needs
>>> to be exactly same) as of main table but is partitioned , so as to take
>>> advantage of partition exchange approach. But yes, it seems like DB
>>> resource/time consumption in this method is the same as method-2, which I
>>> mentioned in my initial post. And it does need downtime , because in
>>> between the CTAS and final partition exchange if any DML operation happens
>>> on the base table , that data will be missed.
>>>
>>> And is it correct that in either of the ways(using CTAS with/without
>>> partition exchange), the primary key constraint can be created with a
>>> VALIDATE state only without much time and resource , if we first create the
>>> UNIQUE index and then create PK constraints in the VALIDATE state using
>>> that same unique index? Please correct me if I'm wrong.
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Jul 11, 2021 at 1:44 AM ahmed.fikri_at_t-online.de <
>>> ahmed.fikri_at_t-online.de> wrote:
>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> in the attached file is a method to delete data from big table using
>>>> partition exchange (you have to enhance the method to use indexes....)
>>>>
>>>>
>>>>
>>>> Best regards
>>>>
>>>> Ahmed
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -----Original-Nachricht-----
>>>>
>>>> Betreff: Better Delete method
>>>>
>>>> Datum: 2021-07-10T21:47:55+0200
>>>>
>>>> Von: "Lok P" <loknath.73_at_gmail.com>
>>>>
>>>> An: "Oracle L" <oracle-l_at_freelists.org>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Hello , this database version is 11.2.0.4 of Oracle Exadata. A
>>>> table(say TAB1) is there holding ~900 million rows with size ~222GB and
>>>> it's not partitioned. It has two indexes , one with a three column
>>>> composite index with size ~98Gb and other is the primary key on one column
>>>> with size ~23GB. As a part of the requirement we need to delete/purge 50%
>>>> of its data from this table. No referential constraints exist here. So I
>>>> wanted to understand, out of the two below, which is the best method to opt
>>>> for? or any other possible better option?
>>>>
>>>> I can understand method-1 is achievable ONLINE, but is slower while
>>>> method-2 will be faster. So if we can afford ~1-2hrs of downtime, is it
>>>> good to go for method -2 as the delete approach. As because deleting 50%
>>>> rows even in method-1 may also need a table move+index rebuild(which will
>>>> again need downtime on 11.2) to lower the high water mark and make the
>>>> indexes compact and back to normal. Please advise.
>>>>
>>>>
>>>> Method-1:-
>>>>
>>>> steps- 1:
>>>> In a cursor pick the ~450million rowids of the rows to be deleted
>>>> based on filter criteria;
>>>> step2:-
>>>> Delete based on ROW_IDS in a bulk collect fashion with LIMIT 50K
>>>> rows ids at oneshot and commit within loop.
>>>>
>>>> Method-2:-
>>>>
>>>> Step- 1
>>>> Create a new table using CTAS a new table TAB1_BKP AS select *
>>>> from TAB1 where (required filter criteria which will pick ~450 required
>>>> rows);
>>>> Step-2:-
>>>> Create the composite index in parallel.to make it as fast as
>>>> possible
>>>> Create the unique index on the same column as there in PK.
>>>> Create the primary constraints with NOVALIDATE(because creating
>>>> it with validate may take a lot of time to validate existing data) using
>>>> the above unique index (This operation should happen in seconds as the
>>>> index is already created in the above step.)
>>>> Rename the TAB1_BKP as TAB1 and TAB1 as TAB1_BKP(which can be served
>>>> as backup for a few days and later dropped).
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Jul 12 2021 - 20:00:12 CEST