"Re-sequencing" a pseudo-key
Date: Wed, 8 Oct 2014 12:08:24 +0200
Message-ID: <CA+S=qd3x3=rYrd2__zNS+GW6RXYNNqhjmk3XU-_FbxGJw=-Wew_at_mail.gmail.com>
Hi, list
The short description of my question is:
· I’ve got 600 million rows in 650 tables and need to update all rows to “re-sequence” a 32-bit integer pseudo key in all tables. How to approach this update task with reasonable efficiency?
I can’t think of a way to describe the task very briefly without getting a lot of answers where I’ll have to say “well, there’s just this little special issue that makes your good idea unable to work here”, so this’ll be a longish mail. If you haven’t got the time or inclination to read through this bunch of details, please feel free to stop reading now :-)
For those reading on, I’ll try to describe the setup and the challenge in a bunch of points. That way you can refer to “point 5.c.2 is wrong, young man” or something like that ;-)
- Total about 650 tables totaling about 600 million rows
- 1 table about 300 million rows
- 7 tables each in 10-40 million rows range totaling about 100 million rows
- 37 tables each in 1-9 million rows range totaling about 100 million rows
- The rest 600+ tables each less than 1 million rows totaling about 100 million rows
- Each table has numeric pseudokey called RECID
- Legacy application code mandates this to be 32-bit integer
-2,147,483,648 to 2,147,483,647
- Populated by one common sequence created with INCREMENT BY 10
- Legacy kernel (unchangeable) code works like:
- When needing to insert a row in any table, it selects NEXTVAL FROM DUAL
ii. Itinserts row using the selected sequence value
iii. Thenext 9 inserts from this session to any table will use the next 9 values
iv. The 10thinsert will again select NEXTVAL FROM DUAL
d. Sequence originally started at 1 going up towards 2,147,483,647
e. Few years ago max was reached and sequence reset to start at
-2,147,483,648
f. We are now at about -340,000,000 and will reach zero late 2015 or early 2016
g. “Re-sequencing” pseudokeys will “use” 600 million of the 4200 million
h. Then we would have 3600 million available or about 10-12 years before next problem
i. Time to research solution is now
i. Ifthis cannot be done in one night, Easter 2015 is the only big window
3 Key constraints are only partially defined in Oracle
- Legacy application treats Oracle as black box and handles keys in application
- So no primary or foreign keys in Oracle – only defined in application metadata
- Except some tables are replicated with materialized views
- These tables have materialized view logs
ii. Thereforethey have primary key in Oracle defined as RECID column
d. Many places in application have “multi-parent foreign key”
i. Onecolumn designates which parent table
ii. Othercolumn contains parent RECID
e. Some “generic” tables exist in the application
i. Forexample if COL2=17, then COL5 contains parent RECID of one table
ii. IfCOL2=18, then COL7 contains parent RECID of another table
4. Legacy application is index happy
- Average almost 4 indexes per table
- Up to 22 indexes on a few tables
- The one big table of 300 million rows has 16 indexes
- Every table has a unique index on RECID
- Indexes are all created as UNIQUE
i.Application
makes non-unique indexes UNIQUE by adding RECID as last element
ii. If itis truly unique in the application, RECID is not in the index
5. IDEA 1 – DML
- Massive DML for each table
- CREATE TABLE tab1$newrecid AS SELECT recid as oldrecid, rownum as newrecid FROM tab1 ORDER BY recid
ii. UPDATEtab1 SET recid = newrecid
iii. UPDATEtables with FKs to tab1 SET parentrecid = newrecid
b. Pros
i. Simpleand easy
c. Cons
i. SlowDML of all rows
ii. Indexmaintenance
iii.Replication
- All rows go in the mview logs
- FAST REFRESH will be very slow based on the bloated mview logs
- IDEA 2 – DDL
- A lot of CTAS
- For all tables create translate table CREATE TABLE tab1$newrecid AS SELECT recid as oldrecid, rownum as newrecid FROM tab1 ORDER BY recid
ii. Foreach table do CTAS joining PK and any FKs to translate tables
- CREATE TABLE tab1$copy AS SELECT t2.newrecid as recid, t1.othercolumns, t15.newrecid as t15parentrecid, t17… FROM tab1 t1 JOIN tab1$newrecid t2 ON t2.oldrecid = t1.recid JOIN tab15$newrecid t15 ON t15.oldrecid = t1.t15parentrecid JOIN tab17$newrecid …
- DROP tab1
- RENAME tab1$copy TO tab1
- Rebuild indexes
- Create mview log if table is replicated
- Recreate / refresh complete replicated mview on target database
- Pros
- Probably a good deal faster than DML
- Cons
- Extra storage needed
ii. Extrawork rebuilding indexes
iii. Extrawork rebuilding replication
iv. Morecomplex code to write
v. Downtimeon replication target database
7. Some of my thoughts
- Normally I can have a night service window maybe 3 hours
- For this I can arrange up to 10-12 hours if need be
- If that won’t be sufficient, Easter can provide 60 hour window
- My priority is “safety” – as little chance of something going wrong as possible
- I can arrange the storage needed for the copy tables of the DDL method
- I think the DDL method is most likely “best”, but can it be “finetuned”?
- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
- Running on Oracle Linux Server x86_64 release 6.5 with 4 cores and 512 GB memory
- I can test this full scale on a dataguard copy that I can put in snapshot standby mode
- My questions
- Have I missed something obvious?
- Ideas for a different method?
- Ideas for improving some of the steps in my ideas?
- Would it help if the “translate” tables were IOTs?
ii. Orgive the “translate” tables a tablespace with bigger blocks?
d. Things to consider that can give considerable speed improvements?
i. (Imean, adding complexity to gain a few minutes isn’t interesting ;-)
e. Any database parameters that should be increased while doing this?
f. Parallelism? DIY with 4 sessions each handling a different set of tables?
So, to any that got this far – thank you for your patience in reading this :-)
I hope I can either get confirmation that my ideas seem reasonably OK or some ideas for improvement I can try out.
Thanks in advance.
Regards
Kim Berg Hansen
http://dspsd.blogspot.com
kibeha_at_gmail.com
_at_kibeha
--
http://www.freelists.org/webpage/oracle-l
Received on Wed Oct 08 2014 - 12:08:24 CEST