Re: "Re-sequencing" a pseudo-key
Date: Mon, 13 Oct 2014 10:46:04 +1000
Message-ID: <543B20CC.4030203_at_tpg.com.au>
Hi Kim,
Am I right in surmising that your legacy system is a sales processing system? If so, given that you restarted with negative numbers several years ago, and expect to reach zero 1-1 1/2 year from now, did you consider archiving all rows with positive RECIDs?
To do this, you could simply clone the database and remove the positive RECIDs from the production. When the production hits the maximum positive RECID several years from now, you could copy the remaining negatives from production to the archival clone and remove the negatives from production. When production hits 0 again in a decade's time or so, you store the now complete archival clone to some permanent storage and start with a new clone.. You would of course need two incarnations of your legacy app, one connecting to the archive, with only reporting functionality enabled, and the live one.
This way historical data from the past decade is still online, if immutable, and new data can continue to be added. "Recent" historical data (i.e. the negative RECIDs) is in both the live and the archival database, providing a good overlap for reporting. Really old data will be available through a second interface for long-term trend studies etc and ancient data on request...
Just a thought..
Cheers,
Tony
On 12/10/14 06:02, Kim Berg Hansen wrote:
> Hi, Serqey
>
> Thanks for the ideas - I'm copying this to the list for the benefit of others.
>
> Unfortunately the idea won't work for us, I cannot change how the application use sequences.
>
> You see, the application is written in an old legacy programming language. The compiler translates into legacy "executable" with embedded SQL (which we cannot touch.) Then the runtime engine executes the legacy compiled executable code with the SQL in it.
>
> Part of that is that the runtime engine whenever a RECID is needed (no matter which table) the engine will get the nextval from the /one/ master sequence (which is defined as INCREMENT BY 10) and it will use that value for the INSERT it executes. Then the engine will use the next 9 values for the next 9 INSERTS (no matter which table) and only then will it get sequence.nextval again.
>
> And the engine uses 32-bit signed integers for the RECIDs and there will never be a 64-bit updated version, thus our deadline in about a year to 1½. The numbers are "running out" faster than the amount of rows we have, partly because (as Iggy pointed out) numbers will be "lost" when the engine calls nextval but then only inserts a couple rows before closing the session (similar to how a sequence cache in Oracle may "lose" numbers when the cache is aged out of shared pool or at database restart.)
>
> But mostly because the application inserts rows to tables like SalesHeader and SalesLine, and then when the order is invoiced the data is inserted to tables like InvoiceHeader and InvoiceLine, after which SalesHeader and SalesLine is deleted. So the application as such "wastes" numbers that is deleted and reinserted in new tables. Again it is unfortunately not possible to change this - only option to avoid it would be to replace the system completely...
>
> I definitely wish I could give each table a separate sequence, but I cannot change how this runtime engine works when it executes our compiled legacy code :-(
>
>
>
> Regards
>
>
> Kim Berg Hansen
>
> http://dspsd.blogspot.com
> kibeha_at_gmail.com <mailto:kibeha_at_gmail.com>
> _at_kibeha
>
>
> On Sat, Oct 11, 2014 at 3:41 AM, Sergey Kosourikhin <kosour_at_gmail.com <mailto:kosour_at_gmail.com>> wrote:
>
> Hi Kim,
>
> If we look at this task from slightly different angle, I would say this task is not about re-sequencing, but rather it’s de-sequencing.
>
> I mean, your main issue is that application uses the only sequence which hit the limit too fast. And the solution would be to have separate sequences for every table. And it’s not a one-off operation. Actually, it should be like nightly job during outage period. (well, I don’t know your application - but it may be even possible to run it without outage under some conditions )
>
> So, if we want to provide solution for de-sequencing, one of the implementatios would be:
>
> 1. Create individual sequences for every table.
> 2. Create mapping table (every time you when you will reset main sequence, we can call it “incarnation”. Currently you have incarnation 1. every time you reset main sequence, incarnation will be incremented by 1). In every incarnation all your parent recid’s are unique across database - as they come from the same sequence. That means to re-sequence child rec_id you don’t need to know from which table it came from - just understand if it’s current incarnation or past incarnations). So, in simplest way you can use structure of mapping table :
> - table_name
> - incarnation_id
> - old_rec_id
> - new_rec_id -> populated from table related sequence
> - is_processed (default 0)
>
> Once outage started, you
> 1. scan your tables and take all recid’s newer than maximum from previous incarnation.
> 2. populate mapping table with this rec_id’s and tables related sequences
> 3. You don’t need to worry about FK - as rec_id is unique across incarnation - just update them correctly (if it’s bigger than latest from previous incarnation - update it, if less - skip it) using mapping table.
> 4. Once you populated mapping table, you can start updating (re-sequencing) rec_id in your tables by using new_rec_id column
> 5. I would suggest do it in batches by, say, 1000 rec_ids in 1 transaction. before commit, you update is_processed = 1 for these old_rec_id’s. You can create stored proc metadata driven which will update all tables for given rec_id.
> 6. to speed up process you can create mapping tables as list partitioned by table name and sub partitioned by is_processed columns and enable row movement. So, you will have partition with non-processed rows which will decrease in size as long as you re-sequencing source tables.
> 7. once you processed all rows, you can reset your main sequence to the greatest of all table related sequences + 1.
>
> You can run this procedure during the outage every day and during weekends/holidays. It’s not a big problem if you couldn’t de-sequenced all rows for particular outage - that means you will just skip 1 sequence reset and you can continue next outage.
>
>
> Think yourself if you need all processed rows in mapping table (it looks to me at this moment that only the last row you need to start next de-sequencing run).
>
> If I didn’t explain something clear enough - feel free to ask me for details.
>
> P.S. Sorry that I can’t reply to oracle-list - I have some issues with writing there - I am more like reader :)
>
> Regards,
> Sergey.
>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Oct 13 2014 - 02:46:04 CEST