How to Control concurrent Inserts of a record by parallel sessions [message #562928] |
Tue, 07 August 2012 00:40 |
|
orapratap
Messages: 134 Registered: November 2011 Location: Canada
|
Senior Member |
|
|
Hello
There is a detail table
There is a summary table
During batch process record is entered in detail table as well as summary table
The process first checks if record exists in summary table for same group_no and if 'yes' then "updates" the record with the newly added amount (sums it) else inserts a new record
Whereas in the detail table it inserts the record directly
detail table :
group_no doc_no amount
101 doc1 100
101 doc2 200
102 doc3 300
102 doc4 400
summary table :
group_no amount
101 300
102 700
Now if the batch process runs in parallel, (out of many) two different sessions insert same group_no; This is because while sesond session inserts a record, first session inserting the same record (group_no) has not yet committed ; So second session Not knowing that already there is same Group_no (101) inserted, again inserts another record with same group_no rather than summing it
Can it be solved without using temp table, select for update?
Thanks and Regards
Orapratap
|
|
|
|
Re: How to Control concurrent Inserts of a record by parallel sessions [message #562941 is a reply to message #562928] |
Tue, 07 August 2012 01:36 |
John Watson
Messages: 8962 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
Quote:Now if the batch process runs in parallel, (out of many) two different sessions insert same group_no; This is because while sesond session inserts a record, first session inserting the same record (group_no) has not yet committed ; So second session Not knowing that already there is same Group_no (101) inserted, again inserts another record with same group_no rather than summing it You need to constrain the group_no column as unique or primary key, then this will not happen.
|
|
|
Re: How to Control concurrent Inserts of a record by parallel sessions [message #564571 is a reply to message #562941] |
Sat, 25 August 2012 21:50 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
I agree that locking - either explicit (as suggested by @michel) or implicit (as suggested by @John) will solve this problem by serialising updates against the same group_no, but since this is in the Performance Tuning forum, we should consider another alternative: make sure that all rows with the same group_no are processed by the same parallel thread.
The way you do this will depend on the technology you use to achieve parallelism, but most of tools out there with inbuilt parallelism provide some means of *directing* rows to particular threads.
That method could be a simple "round-robin", which we would use if every row was independent of each other. But in your case they are dependent, so round-robin is a bad choice.
One of the most common methods used when rows are inter-related is a "hash" distribution. The tool will use an input of your choice (eg. group_no), calculate a hash value (X) between 1 and N (where N is the number of parallel threads) and then send the row to thread X. In this way, two rows with the same group_no will get the same hash value, and will be sent to the same thread. As a result, you can guarantee that no two parallel threads will attempt to update the same summary row.
Ross Leishman
|
|
|
Re: How to Control concurrent Inserts of a record by parallel sessions [message #565234 is a reply to message #564571] |
Fri, 31 August 2012 21:38 |
|
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
maybe another alternative would be to not do the summing at all. Just drop the summary table altogether. You can replace it with a view and anytime you want the summed value, just select it. If there are not hundreds of rows to sum each time then there would be little performance impact on the select side, and you would speed up the process on the update side considerably. A proper index on the detail table will speed up the summing as well.
Or, since this is a batch job, if you don't actually need to read the summary table while you are doing inserts, don't update it, just rebuild it after all inserts are done.
Good luck, Kevin
|
|
|