RE: Hash cluster vs partition by hash
Date: Wed, 29 Jan 2014 16:21:47 +0000
Message-ID: <D7864FA3E7830B428CB2A5A5301B63EE7D296D87_at_S7041VA005.soa.soaad.com>
>> ...including all the columns in the select list...
That's the killer right there. We do take advantage of that technique in other places, but it's not reasonable here. We have too many columns we need to get.
>> if the 17 rows is representative
It's actually not -- most items have between 0 and 1000+ rows; and the distribution changes (always increasing) as the year progresses.
Anyway, thank you very, very much for all of your expert guidance! I am not sure the DBAs will want to create a non-ASSM tablespace in production for this, but at least I have something to think about.
Thanks again,
Matt
-----Original Message-----
From: Jonathan Lewis [mailto:jonathan_at_jlcomp.demon.co.uk]
Sent: Wednesday, January 29, 2014 3:17 AM
To: McPeak, Matt; Oracle Mailinglist
Subject: RE: Hash cluster vs partition by hash
Matt,
If you haven't already paid for the partitioning option it would be an expensive way to eliminate buffer busy waits, but as it is it looks like a reasonable way to take advantage of the technology. I wonder if you could do a little better by creating a local index on the table keyed on the "where clause", including all the columns in the select list, compressed on the where-clause columns (if the 17 rows is representative). If the partitions are small enough this index might have blevel 1 or 2, and therefore reduce your buffer visits.
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle
From: McPeak, Matt [vxsmimmcp_at_subaru.com] Sent: 28 January 2014 22:26
To: Jonathan Lewis; Oracle Mailinglist
Subject: RE: Hash cluster vs partition by hash
Yes, thank you! I forgot about ASSM!
Trying my test in a non-ASSM tablespace (and adding a condition for product-line, which I boneheadly omitted), I'm down to:
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 17 |00:00:00.01 | 3 | | 1 | PARTITION LIST SINGLE | | 1 | 17 | 17 |00:00:00.01 | 3 | | 2 | PARTITION HASH SINGLE| | 1 | 17 | 17 |00:00:00.01 | 3 | |* 3 | TABLE ACCESS FULL | MATT_VPS_DEMANDED_UNITS_PART | 1 | 17 | 17 |00:00:00.01 | 3 |
-----------------------------------------------------------------------------------------------------------------
So, now, in a test of 15,000 random item/ports, the partitioned table is marginally faster (19.88 seconds vs 20.48 seconds) -- we'll call that a wash... but the table is also smaller: 0.23Gb vs 0.28Gb (both are in the same non-ASSM tablespace, both are compressed for OLTP).
This seems to me that, as long as the number of distinct item/ports is significantly less than the number of hash subpartitions I'm allowed, that the LIST-HASH partitioned table is better than the hash cluster in every way AND solves my ITL-deadlock problem to boot.
Any thoughts on what I am not thinking of?
Thanks,
Matt
P.S. "1024 subpartitions seems quite a lot - which is why, perhaps, the access path was a full tablescan (very small segments)"... that's the idea. When I need to access this table, I need all of the data for a single item-port and nothing else. I want to co-locate it and also, if possible, avoid index-access to get to the table blocks.
-----Original Message-----
From: Jonathan Lewis [mailto:jonathan_at_jlcomp.demon.co.uk]
Sent: Monday, January 27, 2014 5:23 PM
To: McPeak, Matt; Oracle Mailinglist
Subject: RE: Hash cluster vs partition by hash
Look on your partitioned object as a rectangular array of 8 rows and 1024 columns. Your query is looking at all rows (partition list all) and one column (partition hash single) (It's slightly misleading / puzzling that those lines have started 9 times rather than 8)
The access method to each physical segment (subpartiion) is a full "table"scan - and I'd guess that on 11.2 you're using ASSM, perhaps with system extent allocation, so it's not surprising if you have about 32 block already allocated per segment (with a few spared out for space management).
1024 subpartitions seems quite a lot - which is why, perhaps, the access path was a full tablescan (very small segments), but if you want to do less work a locally partitioned index would help. It would certainly be one way to reduce contention on inserts - but if you're going to rebuild the object anyway is there any good reason for not rebuilding it as a hash cluster - conversely are there any good business reasons for building it as a partitioned table ?
To answer your first question, you can't partition clusters - although one of Oracle's tpc benchmarks had a version that did, but it never went into general release -(11.2.0.3 on Solaris, only , I think).
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle
From: oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org] on behalf of McPeak, Matt [vxsmimmcp_at_subaru.com] Sent: 27 January 2014 21:47
To: Oracle Mailinglist
Subject: RE: Hash cluster vs partition by hash
(I’m having some confusion, so sorry if this message gets posted 3 times… )
Thank you everyone for responding to my recent question about ITL deadlocks.
As an alternative to re-creating my hash clusters with INITRANS 8, I thought that if I could somehow partition by product line, I wouldn’t have a problem. (INITRANS 8 is a reasonable solution, but I’m not happy about the performance hit. Higher INITRANS means less rows per block means more I/Os per row on average).
As far as I know, you cannot partition a hash cluster in 11gR2. (Please, please tell me I’m wrong!! ☺)
So I started thinking along these lines (see below). The goal is to co-locate all the data by item/port (for performance), with no more than one product line in a given block (to avoid ITL deadlocks).
CREATE TABLE my_table
…
SEGMENT CREATION DEFERRED --IMMEDIATE
STORAGE (INITIAL 8192 NEXT 8192) – block size is 8k
-- CLUSTER MY_TABLE_CLUS(item, port)
COMPRESS FOR OLTP
PARTITION BY LIST (product_line)
SUBPARTITION BY HASH (item, port)
SUBPARTITIONS 1024
(PARTITION
VALUES ('Product Line 1'),
PARTITION
VALUES ('Product Line 2'),
PARTITION
VALUES ('Product Line 3'),
PARTITION
VALUES ('Product Line 4'),
PARTITION
VALUES ('Product Line 5'),
PARTITION
VALUES ('Product Line 6'),
PARTITION
VALUES ('Product Line 7'),
PARTITION
VALUES ('Product Line 8'),
PARTITION part_default
VALUES (DEFAULT)
);
I tried this, and then did a comparison of accessing the table for one item/port in the new structure vs the old (hash cluster) structure.
The partitioned table access plan was:
| Id | Operation | Name | Starts | E-Rows | A-Rows |Â Â A-Time | Buffers ||* 3 |Â Â Â TABLE ACCESS FULL | MATT_VPS_DEMANDED_UNITS_PART |Â Â Â Â Â 9 |Â Â Â Â 79 |Â Â Â Â 17 |00:00:00.01 |Â Â Â Â 224 |
-----------------------------------------------------------------------------------------------------------------
|Â Â 0 | SELECT STATEMENT |Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â |Â Â Â Â Â 1 |Â Â Â Â Â Â Â |Â Â Â Â 17 |00:00:00.01 |Â Â Â Â 224 | |Â Â 1 |Â PARTITION LIST ALL |Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â |Â Â Â Â Â 1 |Â Â Â Â 79 |Â Â Â Â 17 |00:00:00.01 |Â Â Â Â 224 | |Â Â 2 |Â Â PARTITION HASH SINGLE|Â |Â Â Â Â Â 9 |Â Â Â Â 79 |Â Â Â Â 17 |00:00:00.01 |Â Â Â Â 224 |
Predicate Information (identified by operation id):
3 - filter(("MODEL_ITEM_ID"=7698083 AND "PORT_CODE"='020'))
And the hash cluster access plan was:
| Id | Operation | Name | Starts | E-Rows | A-Rows |Â Â A-Time | Buffers |
-------------------------------------------------------------------------------------------------------
|Â Â 0 | SELECT STATEMENT |Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â |Â Â Â Â Â 1 |Â Â Â Â Â Â Â |Â Â Â Â |17 |00:00:00.01 |Â Â Â Â Â Â 1 |
|* 1 |Â TABLE ACCESS HASH| VERP_VPS_DEMANDED_UNITS |Â Â Â Â Â 1 |Â Â Â Â 53 |Â Â Â Â |17 |00:00:00.01 |Â Â Â Â Â Â 1 |
Predicate Information (identified by operation id):
1 - access("MODEL_ITEM_ID"=7698083 AND "PORT_CODE"='020')
But here is what is confusing me (and, finally, my question…): where are the 224 buffer reads in the first result coming from? I confirmed (via rowids) that all the data in my result was coming from just two database blocks. As the extra buffer gets overhead by Oracle to read the list of partitions, etc? And, if so, does that basically mean this alternative is unworkable?
Thanks, everyone, for any thoughts!
Matt
From: Jonathan Lewis [mailto:jonathan_at_jlcomp.demon.co.uk]
Sent: Friday, January 24, 2014 5:21 PM
To: McPeak, Matt; Oracle Mailinglist
Subject: RE: ITL deadlocks question
You might look at the dbms_redefinition package. I won't be surprised if it excluded clustered tables - but it's worth checking. (If it does allow them it will generate a lot of redo)
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle
From: McPeak, Matt [vxsmimmcp_at_subaru.com] Sent: 24 January 2014 20:43
To: Jonathan Lewis; Oracle Mailinglist
Subject: RE: ITL deadlocks question
That’s what I was afraid of… thank you for the quick and definitive answer!
One follow-up, if I may…
>> You need to recreate the table with initrans set to at least 8 to be safe.
Since we’re talking about a single-table hash cluster here, am I correct in believing that there is no online way to do this in 11gR2?
From: Jonathan Lewis [mailto:jonathan_at_jlcomp.demon.co.uk]
Sent: Friday, January 24, 2014 3:01 PM
To: McPeak, Matt; Oracle Mailinglist
Subject: RE: ITL deadlocks question
No.
Unless it's changed recently, a session will try each ITL slot in turn for a few seconds and then stop on one of them once it's gone through the entire list. If you're lucky the time it takes to cover the list means it will find a slot even if there were none when it started looking. You need to recreate the table with initrans set to at least 8 to be safe.
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle
From: oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org] on behalf of McPeak, Matt [vxsmimmcp_at_subaru.com] Sent: 24 January 2014 19:24
To: Oracle Mailinglist
Subject: ITL deadlocks question
I have a situation where the users like to submit eight (8) copies of an I/O intensive job simultaneously – one job for each of our product lines.
The job operates on tables that are not partitioned (or otherwise physically separated) by product line, so that one database block may contain rows for many different product lines.
Occasionally, some of the processes are failing due to ITL deadlocks.
My question is: suppose you have:
Block 1 => ITL: txn A, txn B with txn C waiting. Block 2 => ITL: txn B, txn C with txn A waiting… Block 3 => ITL: txn C, txn *R* with txn B waiting…
Is Oracle’s ITL deadlock detection smart enough to realize that, in block #1 for example, txn C is waiting for *either* txn A *or* txn B to end, but that it need not wait for both?
In other words, is it smart enough to know that the situation above is *not* a deadlock? (Because txn R can still end, then txn B, then both txn C and txn A can continue.)
The two tables involved are each in their own single table hash cluster, if that matters.
Thanks in advance for any help!
Matt
?m ? 祊 l ? j?
†Ûiÿü0ÁúÞzX¬¶Ê+ƒün– {ú+iÉ^
Received on Wed Jan 29 2014 - 17:21:47 CET