Home » RDBMS Server » Performance Tuning » Performance Pruning on fact table (10.2.0.4)
Performance Pruning on fact table [message #442725] |
Wed, 10 February 2010 04:29 |
summoner
Messages: 44 Registered: March 2009
|
Member |
|
|
Dear all,
Our data warehouse has a Range-hash partition table (My_Fact) that stored 1M records. The data is partitioned by the year and then the event ID. We have master tables Event_Master and Year
We tried to use a query (generated from software "Business Objects") to retreive the data My_Fact table that the event ID is 999999 or 1111111.
And the query is similiar as follow
Quote:
select *
from My_Fact, Event_Master, Year
where My_Fact.event_id = Event_Master.event_id
and My_Fact.year_id = Year.year_id
and (Event_Master.event_id = 999999 or Event_Master.event_id = 1111111);
When we check on the execution plan, the plan simply tried to retrieve all rows from the My_Fact table as we find the following statement in the plan
Quote:
PARTITION RANGE ALL
PARTITION HASH ALL
TABLE ACCESS FULL
What we think is that the system will calculate the hash value of the event ID and the check the correct partition. But this is not correct. We are not sure why pruning does not work. We also check the parameter "star_transformation_enabled" and it is set as true
Thanks for help
|
|
|
Re: Partition Pruning for a [message #442731 is a reply to message #442725] |
Wed, 10 February 2010 04:55 |
JRowbottom
Messages: 5933 Registered: June 2006 Location: Sunny North Yorkshire, ho...
|
Senior Member |
|
|
Well, the quick response would be that your query isn't specifying a single year or a range of years that it wants to retrieve data from.
You're specifying that the Event_Id has to be 111111 or 999999, but there is no restriction on the year_id
|
|
|
|
Re: Performance Pruning on fact table [message #442734 is a reply to message #442732] |
Wed, 10 February 2010 05:03 |
John Watson
Messages: 8960 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
Hi - firstly, with regard to star transformation, do you have the correct indexes? B*tree indexes will not work for this. Secondly, it is possible that the optimizer is making the correct decision: you do not have very many rows. Thirdly, are you sure that your partitioning strategy is appropriate? You aren't using the range predicate at all in your query. Lastly, do you, in fact, have a problem at all? How long does the query take?
|
|
|
Re: Partition Pruning for a [message #442737 is a reply to message #442725] |
Wed, 10 February 2010 05:42 |
|
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
I am no expert with partitioning. However I am with the OP. I would have expected Oracle to prune but this all depends upon if I understand how partitioning works or not. Let us walk through an example.
Assume we have years 2008,2009 and events 1,2,3. If we partition by year/event then I would have expected oracle to create six physical partitions as follows:
2008,1
2008,2
2008,3
2009,1
2009,2
2009,3
The data above (the six pairs of year,event) is metadata. It tells us the partitions that we expect to be created given our partitioning scheme and the data we load. In my understanding of partitioning, Oracle does indeed create and use this kind of metadata. Here now is the expectation...
If I want event 1, then I would have expected Oracle to scan the partition metadata and figure out that it only needs to look at partitions ((2008,1),(2009,1)). This requires a scan of the metadata because we are not supplying years, but this is trivial and the result is the pruning away of two thirds of the table before we start our query and so such a metadata scan is well worth it.
Maybe the issue is with the fact that you have used HASH partitioning for your secondary partitioning scheme. HASHING msy have changed the set of partitions created and/or how Oracle uses the partition metadata to figure things out. Try using list partitioning as your secondary partitioning strategy (assuming this is allowed in your version or Oracle).
Good luck, Kevin
|
|
|
Re: Partition Pruning for a [message #442746 is a reply to message #442737] |
Wed, 10 February 2010 06:11 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
If I understand you correctly, you have MY_FACT range partitioned on YEAR_ID and hash subpartitioned on EVENT_ID.
Theoretically, Oracle should be able to apply transitivity to the predicates:
where My_Fact.event_id = Event_Master.event_id
and (Event_Master.event_id = 999999 or Event_Master.event_id = 1111111);
to come up with:
where My_Fact.event_id = 999999 or My_Fact.event_id = 1111111);
Now once that transitive rule is applied, I would expect a HASH PARTITION ITERATOR to pick up the max 2 subpartitions per range partition.
But it's not doing that. Maybe it is the transitivity. Maybe it is the OR.
You need to run some tests. But first make sure of one thing: Are the data types of MY_FACT.EVENT_ID and EVENT_MASTER.EVENT_ID both NUMERIC? If not, your integer comparison or the join will cause casting, which could mess up the transitivity.
Lets try a few things. Tell us if these Prune the subpartitions.
select *
from My_Fact
where My_Fact.event_id = 999999;
select *
from My_Fact
where My_Fact.event_id = 999999 or My_Fact.event_id = 1111111);
select *
from My_Fact, Event_Master
where My_Fact.event_id = Event_Master.event_id
and My_Fact.event_id = 999999;
select *
from My_Fact, Event_Master
where My_Fact.event_id = Event_Master.event_id
and My_Fact.event_id = 999999 or My_Fact.event_id = 1111111;
select *
from My_Fact, Event_Master
where My_Fact.event_id = Event_Master.event_id
and Event_Master.event_id = 999999;
Ross Leishman
|
|
|
Re: Partition Pruning for fact table [message #442833 is a reply to message #442737] |
Wed, 10 February 2010 21:41 |
summoner
Messages: 44 Registered: March 2009
|
Member |
|
|
Thank you for everyone.
We rebuild the Event Master table and correct data type
Now the query can use correct partition pruning for all the queries shown above
Now our system needs to face the actual environment. Our DW is used for the tool "Business Objects" which allow user to drag and drop any fields from the DW.
The Event_Master table has two important fields "event_id" and "previous_event_id" and user always want to compare the revenue between current event and previous event. Somehow the query may be shown as follows
SELECT *
FROM my_fact,
event_master,
YEAR
WHERE my_fact.event_id = event_master.event_id
AND my_fact.year_id = YEAR.year_id
AND (event_master.event_id = 999999
OR event_master.event_id IN (SELECT event_master.previous_event_id
FROM event_master
WHERE ((event_master.event_id) = 999999)));
The system failed to do partition pruning in this case. However, we could solve it by rewriting the query as follows
SELECT *
FROM my_fact,
event_master,
YEAR
WHERE my_fact.event_id = event_master.event_id
AND my_fact.year_id = YEAR.year_id
AND (event_master.event_id = 999999)
UNION ALL
SELECT *
FROM my_fact,
event_master,
YEAR
WHERE my_fact.event_id = event_master.event_id
AND my_fact.year_id = YEAR.year_id
AND (event_master.event_id IN (SELECT event_master.previous_event_id
FROM event_master
WHERE ((event_master.event_id) = 999999)));
Edited by BlackSwan to provide formatted "code" with correct tags
But the fact is that user could create any query they want. So rewriting a query is not a good solution in this case. Could anyone help us to solve the problem in more generic way? Thanks all
===========================
Edited
===========================
Thank you BlackSwan for formatting the code.
Now I am checking whether the "Query Rewrite" feature provided by Oracle can help me to solve the case. As I check some documents, they suggest us to create materialize view with "enable query rewrite" option. Therefore, I tried the following materialized view
CREATE MATERIALIZED VIEW "EVENT_PREV_MV"
ENABLE QUERY REWRITE
AS SELECT event_id AS src_evt_id,
event_id AS evt_id
FROM event_master
UNION ALL
SELECT event_id,
previous_event_id
FROM event_master
WHERE previous_event_id IS NOT NULL;
When I tried the above query again, the query failed to rewrite and I think it is because the query for the MV are different.
[Updated on: Thu, 11 February 2010 02:14] Report message to a moderator
|
|
|
|
Re: Performance Pruning on fact table [message #442982 is a reply to message #442914] |
Thu, 11 February 2010 19:17 |
summoner
Messages: 44 Registered: March 2009
|
Member |
|
|
Kevin Meade wrote on Thu, 11 February 2010 05:14I could hazard a guess or two but there are just too many variables not accounted for.
For example, is event_id unique on the event_master table?
If so then can't you just code:
SELECT *
FROM my_fact,
event_master,
YEAR
WHERE my_fact.event_id = event_master.event_id
AND my_fact.year_id = YEAR.year_id
AND (event_master.event_id = 999999 or event_master.previous_event_id = 999999)
and not bother with the nested select?
Kevin
The event_id is unique on the Event_Master table.
We cannot retrieve correct result from your query. Let's say the previous event of event #999999 is 77777.
Then what we need to do is to create a subquery to retrieve the previous event id (i.e 77777) for event 999999 from the event_master table. In this case, the query should be as follows
SELECT *
FROM my_fact,
event_master,
YEAR
WHERE my_fact.event_id = event_master.event_id
AND my_fact.year_id = YEAR.year_id
AND (event_master.event_id = 999999 or event_master.event_id = 77777)
|
|
|
|
Re: Performance Pruning on fact table [message #443074 is a reply to message #442725] |
Fri, 12 February 2010 07:48 |
|
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
So you are basically doing a hierarchical query one level deep.
Previous_event_id is on the table too. Thus the row where event_id = 999999 has the previous_event_id on it already.
Maybe you should try a join to a second copy of the table.
SELECT *
FROM my_fact,
event_master,
event_master previous_event_master,
YEAR
WHERE my_fact.event_id = event_master.event_id
AND my_fact.year_id = YEAR.year_id
AND event_master.event_id = 999999
AND event_master.previous_event_id = previous_event_master.event_id(+)
/
You will have both current and previous events on the same result row. If you must have them on two rows, you can do a pivot query.
select my_fact.*
, decode(rowduper.column_value,1,a[event_master].event_id,2,a.[previous_event_master].event_id) event_id
, ...
, year.*
from (
SELECT *
FROM my_fact,
event_master,
event_master previous_event_master,
YEAR
WHERE my_fact.event_id = event_master.event_id
AND my_fact.year_id = YEAR.year_id
AND event_master.event_id = 999999
AND event_master.previous_event_id = previous_event_master.event_id(+)
) a
,(
select level column_value
from dual
connect by level <= 2
) rowduper
/
The above code won't work as is because of obvious syntax errors. You will have to pay attention to your columns names etc. in your inner query by naming them well so you can seperate them in the outer query. But you can work those details out.
Maybe I am not understanding your need but this looks workable to me. You of course will also have to figure out if it will still do the partition pruning you need.
Kevin
|
|
|
Goto Forum:
Current Time: Fri Nov 22 07:50:51 CST 2024
|