Home » RDBMS Server » Performance Tuning » Query Optimization Help - Date-Range Matching results in MERGE JOIN CARTESIAN (Oracle 10gR2 (10.2.0.4))
Query Optimization Help - Date-Range Matching results in MERGE JOIN CARTESIAN [message #515435] Mon, 11 July 2011 09:37 Go to next message
rasa
Messages: 45
Registered: February 2006
Member
Assume there are 3 tables (in reality, I have to deal with about 8 tables, but the concept is same):

CREATE TABLE T1EDC (NAME VARCHAR2(16), BEGIN_DATE DATE, END_DATE DATE);
CREATE TABLE T2CAL (NAME VARCHAR2(16), BEGIN_DATE DATE, END_DATE DATE);
CREATE TABLE T3SG (NAME VARCHAR2(16), BEGIN_DATE DATE, END_DATE DATE);


Now, populate the following data, for each of the 3 tables:

insert into T1EDC (NAME, BEGIN_DATE, END_DATE)
values ('A', to_date('01-01-2011', 'dd-mm-yyyy'), to_date('31-01-2011', 'dd-mm-yyyy'));

insert into T1EDC (NAME, BEGIN_DATE, END_DATE)
values ('B', to_date('01-02-2011', 'dd-mm-yyyy'), to_date('31-03-2011', 'dd-mm-yyyy'));

insert into T1EDC (NAME, BEGIN_DATE, END_DATE)
values ('C', to_date('01-04-2011', 'dd-mm-yyyy'), to_date('01-05-2011', 'dd-mm-yyyy'));

insert into T2CAL (NAME, BEGIN_DATE, END_DATE)
values ('D', to_date('01-01-2011', 'dd-mm-yyyy'), to_date('19-01-2011', 'dd-mm-yyyy'));

insert into T2CAL (NAME, BEGIN_DATE, END_DATE)
values ('E', to_date('20-01-2011', 'dd-mm-yyyy'), to_date('09-03-2011', 'dd-mm-yyyy'));

insert into T2CAL (NAME, BEGIN_DATE, END_DATE)
values ('F', to_date('10-03-2011', 'dd-mm-yyyy'), to_date('01-05-2011', 'dd-mm-yyyy'));

insert into T3SG (NAME, BEGIN_DATE, END_DATE)
values ('G', to_date('01-01-2011', 'dd-mm-yyyy'), to_date('19-02-2011', 'dd-mm-yyyy'));

insert into T3SG (NAME, BEGIN_DATE, END_DATE)
values ('H', to_date('20-02-2011', 'dd-mm-yyyy'), to_date('01-05-2011', 'dd-mm-yyyy'));


So, the data will now line-up as shown below:

T1EDC:
NAME	BEGIN_DATE	END_DATE
A	1/1/2011	1/31/2011
B	2/1/2011	3/31/2011
C	4/1/2011	5/1/2011


T2CAL:
NAME	BEGIN_DATE	END_DATE
D	1/1/2011	1/19/2011
E	1/20/2011	3/9/2011
F	3/10/2011	5/1/2011


T3SG:
NAME	BEGIN_DATE	END_DATE
G	1/1/2011	2/19/2011
H	2/20/2011	5/1/2011


The business requirement is that, we need to derive the relationships in such a way that the dates overlap/coincide for the ranges shown. For example, when Tables 1 and 2 are joined, we get the following relationships:
A	D	1/1/2011	1/19/2011
A	E	1/20/2011	1/31/2011
B	E	2/1/2011	3/9/2011
B	F	3/10/2011	3/31/2011
C	F	4/1/2011	5/1/2011


That is because when "A" in T1EDC is compared against "D" in T2CAL, the intersecting date-range is 1/1/2011 to 1/19/2011. The reason why you don't see an entry for "A" and "F" is because, F's date-range is outside of A's date-range and hence there is no overlap.

So, the query I use is like this:
WITH SUBQ AS
 (SELECT A.NAME "EDC_NAME",
		 B.NAME "CAL_NAME",
		 GREATEST(A.BEGIN_DATE, B.BEGIN_DATE) BEGIN_DATE,
		 LEAST(A.END_DATE, B.END_DATE) END_DATE
  FROM T1EDC A, T2CAL B
  WHERE A.BEGIN_DATE <= B.END_DATE
		AND A.END_DATE >= B.BEGIN_DATE)
SELECT A.EDC_NAME,
	   A.CAL_NAME,
	   B.NAME SG_NAME,
	   GREATEST(A.BEGIN_DATE, B.BEGIN_DATE) BEGIN_DATE,
	   LEAST(A.END_DATE, B.END_DATE) END_DATE
FROM SUBQ A, T3SG B
WHERE A.BEGIN_DATE <= B.END_DATE
	  AND A.END_DATE >= B.BEGIN_DATE;


Unfortunately, when I run this prototype, it looks good, but, when I actually start to take this approach for the 8 tables/relationships, it results in numerous MERGE JOIN CARTESIAN that takes a lot of time to complete.

If you think about it, the business-requirement results in 1 row from the main table being compared to all other rows in the next table in order to find the date-range overlap. So, there is an element of "Cartesian" join involved here.

How can I tackle this issue other than the query I had shown above? This does not work even for small datasets.

Any help will be ** greatly ** appreciated.



Re: Query Optimization Help - Date-Range Matching results in MERGE JOIN CARTESIAN [message #515436 is a reply to message #515435] Mon, 11 July 2011 09:43 Go to previous messageGo to next message
rasa
Messages: 45
Registered: February 2006
Member
So, when I actually run the query I showed above, it brings the right results as follows, but this approach does not work well for the 8 tables I need to keep nesting/joining in this fashion. Hence, seeking some ideas. Thanks.

EDC_NAME	CAL_NAME	SG_NAME	BEGIN_DATE	END_DATE
A	D	G	1/1/2011	1/19/2011
A	E	G	1/20/2011	1/31/2011
B	E	G	2/1/2011	2/19/2011
B	E	H	2/20/2011	3/9/2011
B	F	H	3/10/2011	3/31/2011
C	F	H	4/1/2011	5/1/2011
Re: Query Optimization Help - Date-Range Matching results in MERGE JOIN CARTESIAN [message #515437 is a reply to message #515436] Mon, 11 July 2011 09:46 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
>this approach does not work well for the 8 tables I need to keep nesting/joining in this fashion

My reaction to this observation is that this application has a SERIOUS design flaw.
Re: Query Optimization Help - Date-Range Matching results in MERGE JOIN CARTESIAN [message #515438 is a reply to message #515436] Mon, 11 July 2011 09:48 Go to previous messageGo to next message
cookiemonster
Messages: 13958
Registered: September 2008
Location: Rainy Manchester
Senior Member
The obvious question is why do you have 8 tables at all?
Assuming you have to join every table to every other table thats 28 joins. The only way I can see to avoid horrible performance is to rationalise the data.
Re: Query Optimization Help - Date-Range Matching results in MERGE JOIN CARTESIAN [message #515439 is a reply to message #515438] Mon, 11 July 2011 09:52 Go to previous messageGo to next message
rasa
Messages: 45
Registered: February 2006
Member
cookiemonster wrote on Mon, 11 July 2011 10:48
The obvious question is why do you have 8 tables at all?
Assuming you have to join every table to every other table thats 28 joins. The only way I can see to avoid horrible performance is to rationalise the data.


Thanks. Actually, it is not going to be 28 joins. As I demonstrated the tables "1 and 2" are joined as shown in the SUBQ subquery-refactor. And then the resultset from "1 and 2" are used to join "3". And assuming there is a 4th table, we will use the combined resultset from "1, 2, and 3" to be joined against the "4". So on and so forth.

The reason why it is this way is that the relationships are stored in those 8 different tables with differing date-ranges and we need to find the "overlapping" date-range for each relationship. Thanks.
Re: Query Optimization Help - Date-Range Matching results in MERGE JOIN CARTESIAN [message #515440 is a reply to message #515439] Mon, 11 July 2011 09:55 Go to previous messageGo to next message
cookiemonster
Messages: 13958
Registered: September 2008
Location: Rainy Manchester
Senior Member
That doesn't explain why there's 8 tables.
Re: Query Optimization Help - Date-Range Matching results in MERGE JOIN CARTESIAN [message #515445 is a reply to message #515440] Mon, 11 July 2011 10:39 Go to previous messageGo to next message
rasa
Messages: 45
Registered: February 2006
Member
cookiemonster wrote on Mon, 11 July 2011 10:55
That doesn't explain why there's 8 tables.


The business-requirement is such that each of those 8 tables have "relationships" and "date-range" for EACH relationship. All said and done, when all those 8 tables are queried, you have a dataset like this:
REL 1	REL 2	REL 3	REL 4 	REL 5	REL 6	REL 7 	REL 8	BEGIN_DATE 1	BEGIN_DATE 2	BEGIN_DATE 3	BEGIN_DATE 4	BEGIN_DATE 5	BEGIN_DATE 6	BEGIN_DATE 7	BEGIN_DATE 8	END_DATE 1	END_DATE 2	END_DATE 3	END_DATE 4	END_DATE 5	END_DATE 6	END_DATE 7	END_DATE 8
A	B	C	D	E	F	G	H	1/1/2011	1/15/2011	1/13/2011	1/20/2011	1/17/2011	1/5/2011	1/9/2011	1/23/2011	1/20/2011	1/25/2011	1/31/2011	1/30/2011	1/29/2011	1/25/2011	1/28/2011	1/31/2011


From here, you will use GREATEST(BEGIN_DATE_1, BEGIN_DATE_2....BEGIN_DATE_8) for the BEGIN_DATE of the actual relationship and

you will use LEAST(END_DATE_1, END_DATE_2, END_DATE_3...END_DATE_8)

in order to arrive at what should be the effective date-range that has an overlap for the entire set of relationships from A to H.

That is the requirement and unfortunately, we have the relationships in 8 different tables. If the dates had lined-up to a T, then a straight-forward join is possible instead of having to do this range join
WHERE A.BEGIN_DATE <= B.END_DATE AND A.END_DATE >= B.BEGIN_DATE




Re: Query Optimization Help - Date-Range Matching results in MERGE JOIN CARTESIAN [message #515448 is a reply to message #515445] Mon, 11 July 2011 10:52 Go to previous messageGo to next message
cookiemonster
Messages: 13958
Registered: September 2008
Location: Rainy Manchester
Senior Member
Given the lack of information the only suggestion I can give is a series of materiliazed views
- and that'll only work if you don't need the data to be constantly up to date as the refresh time is probably going to be painful.

Failing that you are actually going to have to explain the business requirement because I really don't understand what you are doing here.
And without that knowledge I really can't suggest alternatives.
Re: Query Optimization Help - Date-Range Matching results in MERGE JOIN CARTESIAN [message #515451 is a reply to message #515448] Mon, 11 July 2011 11:10 Go to previous messageGo to next message
rasa
Messages: 45
Registered: February 2006
Member
@cookiemonster - I understand the constraint. Thanks.
Re: Query Optimization Help - Date-Range Matching results in MERGE JOIN CARTESIAN [message #515495 is a reply to message #515451] Mon, 11 July 2011 22:18 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Cartesian products are avoidable - after all you do have join criteria there. It's just that Oracle thinks a Cartesian product would be best. You should be able to override it with /*+ORDERED USE_MERGE(a b)*/ style hints, but it may not be faster.

Here is an article devoted to tuning range joins - you may find some of these tips helpful.

Ross Leishman
Re: Query Optimization Help - Date-Range Matching results in MERGE JOIN CARTESIAN [message #516018 is a reply to message #515495] Thu, 14 July 2011 09:39 Go to previous message
rasa
Messages: 45
Registered: February 2006
Member
rleishman wrote on Mon, 11 July 2011 23:18
Cartesian products are avoidable - after all you do have join criteria there. It's just that Oracle thinks a Cartesian product would be best. You should be able to override it with /*+ORDERED USE_MERGE(a b)*/ style hints, but it may not be faster.

Here is an article devoted to tuning range joins - you may find some of these tips helpful.

Ross Leishman


Ross -- Thanks for that wonderful pointer. Am reading to learn more.Thanks.
Previous Topic: alter table.. shrink space command hanging..
Next Topic: SYSAUX Tablespace
Goto Forum:
  


Current Time: Sun Nov 24 21:45:22 CST 2024