How to reduce the optimizer cost for the below said query [message #385742] |
Tue, 10 February 2009 22:49 |
dhanamukesh
Messages: 51 Registered: January 2009
|
Member |
|
|
Please look into this query and help me to find how the cost and response time can be reduced.
SELECT cnsmr_id
FROM consumer
WHERE edir_gu_id IS NULL
MINUS -- consumer has no transactions
SELECT snd_cnsmr_id
FROM consumer_tran
MINUS -- consumer has no loyalty cards
SELECT cnsmr_id
FROM loyalty_program_mbshp
MINUS
SELECT cp.prm_cnsmr_id
FROM consumer cp
WHERE cp.prm_cnsmr_id != cp.cnsmr_id;
|
|
|
|
|
|
|
|
|
|
|
Re: How to reduce the optimizer cost for the below said query [message #385839 is a reply to message #385762] |
Wed, 11 February 2009 06:02 |
JRowbottom
Messages: 5933 Registered: June 2006 Location: Sunny North Yorkshire, ho...
|
Senior Member |
|
|
I'm pretty sure that your rewritten query is wrong.
SELECT cnsmr_id
FROM consumer
WHERE edir_gu_id IS NULL
MINUS
SELECT cp.prm_cnsmr_id
FROM consumer cp
WHERE cp.prm_cnsmr_id != cp.cnsmr_id; is not the same as
SELECT cnsmr_id,prm_cnsmr_id
FROM consumer
WHERE (edir_gu_id IS NULL or prm_cnsmr_id != cnsmr_id)
If there are a relatively small number of rows in Consumer where edir_gu_id is null, you could create a function based index on nvl(edir_gu_id,-999) (assuming that edir_gu_id will never by -999) and then rewrite the first clause of the query as SELECT cnsmr_id
FROM consumer
WHERE nvl(edir_gu_id,-999) = -999
|
|
|
Re: How to reduce the optimizer cost for the below said query [message #385912 is a reply to message #385839] |
Wed, 11 February 2009 20:00 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
It's a bit difficult to tell without any information.
- How many rows in consumer
- How many of those rows have edir_gu_id IS NULL
- How many of those remaining rows will be filtered out by each of the MINUS conditions? Treat them separately - i.e. the first condition might filter out 100 rows and the second condition might filter out 100 rows. They might be the same 100 rows, or separate (200 in total). Just tell us how many are filtered by each MINUS condition one at a time - not combined.
- Finally, how many rows are returned by the query.
There are a number of ways to go about this, and the BEST way will depend on your accurate responses to ALL of the above questions.
Ross Leishman
|
|
|
Re: How to reduce the optimizer cost for the below said query [message #385948 is a reply to message #385912] |
Wed, 11 February 2009 23:57 |
dhanamukesh
Messages: 51 Registered: January 2009
|
Member |
|
|
Thanks for your reply..
I have given all the detais down.
How many rows in consumer
Count -- 16164761
How many of those rows have edir_gu_id IS NULL
Count -- 16163799
How many of those remaining rows will be filtered out by each of the MINUS conditions? Treat them separately - i.e. the first condition might filter out 100 rows and the second condition might filter out 100 rows. They might be the same 100 rows, or separate (200 in total). Just tell us how many are filtered by each MINUS condition one at a time - not combined
In the first Minus the count will be 3523456
After executing the second Minus Condition, the count would be
3261775 AND executing the last Minus condition the count will be 3261772.
Finally, how many rows are returned by the query
count -- 3261772
Help me to tune this query asap.
Regards,
Dhanalakshmi.P
|
|
|
|
Re: How to reduce the optimizer cost for the below said query [message #386152 is a reply to message #386032] |
Thu, 12 February 2009 21:46 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
That's great information. 3 million rows from 16 million is LOT. There is no way you want this to be indexed; when you are reading so much data, the best way to go is Full Table Scans.
Your current method is pretty good. MINUS is a quite scaleable way to filter large volumes of data. The only problem is that it requires both sides of the MINUS to be sorted in memory. For large volumes, memory will run out and it will page to disk.
You can rewrite it using NOT IN sub-queries. These can be made to use HASH ANTI JOINS instead of sorting. This can be faster, but you are still dealing with some large volumes, so it may still page to disk. Generally speaking, hash anti joins will out-perform MINUS.
SELECT cnsmr_id
FROM consumer
WHERE edir_gu_id IS NULL
AND sys_op_map_nonnull(cnsmr_id) NOT IN (
SELECT sys_op_map_nonnull(snd_cnsmr_id)
FROM consumer_tran
UNION ALL -- consumer has no loyalty cards
SELECT sys_op_map_nonnull(cnsmr_id)
FROM loyalty_program_mbshp
UNION ALL
SELECT sys_op_map_nonnull(cp.prm_cnsmr_id)
FROM consumer cp
WHERE cp.prm_cnsmr_id != cp.cnsmr_id
);
For an anti-join to work, you CANNOT have nulls in the comparison. SYS_OP_MAP_NONNULL takes care of nulls. You could do the same with NVL(), mapping NULL to another character/number; or you could use AND col IS NOT NULL clauses to remove NULLS altogether.
Ross Leishman
|
|
|