Tuning analytical function query [message #388873] |
Thu, 26 February 2009 08:08 |
ajitpal.s
Messages: 204 Registered: November 2006
|
Senior Member |
|
|
Hi Guys,
I have a query, for some reasons its taking more than 5mins to run, which is not so reasonable.
explain plan for
select distinct
a.line
,a.item_id
,first_value (b.std) over (partition by b.item_id
order by case when b.r_date <= (select add_months(start_date,-4) from rel_tim where time_id = 5) then b.r_date
else null end desc nulls last
,case when b.r_date > (select add_months(start_date,-4) from rel_tim where time_id = 5) then b.r_date
else null end asc nulls last) val
from
dm_v a, std_c b
where a.item_id = b.item_id
and b.std <> 0
and b.id = 105
and a.TRX >= '1-Jul-2008'
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4584K| 144M| | 106K (2)| 00:21:14 |
| 1 | TABLE ACCESS BY INDEX ROWID | rel_tim | 1 | 12 | | 1 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | rel_tim_PK | 1 | | | 0 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID | rel_tim | 1 | 12 | | 1 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | rel_tim_PK | 1 | | | 0 (0)| 00:00:01 |
| 5 | HASH UNIQUE | | 4584K| 144M| 421M| 106K (2)| 00:21:14 |
| 6 | WINDOW SORT | | 4584K| 144M| 421M| 106K (2)| 00:21:14 |
|* 7 | HASH JOIN | | 4584K| 144M| 6560K| 23300 (2)| 00:04:40 |
| 8 | TABLE ACCESS BY INDEX ROWID| dm_v | 239K| 3744K| | 7622 (1)| 00:01:32 |
|* 9 | INDEX RANGE SCAN | dm_v_N11 | 239K| | | 649 (2)| 00:00:08 |
|* 10 | TABLE ACCESS FULL | std_c | 406K| 6745K| | 14725 (3)| 00:02:57 |
-------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("TIME_ID"=5)
4 - access("TIME_ID"=5)
7 - access("A"."ITEM_ID"="B"."ITEM_ID")
9 - access("A"."TRX">=TO_DATE('2008-07-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
10 - filter("B"."ID"=105 AND "B"."std"<>0)
Note:
1) Both tables have dm_v & std_c have index created on item_id
2) I have anaylzed the table
ANALYZE TABLE dm_v ESTIMATE STATISTICS SAMPLE 30 PERCENT;
ANALYZE TABLE std_c ESTIMATE STATISTICS SAMPLE 30 PERCENT;
Appreciate suggestions on how can i go about tuning, i see a full table scan, but not sure how to get rid of it.
|
|
|
|
Re: Tuning analytical function query [message #388889 is a reply to message #388884] |
Thu, 26 February 2009 09:08 |
ajitpal.s
Messages: 204 Registered: November 2006
|
Senior Member |
|
|
Thanks for the reply.
Noticed and b.std <> 0 no index, but its a number field, not sure if its a good idea creating index on it.
After analyzing rest of indexes, using syntax below, noticed the explain plan still producing same plan. Can i tell the optimizer to use index for this column?
ANALYZE index <name> ESTIMATE STATISTICS SAMPLE 30 PERCENT;
Is there any guideline as of when to analyze index/table? How many sample percent to estimate and do we analyze table or index first?
|
|
|
|
Re: Tuning analytical function query [message #388893 is a reply to message #388890] |
Thu, 26 February 2009 09:32 |
ajitpal.s
Messages: 204 Registered: November 2006
|
Senior Member |
|
|
Did a count, figures are as below, i did not specify the filter on trx (and a.TRX >= '1-Jul-2008')
How many total rows in std_c b?
select count(*) from std_c -- = 4848914
How many rows a.item_id = b.item_id?
select count(*) from dm_v a, std_c b where a.item_id = b.item_id -- = 3709761000
How many rows b.std <> 0?
select count(*) from std_c where std <> 0 -- 3280043
How many rows b.id = 105?
select count(*) from std_c where id = 105 -- 728215
|
|
|
|
|
Re: Tuning analytical function query [message #388902 is a reply to message #388896] |
Thu, 26 February 2009 09:53 |
JRowbottom
Messages: 5933 Registered: June 2006 Location: Sunny North Yorkshire, ho...
|
Senior Member |
|
|
Try this - instead of querying rel_tim twice per row returned, try querying it once per query
select distinct
a.line
,a.item_id
,first_value (b.std) over (partition by b.item_id
order by case when b.r_date <= add_months(t.start_date,-4)
then b.r_date end desc nulls last
,case when b.r_date > add_months(t.start_date,-4)
then b.r_date end asc nulls last) val
from dm_v a
,std_c b
,rel_tim t
where a.item_id = b.item_id
and b.std <> 0
and b.id = 105
and a.TRX >= to_date('1-Jul-2008','dd-mon-yyyy')
and t.time_id = 5
|
|
|
Re: Tuning analytical function query [message #388906 is a reply to message #388902] |
Thu, 26 February 2009 10:10 |
ajitpal.s
Messages: 204 Registered: November 2006
|
Senior Member |
|
|
Thanks for building up the query initially and tuning it, now seems like its taking - 4min 21secs. Some imporvement there.
explain plan reduced one step.
Just curious, is it possible not to use full table scan on std_c table (step 10 in explain plan). Not sure if it would faster if it accesses via index since theres index created.
Any specific hints can i specify to perform a much faster join?
|
|
|
Re: Tuning analytical function query [message #388952 is a reply to message #388906] |
Thu, 26 February 2009 14:13 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
Seems to me like the natsy analytic function can be converted into a slighlty less nasty aggregate function. If we do this and aggregate down to the join key before the mini-cartesian join to A, we can sort fewer rows and reduce the size of joins.
SELECT a.line
, a.item_id
, max(d.val)
FROM dm_v a
,(
SELECT b.item,
MAX(b.std) KEEP (DENSE_RANK FIRST ORDER BY
CASE
WHEN b.r_date <= add_months(t.start_date,-4)
THEN b.r_date
END DESC NULLS LAST
,CASE WHEN b.r_date > add_months(t.start_date,-4)
THEN b.r_date
END ASC NULLS LAST
) AS val
FROM std_c b
, rel_tim t
AND b.std <> 0
AND b.id = 105
AND t.time_id = 5
GROUP BY b.item
) d
WHERE a.item = d.item
AND a.TRX >= to_date('1-Jul-2008','dd-mon-yyyy')
GROUP BY a.line
, a.item_id
Ross Leishman
|
|
|
Re: Tuning analytical function query [message #388972 is a reply to message #388952] |
Thu, 26 February 2009 17:11 |
ajitpal.s
Messages: 204 Registered: November 2006
|
Senior Member |
|
|
Was trying to understand the changes you have made.
The dense_rank assigns rank, if the value is same as previous, same rank gets assigned and the next value will have subsequent rank.
But can you explain the function of aggregrate, how does it work
MAX(b.std) before dense_rank
Just trying to understand how the sql in 'd' clause effects the join
[Updated on: Thu, 26 February 2009 17:14] Report message to a moderator
|
|
|
Re: Tuning analytical function query [message #389129 is a reply to message #388972] |
Fri, 27 February 2009 06:11 |
|
Alessandro Rossi
Messages: 166 Registered: September 2008 Location: Rome
|
Senior Member |
|
|
This one should do the same job probably.
select a.line
,a.item_id
,d.std as val
from dm_v a
left outer join (
select b.item_id,
max(b.std) keep (
dense_rank first
order by abs(b.r_date - add_months(c.start_date,-4))
) as std
from std_c b
join rel_tim c on ( c.time_id = 5 )
where b.r_date is not null
group by b.item
) d on (a.item_id = d.item_id)
where b.std <> 0
and b.id = 105
and a.TRX >= date '2008-7-1'
As Ross Leishman said it can be converted into direct aggregate function (instead of using distinct over analytic).
Here I tried to reduce everything:
1. Instead of using order by .. nulls last I used an outer join against not null first_values.
2. Instead of aggregating the final ( and bigger ) data set I tried to use the aggregate function on the smallest one ( the table std_c ).
3. That order by clause using case may be reduced into a mathematical one probably a bit easier to evaluate.
PS.
the max before dense_rank is there because generally you may have multiple rows to be first (or last) upon a particular sort order. So on those (multiple) rows you need to define an aggregate function, because the result must reside on a single row.
Bye Alessandro
[Updated on: Fri, 27 February 2009 06:18] Report message to a moderator
|
|
|
|