Home » RDBMS Server » Performance Tuning » Problem with SQL (Oracle 10 g, Windows XP)
Problem with SQL [message #282811] |
Fri, 23 November 2007 13:34 |
kpremsagar
Messages: 26 Registered: June 2007
|
Junior Member |
|
|
Hi,
I have a SQL. The query takes too much time when run for a longer period say sysdate -365. And I need your inputs in optimizing it.
SELECT ID, salesperson, consultantid,
(SELECT MAX (employee_hist_key)
FROM edw.employee_vw
WHERE operator_id = t.consultantid) sckey, activityid,
(SELECT MAX (activityflag)
FROM edw_stage.psm_activitylookup p
WHERE p.activityid = t.activityid) aflag, activitydate,
(SELECT MAX (date_key)
FROM edw.dates
WHERE dates = t.activitydate) dkey, DURATION, customer,
(SELECT MAX (company_address_source_key)
FROM edw.company_address_source_dim
WHERE source_system = 'SFDC'
AND UPPER (company_name) = t.customer) co_key,
comments, deleteflag, lastupdated, opportunityidnumber
FROM (SELECT ROW_NUMBER () OVER (PARTITION BY ID, consultantid, activityid, activitydate ORDER BY lastupdated DESC) distinct_ct,
ID, salesperson, consultantid, activityid, activitydate,
DURATION, customer, comments, deleteflag, lastupdated,
opportunityidnumber
FROM edw_stage.psm_activity psm_activity) t
WHERE distinct_ct = 1 AND lastupdated > SYSDATE - 4
The explain plan for this is below. Somehow I cant get it to display properly. So am attaching the excel sheet too.
Plan Object_Name Obj_type Bytes Cost Cardinality CPU_COST IO_COST ACCESS_PRED FILTER_PRED
Select Statement 261905994 6893 692873 1126464456 6808
1.1 Sort (Aggregate) 60 1
2.1 View EMPLOYEE_VW VIEW 120 6 2 13312300 5
3.1 Sort (Group By) 788 6 2 13312300 5
4.1 Table Access (By Index Rowid) EMPLOYEE_HIST TABLE 788 5 2 92364 5 "EFF_END_DATE">SYSDATE@! OR "EFF_END_DATE" IS NULL
5.1 Index (Range Scan) EMP_HIST_OPERATOR_ID INDEX 1 9 17836 1 "OPERATOR_ID"=:B1
1.2 Sort (Aggregate) 16 1
2.1 Table Access (Full) PSM_ACTIVITYLOOKUP TABLE 16 3 1 53417 3 "P"."ACTIVITYID"=:B1
1.3 Sort (Aggregate) 13 1
2.1 Table Access (By Index Rowid) DATES TABLE 13 2 1 31562 2
3.1 Index (Range Scan) DATES_IDX02 INDEX 1 1 16236 1 "DATES"=:B1
1.4 Sort (Aggregate) 34 1
2.1 Table Access (Full) COMPANY_ADDRESS_SOURCE_DIM TABLE 140930 8506 4145 696898590 8453 "SOURCE_SYSTEM"='SFDC' AND UPPER("COMPANY_NAME")=:B1
1.5 View 261905994 6893 692873 1126464456 6808 "DISTINCT_CT"=1 AND "LASTUPDATED">SYSDATE@!-4
2.1 Window (Sort Pushed Rank) 62358570 6893 692873 1126464456 6808 ROW_NUMBER() OVER ( PARTITION BY "ID","CONSULTANTID","ACTIVITYID","ACTIVITYDATE" ORDER BY INTERNAL_FUNCTION("LASTUPDATED") DESC )<=1
3.1 Table Access (Full) PSM_ACTIVITY TABLE 62358570 1188 692873 280314622 1167
Is there any way I can optimize this?
Thanks a million
Prem
|
|
|
Re: Problem with SQL [message #282817 is a reply to message #282811] |
Fri, 23 November 2007 16:49 |
kpremsagar
Messages: 26 Registered: June 2007
|
Junior Member |
|
|
Hi,
The inline view is used to get the distinct records based on only certain columns. I can certainly think of taking the date filter inside the inline query. Need to test it out.
Also I created a function index on UPPER(COMPANY_NAME) and a bitmap index on SOURCE_SYSTEM. Bitmap because for almost 700K records there are only 6 distinct values for this column.
But I am not sure of the impact of using a bitmap index with regards to index create / update timing. will it impact the job running for that table?
Also are there any other ideas?
Thanks a million
Prem
|
|
|
|
Re: Problem with SQL [message #282918 is a reply to message #282910] |
Sun, 25 November 2007 01:59 |
kpremsagar
Messages: 26 Registered: June 2007
|
Junior Member |
|
|
Hi Ross,
I wish I could remove the scalar subqueries...but I cannot because my other option is to re-write them as outer joins...
any other options?
Thanks a lot for the reply
Prem
|
|
|
Re: Problem with SQL [message #282920 is a reply to message #282811] |
Sun, 25 November 2007 02:23 |
orausern
Messages: 826 Registered: December 2005
|
Senior Member |
|
|
If I understand correctly, the biggest resouce consumption takes place at:
(SELECT ROW_NUMBER () OVER (PARTITION BY ID, consultantid, activityid, activitydate ORDER BY lastupdated DESC) distinct_ct,
ID, salesperson, consultantid, activityid, activitydate,
DURATION, customer, comments, deleteflag, lastupdated,
opportunityidnumber
FROM edw_stage.psm_activity psm_activity) t
WHERE distinct_ct = 1 AND lastupdated > SYSDATE - 4
This is because , in the explain plan, the cost of this operation is shown as : 692873. And the access path is FULL TABLE access of PSM_ACTIVITY.
So this is the part of query that needs to be reviewed. If you can rewrite the query to avoid the full table scan then you can reduce the cost of the query. perhaps by adding some function base index - for ROW_NUMBER () OVER (PARTITION BY ID, consultantid, activityid, activitydate ORDER BY lastupdated DESC part etc.
Not sure but this is what I thought.
|
|
|
|
Re: Problem with SQL [message #283054 is a reply to message #282981] |
Mon, 26 November 2007 00:42 |
kpremsagar
Messages: 26 Registered: June 2007
|
Junior Member |
|
|
Hi Ross,
Thanks a million for the reply.
But isnt a inner query a better way of re-writing an outer join? I thought I read somewhere that such way of writing is better.
Is this way a more intensive process? Pls let me know your thoughts. This is in contradiction with something that I thought.
Thanks again
Prem
|
|
|
Re: Problem with SQL [message #283065 is a reply to message #283054] |
Mon, 26 November 2007 01:07 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
I'm in the early stages of an article on this very topic - watch the main board. So I don't want to spoil the surprise.
Long story short:
- A nested operation is one that is performed for each row in an outer result set
- Nesting is bad for large result sets.
- Correlated scalar sub-queries in the SELECT clause must be nested
- Joins (of any type) and even most sub-queries can be optionally nested or unnested as the optimiser sees fit.
- Joins are more versatile than correlated scalar sub-queries in the SELECT clause.
Ross Leishman
|
|
|
Re: Problem with SQL [message #284495 is a reply to message #282811] |
Thu, 29 November 2007 21:58 |
|
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
try running the pieces one step at a time. This will give you some idea of how long each takes. It is a crude method, but if there is anything obvioiusly long, it will show up.
so, do your analytic query first as a create table as...
The write several versions of a second query, where each verions does on of the scalar subselect, and execute and time these.
This will give you a good idea of the incremental cost of each piece. Then it will be clear where to go.
Seems to me, the Oracle Optimizer should have recongnized your query as a top-n query and done the appropriate optimization for it. I would start there.
Good luck, Kevin
|
|
|
Goto Forum:
Current Time: Tue Nov 26 19:35:19 CST 2024
|