Home » RDBMS Server » Performance Tuning » Problem with SQL (Oracle 10 g, Windows XP)
Problem with SQL [message #282811] Fri, 23 November 2007 13:34 Go to next message
kpremsagar
Messages: 26
Registered: June 2007
Junior Member
Hi,
I have a SQL. The query takes too much time when run for a longer period say sysdate -365. And I need your inputs in optimizing it.

SELECT ID, salesperson, consultantid,
       (SELECT MAX (employee_hist_key)
          FROM edw.employee_vw
         WHERE operator_id = t.consultantid) sckey, activityid,
       (SELECT MAX (activityflag)
          FROM edw_stage.psm_activitylookup p
         WHERE p.activityid = t.activityid) aflag, activitydate,
       (SELECT MAX (date_key)
          FROM edw.dates
         WHERE dates = t.activitydate) dkey, DURATION, customer,
       (SELECT MAX (company_address_source_key)
          FROM edw.company_address_source_dim
         WHERE source_system = 'SFDC'
           AND UPPER (company_name) = t.customer) co_key,
       comments, deleteflag, lastupdated, opportunityidnumber
  FROM (SELECT ROW_NUMBER () OVER (PARTITION BY ID, consultantid, activityid, activitydate ORDER BY lastupdated DESC) distinct_ct,
               ID, salesperson, consultantid, activityid, activitydate,
               DURATION, customer, comments, deleteflag, lastupdated,
               opportunityidnumber
          FROM edw_stage.psm_activity psm_activity) t
 WHERE distinct_ct = 1 AND lastupdated > SYSDATE - 4


The explain plan for this is below. Somehow I cant get it to display properly. So am attaching the excel sheet too.

 Plan	Object_Name	Obj_type	Bytes	Cost	Cardinality	CPU_COST	IO_COST	ACCESS_PRED	FILTER_PRED
Select Statement			261905994	6893	692873	1126464456	6808		
  1.1 Sort (Aggregate)			60		1				
    2.1 View	EMPLOYEE_VW	VIEW	120	6	2	13312300	5		
      3.1 Sort (Group By)			788	6	2	13312300	5		
        4.1 Table Access (By Index Rowid)	EMPLOYEE_HIST	TABLE	788	5	2	92364	5		"EFF_END_DATE">SYSDATE@! OR "EFF_END_DATE" IS NULL
          5.1 Index (Range Scan)	EMP_HIST_OPERATOR_ID	INDEX		1	9	17836	1	"OPERATOR_ID"=:B1	
  1.2 Sort (Aggregate)			16		1				
    2.1 Table Access (Full)	PSM_ACTIVITYLOOKUP	TABLE	16	3	1	53417	3		"P"."ACTIVITYID"=:B1
  1.3 Sort (Aggregate)			13		1				
    2.1 Table Access (By Index Rowid)	DATES	TABLE	13	2	1	31562	2		
      3.1 Index (Range Scan)	DATES_IDX02	INDEX		1	1	16236	1	"DATES"=:B1	
  1.4 Sort (Aggregate)			34		1				
    2.1 Table Access (Full)	COMPANY_ADDRESS_SOURCE_DIM	TABLE	140930	8506	4145	696898590	8453		"SOURCE_SYSTEM"='SFDC' AND UPPER("COMPANY_NAME")=:B1
  1.5 View			261905994	6893	692873	1126464456	6808		"DISTINCT_CT"=1 AND "LASTUPDATED">SYSDATE@!-4
    2.1 Window (Sort Pushed Rank)			62358570	6893	692873	1126464456	6808		ROW_NUMBER() OVER ( PARTITION BY "ID","CONSULTANTID","ACTIVITYID","ACTIVITYDATE" ORDER BY INTERNAL_FUNCTION("LASTUPDATED") DESC )<=1
      3.1 Table Access (Full)	PSM_ACTIVITY	TABLE	62358570	1188	692873	280314622	1167		



Is there any way I can optimize this?

Thanks a million
Prem
Re: Problem with SQL [message #282817 is a reply to message #282811] Fri, 23 November 2007 16:49 Go to previous messageGo to next message
kpremsagar
Messages: 26
Registered: June 2007
Junior Member
Hi,
The inline view is used to get the distinct records based on only certain columns. I can certainly think of taking the date filter inside the inline query. Need to test it out.

Also I created a function index on UPPER(COMPANY_NAME) and a bitmap index on SOURCE_SYSTEM. Bitmap because for almost 700K records there are only 6 distinct values for this column.

But I am not sure of the impact of using a bitmap index with regards to index create / update timing. will it impact the job running for that table?

Also are there any other ideas?

Thanks a million
Prem
Re: Problem with SQL [message #282910 is a reply to message #282817] Sat, 24 November 2007 23:20 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Get rid of the scalar-subqueries in the SELECT clause.

Ross Leishman
Re: Problem with SQL [message #282918 is a reply to message #282910] Sun, 25 November 2007 01:59 Go to previous messageGo to next message
kpremsagar
Messages: 26
Registered: June 2007
Junior Member
Hi Ross,
I wish I could remove the scalar subqueries...but I cannot because my other option is to re-write them as outer joins...

any other options?

Thanks a lot for the reply
Prem
Re: Problem with SQL [message #282920 is a reply to message #282811] Sun, 25 November 2007 02:23 Go to previous messageGo to next message
orausern
Messages: 826
Registered: December 2005
Senior Member
If I understand correctly, the biggest resouce consumption takes place at:

(SELECT ROW_NUMBER () OVER (PARTITION BY ID, consultantid, activityid, activitydate ORDER BY lastupdated DESC) distinct_ct,
               ID, salesperson, consultantid, activityid, activitydate,
               DURATION, customer, comments, deleteflag, lastupdated,
               opportunityidnumber
          FROM edw_stage.psm_activity psm_activity) t
 WHERE distinct_ct = 1 AND lastupdated > SYSDATE - 4


This is because , in the explain plan, the cost of this operation is shown as : 692873. And the access path is FULL TABLE access of PSM_ACTIVITY.

So this is the part of query that needs to be reviewed. If you can rewrite the query to avoid the full table scan then you can reduce the cost of the query. perhaps by adding some function base index - for ROW_NUMBER () OVER (PARTITION BY ID, consultantid, activityid, activitydate ORDER BY lastupdated DESC part etc.

Not sure but this is what I thought.



Re: Problem with SQL [message #282981 is a reply to message #282920] Sun, 25 November 2007 19:53 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
And what's wrong with an outer-join?

Ross Leishman
Re: Problem with SQL [message #283054 is a reply to message #282981] Mon, 26 November 2007 00:42 Go to previous messageGo to next message
kpremsagar
Messages: 26
Registered: June 2007
Junior Member
Hi Ross,
Thanks a million for the reply.

But isnt a inner query a better way of re-writing an outer join? I thought I read somewhere that such way of writing is better.

Is this way a more intensive process? Pls let me know your thoughts. This is in contradiction with something that I thought.

Thanks again
Prem
Re: Problem with SQL [message #283065 is a reply to message #283054] Mon, 26 November 2007 01:07 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
I'm in the early stages of an article on this very topic - watch the main board. So I don't want to spoil the surprise.

Long story short:
- A nested operation is one that is performed for each row in an outer result set
- Nesting is bad for large result sets.
- Correlated scalar sub-queries in the SELECT clause must be nested
- Joins (of any type) and even most sub-queries can be optionally nested or unnested as the optimiser sees fit.
- Joins are more versatile than correlated scalar sub-queries in the SELECT clause.

Ross Leishman
Re: Problem with SQL [message #284495 is a reply to message #282811] Thu, 29 November 2007 21:58 Go to previous message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
try running the pieces one step at a time. This will give you some idea of how long each takes. It is a crude method, but if there is anything obvioiusly long, it will show up.

so, do your analytic query first as a create table as...

The write several versions of a second query, where each verions does on of the scalar subselect, and execute and time these.

This will give you a good idea of the incremental cost of each piece. Then it will be clear where to go.

Seems to me, the Oracle Optimizer should have recongnized your query as a top-n query and done the appropriate optimization for it. I would start there.

Good luck, Kevin
Previous Topic: order of drop,create and create index during data load
Next Topic: Removing Duplicates Query
Goto Forum:
  


Current Time: Tue Nov 26 19:35:19 CST 2024