Home » RDBMS Server » Performance Tuning » Performance issue with long running query (Oracle 11i)
Performance issue with long running query [message #500825] Wed, 23 March 2011 14:52 Go to next message
prasadpsk
Messages: 4
Registered: March 2011
Junior Member
Hi,
I have a long running query like below which is taking around 1 hour to pull around 650k records for 1 day data.

SELECT DISTINCT customer_trx_line_id, customer_trx_id
FROM ra_cust_trx_line_gl_dist_all
WHERE last_update_date BETWEEN TO_DATE ('2011-01-28','yyyy-mm-dd HH24:MI:SS') AND TO_DATE ('2011-01-29','yyyy-mm-dd HH24:MI:SS')

On one particular day of 28th jan there are around 650K records created. There is an index exists on last_update_date column and the query is utilizing the index but still taking long time to execute. I need to pull the DISTINCT records. Without DISTINCT clause the query is executing fast but when trying to pull the DISTINCT records then the query is taking long time.
Any inputs/ideas to improve the performance of this query will be highly appreciated.

Thanks,
Prasad.
Re: Performance issue with long running query [message #500826 is a reply to message #500825] Wed, 23 March 2011 14:58 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Read http://www.orafaq.com/forum/mv/msg/84315/433888/102589/#msg_433888 and post the required information.

Regards
Michel
Re: Performance issue with long running query [message #500827 is a reply to message #500826] Wed, 23 March 2011 15:03 Go to previous messageGo to next message
prasadpsk
Messages: 4
Registered: March 2011
Junior Member
hi,
Here is the explain plan
Plan

SELECT STATEMENT ALL_ROWSCost: 13,661 Bytes: 25,476 Cardinality: 772
4 HASH UNIQUE Cost: 13,661 Bytes: 25,476 Cardinality: 772
3 FILTER
2 TABLE ACCESS BY INDEX ROWID TABLE AR.RA_CUST_TRX_LINE_GL_DIST_ALL Cost: 13,660 Bytes: 25,476 Cardinality: 772
1 INDEX RANGE SCAN INDEX AR.RA_LINE_GL_DIST_N1 Cost: 174 Cardinality: 46,892
Re: Performance issue with long running query [message #500834 is a reply to message #500825] Wed, 23 March 2011 16:31 Go to previous messageGo to next message
ma_appsdba
Messages: 4
Registered: March 2011
Junior Member
Try creating function based index on TO_DATE(last_update_date). Also, try creating concatenated index on customer_trx_line_id, customer_trx_id - if that helps (make sure stats are in place).
What is the DB version.
Re: Performance issue with long running query [message #500835 is a reply to message #500834] Wed, 23 March 2011 17:25 Go to previous messageGo to next message
ThomasG
Messages: 3212
Registered: April 2005
Location: Heilbronn, Germany
Senior Member
ma_appsdba wrote on Wed, 23 March 2011 22:31
Try creating function based index on TO_DATE(last_update_date)


The to_date would make no sense if last_update_date is a date column. An index on last_update_date might help, though. Depending on the way the data is selected normally perhaps even an function based index on TRUNC(last_update_date)
Re: Performance issue with long running query [message #500922 is a reply to message #500827] Thu, 24 March 2011 09:00 Go to previous messageGo to next message
LKBrwn_DBA
Messages: 487
Registered: July 2003
Location: WPB, FL
Senior Member
Create a function index on TRUNC(last_update_date) as ThomasG suggest, and then you could try analytical query to see if it's faster than "DISTINCT":
SELECT * FROM (
SELECT customer_trx_line_id,
       customer_trx_id,
       DENSE_RANK ()
          OVER (PARTITION BY customer_trx_line_id, customer_trx_id
                ORDER BY ROWID)   rk
  FROM ra_cust_trx_line_gl_dist_all
 WHERE TRUNC (last_update_date) = TO_DATE ('2011-01-28', 'yyyy-mm-dd'))
 WHERE rk = 1

[Updated on: Thu, 24 March 2011 09:32] by Moderator

Report message to a moderator

Re: Performance issue with long running query [message #501144 is a reply to message #500922] Sat, 26 March 2011 06:46 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Are you people all mad? This query is already using an index to pull the EXACT 650K rows it needs, and then de-duplicating them. It really couldn't be simpler. Creating a function based index will mean he now has TWO indexes that do essentially the same thing (find the exact same rows), except now there's a lot less free space in the database because it's been chewed up by a useless index.

An analytic function won't be faster. It will do a sort - slower than the hash-distinct it is currently doing.

I'd be interested to know whether the problem is in the row retrieval or the sort. How long does this take:
SELECT MAX(customer_trx_line_id)
FROM ra_cust_trx_line_gl_dist_all
WHERE last_update_date BETWEEN TO_DATE ('2011-01-28','yyyy-mm-dd HH24:MI:SS') AND TO_DATE ('2011-01-29','yyyy-mm-dd HH24:MI:SS')

This SQL will retrieve all of the same data, but will not do the hash-unique (DISTINCT).

If it is still comparably slow, then the problem is probably disk latency. All those indexed lookups are single-block reads - lots of back-and-forth traffic to the disk, and lots of waiting time. If you add customer_trx_line_id and customer_trx_id to the last_update_date index, that means you won't need to lookup all those table blocks. Much less IO, should be at least 2x faster - probably more. If you cannot muck about with the indexes, Parallel Query is a very good way of dealing with disk latency. If the disk is going to take so darned long to serve block requests, at least you can get Parallel Query to have it retrieve them 4 or 8 at a time. However, before you go parallelizing your query, you need to speak to your DBA about it. There is an excellent chance that you will destroy the performance of the database if you do it inappropriately.

If the above SQL is quick, then the HASH-distinct is taking up the time. That would probably mean your PGA_AGGREGATE_TARGET is set too small, and you would need to ask your DBA to increase it.

Ross Leishman

Re: Performance issue with long running query [message #501204 is a reply to message #500825] Sun, 27 March 2011 11:36 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
Ross is correct on many counts. The ideas of TRUNC and FUCTION BASED INDEX are way off. They either change the result, or are useless in terms of helping.

The fastest way to get this query done is to create the following index.

SELECT DISTINCT customer_trx_line_id, customer_trx_id
FROM ra_cust_trx_line_gl_dist_all
WHERE last_update_date BETWEEN TO_DATE ('2011-01-28','yyyy-mm-dd HH24:MI:SS') AND TO_DATE ('2011-01-29','yyyy-mm-dd HH24:MI:SS') 

create index xi1 on ra_cust_trx_line_gl_dist_all
(last_update_date , customer_trx_line_id, customer_trx_id
);

This index will give you three benefits:

1) it will do a range scan on the index for only the rows that match the date expression.

2) it will skip table access altogether thus never visiting the table because all columns required to answer the query are in the index.

3) distinct might take less time since the rows are bunching in semi-sorted order already **though I would not count it too much.

However, if your explanation is right, that the distinct is the cause of your grief, then you need to check on how the HASH UNIQUE is performing. Not sure how to speed that up. There must be a way to figure out if it is hashing in memory or if the hash spills to disk. Maybe ROSS has a comment on how to figure this out.

For 650k rows, this query should finish in under a minute. On my systems, I would expect it to get done between a max of 20 and as little as 5 seconds.

Kevin

[Updated on: Sun, 27 March 2011 11:38]

Report message to a moderator

Re: Performance issue with long running query [message #501215 is a reply to message #501204] Sun, 27 March 2011 17:01 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Kevin,

I agree on that index as "ideal", however I was encouraging the OP to do some investigating to avoid the possibility of solving problems that don't exist.

http://www.orafaq.com/node/1446This article gives a couple of tips on looking at memory usage, however there are certainly better sources of information out there.

Ross Leishman
Re: Performance issue with long running query [message #501289 is a reply to message #500825] Mon, 28 March 2011 07:28 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
Thanks Ross. I had assumed that the OP was going to do the query you provided in order to get additional information about their situation. I sure hope that is their plan.

Kevin
Re: Performance issue with long running query [message #501352 is a reply to message #500825] Mon, 28 March 2011 12:43 Go to previous messageGo to next message
prasadpsk
Messages: 4
Registered: March 2011
Junior Member
Hi,
Thanks all for your suggesstions.
I tried with parallel hint and the performance is good but the problem is there are lot of other jobs also running at the same time and this could slow down the performance of the other jobs.

Have to see any other alternative should resolve this problem.

Thanks
Prasad
Re: Performance issue with long running query [message #501369 is a reply to message #501352] Mon, 28 March 2011 13:07 Go to previous messageGo to next message
Kevin Meade
Messages: 2103
Registered: December 1999
Location: Connecticut USA
Senior Member
OK, but did you do what Ross asked for?
Re: Performance issue with long running query [message #501395 is a reply to message #501369] Mon, 28 March 2011 15:00 Go to previous messageGo to next message
prasadpsk
Messages: 4
Registered: March 2011
Junior Member

The below query is taking 20 mins

SELECT MAX(customer_trx_line_id)
FROM ra_cust_trx_line_gl_dist_all
WHERE last_update_date BETWEEN TO_DATE ('2011-01-28','yyyy-mm-dd HH24:MI:SS') AND TO_DATE ('2011-01-29','yyyy-mm-dd HH24:MI:SS')
Re: Performance issue with long running query [message #501398 is a reply to message #501395] Mon, 28 March 2011 15:06 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
SQL> SET TIME ON
SQL> SELECT COUNT(*) FROM ra_cust_trx_line_gl_dist_all;

COPY SQL above & results then PASTE all back here
Re: Performance issue with long running query [message #501431 is a reply to message #501398] Tue, 29 March 2011 03:16 Go to previous message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Looks like a disk latency issue. Is the database on network disk (SAN/NAS)or local?

You can combat disk latency issues in a few different ways:
- Use faster disk
- Use less disk
- Increase disk utilisation

Faster disk is probably outside of your control.

You can use less disk by decreasing the wastage in blocks where the data YOU WANT is kept.
- Create the index Kevin suggested (best bet) to obviate table row retrieval
- Use table compression
- Reorganise the table to ensure rows for a single day are clustered together in common blocks.

Parallelism will increase disk utilisation.

Ross Leishman
Previous Topic: How to assure partition table is better than non-partition table (2 MERGED)
Next Topic: nested select / instead-of-trigger and views=no index used
Goto Forum:
  


Current Time: Sun Nov 24 20:30:29 CST 2024