Home » SQL & PL/SQL » SQL & PL/SQL » SQL Query re-write assistance requested (Oracle 9i)
SQL Query re-write assistance requested [message #326498] |
Wed, 11 June 2008 15:34 |
svguerin3
Messages: 44 Registered: November 2005 Location: TX
|
Member |
|
|
I am needing to know the best way to go about re-writing a bad query at my workplace that needs to run at TOP performance due to very large amounts of data that will be run through it.
I have simplified the data and table to the following mock example:
ID SEQ_N IND
1 1 Y
1 2 N
2 1 N
2 2 N
3 1 N
I need a query that returns each ID, as well as the maximum SEQ_N for each ID. The tricky part is that if there is a "Y" in the indicator (IND) column, then use the maximum SEQ_N for those. If there is no "Y" indicator, then use the maximum SEQ_N for the "N" indicator column. So given the above example data, the query executed would return the following results:
ID MAX(SEQ_N)
1 1
2 2
3 1
I hope that makes sense.. The query we have running currently creates a string and gets the maximum from that. The inner part of the query looks like the following:
select id, max(decode(IND,'Y','1',0') || to_char(seq_n,'0000000009')) from table
And the outer query selects from this (using a substring and to_number) to get the sequence number back.. Very expensive; there must be a better way!
Thanks in advance!
|
|
|
|
|
Re: SQL Query re-write assistance requested [message #326589 is a reply to message #326513] |
Thu, 12 June 2008 01:30 |
sarwagya
Messages: 87 Registered: February 2008 Location: Republic of Nepal
|
Member |
|
|
Please try with this query. I tried it with few rows.
Let's hope it works for your data too.
CREATE TABLE test_table(id number(2),seq_n number(2),ind char(1));
INSERT INTO test_table values(1,1,'Y');
INSERT INTO test_table values(1,3,'Y');
INSERT INTO test_table values(1,2,'N');
INSERT INTO test_table values(2,1,'N');
INSERT INTO test_table values(2,2,'N');
INSERT INTO test_table values(3,1,'N');
INSERT INTO test_table values(3,2,'Y');
INSERT INTO test_table values(3,3,'Y');
INSERT INTO test_table values(3,4,'N');
QUERY:
WITH DATA AS (
SELECT ID,MAX(SEQ_N) M_SEQ_Y, NULL M_SEQ_N
FROM TEST_TABLE
WHERE IND = 'Y'
AND SEQ_N IS NOT NULL
GROUP BY ID
UNION
SELECT ID,NULL M_SEQ_Y,MAX(SEQ_N) M_SEQ_N
FROM TEST_TABLE
WHERE IND = 'N'
AND SEQ_N IS NOT NULL
GROUP BY ID
)SELECT ID,M_SEQ_Y MAX_SEQ
FROM DATA
WHERE M_SEQ_Y IS NOT NULL
UNION
SELECT ID,M_SEQ_N MAX_SEQ
FROM DATA OUTER
WHERE M_SEQ_Y IS NULL
AND ID NOT IN (SELECT ID FROM DATA WHERE M_SEQ_Y IS NOT NULL AND DATA.ID = OUTER.ID);
OUTPUT:
ID MAX_SEQ
----- -------
1 3
2 2
3 3
[Updated on: Thu, 12 June 2008 01:47] Report message to a moderator
|
|
|
|
Re: SQL Query re-write assistance requested [message #326652 is a reply to message #326602] |
Thu, 12 June 2008 02:59 |
JRowbottom
Messages: 5933 Registered: June 2006 Location: Sunny North Yorkshire, ho...
|
Senior Member |
|
|
Using the previous posts test setup, would this help:select id,max_seq
from (select id
,last_value(seq_n) over (partition by id order by ind,seq_n rows between unbounded preceding and unbounded following) max_seq
,row_number() over (partition by id order by null) rnum
from test_table)
where rnum = 1;
|
|
|
|
Re: SQL Query re-write assistance requested [message #326666 is a reply to message #326662] |
Thu, 12 June 2008 03:47 |
JRowbottom
Messages: 5933 Registered: June 2006 Location: Sunny North Yorkshire, ho...
|
Senior Member |
|
|
They are caled
Analytic Functions
Put simply, they work out a value for a function based on a subset of the rows returned by the query.
There are 4 main parts to them
1) The function. This is the function that operates on the set of rows that you define. These are listed at the bottom of the link I gave above. LAST_VALUE returns the last value in an ordered set of rows. Row_Number takes an ordered set of rows, and gives them a number from 1..n
2) Partition Clause. This allows you to break down the total set of rows returned by the query into discrete subsets.
The function ROW_NUMBER() OVER (PARTION BY id ...) simply takes each set of rows that share the same ID, and gives each of them a unique row number.
3) Order By clause. This determines which order the rows are to be passed to the function in.
The function LAST_VALUE(seq_n) OVER (ORDER BY ind,seq_n) orders the set of rows so that all the rows with IND='N' come before the rows with IND='Y', and the rows with the highest Seq_n wll come last. The LAST_VALUE functino then picks the last value from this list.
4) Rows Between clause. This determines which set of rows the whole analytic function will operate on. The default is Unbounded Preceeding to Current Row, which means that the function will operate on all the current row and all the rows that were retrieved earlier.
[Mod-edit: Frank added missing bracket in URL-definition.]
[Updated on: Thu, 12 June 2008 04:44] by Moderator Report message to a moderator
|
|
|
|
Re: SQL Query re-write assistance requested [message #326680 is a reply to message #326498] |
Thu, 12 June 2008 04:33 |
S.Rajaram
Messages: 1027 Registered: October 2006 Location: United Kingdom
|
Senior Member |
|
|
Another way of doing it.
SQL> l
1 with t
2 as
3 (
4 select 1 id, 1 seq_n, 'Y' indicator from dual union all
5 select 1, 2, 'N' from dual union all
6 select 2, 1, 'N' from dual union all
7 select 2, 2, 'N' from dual union all
8 select 3, 1, 'N' from dual
9 )
10 select id, max(seq_n)
11 keep(dense_rank first order by decode(indicator,'Y',1,2)) max_seq
12* from t group by id
SQL> /
ID MAX_SEQ
---------- ----------
1 1
2 2
3 1
SQL> CREATE TABLE test_table(id number(2),seq_n number(2),ind char(1));
Table created.
SQL> INSERT INTO test_table values(1,1,'Y');
1 row created.
SQL> INSERT INTO test_table values(1,3,'Y');
1 row created.
SQL> INSERT INTO test_table values(1,2,'N');
1 row created.
SQL> INSERT INTO test_table values(2,1,'N');
1 row created.
SQL> INSERT INTO test_table values(2,2,'N');
1 row created.
SQL> INSERT INTO test_table values(3,1,'N');
1 row created.
SQL> INSERT INTO test_table values(3,2,'Y');
1 row created.
SQL> INSERT INTO test_table values(3,3,'Y');
1 row created.
SQL> INSERT INTO test_table values(3,4,'N');
1 row created.
SQL> commit;
Commit complete.
SQL> select * from test_table;
ID SEQ_N I
---------- ---------- -
1 1 Y
1 3 Y
1 2 N
2 1 N
2 2 N
3 1 N
3 2 Y
3 3 Y
3 4 N
9 rows selected.
SQL> l
1 select id, max(seq_n)
2 keep(dense_rank first order by decode(ind,'Y',1,2)) max_seq
3* from test_table group by id
SQL> /
ID MAX_SEQ
---------- ----------
1 3
2 2
3 3
Regards
Raj
|
|
|
Re: SQL Query re-write assistance requested [message #326809 is a reply to message #326680] |
Thu, 12 June 2008 12:06 |
svguerin3
Messages: 44 Registered: November 2005 Location: TX
|
Member |
|
|
Thank you all for the great responses!! I have gone through them, as well as several others from Oracle forums, and formed 3 options for this query (below is what they actually look like). I ran tests of about 500k rows through each one, and the timings were about the same (~40-50s per query). I'm not very good at reading explain plans, so I'm a bit lost at which one to choose. Please let me know what you guys think the best choice should be (regarding performance above all else). If more info about indexes is needed, I can gladly supply it.
I will engage the DBA's on our side to analyze and apply hints to the resulting query to help speed it up even further. Thank you all for your help, this seemingly small change is huge for us!
/* OPTION 1 - 39s */
select co.ord_id,max(co.seq_n) keep(dense_rank first order by co.actv_chan_i desc,co.seq_n desc) seq_n
from t_fbsco_chan_ord co, t_fbsos_offr_stg os
where co.ord_id = os.ord_id
and os.run_id = 98889
group by co.ord_id
/* OPTION 2 - 38.5s */
select
co.ord_id, max(co.seq_n) keep(dense_rank first order by decode(co.actv_chan_i,'Y',1,2)) max_seq
from t_fbsco_chan_ord co, t_fbsos_offr_stg os
where co.ord_id = os.ord_id
and os.run_id = 98889
group by co.ord_id
/* OPTION 3 - */
select ord_id,max_seq
from (select co.ord_id
,last_value(co.seq_n) over (partition by co.ord_id
order by co.actv_chan_i,co.seq_n rows between unbounded preceding and unbounded following) max_seq
,row_number() over (partition by co.ord_id order by null) rnum
from t_fbsco_chan_ord co, t_fbsos_offr_stg os
where co.ord_id = os.ord_id
and os.run_id = 98889 )
where rnum = 1;
|
|
|
Re: SQL Query re-write assistance requested [message #327178 is a reply to message #326809] |
Sat, 14 June 2008 09:13 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
All of these queries do the same thing:- A full scan of the table
- A sort
...which is exactly what the ORIGINAL query was doing.
I would be surprised if ANY of these was a material improvement.
The only way you are going to get a real improvement - short of restructuring your data model or hardware - is to eliminate either the full table scan or the sort.
How many rows per ID are there typically?
If it is more than 10, then you can probably avoid both the full table scan AND the sort.
Create an index on (ID, IND, SEQ_N) in that order and then use DBMS_STATS.GATHER_TABLE_STATS to gather statistics on the table AND the index.
Then create a BITMAP INDEX on just (ID). Gather stats for that one too.
Then try this query:
SELECT DISTINCT id
, NVL(
( SELECT max(seq_n)
FROM tbl
WHERE id = t.id
AND ind = 'Y')
,( SELECT max(seq_n)
FROM tbl
WHERE id = t.id
AND ind = 'N')
)
FROM tbl t
CBO should go to the BITMAP index for the distinct list of IDs - that should be lightning fast. Then for each ID, it will pick the MAX(seq_n) for either Y or N as required from the index without scanning all rows for that ID.
If you have lots more than 10 rows per ID, this should work fine - otherwise it will fail miserably. Post the explain plan if you have lots of rows per ID and it is still not working.
If you have only a few rows per ID, then you cannot (and should not) avoid the full scan. But perhaps you can avoid the sort.
If your table is very large (say, millions of rows) then Oracle will not be able to perform the sort in memory - this generates disk IO. You can help it out by storing a sorted copy as an index.
There's a few way of doing that, but probably the simplest is to keep your ORIGINAL query, and create a function-based index on:
( ID, decode(IND,'Y','1',0') || to_char(SEQ_N,'0000000009') )
Oracle should then use the index to sort the GROUP BY. Look for a GROUP BY (NOSORT) in the explain plan.
Ross Leishman
|
|
|
|
Re: SQL Query re-write assistance requested [message #327565 is a reply to message #327197] |
Mon, 16 June 2008 22:53 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
Yes, it's worth a try. Remember, you are looking for a NOSORT keyword in your explain plan. I haven't tried exploiting function-based indexes for sorting, but it should work. There was a thread here a while ago where they were having difficulty using a function-based index to support an ORDER BY DESC, so I don't know how you'll go.
Ross Leishman
|
|
|
Goto Forum:
Current Time: Thu Dec 26 18:24:45 CST 2024
|