Help: use index [message #111314] |
Tue, 15 March 2005 10:33 |
Jadie
Messages: 64 Registered: January 2002
|
Member |
|
|
We have a big transactions table which is queried daily, based on the transactions the day.
the query we currently used in production is:
SELECT *
FROM POS_TRANSACTIONS
WHERE TRUNC (transaction_date) = TRUNC (v_s_process_date)
AND NVL (settle_trans_id, 0) = 0
AND void_flag = trans_void
AND transaction_type IN (pur_trans, refl_trans)
AND test_flag = 'N'
ORDER BY merchant_store_id;
There is no index used in this query. So I got the "full table scan" explain plan.
There are only 4 transaction types in our system, so I created an index on the filed transaction_type, and changed the query to:
SELECT *
FROM POS_TRANSACTIONS
WHERE transaction_type IN (pur_trans, refl_trans)
AND TRUNC (transaction_date) = TRUNC (v_s_process_date)
AND NVL (settle_trans_id, 0) = 0
AND void_flag = trans_void
AND test_flag = 'N'
ORDER BY merchant_store_id;
I still got the "Full table scan" for it. I don't understand why the index I created is not being used. And according to the statistics, it even takes more time to process it.
Can anyone tell me why? I appreciate it. Thanks
Jadie
|
|
|
|
|
Re: Help: use index [message #111517 is a reply to message #111314] |
Thu, 17 March 2005 01:21 |
Frank
Messages: 7901 Registered: March 2000
|
Senior Member |
|
|
You say there are only 4 transaction types. If those are (more or less) evenly spread (each of them make up for 1/4 of the rows) then an index on this column is useless, especially since you want 2 of these types.
This means you want about half of the rows returned (based on transaction type alone). Generally speaking, indexes are useful when they would select max 20% of the rows.
If there is an index on transaction_date, you disabled it by using TRUNC.
To avoid this, you could rewrite that line to
where transaction_date between trunc(v_s_process_date) and trunc(v_s_process_date - 1) - 1/(24*60*60)
hth
|
|
|
Re: Help: use index [message #111671 is a reply to message #111314] |
Fri, 18 March 2005 09:42 |
smartin
Messages: 1803 Registered: March 2005 Location: Jacksonville, Florida
|
Senior Member |
|
|
Is this just one table? How many total rows are in the table, and how many rows does the query return?
What is the void_flag = trans_void all about? If this is very selective then could be a good choice for index.
Same thing with the test_flag.
Do you really need the nvl() on settle_trans_id?
And how did you analyze the table, and did you do it on a regular basis if the data changes? Do you have histograms?
|
|
|
Re: Help: use index [message #111688 is a reply to message #111671] |
Fri, 18 March 2005 10:50 |
Jadie
Messages: 64 Registered: January 2002
|
Member |
|
|
Thanks for the smartin,
Yes, only one table without partitions at this moment since the table is not that huge yet, but it will grow quickly though basic on the transaction volumn.
Void_flag is not a very selective field either. Two value in this field (0,1). We want to select the transactions basic on the day which is not voided void_flag = 0).
Test_flag is the same.
nvl() is not really necessary as long as the app does not make any mistake to pupolate NULL value into settle_trans_id, which is suposed to be 0.
I have the job to ananlyze the tabled/indexes every day. No use histogram though
jadie
[Updated on: Fri, 18 March 2005 10:52] Report message to a moderator
|
|
|
Re: Help: use index [message #111693 is a reply to message #111517] |
Fri, 18 March 2005 11:03 |
Jadie
Messages: 64 Registered: January 2002
|
Member |
|
|
Thanks Frank,
There is no index on transaction_date field.
There are 4 types of transaction types, but 90% will be returned, then filter the ones that are not on v_s_process_date. search on transaction_date will return less rows for sure (only return one day transactions). Is it the reason the CBO use transaction_date instead of transaction_type with the index?
I thought bitmap index on transaction_type would help, since currently it is doing full table scan!
(half of million rows in the table at this moment)
Jadie
|
|
|
Re: Help: use index [message #111779 is a reply to message #111314] |
Sun, 20 March 2005 00:06 |
smartin
Messages: 1803 Registered: March 2005 Location: Jacksonville, Florida
|
Senior Member |
|
|
Half a million rows is really pretty trivial. You probably don't want a bitmap index on it though because that may kill your transaction performance on putting data into the table in the first place. Plus bitmaps bloat very quickly needing rebuild, and since no column is selective it won't help anyway.
You have a couple options. One is to tune your system to better support the full table scan if query is really processed very often. Could mark it as cache to put it at the MRU end of the LRU buffers. Also check your multiblock read count.
Another option is to try to get better statistics. Make your nightly collection do something along lines of:
dbms_stats.gather_table_stats(<owner>,<table>,cascade=>true,method_opt=>'for all columns size 250');
Especially if your statistics gathering time isn't limitted. And heck, run that in parallel.
Also another option is to partition it by month by that date field. Especially if you expect table to actively grow. Will cause your daily full table scan query to only scan one month instead of all of the data in the table.
And another option is to build a function index on trunc(date). This will slow down inserts slightly, but if there are relatively few results you are getting each day compared to total in your table then it will probably help a lot.
I'm curious how this goes, post back and let us know. Also, how long does the query take now? And when in sqlplus and you enter:
set autotrace traceonly explain statistics
Copy and paste all of the results of that. Do that before and after the function index on date and take off any other indexes.
|
|
|
Re: Help: use index [message #111780 is a reply to message #111314] |
Sun, 20 March 2005 00:08 |
smartin
Messages: 1803 Registered: March 2005 Location: Jacksonville, Florida
|
Senior Member |
|
|
Oh and get rid of the nvl and make the settle_trans_id not null so that the app CAN'T accidently put a null value in it.
|
|
|
Re: Help: use index [message #111833 is a reply to message #111780] |
Mon, 21 March 2005 04:32 |
_simma_dba
Messages: 34 Registered: November 2003
|
Member |
|
|
Try using optimizer hint /*+ INDEX (table_name the_name_of_the_index)*/ just after the SELECT. U have column with low cardinality, only 4 so CBO think that it is better to use table scan rather than index scan. Mahesh is right index scan is not always the best solution. Try to create index on other columns in your where clause.
|
|
|
|
Re: Help: use index [message #115590 is a reply to message #111779] |
Tue, 12 April 2005 16:03 |
Jadie
Messages: 64 Registered: January 2002
|
Member |
|
|
Hi Smartin,
can you take a look at the following to see which one is the better choice? I don't really understand it well
Option 1:
I created the partitions without function index trunc(transaction_date). the explain result is:
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=912 Card=1 Bytes=138
)
1 0 SORT (ORDER BY) (Cost=912 Card=1 Bytes=138)
2 1 PARTITION RANGE (ALL)
3 2 TABLE ACCESS (FULL) OF 'POS_TRANSACTIONS' (Cost=910 Ca
rd=1 Bytes=138)
Option 2:
Partition the table and with the function index trunc(transaction_date)
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=138)
1 0 SORT (ORDER BY) (Cost=4 Card=1 Bytes=138)
2 1 TABLE ACCESS (BY GLOBAL INDEX ROWID) OF 'POS_TRANSACTION
S' (Cost=2 Card=1 Bytes=138)
3 2 INDEX (RANGE SCAN) OF 'PTR_TRANDATE_IND' (NON-UNIQUE)
(Cost=1 Card=1232)
This is the testing environment with partial prod data. I am not able to change the applications for performance tuning at this moment.
Many thanks.
Jadie
|
|
|
Re: Help: use index [message #115593 is a reply to message #111314] |
Tue, 12 April 2005 16:39 |
smartin
Messages: 1803 Registered: March 2005 Location: Jacksonville, Florida
|
Senior Member |
|
|
As long as the index doesn't slow your inserts and any other DML down too much, and you hae the space for it, then might as well have it. Too much is defined based on your specific requirements and system response time. Indexes are a trade-off, you are attempting to speed up queries but at a performance hit on the DML to maintain the index.
Also if you run that again, include the statistics option in there to show the actual stats of how many logical reads and such happened to run the query.
With half a million rows, partitioning is probably overkill, but if that table will grow quickly to be big then it may be a good idea. And if your queries always include the field that you partition by, then the partitions act somewhat like indexes.
All the partitions do is break the table into chunks so that operations have a chance to be done on a single chunk instead of the whole table at once.
Also to really get a feel for how partition elimination works, run your same query and do an explain plan for it. That will tell you exactly which partitions each operation in the plan is operating on.
The only way to know for sure what is the best approach is to test it in full scale, with representative data load. You can "sand box" it with a small sample of data, but one way or the other you will be testing it against a full set of data (either in a QA/TEST db that you have copied directly from production, or production itself when you implement).
|
|
|
Re: Help: use index [message #115706 is a reply to message #111314] |
Wed, 13 April 2005 15:00 |
Jadie
Messages: 64 Registered: January 2002
|
Member |
|
|
Eventually I finish the testing for this performance tuning.
I choose range partition for this table, no function index, the reason is the there are lots of inserts/updates against this table.
And I change the query from
SELECT *
FROM POS_TRANSACTIONS
WHERE TRUNC(transaction_date) = TRUNC(v_s_process_date)
AND NVL (settle_trans_id, 0) = 0
AND void_flag IN (trans_void, trans2_void)
AND transaction_type IN (pur_trans, refl_trans)
AND test_flag = 'N'
ORDER BY merchant_store_id;
To
SELECT *
FROM POS_TRANSACTIONS
WHERE transaction_date >= TRUNC(v_s_process_date)
AND transaction_date < TRUNC(v_s_process_date+1)
AND NVL (settle_trans_id, 0) = 0
AND void_flag IN (trans_void, trans2_void)
AND transaction_type IN (pur_trans, refl_trans)
AND test_flag = 'N'
ORDER BY merchant_store_id;
The response time gets great improvement!
Thanks.
It is wondeful!
J
|
|
|
Re: Help: use index [message #115712 is a reply to message #111314] |
Wed, 13 April 2005 16:24 |
Jadie
Messages: 64 Registered: January 2002
|
Member |
|
|
I found a problem when I did that.
I tried to use the query:
SELECT *
FROM POS_TRANSACTIONS
WHERE transaction_date BETWEEN TRUNC(v_s_process_date) AND (TRUNC(v_s_process_date+1)-1/24/60/60)
AND NVL (settle_trans_id, 0) = 0
AND void_flag IN (trans_void, trans2_void)
AND transaction_type IN (pur_trans, refl_trans)
AND test_flag = 'N'
ORDER BY merchant_store_id;
it is terribly slow. I thought this query is same as
SELECT *
FROM POS_TRANSACTIONS
WHERE transaction_date >= TRUNC(v_s_process_date)
AND transaction_date < TRUNC(v_s_process_date+1)
AND NVL (settle_trans_id, 0) = 0
AND void_flag IN (trans_void, trans2_void)
AND transaction_type IN (pur_trans, refl_trans)
AND test_flag = 'N'
ORDER BY merchant_store_id;
Can you tell me what makes the difference?
J
|
|
|
Re: Help: use index [message #115815 is a reply to message #111314] |
Thu, 14 April 2005 09:55 |
smartin
Messages: 1803 Registered: March 2005 Location: Jacksonville, Florida
|
Senior Member |
|
|
What exactly are you trying to query, in english, with regards to the transaction date and process date? Why are you subtracting a second?
When exactly is your query run? How often?
Would it make sense for you to capture the transaction date as just the date without the time portion? Do you need the time portion in the transaction date in the first place? If you don't, you could avoid the trunc altogether.
Even if you do, you could set up two fields, one for the date and one for the date and time. Just something to consider.
Also, for each query you run and want to compare, do the following, at a minimum, in sqlplus and capture the output, and post it here (and examine it yourself):
set timing on
set autotrace traceonly explain statistics
query version 1
query version 2
set autotrace off
set timing off
That will show you what is really happening on each query. Ther are some other things you can do to learn more, like the explain plan statement to show partition elimination, and tkprof, but this would be a good start.
And even if you are doing lots of DML on the table, and index may still be a good idea, it just depends on what exactly it does to the response time of that DML and if that difference matters to your users. If the time goes from .00001 to .00002 seconds, and nobody notices and your concurrent users are fine, then there is no harm in the index. It is all very specific to your situation.
|
|
|