Re: Improving query performance further
Date: Thu, 13 Oct 2022 01:04:59 +0530
Message-ID: <CAEzWdqc0C-Bfq4yCeqOCzV9uLHe+sA1vBurxVCnpDs1165XXVA_at_mail.gmail.com>
I tried on dev by copying some sample data, but i see by compressing the index compress 1(i.e. part_col) , its just reducing the consistent gets from ~82 to ~79 and same effect is observed , if i move the column part_col to the end of the primary key i.e. index on (min_val, max_val,part_col). Even just indexing (min_val,max_val). So it seems it will not help to a great extent in minimizing the IO or response time.
Also seeing the data pattern , i see the input bind value is varying a lot and for few of the incoming binds are closer to MAX_VAL column and in those cases the index on (MAX_VAL , MIN_VAL) will help , however many input binds are closer to MIN_VAL column value, so they will suffer with that index order. And for those index on reverse order i.e. (MIN_VAL, MAX_VAL) will be suitable. There seems no easy choice here.
Also for the majority of the values, the first 6 digits are the same for column max_val and min_val, but I do see few of the cases where it's not the same.
Below is the statistics of two executions I manually did for two sample bind variables. But as i mentioned the current execution yields a single row only 10% of the time and rest of the time it results in zero rows. So below two executions were giving zero rows only for selected input binds.
Plan hash value: 692467662
| Id | Operation | Name | Starts | E-Rows |A-Rows | A-Time | Buffers |
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 | 169 | | 1 | PARTITION RANGE SINGLE | | 1 | 20548 | 0 |00:00:00.01 | 169 | | 2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| TABLE1 | 1 | 20548 | 0 |00:00:00.01 | 169 | |* 3 | INDEX RANGE SCAN | PK_TABLE1 | 1 | 20548 | 0|00:00:00.01 | 169 |
Predicate Information (identified by operation id):
3 - access("PART_COL"=:B2 AND "MAX_VAL">=:B1 AND "MIN_VAL"<=:B1) filter("MAX_VAL">=:B1)
Plan hash value: 692467662
| Id | Operation | Name | Starts | E-Rows |A-Rows | A-Time | Buffers | Reads |
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:01.99 | 37 | 37 | | 1 | PARTITION RANGE SINGLE | | 1 | 20548 | 0 |00:00:01.99 | 37 | 37 | | 2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| TABLE1 | 1 | 20548 | 0 |00:00:01.99 | 37 | 37 | |* 3 | INDEX RANGE SCAN | PK_TABLE1 | 1 | 20548 | 0|00:00:01.99 | 37 | 37 |
Predicate Information (identified by operation id):
3 - access("PART_COL"=:B2 AND "MAX_VAL">=:B1 AND "MIN_VAL"<=:B1) filter("MAX_VAL">=:B1)
On Wed, Oct 12, 2022 at 6:18 PM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:
>
> The table has three partitions, so it would be better to run the queries
> for each partition in turn. Because of the bind variables the optimizer
> might still use the global stats to estimate cost and/or cardinality, but
> there might be something about the partition-level stats that affect the
> choice of strategy (e.g the max(max_val - min_val) might be very large
> only in one partition out of three, and that partition may have only a very
> small number of rows).
>
> You appear to be storing a 19 digit number as a varchar2(4000) - it's a
> bad idea (generally) to store numbers in character columns, and it's a bad
> idea to define character columns with lengths that are vastly longer than
> needed. The latter may not matter very much in this case with a tiny
> table, but ask Sayan to point you to his blog notes on the effects of row
> definitions being larger than the blocksize.
>
> As Sayan said - can you give us some idea of how the data is treated - for
> example, does a row appear in one partition and then get from one partition
> to the next over time; is there something about the data that makes the
> first 6 digits of the values significant; are rows constantly updated on
> these column, or is there a constant trickle of inserts with a balancing
> number of deletes etc. The right strategy is often dictated by the data
> patterns and activity.
>
>
> Which version of Oracle, by the way ?
>
> Regards
> Jonathan Lewis
>
>
>
> On Wed, 12 Oct 2022 at 05:30, yudhi s <learnerdatabase99_at_gmail.com> wrote:
>
>> Thank you so much.
>>
>> There are only three different distinct values for the partition key
>> column PART_COL and thus a total of three partitions exists on this table.
>> And each partition holding an equal amount of rows says ~150K each. And on
>> an avg, this query returns zero rows almost 9 times out of 10 executions.
>>
>> I will try to see the behaviour of IOT VS Keep Pool VS Domain indexes on
>> one of lower environment. I have heard about it to be helpful on text
>> column searches for like operator etc, but cant think of right away how
>> this domain indexes will help the range/between predicate here.
>>
>> _at_Sayan, Below is the output of the query :-
>>
>> M0 M2 M1 M3 M7
>> M5 M8
>> 19 9999999999 109999999999999 2226470000000000000 5599860000000000000
>> 2226470099999999999 5599869999999999999
>>
>> The statistics on this table is not up-to date , but below is how it
>> looks like. So pattern wise its as below.
>>
>> COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE
>> DENSITY NUM_NULLS
>> PART_COL 3 80 C103
>> 0.0000014697539337964 0
>> LOW_VAL 114992 323232363437303030303030303030303030303
>> 5353939383630303030303030303030303030 0.00000869625713093085 0
>> HIGH_VALUE 115792 32323236343730303939393939393939393939
>> 35353939383639393939393939393939393939 0.00000863617521072268 0
>>
>>
>> NUM_ROWS BLOCKS AVG_ROW_LEN
>> 248942 3776 96
>>
>>
>>
>>
>> On Wed, Oct 12, 2022 at 6:41 AM Sayan Malakshinov <xt.and.r_at_gmail.com>
>> wrote:
>>
>>> Hi Andy
>>>
>>> Oh, yes, I also had a couple of examples of own odci indexes for such
>>> cases, but it would be great to see your variant!
>>>
>>>
>>> Best regards,
>>> Sayan Malakshinov
>>> Oracle performance tuning expert
>>> Oracle Database Developer Choice Award winner
>>> Oracle ACE
>>> http://orasql.org
>>>
>>> On Wed, 12 Oct 2022, 02:05 Andy Sayer, <andysayer_at_gmail.com> wrote:
>>>
>>>> I have a demo for a domain index which allows this sort of query to be
>>>> very fast. Essentially it creates an index entry for each value (with your
>>>> desired granularity) between your min and max columns and does look ups
>>>> based on that. The domain index knows what the granularity is so as well as
>>>> :bind between x and y it queries new_domain_column between :bind -
>>>> granularity and :bind + granularity.
>>>>
>>>> Unfortunately, the demo is on the pc that I left in the UK when I moved
>>>> to Canada. This might be enough information for you to get going with,
>>>> otherwise I was planning on publishing my work when I visit the UK in
>>>> December.
>>>>
>>>> The other, easier but less guaranteed, way of combining two range
>>>> predicates on different columns is to partition the index by one of the
>>>> columns. Eg your index would be on some_key, start_date and your
>>>> partitioning would be on end_date. Oracle would then need to do a range
>>>> scan using your some_key and start_date filters as access predicates once
>>>> per partition which matches your end_date filter. You’d have to decide
>>>> which column is the partition key and how wide to make the partitions, and
>>>> you might not gain a lot.
>>>>
>>>> Thanks,
>>>> Andy
>>>>
>>>> On Tue, Oct 11, 2022 at 5:38 PM, Sayan Malakshinov <xt.and.r_at_gmail.com>
>>>> wrote:
>>>>
>>>>> I tuned a lot of similar queries: tried different methods (even
>>>>> spatial RTree indexes) and made a lot of tests, so I'll try to describe and
>>>>> summarize them all later, when I have more time, but for now just in short:
>>>>>
>>>>> Let''s simplify it just to:
>>>>> - table T(A,B)
>>>>> - index (A,B)
>>>>> - select * from T where :X between A and B
>>>>> - ":X between A and B" is very selective
>>>>>
>>>>> The main problem in this case is that you need to check all index
>>>>> entries A<:X, for example you have:
>>>>>
>>>>> A | B
>>>>> --------
>>>>> 1 1
>>>>> 1 2
>>>>> 1 3
>>>>> .....
>>>>> 1 100
>>>>> 2 107
>>>>> 2 111
>>>>> ....
>>>>> 2 215
>>>>> 3 204
>>>>> 3 206
>>>>> ....
>>>>> 3 299
>>>>> ...
>>>>> 998 99799
>>>>> 999 99801
>>>>> ...
>>>>> 999 99900
>>>>>
>>>>> where each A has about 100 different B,
>>>>> and your :X = 700, so, though there are just about 100 rows satisfying
>>>>> both conditions, INDEX RANGE SCAN with access predicate A<:X will scan
>>>>> 700*100 index entries, and filter out 699*100 of them by filter predicate
>>>>> (B>:X).
>>>>>
>>>>> So we have just a few different situations and key points:
>>>>>
>>>>> ===================================================================================
>>>>>
>>>>> *0. There is a limitation of max(B-A) and it's quite small, ie
>>>>> max(B[x]-A[x]) << max(B) - min(A)*In this case, you can simply add a
>>>>> couple of additional predicates:
>>>>> select *
>>>>> from T
>>>>> where A between :X-:MAX_DIFF and :X
>>>>> and B between :X-:MAX_DIFF and :X
>>>>> ;
>>>>> So you''ll get a short search range for A (ACCESS predicates for the
>>>>> first column of the index).
>>>>> //For example, it may be short intervals or start/end dates of short
>>>>> time events, with check constraints like (b-a<N)
>>>>>
>>>>> ===================================================================================
>>>>>
>>>>>
>>>>> *1. There is a perfect positive correlation between A and B, ie if
>>>>> a2>a1, must be b2>b1*
>>>>> -----------------------------------------------------------------------------------
>>>>>
>>>>> *1.1 There is no intersection of any 2 intervals (Ax - Bx), (Ay - By)*
>>>>> This means that for any :X you need to find just 1 row, so you can
>>>>> stop to scan the index after the first found row.
>>>>> We can rewrite our query for that like this:
>>>>>
>>>>> select *
>>>>> from
>>>>> (select * from T where A<=:X order by A desc fetch first 1 row only)
>>>>> -- to force IRS DESCENDING (IRS - index range scan) and stop after 1 row
>>>>> where
>>>>> B>:X -- our second predicate is one level upper, so we check just 1
>>>>> row maximum
>>>>>
>>>>> -- IIRC there were some problems with IRS DESC in case of "fetch
>>>>> first", so I'll rewrite it to make it more reliable:
>>>>> select *
>>>>> from
>>>>> (select *
>>>>> from
>>>>> (select * from T where A<=:X order by A desc) -- to force IRS
>>>>> DESCENDING
>>>>> where rownum=1) -- and stop after 1 row
>>>>> where
>>>>> B>:X -- our second predicate
>>>>>
>>>>>
>>>>> -----------------------------------------------------------------------------------
>>>>>
>>>>> *1.2 There may be intersections:*This means that you can't stop after
>>>>> the first found row satisfying our conditions. We need to find all of them.
>>>>> Unfortunately, there is no a documented method to make a dynamic SQL
>>>>> condition like "please stop, I don't need other rows", but we can simply
>>>>> use PL/SQL for that:
>>>>>
>>>>> declare
>>>>> X number := 34;
>>>>>
>>>>> cursor c_1(p number) is
>>>>> select * from T where A<=p order by A desc;
>>>>> R c_1%rowtype;
>>>>> begin
>>>>> open c_1(X);
>>>>> <<LL>>
>>>>> loop
>>>>> fetch c_1 into R;
>>>>> if R.B>=X then
>>>>> dbms_output.put_line(R.A ||' '||R.B); -- or pipe row or collect
>>>>> ROWIDS and return them as a collection, etc
>>>>> else
>>>>> exit LL;
>>>>> end if;
>>>>> end loop;
>>>>> end;
>>>>> /
>>>>>
>>>>> As you can see, we stop fetching cursor C_1 when R.B becomes lower
>>>>> than X.
>>>>> Of course to make it more convenient you can wrap it into a pipelined
>>>>> pl/sql table function or even use it with INLINE pl/sql functions.
>>>>>
>>>>>
>>>>> ===================================================================================
>>>>>
>>>>> *2. There is no correlation between A and B, ie if a2>a1, it doesn't
>>>>> mean that b2>b1 :*
>>>>> This one is the most difficult problem, because in case of unknown X
>>>>> we need to check all rows that have A<=X (or all rows with B>=X).
>>>>> Obviously, it would be great if we know some common patterns and
>>>>> distribution of A/B/X, for example:
>>>>> - A and B are dates -- like table DT (start_date date, end_date
>>>>> date...)
>>>>> - min(A) and min(B) ~ 10-20 years ago
>>>>> - max(A) and max(B) ~ sysdate
>>>>> - and X usually is in the range of [sysdate-10; sysdate]
>>>>> - number of rows satisfying our conditions are pretty small
>>>>> In this case we could create index (B,A), just because we have less
>>>>> rows satisfying B>=X than A<=X.
>>>>>
>>>>> Or sometimes it maybe useful to create 2 indexes (A) and (B) and use
>>>>> bitmap_and operation like this:
>>>>>
>>>>> select/*+ index_combine(t) */ * from t where :X between a and b;
>>>>> ------------------------------------------------------------
>>>>> | Id | Operation | Name |
>>>>> ------------------------------------------------------------
>>>>> | 0 | SELECT STATEMENT | |
>>>>> | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T |
>>>>> | 2 | BITMAP CONVERSION TO ROWIDS | |
>>>>> | 3 | BITMAP AND | |
>>>>> | 4 | BITMAP CONVERSION FROM ROWIDS | |
>>>>> | 5 | SORT ORDER BY | |
>>>>> |* 6 | INDEX RANGE SCAN | IX_B |
>>>>> | 7 | BITMAP CONVERSION FROM ROWIDS | |
>>>>> | 8 | SORT ORDER BY | |
>>>>> |* 9 | INDEX RANGE SCAN | IX_A |
>>>>> ------------------------------------------------------------
>>>>>
>>>>> Predicate Information (identified by operation id):
>>>>> ---------------------------------------------------
>>>>>
>>>>> 6 - access("B">=TO_NUMBER(:X))
>>>>> filter("B">=TO_NUMBER(:X))
>>>>> 9 - access("A"<=TO_NUMBER(:X))
>>>>> filter("A"<=TO_NUMBER(:X))
>>>>>
>>>>> ===================================================================================
>>>>> But in general, there is no silver bullet for such cases...
>>>>>
>>>>> On Tue, Oct 11, 2022 at 10:58 PM Tim Gorman <tim.evdbt_at_gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Have you considered making TABLE1 an IOT? No, not an "internet of
>>>>>> things" but an "index-organized table"...
>>>>>>
>>>>>> If the primary method of access is the primary key, and TABLE1 is has
>>>>>> "narrow" rows (i.e. AVG_ROW_LEN less than 20-25 bytes or so), then an IOT
>>>>>> could save on the number of logical reads. There's a lot of "if"s, but the
>>>>>> best approach is not to think about it, but just go ahead and test it,
>>>>>> side-by-side with the existing table. After all, it's only about ~100MB in
>>>>>> size, right?
>>>>>>
>>>>>> But, at the very least, it shouldn't be difficult to just put the
>>>>>> table and the PK index together into the KEEP pool of the Buffer Cache?
>>>>>> After all, although the ideal is to size the KEEP pool to accommodate the
>>>>>> entire objects assigned to it, you certainly aren't required to size it
>>>>>> that way. You just want to size it so that buffers flush out far more
>>>>>> slowly than they do in the DEFAULT pool.
>>>>>>
>>>>>> <rant>Too many DBAs think that the SGA should only use about 50% of
>>>>>> the available physical memory on a host, which is nonsense. The Linux/UNIX
>>>>>> operating systems only need a few GB of memory, and AWR can tell you
>>>>>> unequivocally how much space is needed for PGA, so the SGA should be sized
>>>>>> closer to the Oracle-imposed maximum of 90% of host physical memory. It's
>>>>>> there. It's not being used. Use it. If I had a nickel for every unused
>>>>>> GB of RAM on Oracle database servers, I could buy my own Hawaiian
>>>>>> island.</rant>
>>>>>>
>>>>>> Hope this helps!
>>>>>>
>>>>>> Enjoy!
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/11/2022 2:04 PM, yudhi s wrote:
>>>>>>
>>>>>> Hello Listers, We have a customer database on Oracle version 19C. We
>>>>>> have a simple query as below. and as per current design is executing ~200
>>>>>> to 300 times per second and it's part of a bigger process and thus is one
>>>>>> of the top consumers in that. Now as we are working to change the design to
>>>>>> make the number of execution of this query lesser to help the process. But
>>>>>> that needs much more impact analysis, so we were thinking of any possible
>>>>>> easy way to make the individual execution of this query faster? Or say any
>>>>>> structural change(new index etc.) which can further drop the IO/CPU
>>>>>> requirement for individual execution of this query?
>>>>>>
>>>>>> Currently this query is accessing table TABLE1 through a primary key
>>>>>> which is on three columns (PART_COL,MIN_VALUE,MAX_VAL). The table is
>>>>>> partitioned on column PART_COL. This table contains ~400K rows and is
>>>>>> ~100MB in size. It's a master data kind of table.
>>>>>>
>>>>>> SELECT column1 FROM TABLE1 WHERE PART_COL = :B2 AND :B1 BETWEEN
>>>>>> MIN_VAL AND MAX_VALUE
>>>>>>
>>>>>> Global Information
>>>>>> ------------------------------
>>>>>> Status : DONE (ALL ROWS)
>>>>>> Instance ID : 1
>>>>>> SQL Execution ID : 16777216
>>>>>> Execution Started : 10/11/2022 09:36:48
>>>>>> First Refresh Time : 10/11/2022 09:36:48
>>>>>> Last Refresh Time : 10/11/2022 09:36:48
>>>>>> Duration : .06173s
>>>>>> Module/Action : SQL*Plus/-
>>>>>> Program : sqlplus.exe
>>>>>> Fetch Calls : 1
>>>>>>
>>>>>> Binds
>>>>>>
>>>>>> ========================================================================================================================
>>>>>> | Name | Position | Type | Value
>>>>>> |
>>>>>>
>>>>>> ========================================================================================================================
>>>>>> | :B2 | 1 | NUMBER | 2
>>>>>> |
>>>>>> | :B1 | 2 | VARCHAR2(4000) | XXXXXXXXXXX
>>>>>> |
>>>>>>
>>>>>> ========================================================================================================================
>>>>>>
>>>>>> Global Stats
>>>>>>
>>>>>> =========================================================================================
>>>>>> | Elapsed | Cpu | IO | Concurrency | Cluster | Fetch | Buffer |
>>>>>> Read | Read |
>>>>>> | Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Calls | Gets
>>>>>> | Reqs | Bytes |
>>>>>>
>>>>>> =========================================================================================
>>>>>> | 0.06 | 0.04 | 0.02 | 0.00 | 0.00 | 1 | 911 | 778 | 6MB
>>>>>> |
>>>>>>
>>>>>> =========================================================================================
>>>>>>
>>>>>> SQL Plan Monitoring Details (Plan Hash Value=692467662)
>>>>>>
>>>>>> ======================================================================================================================================================================================
>>>>>> | Id | Operation | Name | Rows | Cost |
>>>>>> Time | Start | Execs | Rows | Read | Read | Activity | Activity Detail |
>>>>>> | | | | (Estim) | | Active(s) |
>>>>>> Active | | (Actual) | Reqs | Bytes | (%) | (# samples) |
>>>>>>
>>>>>> ======================================================================================================================================================================================
>>>>>> | 0 | SELECT STATEMENT | | | |
>>>>>> | | 1 | | | | | |
>>>>>> | 1 | PARTITION RANGE SINGLE | | 10610 | 928
>>>>>> | | | 1 | | | | | |
>>>>>> | 2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | TABLE1 |
>>>>>> 10610 | 928 | | | 1 | | | | | |
>>>>>> | 3 | INDEX RANGE SCAN | PK_TABLE1 | 10610 |
>>>>>> 771 | | | 1 | | 770 | 6MB | | |
>>>>>>
>>>>>> ======================================================================================================================================================================================
>>>>>>
>>>>>> Predicate Information (identified by operation id):
>>>>>> ---------------------------------------------------
>>>>>> 3 - access("PART_COL"=TO_NUMBER(:B2) AND "MAX_VALUE">=:B1 AND
>>>>>> "MIN_VAL"<=:B1)
>>>>>> filter("MAX_VALUE">=:B1)
>>>>>>
>>>>>> Statistics
>>>>>> ----------------------------------------------------------
>>>>>> 37 recursive calls
>>>>>> 0 db block gets
>>>>>> 911 consistent gets
>>>>>> 778 physical reads
>>>>>> 41076 redo size
>>>>>> 260 bytes sent via SQL*Net to client
>>>>>> 489 bytes received via SQL*Net from client
>>>>>> 1 SQL*Net roundtrips to/from client
>>>>>> 28 sorts (memory)
>>>>>> 0 sorts (disk)
>>>>>> 0 rows processed
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Sayan Malakshinov
>>>>> Oracle performance tuning engineer
>>>>> Oracle ACE
>>>>> http://orasql.org
>>>>>
>>>>
-- http://www.freelists.org/webpage/oracle-lReceived on Wed Oct 12 2022 - 21:34:59 CEST