Re: Improving query performance further

From: yudhi s <learnerdatabase99_at_gmail.com>
Date: Wed, 12 Oct 2022 09:59:52 +0530
Message-ID: <CAEzWdqcBg4UMQ82-_eSBL6jTfPX99b4caFdFN55u4w86Sp6caA_at_mail.gmail.com>



Thank you so much.

There are only three different distinct values for the partition key column PART_COL and thus a total of three partitions exists on this table. And each partition holding an equal amount of rows says ~150K each. And on an avg, this query returns zero rows almost 9 times out of 10 executions. I will try to see the behaviour of IOT VS Keep Pool VS Domain indexes on one of lower environment. I have heard about it to be helpful on text column searches for like operator etc, but cant think of right away how this domain indexes will help the range/between predicate here.

_at_Sayan, Below is the output of the query :-

M0   M2       M1                  M3                    M7
M5                   M8

19 9999999999 109999999999999 2226470000000000000 5599860000000000000 2226470099999999999 5599869999999999999

The statistics on this table is not up-to date , but below is how it looks like. So pattern wise its as below.

COLUMN_NAME NUM_DISTINCT LOW_VALUE                HIGH_VALUE
                             DENSITY                  NUM_NULLS
PART_COL         3       80                          C103
                            0.0000014697539337964            0
LOW_VAL        114992     323232363437303030303030303030303030303
 5353939383630303030303030303030303030 0.00000869625713093085         0
HIGH_VALUE      115792    32323236343730303939393939393939393939
 35353939383639393939393939393939393939 0.00000863617521072268         0


NUM_ROWS BLOCKS AVG_ROW_LEN
248942 3776 96

On Wed, Oct 12, 2022 at 6:41 AM Sayan Malakshinov <xt.and.r_at_gmail.com> wrote:

> Hi Andy
>
> Oh, yes, I also had a couple of examples of own odci indexes for such
> cases, but it would be great to see your variant!
>
>
> Best regards,
> Sayan Malakshinov
> Oracle performance tuning expert
> Oracle Database Developer Choice Award winner
> Oracle ACE
> http://orasql.org
>
> On Wed, 12 Oct 2022, 02:05 Andy Sayer, <andysayer_at_gmail.com> wrote:
>
>> I have a demo for a domain index which allows this sort of query to be
>> very fast. Essentially it creates an index entry for each value (with your
>> desired granularity) between your min and max columns and does look ups
>> based on that. The domain index knows what the granularity is so as well as
>> :bind between x and y it queries new_domain_column between :bind -
>> granularity and :bind + granularity.
>>
>> Unfortunately, the demo is on the pc that I left in the UK when I moved
>> to Canada. This might be enough information for you to get going with,
>> otherwise I was planning on publishing my work when I visit the UK in
>> December.
>>
>> The other, easier but less guaranteed, way of combining two range
>> predicates on different columns is to partition the index by one of the
>> columns. Eg your index would be on some_key, start_date and your
>> partitioning would be on end_date. Oracle would then need to do a range
>> scan using your some_key and start_date filters as access predicates once
>> per partition which matches your end_date filter. You’d have to decide
>> which column is the partition key and how wide to make the partitions, and
>> you might not gain a lot.
>>
>> Thanks,
>> Andy
>>
>> On Tue, Oct 11, 2022 at 5:38 PM, Sayan Malakshinov <xt.and.r_at_gmail.com>
>> wrote:
>>
>>> I tuned a lot of similar queries: tried different methods (even spatial
>>> RTree indexes) and made a lot of tests, so I'll try to describe and
>>> summarize them all later, when I have more time, but for now just in short:
>>>
>>> Let''s simplify it just to:
>>> - table T(A,B)
>>> - index (A,B)
>>> - select * from T where :X between A and B
>>> - ":X between A and B" is very selective
>>>
>>> The main problem in this case is that you need to check all index
>>> entries A<:X, for example you have:
>>>
>>> A | B
>>> --------
>>> 1 1
>>> 1 2
>>> 1 3
>>> .....
>>> 1 100
>>> 2 107
>>> 2 111
>>> ....
>>> 2 215
>>> 3 204
>>> 3 206
>>> ....
>>> 3 299
>>> ...
>>> 998 99799
>>> 999 99801
>>> ...
>>> 999 99900
>>>
>>> where each A has about 100 different B,
>>> and your :X = 700, so, though there are just about 100 rows satisfying
>>> both conditions, INDEX RANGE SCAN with access predicate A<:X will scan
>>> 700*100 index entries, and filter out 699*100 of them by filter predicate
>>> (B>:X).
>>>
>>> So we have just a few different situations and key points:
>>>
>>> ===================================================================================
>>>
>>> *0. There is a limitation of max(B-A) and it's quite small, ie
>>> max(B[x]-A[x]) << max(B) - min(A)*In this case, you can simply add a
>>> couple of additional predicates:
>>> select *
>>> from T
>>> where A between :X-:MAX_DIFF and :X
>>> and B between :X-:MAX_DIFF and :X
>>> ;
>>> So you''ll get a short search range for A (ACCESS predicates for the
>>> first column of the index).
>>> //For example, it may be short intervals or start/end dates of short
>>> time events, with check constraints like (b-a<N)
>>>
>>> ===================================================================================
>>>
>>>
>>> *1. There is a perfect positive correlation between A and B, ie if
>>> a2>a1, must be b2>b1*
>>> -----------------------------------------------------------------------------------
>>>
>>> *1.1 There is no intersection of any 2 intervals (Ax - Bx), (Ay - By)*
>>> This means that for any :X you need to find just 1 row, so you can stop
>>> to scan the index after the first found row.
>>> We can rewrite our query for that like this:
>>>
>>> select *
>>> from
>>> (select * from T where A<=:X order by A desc fetch first 1 row only)
>>> -- to force IRS DESCENDING (IRS - index range scan) and stop after 1 row
>>> where
>>> B>:X -- our second predicate is one level upper, so we check just 1
>>> row maximum
>>>
>>> -- IIRC there were some problems with IRS DESC in case of "fetch first",
>>> so I'll rewrite it to make it more reliable:
>>> select *
>>> from
>>> (select *
>>> from
>>> (select * from T where A<=:X order by A desc) -- to force IRS
>>> DESCENDING
>>> where rownum=1) -- and stop after 1 row
>>> where
>>> B>:X -- our second predicate
>>>
>>>
>>> -----------------------------------------------------------------------------------
>>>
>>> *1.2 There may be intersections:*This means that you can't stop after
>>> the first found row satisfying our conditions. We need to find all of them.
>>> Unfortunately, there is no a documented method to make a dynamic SQL
>>> condition like "please stop, I don't need other rows", but we can simply
>>> use PL/SQL for that:
>>>
>>> declare
>>> X number := 34;
>>>
>>> cursor c_1(p number) is
>>> select * from T where A<=p order by A desc;
>>> R c_1%rowtype;
>>> begin
>>> open c_1(X);
>>> <<LL>>
>>> loop
>>> fetch c_1 into R;
>>> if R.B>=X then
>>> dbms_output.put_line(R.A ||' '||R.B); -- or pipe row or collect
>>> ROWIDS and return them as a collection, etc
>>> else
>>> exit LL;
>>> end if;
>>> end loop;
>>> end;
>>> /
>>>
>>> As you can see, we stop fetching cursor C_1 when R.B becomes lower than
>>> X.
>>> Of course to make it more convenient you can wrap it into a pipelined
>>> pl/sql table function or even use it with INLINE pl/sql functions.
>>>
>>>
>>> ===================================================================================
>>>
>>> *2. There is no correlation between A and B, ie if a2>a1, it doesn't
>>> mean that b2>b1 :*
>>> This one is the most difficult problem, because in case of unknown X we
>>> need to check all rows that have A<=X (or all rows with B>=X).
>>> Obviously, it would be great if we know some common patterns and
>>> distribution of A/B/X, for example:
>>> - A and B are dates -- like table DT (start_date date, end_date
>>> date...)
>>> - min(A) and min(B) ~ 10-20 years ago
>>> - max(A) and max(B) ~ sysdate
>>> - and X usually is in the range of [sysdate-10; sysdate]
>>> - number of rows satisfying our conditions are pretty small
>>> In this case we could create index (B,A), just because we have less rows
>>> satisfying B>=X than A<=X.
>>>
>>> Or sometimes it maybe useful to create 2 indexes (A) and (B) and use
>>> bitmap_and operation like this:
>>>
>>> select/*+ index_combine(t) */ * from t where :X between a and b;
>>> ------------------------------------------------------------
>>> | Id | Operation | Name |
>>> ------------------------------------------------------------
>>> | 0 | SELECT STATEMENT | |
>>> | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T |
>>> | 2 | BITMAP CONVERSION TO ROWIDS | |
>>> | 3 | BITMAP AND | |
>>> | 4 | BITMAP CONVERSION FROM ROWIDS | |
>>> | 5 | SORT ORDER BY | |
>>> |* 6 | INDEX RANGE SCAN | IX_B |
>>> | 7 | BITMAP CONVERSION FROM ROWIDS | |
>>> | 8 | SORT ORDER BY | |
>>> |* 9 | INDEX RANGE SCAN | IX_A |
>>> ------------------------------------------------------------
>>>
>>> Predicate Information (identified by operation id):
>>> ---------------------------------------------------
>>>
>>> 6 - access("B">=TO_NUMBER(:X))
>>> filter("B">=TO_NUMBER(:X))
>>> 9 - access("A"<=TO_NUMBER(:X))
>>> filter("A"<=TO_NUMBER(:X))
>>>
>>> ===================================================================================
>>> But in general, there is no silver bullet for such cases...
>>>
>>> On Tue, Oct 11, 2022 at 10:58 PM Tim Gorman <tim.evdbt_at_gmail.com> wrote:
>>>
>>>> Have you considered making TABLE1 an IOT? No, not an "internet of
>>>> things" but an "index-organized table"...
>>>>
>>>> If the primary method of access is the primary key, and TABLE1 is has
>>>> "narrow" rows (i.e. AVG_ROW_LEN less than 20-25 bytes or so), then an IOT
>>>> could save on the number of logical reads. There's a lot of "if"s, but the
>>>> best approach is not to think about it, but just go ahead and test it,
>>>> side-by-side with the existing table. After all, it's only about ~100MB in
>>>> size, right?
>>>>
>>>> But, at the very least, it shouldn't be difficult to just put the table
>>>> and the PK index together into the KEEP pool of the Buffer Cache? After
>>>> all, although the ideal is to size the KEEP pool to accommodate the entire
>>>> objects assigned to it, you certainly aren't required to size it that way.
>>>> You just want to size it so that buffers flush out far more slowly than
>>>> they do in the DEFAULT pool.
>>>>
>>>> <rant>Too many DBAs think that the SGA should only use about 50% of the
>>>> available physical memory on a host, which is nonsense. The Linux/UNIX
>>>> operating systems only need a few GB of memory, and AWR can tell you
>>>> unequivocally how much space is needed for PGA, so the SGA should be sized
>>>> closer to the Oracle-imposed maximum of 90% of host physical memory. It's
>>>> there. It's not being used. Use it. If I had a nickel for every unused
>>>> GB of RAM on Oracle database servers, I could buy my own Hawaiian
>>>> island.</rant>
>>>>
>>>> Hope this helps!
>>>>
>>>> Enjoy!
>>>>
>>>>
>>>>
>>>> On 10/11/2022 2:04 PM, yudhi s wrote:
>>>>
>>>> Hello Listers, We have a customer database on Oracle version 19C. We
>>>> have a simple query as below. and as per current design is executing ~200
>>>> to 300 times per second and it's part of a bigger process and thus is one
>>>> of the top consumers in that. Now as we are working to change the design to
>>>> make the number of execution of this query lesser to help the process. But
>>>> that needs much more impact analysis, so we were thinking of any possible
>>>> easy way to make the individual execution of this query faster? Or say any
>>>> structural change(new index etc.) which can further drop the IO/CPU
>>>> requirement for individual execution of this query?
>>>>
>>>> Currently this query is accessing table TABLE1 through a primary key
>>>> which is on three columns (PART_COL,MIN_VALUE,MAX_VAL). The table is
>>>> partitioned on column PART_COL. This table contains ~400K rows and is
>>>> ~100MB in size. It's a master data kind of table.
>>>>
>>>> SELECT column1 FROM TABLE1 WHERE PART_COL = :B2 AND :B1 BETWEEN
>>>> MIN_VAL AND MAX_VALUE
>>>>
>>>> Global Information
>>>> ------------------------------
>>>> Status : DONE (ALL ROWS)
>>>> Instance ID : 1
>>>> SQL Execution ID : 16777216
>>>> Execution Started : 10/11/2022 09:36:48
>>>> First Refresh Time : 10/11/2022 09:36:48
>>>> Last Refresh Time : 10/11/2022 09:36:48
>>>> Duration : .06173s
>>>> Module/Action : SQL*Plus/-
>>>> Program : sqlplus.exe
>>>> Fetch Calls : 1
>>>>
>>>> Binds
>>>>
>>>> ========================================================================================================================
>>>> | Name | Position | Type | Value
>>>> |
>>>>
>>>> ========================================================================================================================
>>>> | :B2 | 1 | NUMBER | 2 |
>>>> | :B1 | 2 | VARCHAR2(4000) | XXXXXXXXXXX
>>>> |
>>>>
>>>> ========================================================================================================================
>>>>
>>>> Global Stats
>>>>
>>>> =========================================================================================
>>>> | Elapsed | Cpu | IO | Concurrency | Cluster | Fetch | Buffer |
>>>> Read | Read |
>>>> | Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Calls | Gets |
>>>> Reqs | Bytes |
>>>>
>>>> =========================================================================================
>>>> | 0.06 | 0.04 | 0.02 | 0.00 | 0.00 | 1 | 911 | 778 | 6MB |
>>>>
>>>> =========================================================================================
>>>>
>>>> SQL Plan Monitoring Details (Plan Hash Value=692467662)
>>>>
>>>> ======================================================================================================================================================================================
>>>> | Id | Operation | Name | Rows | Cost |
>>>> Time | Start | Execs | Rows | Read | Read | Activity | Activity Detail |
>>>> | | | | (Estim) | | Active(s) |
>>>> Active | | (Actual) | Reqs | Bytes | (%) | (# samples) |
>>>>
>>>> ======================================================================================================================================================================================
>>>> | 0 | SELECT STATEMENT | | | | |
>>>> | 1 | | | | | |
>>>> | 1 | PARTITION RANGE SINGLE | | 10610 | 928 |
>>>> | | 1 | | | | | |
>>>> | 2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | TABLE1 |
>>>> 10610 | 928 | | | 1 | | | | | |
>>>> | 3 | INDEX RANGE SCAN | PK_TABLE1 | 10610 | 771
>>>> | | | 1 | | 770 | 6MB | | |
>>>>
>>>> ======================================================================================================================================================================================
>>>>
>>>> Predicate Information (identified by operation id):
>>>> ---------------------------------------------------
>>>> 3 - access("PART_COL"=TO_NUMBER(:B2) AND "MAX_VALUE">=:B1 AND
>>>> "MIN_VAL"<=:B1)
>>>> filter("MAX_VALUE">=:B1)
>>>>
>>>> Statistics
>>>> ----------------------------------------------------------
>>>> 37 recursive calls
>>>> 0 db block gets
>>>> 911 consistent gets
>>>> 778 physical reads
>>>> 41076 redo size
>>>> 260 bytes sent via SQL*Net to client
>>>> 489 bytes received via SQL*Net from client
>>>> 1 SQL*Net roundtrips to/from client
>>>> 28 sorts (memory)
>>>> 0 sorts (disk)
>>>> 0 rows processed
>>>>
>>>>
>>>>
>>>
>>> --
>>> Best regards,
>>> Sayan Malakshinov
>>> Oracle performance tuning engineer
>>> Oracle ACE
>>> http://orasql.org
>>>
>>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Oct 12 2022 - 06:29:52 CEST

Original text of this message