Re: Improving query performance further

From: Laurentiu Oprea <laurentiu.oprea06_at_gmail.com>
Date: Wed, 12 Oct 2022 12:32:17 +0200
Message-ID: <CA+riqSVYKY9TjJyi_qF2UmuEhyYjjU=xj1fUoWwudBsqwtF_Vw_at_mail.gmail.com>

Hello,

Because max(max_val - min_val) = 109999999999999 this means the first 4 digits should always match (maybe you can validate with the designers if this is the case ) . If the values between 2226 and 5599 (which are the first 4 digits ) are equally distributed adding an equality condition like: substr(:B1,1,4) = substr(MIN_VAL,1,4) and create an index on (PART_COL,substr(MIN_VAL,1,4),MIN_VAL,MAX_VALUE,column1) ,ideally keep it in memory, probably will make you query faster.

Thanks.

În mie., 12 oct. 2022 la 06:30, yudhi s <learnerdatabase99_at_gmail.com> a scris:

> Thank you so much.
>
> There are only three different distinct values for the partition key
> column PART_COL and thus a total of three partitions exists on this table.
> And each partition holding an equal amount of rows says ~150K each. And on
> an avg, this query returns zero rows almost 9 times out of 10 executions.
>
> I will try to see the behaviour of IOT VS Keep Pool VS Domain indexes on
> one of lower environment. I have heard about it to be helpful on text
> column searches for like operator etc, but cant think of right away how
> this domain indexes will help the range/between predicate here.
>
> _at_Sayan, Below is the output of the query :-
>
> M0 M2 M1 M3 M7
> M5 M8
> 19 9999999999 109999999999999 2226470000000000000 5599860000000000000
> 2226470099999999999 5599869999999999999
>
> The statistics on this table is not up-to date , but below is how it looks
> like. So pattern wise its as below.
>
> COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE
> DENSITY NUM_NULLS
> PART_COL 3 80 C103
> 0.0000014697539337964 0
> LOW_VAL 114992 323232363437303030303030303030303030303
> 5353939383630303030303030303030303030 0.00000869625713093085 0
> HIGH_VALUE 115792 32323236343730303939393939393939393939
> 35353939383639393939393939393939393939 0.00000863617521072268 0
>
>
> NUM_ROWS BLOCKS AVG_ROW_LEN
> 248942 3776 96
>
>
>
>
> On Wed, Oct 12, 2022 at 6:41 AM Sayan Malakshinov <xt.and.r_at_gmail.com>
> wrote:
>
>> Hi Andy
>>
>> Oh, yes, I also had a couple of examples of own odci indexes for such
>> cases, but it would be great to see your variant!
>>
>>
>> Best regards,
>> Sayan Malakshinov
>> Oracle performance tuning expert
>> Oracle Database Developer Choice Award winner
>> Oracle ACE
>> http://orasql.org
>>
>> On Wed, 12 Oct 2022, 02:05 Andy Sayer, <andysayer_at_gmail.com> wrote:
>>
>>> I have a demo for a domain index which allows this sort of query to be
>>> very fast. Essentially it creates an index entry for each value (with your
>>> desired granularity) between your min and max columns and does look ups
>>> based on that. The domain index knows what the granularity is so as well as
>>> :bind between x and y it queries new_domain_column between :bind -
>>> granularity and :bind + granularity.
>>>
>>> Unfortunately, the demo is on the pc that I left in the UK when I moved
>>> to Canada. This might be enough information for you to get going with,
>>> otherwise I was planning on publishing my work when I visit the UK in
>>> December.
>>>
>>> The other, easier but less guaranteed, way of combining two range
>>> predicates on different columns is to partition the index by one of the
>>> columns. Eg your index would be on some_key, start_date and your
>>> partitioning would be on end_date. Oracle would then need to do a range
>>> scan using your some_key and start_date filters as access predicates once
>>> per partition which matches your end_date filter. You’d have to decide
>>> which column is the partition key and how wide to make the partitions, and
>>> you might not gain a lot.
>>>
>>> Thanks,
>>> Andy
>>>
>>> On Tue, Oct 11, 2022 at 5:38 PM, Sayan Malakshinov <xt.and.r_at_gmail.com>
>>> wrote:
>>>
>>>> I tuned a lot of similar queries: tried different methods (even spatial
>>>> RTree indexes) and made a lot of tests, so I'll try to describe and
>>>> summarize them all later, when I have more time, but for now just in short:
>>>>
>>>> Let''s simplify it just to:
>>>> - table T(A,B)
>>>> - index (A,B)
>>>> - select * from T where :X between A and B
>>>> - ":X between A and B" is very selective
>>>>
>>>> The main problem in this case is that you need to check all index
>>>> entries A<:X, for example you have:
>>>>
>>>> A | B
>>>> --------
>>>> 1 1
>>>> 1 2
>>>> 1 3
>>>> .....
>>>> 1 100
>>>> 2 107
>>>> 2 111
>>>> ....
>>>> 2 215
>>>> 3 204
>>>> 3 206
>>>> ....
>>>> 3 299
>>>> ...
>>>> 998 99799
>>>> 999 99801
>>>> ...
>>>> 999 99900
>>>>
>>>> where each A has about 100 different B,
>>>> and your :X = 700, so, though there are just about 100 rows satisfying
>>>> both conditions, INDEX RANGE SCAN with access predicate A<:X will scan
>>>> 700*100 index entries, and filter out 699*100 of them by filter predicate
>>>> (B>:X).
>>>>
>>>> So we have just a few different situations and key points:
>>>>
>>>> ===================================================================================
>>>>
>>>> *0. There is a limitation of max(B-A) and it's quite small, ie
>>>> max(B[x]-A[x]) << max(B) - min(A)*In this case, you can simply add a
>>>> couple of additional predicates:
>>>> select *
>>>> from T
>>>> where A between :X-:MAX_DIFF and :X
>>>> and B between :X-:MAX_DIFF and :X
>>>> ;
>>>> So you''ll get a short search range for A (ACCESS predicates for the
>>>> first column of the index).
>>>> //For example, it may be short intervals or start/end dates of short
>>>> time events, with check constraints like (b-a<N)
>>>>
>>>> ===================================================================================
>>>>
>>>>
>>>> *1. There is a perfect positive correlation between A and B, ie if
>>>> a2>a1, must be b2>b1*
>>>> -----------------------------------------------------------------------------------
>>>>
>>>> *1.1 There is no intersection of any 2 intervals (Ax - Bx), (Ay - By)*
>>>> This means that for any :X you need to find just 1 row, so you can stop
>>>> to scan the index after the first found row.
>>>> We can rewrite our query for that like this:
>>>>
>>>> select *
>>>> from
>>>> (select * from T where A<=:X order by A desc fetch first 1 row only)
>>>> -- to force IRS DESCENDING (IRS - index range scan) and stop after 1 row
>>>> where
>>>> B>:X -- our second predicate is one level upper, so we check just 1
>>>> row maximum
>>>>
>>>> -- IIRC there were some problems with IRS DESC in case of "fetch
>>>> first", so I'll rewrite it to make it more reliable:
>>>> select *
>>>> from
>>>> (select *
>>>> from
>>>> (select * from T where A<=:X order by A desc) -- to force IRS
>>>> DESCENDING
>>>> where rownum=1) -- and stop after 1 row
>>>> where
>>>> B>:X -- our second predicate
>>>>
>>>>
>>>> -----------------------------------------------------------------------------------
>>>>
>>>> *1.2 There may be intersections:*This means that you can't stop after
>>>> the first found row satisfying our conditions. We need to find all of them.
>>>> Unfortunately, there is no a documented method to make a dynamic SQL
>>>> condition like "please stop, I don't need other rows", but we can simply
>>>> use PL/SQL for that:
>>>>
>>>> declare
>>>> X number := 34;
>>>>
>>>> cursor c_1(p number) is
>>>> select * from T where A<=p order by A desc;
>>>> R c_1%rowtype;
>>>> begin
>>>> open c_1(X);
>>>> <<LL>>
>>>> loop
>>>> fetch c_1 into R;
>>>> if R.B>=X then
>>>> dbms_output.put_line(R.A ||' '||R.B); -- or pipe row or collect
>>>> ROWIDS and return them as a collection, etc
>>>> else
>>>> exit LL;
>>>> end if;
>>>> end loop;
>>>> end;
>>>> /
>>>>
>>>> As you can see, we stop fetching cursor C_1 when R.B becomes lower than
>>>> X.
>>>> Of course to make it more convenient you can wrap it into a pipelined
>>>> pl/sql table function or even use it with INLINE pl/sql functions.
>>>>
>>>>
>>>> ===================================================================================
>>>>
>>>> *2. There is no correlation between A and B, ie if a2>a1, it doesn't
>>>> mean that b2>b1 :*
>>>> This one is the most difficult problem, because in case of unknown X we
>>>> need to check all rows that have A<=X (or all rows with B>=X).
>>>> Obviously, it would be great if we know some common patterns and
>>>> distribution of A/B/X, for example:
>>>> - A and B are dates -- like table DT (start_date date, end_date
>>>> date...)
>>>> - min(A) and min(B) ~ 10-20 years ago
>>>> - max(A) and max(B) ~ sysdate
>>>> - and X usually is in the range of [sysdate-10; sysdate]
>>>> - number of rows satisfying our conditions are pretty small
>>>> In this case we could create index (B,A), just because we have less
>>>> rows satisfying B>=X than A<=X.
>>>>
>>>> Or sometimes it maybe useful to create 2 indexes (A) and (B) and use
>>>> bitmap_and operation like this:
>>>>
>>>> select/*+ index_combine(t) */ * from t where :X between a and b;
>>>> ------------------------------------------------------------
>>>> | Id | Operation | Name |
>>>> ------------------------------------------------------------
>>>> | 0 | SELECT STATEMENT | |
>>>> | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T |
>>>> | 2 | BITMAP CONVERSION TO ROWIDS | |
>>>> | 3 | BITMAP AND | |
>>>> | 4 | BITMAP CONVERSION FROM ROWIDS | |
>>>> | 5 | SORT ORDER BY | |
>>>> |* 6 | INDEX RANGE SCAN | IX_B |
>>>> | 7 | BITMAP CONVERSION FROM ROWIDS | |
>>>> | 8 | SORT ORDER BY | |
>>>> |* 9 | INDEX RANGE SCAN | IX_A |
>>>> ------------------------------------------------------------
>>>>
>>>> Predicate Information (identified by operation id):
>>>> ---------------------------------------------------
>>>>
>>>> 6 - access("B">=TO_NUMBER(:X))
>>>> filter("B">=TO_NUMBER(:X))
>>>> 9 - access("A"<=TO_NUMBER(:X))
>>>> filter("A"<=TO_NUMBER(:X))
>>>>
>>>> ===================================================================================
>>>> But in general, there is no silver bullet for such cases...
>>>>
>>>> On Tue, Oct 11, 2022 at 10:58 PM Tim Gorman <tim.evdbt_at_gmail.com>
>>>> wrote:
>>>>
>>>>> Have you considered making TABLE1 an IOT? No, not an "internet of
>>>>> things" but an "index-organized table"...
>>>>>
>>>>> If the primary method of access is the primary key, and TABLE1 is has
>>>>> "narrow" rows (i.e. AVG_ROW_LEN less than 20-25 bytes or so), then an IOT
>>>>> could save on the number of logical reads. There's a lot of "if"s, but the
>>>>> best approach is not to think about it, but just go ahead and test it,
>>>>> side-by-side with the existing table. After all, it's only about ~100MB in
>>>>> size, right?
>>>>>
>>>>> But, at the very least, it shouldn't be difficult to just put the
>>>>> table and the PK index together into the KEEP pool of the Buffer Cache?
>>>>> After all, although the ideal is to size the KEEP pool to accommodate the
>>>>> entire objects assigned to it, you certainly aren't required to size it
>>>>> that way. You just want to size it so that buffers flush out far more
>>>>> slowly than they do in the DEFAULT pool.
>>>>>
>>>>> <rant>Too many DBAs think that the SGA should only use about 50% of
>>>>> the available physical memory on a host, which is nonsense. The Linux/UNIX
>>>>> operating systems only need a few GB of memory, and AWR can tell you
>>>>> unequivocally how much space is needed for PGA, so the SGA should be sized
>>>>> closer to the Oracle-imposed maximum of 90% of host physical memory. It's
>>>>> there. It's not being used. Use it. If I had a nickel for every unused
>>>>> GB of RAM on Oracle database servers, I could buy my own Hawaiian
>>>>> island.</rant>
>>>>>
>>>>> Hope this helps!
>>>>>
>>>>> Enjoy!
>>>>>
>>>>>
>>>>>
>>>>> On 10/11/2022 2:04 PM, yudhi s wrote:
>>>>>
>>>>> Hello Listers, We have a customer database on Oracle version 19C. We
>>>>> have a simple query as below. and as per current design is executing ~200
>>>>> to 300 times per second and it's part of a bigger process and thus is one
>>>>> of the top consumers in that. Now as we are working to change the design to
>>>>> make the number of execution of this query lesser to help the process. But
>>>>> that needs much more impact analysis, so we were thinking of any possible
>>>>> easy way to make the individual execution of this query faster? Or say any
>>>>> structural change(new index etc.) which can further drop the IO/CPU
>>>>> requirement for individual execution of this query?
>>>>>
>>>>> Currently this query is accessing table TABLE1 through a primary key
>>>>> which is on three columns (PART_COL,MIN_VALUE,MAX_VAL). The table is
>>>>> partitioned on column PART_COL. This table contains ~400K rows and is
>>>>> ~100MB in size. It's a master data kind of table.
>>>>>
>>>>> SELECT column1 FROM TABLE1 WHERE PART_COL = :B2 AND :B1 BETWEEN
>>>>> MIN_VAL AND MAX_VALUE
>>>>>
>>>>> Global Information
>>>>> ------------------------------
>>>>> Status : DONE (ALL ROWS)
>>>>> Instance ID : 1
>>>>> SQL Execution ID : 16777216
>>>>> Execution Started : 10/11/2022 09:36:48
>>>>> First Refresh Time : 10/11/2022 09:36:48
>>>>> Last Refresh Time : 10/11/2022 09:36:48
>>>>> Duration : .06173s
>>>>> Module/Action : SQL*Plus/-
>>>>> Program : sqlplus.exe
>>>>> Fetch Calls : 1
>>>>>
>>>>> Binds
>>>>>
>>>>> ========================================================================================================================
>>>>> | Name | Position | Type | Value
>>>>> |
>>>>>
>>>>> ========================================================================================================================
>>>>> | :B2 | 1 | NUMBER | 2 |
>>>>> | :B1 | 2 | VARCHAR2(4000) | XXXXXXXXXXX
>>>>> |
>>>>>
>>>>> ========================================================================================================================
>>>>>
>>>>> Global Stats
>>>>>
>>>>> =========================================================================================
>>>>> | Elapsed | Cpu | IO | Concurrency | Cluster | Fetch | Buffer |
>>>>> Read | Read |
>>>>> | Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Calls | Gets |
>>>>> Reqs | Bytes |
>>>>>
>>>>> =========================================================================================
>>>>> | 0.06 | 0.04 | 0.02 | 0.00 | 0.00 | 1 | 911 | 778 | 6MB |
>>>>>
>>>>> =========================================================================================
>>>>>
>>>>> SQL Plan Monitoring Details (Plan Hash Value=692467662)
>>>>>
>>>>> ======================================================================================================================================================================================
>>>>> | Id | Operation | Name | Rows | Cost |
>>>>> Time | Start | Execs | Rows | Read | Read | Activity | Activity Detail |
>>>>> | | | | (Estim) | | Active(s) |
>>>>> Active | | (Actual) | Reqs | Bytes | (%) | (# samples) |
>>>>>
>>>>> ======================================================================================================================================================================================
>>>>> | 0 | SELECT STATEMENT | | | | |
>>>>> | 1 | | | | | |
>>>>> | 1 | PARTITION RANGE SINGLE | | 10610 | 928
>>>>> | | | 1 | | | | | |
>>>>> | 2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | TABLE1 |
>>>>> 10610 | 928 | | | 1 | | | | | |
>>>>> | 3 | INDEX RANGE SCAN | PK_TABLE1 | 10610 |
>>>>> 771 | | | 1 | | 770 | 6MB | | |
>>>>>
>>>>> ======================================================================================================================================================================================
>>>>>
>>>>> Predicate Information (identified by operation id):
>>>>> ---------------------------------------------------
>>>>> 3 - access("PART_COL"=TO_NUMBER(:B2) AND "MAX_VALUE">=:B1 AND
>>>>> "MIN_VAL"<=:B1)
>>>>> filter("MAX_VALUE">=:B1)
>>>>>
>>>>> Statistics
>>>>> ----------------------------------------------------------
>>>>> 37 recursive calls
>>>>> 0 db block gets
>>>>> 911 consistent gets
>>>>> 778 physical reads
>>>>> 41076 redo size
>>>>> 260 bytes sent via SQL*Net to client
>>>>> 489 bytes received via SQL*Net from client
>>>>> 1 SQL*Net roundtrips to/from client
>>>>> 28 sorts (memory)
>>>>> 0 sorts (disk)
>>>>> 0 rows processed
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Sayan Malakshinov
>>>> Oracle performance tuning engineer
>>>> Oracle ACE
>>>> http://orasql.org
>>>>
>>>

--
http://www.freelists.org/webpage/oracle-l

Received on Wed Oct 12 2022 - 12:32:17 CEST

This message: [ Message body ]
Next message: Jonathan Lewis: "Re: Improving query performance further"
Previous message: Sayan Malakshinov: "Re: Improving query performance further"
In reply to yudhi s: "Re: Improving query performance further"
Next in thread: Jonathan Lewis: "Re: Improving query performance further"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message