Re: Fixing Performance issue with less selective columns

From: Sayan Malakshinov <xt.and.r_at_gmail.com>
Date: Wed, 25 Aug 2021 23:26:42 +0300
Message-ID: <CAOVevU7FS5ZDfkU9A7PybkDUsL8trD2G6osCp0n6d47fEf6J_w_at_mail.gmail.com>



Hi Lok,

I don't see statistics by "DC_CODE" column, but looking at the results of that query I asked for:
select
  NVL(I.MA_FLG, 'N'),NVL(I.D_UNMTCH, 'N'),I.DC_CODE,count(*) FROM PP_IN_TAB I
group by NVL(I.MA_FLG, 'N'),NVL(I.D_UNMTCH, 'N'),I.DC_CODE;

NVL(I.MA_FLG,'N') NVL(I.D_UNMTCH,'N') DC_CODE COUNT(*) N N D 98348098
N N C 7510750
Y Y D 224
N Y D 9319
N N 78
N Y C 3977
Y N D 5931061
Y N C 69458
Y Y C 195

and considering your binds:
> "exec :b4:= 'A';"

and statistics:
TABLE_NAME COLUMN_NAME NUM_DISTINCT NUM_NULLS PP_IN_TAB WOF_DATE 2572 *83154* looks like these 2 predicates are most selective: AND I.DC_CODE = :B4 -- gives 0 rows
AND I.WOF_DATE IS NULL -- gives 83k rows

So I would create an index on them:
> create index ... on PP_IN_TAB (DC_CODE, WOF_DATE)
(you can consider it also in the reverse order: (WOF_DATE, DC_CODE))

What does it give: if :b4 = 'A', it will return 0 rows immediately. And even if :b4 = 'C' or 'D', this index will help you to filter out by "I.WOF_DATE IS NULL" (less than 83154 rows from 111mln). You can even check it with this query:

select DC_CODE, count(*) from PP_IN_TAB where DC_CODE is null group by DC_CODE;
This query will show the maximum number of rows you will get from this index for different :b4 values.

On Wed, Aug 25, 2021 at 10:34 PM Lok P <loknath.73_at_gmail.com> wrote:

> Thank you Lothar. Actually this is an OLTP kind of database and I do see
> UPDATE queries on these table columns. Need to see in detail about the
> frequency of those and how concurrent they are. I am Not very familiar with
> the usage of bitmap indexes , so trying to understand what exact column you
> are suggesting for the bitmap index to help this query?
>
> On Wed, Aug 25, 2021 at 11:45 PM Lothar Flatz <l.flatz_at_bluewin.ch> wrote:
>
>> This scenario cries for bitmap indexes.
>> Bitmap Indexes can deal with "not equal" as well as "is null".
>> The columns seems to be low cardinality too.
>> The only open question is how often these columns get updated. (
>> https://asktom.oracle.com/pls/apex/asktom.search?tag=bitmap-indexes-and-locking
>> ).
>>
>> Regards
>>
>> Lothar
>>
>> Am 25.08.2021 um 19:19 schrieb Mark W. Farnham:
>>
>> unfortunately you keep nearly all the rows of both MA_FLG and D_UNMTCH,
>> so this query is the opposite of those indexes being useful.
>>
>>
>>
>> IF you were looking for ‘Y’ instead of not ‘Y’ on either one it would be
>> extremely good. I didn’t see initially that these two columns are extremely
>> inclusive.
>>
>>
>>
>> I think Sayan was checking that in his query request. MA_FLG could reject
>> at most about 6 million rows, so that’s pretty worthless.
>>
>>
>>
>> *From:* Lok P [mailto:loknath.73_at_gmail.com <loknath.73_at_gmail.com>]
>> *Sent:* Wednesday, August 25, 2021 1:09 PM
>> *To:* Mark W. Farnham
>> *Cc:* Sayan Malakshinov; Oracle L
>> *Subject:* Re: Fixing Performance issue with less selective columns
>>
>>
>>
>> Thank You Mark.
>>
>> I may be wrong but in this situation I was unable to think of any other
>> way we could make this query faster , so I was thinking of creating a new
>> index. If there exists any other way to make this query faster without
>> creating any new index that would really be helpful.
>>
>>
>>
>> I am not able to get your point fully, If you can help me understand it a
>> bit more here please. Below is the data pattern for MA_FLG and D_UNMTCH.
>>
>>
>>
>> Thus , in this query condition " NVL (I.MA_FLG, 'N') <> 'Y' results in
>> ~105million and NVL (I.D_UNMTCH, 'N') <> 'Y' results in ~111million. So
>> how should I create index or modify code to make it the best access/filter
>> criteria so as to make the query faster?
>>
>>
>>
>>
>>
>> MA_FLG
>>
>> Count(*)
>>
>> N
>>
>> 105228656
>>
>> Y
>>
>> 6000938
>>
>>
>>
>> 643566
>>
>>
>>
>> D_UNMTCH
>>
>> Count(*)
>>
>> Y
>>
>> 13715
>>
>>
>>
>> 111859445
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Aug 25, 2021 at 8:00 PM Mark W. Farnham <mwf_at_rsiz.com> wrote:
>>
>> The other thing, for flag values like AND NVL (I.MA_FLG, 'N') <>
>> 'Y' AND NVL (I.D_UNMTCH, 'N') <> 'Y'
>>
>>
>>
>> if you’re thinking about adding an index, and even if you need a virtual
>> column to do this because you have too much code depending on values ‘N’
>> and ‘Y’, define the final status (the one where nearly all of them land) as
>> NULL, being the ones you are NOT interested in most of the time. In both
>> these cases it looks like ‘Y’ would then be NULL, so
>>
>>
>>
>> i.ma_flg_v is defined decoding Y to NULL and anything else to N and your
>> code becomes and i.ma_flg_v = ‘N’ and you deal with variability in
>> non-nulls that are not ‘Y’ on the original,
>>
>> or
>>
>> i.ma_flg_v decodes Y to NULL, NULL to ‘N’ and anything else unchanged and
>> your code becomes i.ma_flg_v is NOT NULL,
>>
>> or
>>
>> you make a functional index on i.ma_flg that does the equivalent.
>>
>>
>>
>> I can’t remember off the top of my head whether either way gives you a
>> real advantage over the other in stats collections and the CBO doing
>> something smart and that probably changed over the releases. That might be
>> in one of my papers.
>>
>>
>>
>> When you then index that column the nulls disappear, leaving you with a
>> very tiny index to prune your result set immediately to very small and you
>> can usually filter the rest fast without an index.
>>
>>
>>
>> Remember, ORACLE cannot assign a value to NULL in anything they do. But
>> YOU can.
>>
>>
>>
>> When this is appropriate, it is one of the neatest and easiest “magic
>> tricks” in the Oracle kit.
>>
>>
>>
>> Good luck,
>>
>>
>>
>> mwf
>>
>>
>>
>> *From:* oracle-l-bounce_at_freelists.org [mailto:
>> oracle-l-bounce_at_freelists.org] *On Behalf Of *Sayan Malakshinov
>> *Sent:* Wednesday, August 25, 2021 9:40 AM
>> *To:* Lok P
>> *Cc:* Oracle L
>> *Subject:* Re: Fixing Performance issue with less selective columns
>>
>>
>>
>> Hi Lok,
>>
>>
>>
>> > SUBSTR(:B8,0*.50*)
>>
>> Looks like this query should be analyzed and tested better.
>>
>> You haven't provided histograms and bind values statistics, so not
>> enough info to analyze it properly.
>>
>> For now it looks like "I.WOF_DATE IS NULL" is one of the most selective
>> predicates - it gives only 83154 nulls.
>>
>> In addition to histogram statistics(dba_tab_histograms) and most often
>> binds values, I would like also to see what does return this query:
>>
>> select
>> NVL(I.MA_FLG, 'N'),NVL(I.D_UNMTCH, 'N'),I.DC_CODE,count(*)
>> FROM PP_IN_TAB I
>> group by NVL(I.MA_FLG, 'N'),NVL(I.D_UNMTCH, 'N'),I.DC_CODE;
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Aug 25, 2021 at 4:14 PM Lok P <loknath.73_at_gmail.com> wrote:
>>
>> Hello , This database has version 11.2.0.4 of Oracle. We have the below
>> query which is executed thousands of times. It's used in a plsql function
>> which in turn gets called from a procedure. And this procedure gets called
>> from java thousands of times. And I see from dba_hist_sqlstat , for most
>> of the runs this below query results in zero rows. We see from the active
>> session history for the overall process this query is consuming most
>> time/resources and making the process run longer. So wanted to understand
>> if we can make this individual query execution faster which would
>> eventually make the process faster?
>>
>> The base table- PP_IN_TAB is holding ~111million rows and is ~43GB in
>> size. Column PP_ID is the primary key here. The filter predicates used in
>> this query are below. Many of them were not very selective in nature. So I
>> am not able to conclude if any composite index is going to help us here.
>> Can you please guide me , what is the correct approach to tune this process
>> in such a scenario?
>>
>> Below is the column data pattern used as filter predicate in this query.
>> Most of these are less selective in nature.
>>
>>
>>
>> *TABLE_NAME*
>>
>> *COLUMN_NAME*
>>
>> *NUM_DISTINCT*
>>
>> *NUM_NULLS*
>>
>> PP_IN_TAB
>>
>> EF_ID
>>
>> 39515
>>
>> 6151686
>>
>> PP_IN_TAB
>>
>> PE
>>
>> 103074806
>>
>> 647050
>>
>> PP_IN_TAB
>>
>> PT_Code
>>
>> 24
>>
>> 0
>>
>> PP_IN_TAB
>>
>> PT_MCODE
>>
>> 20
>>
>> 0
>>
>> PP_IN_TAB
>>
>> D_CUR_CODE
>>
>> 13
>>
>> 592784
>>
>> PP_IN_TAB
>>
>> ED_AMT
>>
>> 320892
>>
>> 6
>>
>> PP_IN_TAB
>>
>> WOF_DATE
>>
>> 2572
>>
>> 83154
>>
>> PP_IN_TAB
>>
>> PR_CTGRY
>>
>> 2
>>
>> 86
>>
>> PP_IN_TAB
>>
>> PDE_RSN_CAT
>>
>> 6
>>
>> 0
>>
>> PP_IN_TAB
>>
>> MA_FLG
>>
>> 2
>>
>> 648172
>>
>> PP_IN_TAB
>>
>> M_TXT
>>
>> 29460248
>>
>> 9118572
>>
>> PP_IN_TAB
>>
>> D_UNMTCH
>>
>> 1
>>
>> 111766716
>>
>>
>>
>>
>>
>> SELECT NVL (I.PP_ID, 0)
>> FROM PP_IN_TAB I
>> WHERE TRIM(I.M_TXT) = TRIM (SUBSTR ( :B8, 0.50)) AND I.PT_Code
>> = :B7
>> AND NVL ( :B6, I.PT_MCODE) = NVL ( :B6, :B5) AND I.DC_CODE =
>> :B4
>> AND I.D_CUR_CODE = :B3 AND I.ED_AMT = :B2
>> AND I.PR_CTGRY = :B1 AND I.PE IS NOT NULL
>> AND I.EF_ID IS NULL AND I.WOF_DATE IS NULL
>> AND NVL (I.MA_FLG, 'N') <> 'Y' AND NVL (I.D_UNMTCH, 'N') <>
>> 'Y'
>> AND ROWNUM = 1;
>>
>>
>> Global Information
>> ------------------------------
>> Status : DONE (ALL ROWS)
>> Instance ID : 1
>> SQL Execution ID : 16777216
>> Execution Started : 08/25/2021 03:53:25
>> First Refresh Time : 08/25/2021 03:53:25
>> Last Refresh Time : 08/25/2021 03:53:28
>> Duration : 3s
>> Module/Action : SQL*Plus/-
>> Program : sqlplus.exe
>> Fetch Calls : 1
>>
>> Global Stats
>>
>> =========================================================================================
>> | Elapsed | Cpu | IO | Application | Fetch | Buffer | Read |
>> Read | Cell |
>> | Time(s) | Time(s) | Waits(s) | Waits(s) | Calls | Gets | Reqs |
>> Bytes | Offload |
>>
>> =========================================================================================
>> | 3.30 | 1.15 | 2.15 | 0.00 | 1 | 6M | 44379 |
>> 43GB | 99.99% |
>>
>> =========================================================================================
>>
>> SQL Plan Monitoring Details (Plan Hash Value=1096440065)
>>
>> ==========================================================================================================================================================================================
>> | Id | Operation | Name | Rows | Cost |
>> Time | Start | Execs | Rows | Read | Read | Cell | Mem |
>> Activity | Activity Detail |
>> | | | | (Estim) | |
>> Active(s) | Active | | (Actual) | Reqs | Bytes | Offload | (Max) |
>> (%) | (# samples) |
>>
>> ==========================================================================================================================================================================================
>> | 0 | SELECT STATEMENT | | | |
>> | | 1 | | | | | |
>> | |
>> | 1 | COUNT STOPKEY | | | |
>> | | 1 | | | | | |
>> | |
>> | 2 | TABLE ACCESS STORAGE FULL | PP_IN_TAB | 1 | 128K |
>> 3 | +2 | 1 | 0 | 44379 | 43GB | 99.99% | 6M |
>> 100.00 | cell smart table scan (3) |
>> ==========================================================================================================================================================================================
>>
>>
>> Predicate Information (identified by operation id):
>> ---------------------------------------------------
>>
>> 1 - filter(ROWNUM=1)
>> 2 - storage("I"."WOF_DATE" IS NULL AND "I"."EF_ID" IS NULL AND
>> "I"."PT_Code"=:B7 AND "I"."D_CUR_CODE"=:B3 AND "I"."PR_CTGRY"=:B1 AND
>> "I"."DC_CODE"=:B4 AND "I"."ED_AMT"=TO_NUMBER(:B2) AND
>> NVL(:B6,"I"."PT_MCODE")=NVL(:B6,:B5) AND
>> TRIM("I"."M_TXT")=TRIM(SUBSTR(:B8,0.50))
>> AND "I"."PE" IS NOT NULL AND NVL("I"."MA_FLG",'N')<>'Y' AND
>> NVL("I"."D_UNMTCH",'N')<>'Y')
>> filter("I"."WOF_DATE" IS NULL AND "I"."EF_ID" IS NULL AND
>> "I"."PT_Code"=:B7 AND "I"."D_CUR_CODE"=:B3 AND "I"."PR_CTGRY"=:B1 AND
>> "I"."DC_CODE"=:B4 AND "I"."ED_AMT"=TO_NUMBER(:B2) AND
>> NVL(:B6,"I"."PT_MCODE")=NVL(:B6,:B5) AND
>> TRIM("I"."M_TXT")=TRIM(SUBSTR(:B8,0.50))
>> AND "I"."PE" IS NOT NULL AND NVL("I"."MA_FLG",'N')<>'Y'
>> AND NVL("I"."D_UNMTCH",'N')<>'Y')
>>
>>
>>
>>
>> --
>>
>> Best regards,
>> Sayan Malakshinov
>>
>> Oracle performance tuning engineer
>>
>> Oracle ACE Associate
>> http://orasql.org
>>
>>
>>

-- 
Best regards,
Sayan Malakshinov
Oracle performance tuning engineer
Oracle ACE Associate
http://orasql.org

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Aug 25 2021 - 22:26:42 CEST

Original text of this message