Re: Unique index access path seems very slow
Date: Mon, 23 Jan 2023 12:41:34 -0800
Message-ID: <CACj1VR5KHaRPxp4mmmuiPy6LfSRFPL0KbHDzsNAbN9cCQ=gS_w_at_mail.gmail.com>
Hi Pap,
No real suggestions but there’s a few things I would think about:
The join on the hash function means that you’ll be reading from all over the index randomly, you’re not going to benefit from caching unless you’ve managed to fit the whole index in the cache.
Is there additional information you can take advantage of? Eg is the related data inserted at the same time? That way you can do a hash join of the two recently inserted data sets.
Applying the hash function over and over is probably chewing up cpu and might be interfering with how Oracle can batch up those nested loops. Storing the hash value in that table when the data is inserted is probably a much better idea, rather than calculate it over and over again.
Looking at the time spent on IO and the number of requests: you’re getting about 100 IOPS. That doesn’t seem very good, perhaps you have already maxed out the IO capacity with other processes.
Hope this helps,
Andy
On Mon, Jan 23, 2023 at 10:47 AM, Pap <oracle.developer35_at_gmail.com> wrote:
> This database is on 19C of Oracle exadata. Below is the query and the sql
> monitor for both the FULL scan path and index access path. Also I had
> captured the non-zero stats from gv$sesstat for ~10minutes execution for
> each of those runs. The index access path has just processed ~82K in
> 30minutes using the index access path and has still not finished yet. We
> want to make this query finish in ~2-3minutes.
>
> SELECT TRAN_TAB.COL1_SHA_512, TAB_ENCRYPT.TAB_ENCRYPT_COL2.....
> FROM (SELECT RAWTOHEX ( STANDARD_HASH (TO_CHAR (TRIM (
> TRAN_TAB.LN4)),'SHA512')) AS COL1_SHA_512
> FROM TRAN_TAB
> WHERE TRAN_TAB.PD_CD(+) = 'XXX'
> AND TRAN_TAB.PART_DT = to_date(:B1,'DD-MON-YYYY HH24:MI:SS')
> AND TRAN_TAB.dt_cr between to_date(:B2,'DD-MON-YYYY HH24:MI:SS') and
> to_date(:B3,'DD-MON-YYYY HH24:MI:SS')
> ) TRAN_TAB, TAB_ENCRYPT
> WHERE TRAN_TAB.COL1_SHA_512 = TAB_ENCRYPT.COL1_SHA_512(+)
>
> ****** "Full scan" path sql monitor + nonzero results from gv$sesstat from
> initial ~10minutes run*******
>
> https://gist.github.com/oracle9999/b37160097fa8e3929fb66af61ebbf9ed
>
> ***** "Index access" path sql monitor + nonzero results from gv$sesstat
> from initial ~10minutes run*******
>
> https://gist.github.com/oracle9999/9792a70f748b71c365397c5458d3e9e8
>
> The requirement is something as below...
>
> We have two tables TRAN_TAB and TAB_ENCRYPT. Table(TAB_ENCRYPT) is an
> insert only table and holds the encrypted value(COL1_SHA_512) of a
> confidential column/attribute. Each row of the transaction table(TRAN_TAB)
> has to be checked/joined against all the encrypted column values of table
> TAB_ENCRYPT using the above query and the results along with additional
> columns from table tab_encrypt have to be sent to another downstream
> system. This query is supposed to run multiple times in a day so as to send
> the results to the downstream system as real time as possible based on the
> records in the transaction table tran_tab. Currently the plan is to run it
> once in ~5minutes. The table TAB_ENCRYPT holds ~3billion rows. And the
> transaction table TRAN_TAB will have Approx ~400 million rows per
> day/part_dt to get it joined with the encrypted column value- COL1_SHA_512.
>
> As in the sql monitor, the original query was taking ~30minutes to join
> with ~3million rows in the transaction table. And the majority of the
> resources seems to be spent while full scanning the table TAB_ENCRYPT. And
> thus we tried creating a unique index on column "COL1_SHA_512" thinking it
> will be fast, but it appears to be further degrading the performance of
> this query. Want to understand why it's happening this way. Is it because
> the column holds ~128byte values so the index becomes too big to fit in
> cache? and then how can we fix this query?
>
> The column "COL1_SHA_512" , holds the SHA512 encrypted value of another
> column. The sample encrypted values look something like below. The size of
> the index on this column is ~700GB. And the statistics of the index is as
> below. It contains all unique values though but i think because of its
> 128byte length the size of the index becomes this huge. The size of the
> table tran_tab is ~880GB.
>
>
> "99983FB4F3BEAA516BE40B85706338B0D7E5D14D1B2A3C945D5F74947A3E8DA8E714D1761D008715CD46B6C6CDE8B99690F2AE04D97D9KHYU67GBF4589HNFRSDR"
>
>
> *Column 'COL1_SHA_512' statistics:- *NUM_DISTINCT NULLABLE DENSITY
> NUM_NULLS AVG_COL_LEN
> 2987284438 N 0.000000000334752187397831
> 0 129
>
>
> *Unique Index(on column COL1_SHA_512) Statistics:- *INDEX_NAME
> BLEVEL PCT_FREE LEAF_BLOCKS CLUSTERING_FACTOR
> NUM_ROWS
> TAB_ENCRYPT_UK 3 10 83389146
> 3008764650 3008764650
>
> Regards
> Pap
>
>
>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Jan 23 2023 - 21:41:34 CET