Re: Unique index access path seems very slow

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Mon, 30 Jan 2023 22:19:59 +0000
Message-ID: <CAGtsp8k5Bq8gBP1r0R2QSx4k2Y+g69hBpHtHF6EbVFZr8cYzeA_at_mail.gmail.com>



Comparing the 5 hash join plans you've posted: (a) The sample where you compared including the Bloom filter against the one with the 2M hint.
They are nearly identical, and critically the CPU operations 6 and 7 is very high with the row count dropping from 3G to 99M. The CPU at operation 2 is then small. I suspect you may have left the cardinality 2M hint in this example.

(b) The other three plans, where the TRAN_TAB estimate is 49,000 The CPU time in the scan of the encrypt_tab is around 150 seconds and the CPU for the hash join goes high - 350 for the original and the hinted 49,000; 450 for the one with the Bloom filter off.

Roughly speaking, Oracle's implementation of Bloom filtering is the first check in a hash join (i.e. are there possible matches for the latest probe table row in the relevant build table hash bucket), so we can can use the CPU for that step down at the scan of encrypt_tab or up at the hash join. We cannot avoid it. What we hope to do with the Bloom filter is reduce the volume of data that passes up a couple of lines in the row (i.e.through a couple of call stacks). When we disable the Bloom filter that part of the test now happens at the hash join, and all the data passed in from the cells has to pass up the plan.

The size of the hash table (i.e. the number of hash buckets / bits in the Bloom filter) is affected by the optimizer's estimate of the number of rows in the build (first) table. This is why the estimate of 2M eliminated more data with the Bloom filter then the estimate of 49,000; it increased the size of the Bloom filter so reduced the "collisions" or "false positives". But there's a down-side.

When the estimate was 2M I think (it's a guess) that the Bloom filter became too large to be passed down to the cell server, and that's why the actual rows for the encrypt_tab scan was 3G; but when the estimate was 49K the actual rows for the scan was 2G because the Bloom filter (when it was enabled) was passed to the cell server which managed to eliminate SOME data, though not very much. And, of course, since the cells had eliminated data the Bloom filter operation at the database server didn't eliminate any more data, so passed 2G rows up to the hash join.

Note that the Offload Returned Bytes was 300GB for the 49K estimate with the small Bloom filter, and 500GB for the 2M estimate with the large Bloom filter.
It looks like we need to "fake" the system so that the Bloom filter (estimate) is large enough to eliminate a lot of data while being small enough to be sent to the cell server so that the 14 concurrently active cells can do the row elimination. Beyond that I don't think there's a way to make the query go faster than the (roughly) 650 seconds you've seen so far.


THe two suggestions I've made so far (use RAW not HEX, and hash partition/subpartition for partition-wise joins) can make a big difference though.

  1. using RAW means each value is half the size, so the number of hash buckets can increase significantly for the same size memory, and that MAY help Oracle with elimination (though you still have to avoid the bitmap / Bloom filter getting too big)
  2. if you hash partition the encrypt_tab table N ways (N = a power of 2) and hash subpartition the trans_tab table to match then a bitmap / Bloom filter then Oracle can walk through each of the N pairs in turn and the bitmap / Bloom filter on each step can be just as big as the largest viable Bloom filter that you could create for the full data set but only have to cope with 1/Nth of the data, which means fewer collisions, more elimination, and much less data passing up the plan.

Side note: Unless things have changed in recent versions a 10104 trace will tell you about what's going on with the hash join, including the number of buckets in the hash table and the distribution of rows per bucket - that might be quite interesting and corroborate some of my comments about collisions/false positives etc.

Regards
Jonathan Lewis

On Mon, 30 Jan 2023 at 18:37, Pap <oracle.developer35_at_gmail.com> wrote:

> Yes Mark, that is kind of what I got to know from the requirements so far.
> And as I understood from this discussion so far, the encrypted raw value
> is 50% smaller in size and is the same irrespective of the client and is
> not going to change with the new oracle version etc. So we are in
> agreement with the dev team to change the column to RAW data type. However,
> considering that will be a design change and will take some time, we were
> trying to see if we can make the existing query run faster by any quick fix
> or code change etc. And as I posted the two sql monitors above, it appears
> to be the same query sometimes running for 30 minutes and
> sometimes ~10minutes, so trying to understand the cause and if we can make
> it consistent?
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Jan 30 2023 - 23:19:59 CET

Original text of this message