Re: Error related to hash memory

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Thu, 2 Dec 2021 10:56:27 +0000
Message-ID: <CAGtsp8=jLtzO56SLeQFwT0LENcNrLhDCTT-G9fHnWSgKppk+ew_at_mail.gmail.com>



Just a little follow-up: I'd forgotten that I had published a note a few years ago about the "depth" error in execution plans with case/decode() that included embedded scalar subqueries; but I went looking for it and it's here ( https://jonathanlewis.wordpress.com/2014/10/19/plan-depth/ ) if you want to check it out.

Regards
Jonathan Lewis

On Thu, 2 Dec 2021 at 10:10, Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:

>
> The error message is
> ORA-32690: Hash Table Infrastructure ran out of memory
> Two things to note about the wording: l
> It's NOT a message saying something about being unable to allocate PGA
> memory
> It's not a message about a hash JOIN, it's about a hash TABLE.
>
> I do not know if I'm reading too much into the words, and I don't know
> exactly how Oracle handles hash aggregation so I won't be able to answer
> anyt questions about the detailed mechanisms that might be causing this
> error to appear; but I do know something about how hash joins work and, in
> particular, how the code handles the in-memory BUILD TABLE component if it
> won't fit in memory.
>
> If you look at the Estimated and Actual rows for the hash aggregation you
> can see that the estimate is out by a factor of over 1,000. For hash joins
> Oracle sets up an structure for the build table that anticipates splitting
> it into partitions that can spill to disc if the volume of data is too
> large to handle in memory (hence one-pass and multipass hash joins). It's
> likely that Oracle does something similar for hash aggregation but doesn't
> allow for the estimate being out by a factor of 1,000 and the enormous
> discrepancy might have resulted (for example) in Oracle being able to keep
> track of data that it had to dump to disc.
>
> I can see two options (apart from the obvious "set the parameter according
> to the workaround suggested" so that the aggregation is a sort aggregate).
> The first is simply to find out why the estimate of rows is so bad and
> allow Oracle to work out a better estimate. Since there's a non-mergeable
> view in the query (operation 20) you may be able to inject a cardinality
> (or, to be fussy, an opt_estimate hint) that simply says "there will be
> 168000000 rows coming out of this query block". That might allow Oracle to
> create a suitable internal structure to handle the data volume correctly.
> However, I note that the MAX MEM for the hash aggregation is 2GB and that
> may be a hard limit (assuming it is an accurate report rather than a
> round-up from something much less), so option 2 is to reduce the absolute
> memory requirement of the hash aggregation.
>
>
> Option 2, reducing the memory requirement: the plan is going serial for
> the hash aggregation and it would be good to find out why. If it runs
> parallel the problem may disappear anyway. However I note that there appear
> to be 8 scalar subqueries in the select list; (operations 2 - 18), and I
> think they are growing the row length BEFORE the aggregation takes place.
> If some of these scalar subqueries can be made to operate AFTER the hash
> aggregation then you could be aggregating a much smaller volume of data and
> the memory requirment and the need to track a large volume spilled to disc
> could be reduced to a level that bypasses the error.
>
> I happen to have published a note only yesterday with an example that
> rewrote a query to do a sort before collecting scalar subquery column
> values: https://jonathanlewis.wordpress.com/2021/12/01/best-practice/ At
> the end of it I pointed out that generally "you don't want to do this kind
> of thing unless it's for very special circumstances" - but you may have
> exactly those circumstances. Another reason for considering this strategy
> is that maybe there's something about one (or more) of your scalar
> subqueries that forces you hash aggregation to serialis - e.g. do you call
> a pl/sql function that is not parallel enabled - and moving one or more
> scalar subqueries out of the hash aggregation may result in parallel
> execution taking place.
>
> (As others hav pointed out - you're doing a serlal insert anyway, so maybe
> a simple "enable parallel dml" would bypass the issue anyway.)
>
> (On a side note: I am curious about the operations 4 - 11 of your plan.
> The code to generate the plan "depth" has some errors in 11g, so I'd like
> to know whether that part of the plan represents a single scalar subquery
> with a couple of cascading subqueries, or a scalar subquery which uses
> decode()/case with further scalar subqueries embedded in the decode()/case,
> or has some other structure.)
>
> Regards
> Jonathan Lewis
>
>
>
>
>
>
>
>
>
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Dec 02 2021 - 11:56:27 CET

Original text of this message