Re: Error related to hash memory

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Thu, 2 Dec 2021 10:10:42 +0000
Message-ID: <CAGtsp8ka=5zC3SHD8RLQsHxa6bY7k=EkQ1zFDK0=hVz+ruoKCA_at_mail.gmail.com>



The error message is

   ORA-32690: Hash Table Infrastructure ran out of memory Two things to note about the wording: l

    It's NOT a message saying something about being unable to allocate PGA memory

   It's not a message about a hash JOIN, it's about a hash TABLE.

I do not know if I'm reading too much into the words, and I don't know exactly how Oracle handles hash aggregation so I won't be able to answer anyt questions about the detailed mechanisms that might be causing this error to appear; but I do know something about how hash joins work and, in particular, how the code handles the in-memory BUILD TABLE component if it won't fit in memory.

If you look at the Estimated and Actual rows for the hash aggregation you can see that the estimate is out by a factor of over 1,000. For hash joins Oracle sets up an structure for the build table that anticipates splitting it into partitions that can spill to disc if the volume of data is too large to handle in memory (hence one-pass and multipass hash joins). It's likely that Oracle does something similar for hash aggregation but doesn't allow for the estimate being out by a factor of 1,000 and the enormous discrepancy might have resulted (for example) in Oracle being able to keep track of data that it had to dump to disc.

I can see two options (apart from the obvious "set the parameter according to the workaround suggested" so that the aggregation is a sort aggregate). The first is simply to find out why the estimate of rows is so bad and allow Oracle to work out a better estimate. Since there's a non-mergeable view in the query (operation 20) you may be able to inject a cardinality
(or, to be fussy, an opt_estimate hint) that simply says "there will be
168000000 rows coming out of this query block". That might allow Oracle to create a suitable internal structure to handle the data volume correctly. However, I note that the MAX MEM for the hash aggregation is 2GB and that may be a hard limit (assuming it is an accurate report rather than a round-up from something much less), so option 2 is to reduce the absolute memory requirement of the hash aggregation.

Option 2, reducing the memory requirement: the plan is going serial for the hash aggregation and it would be good to find out why. If it runs parallel the problem may disappear anyway. However I note that there appear to be 8 scalar subqueries in the select list; (operations 2 - 18), and I think they are growing the row length BEFORE the aggregation takes place. If some of these scalar subqueries can be made to operate AFTER the hash aggregation then you could be aggregating a much smaller volume of data and the memory requirment and the need to track a large volume spilled to disc could be reduced to a level that bypasses the error.

I happen to have published a note only yesterday with an example that rewrote a query to do a sort before collecting scalar subquery column values: https://jonathanlewis.wordpress.com/2021/12/01/best-practice/ At the end of it I pointed out that generally "you don't want to do this kind of thing unless it's for very special circumstances" - but you may have exactly those circumstances. Another reason for considering this strategy is that maybe there's something about one (or more) of your scalar subqueries that forces you hash aggregation to serialis - e.g. do you call a pl/sql function that is not parallel enabled - and moving one or more scalar subqueries out of the hash aggregation may result in parallel execution taking place.

(As others hav pointed out - you're doing a serlal insert anyway, so maybe
a simple "enable parallel dml" would bypass the issue anyway.)

(On a side note: I am curious about the operations 4 - 11 of your plan.
The code to generate the plan "depth" has some errors in 11g, so I'd like to know whether that part of the plan represents a single scalar subquery with a couple of cascading subqueries, or a scalar subquery which uses decode()/case with further scalar subqueries embedded in the decode()/case, or has some other structure.)

Regards
Jonathan Lewis

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Dec 02 2021 - 11:10:42 CET

Original text of this message