RE: hash join waits on cpu 100% time
From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Sun, 4 Jan 2015 12:12:37 +0000
Message-ID: <CE70217733273F49A8A162EE074F64D928266A76_at_EXMBX01.thus.corp>
Date: Sun, 4 Jan 2015 12:12:37 +0000
Message-ID: <CE70217733273F49A8A162EE074F64D928266A76_at_EXMBX01.thus.corp>
You might as well capture the full SQL Monitoring output for the completed query. Then I'd check the definition of the view V_SA_tab_personal_ADD to see what it's doing with the "with subquery". I think most of the work relates purely to the number and size of the tablescans of the resultant GTT - but the CPU used still looks a little high (even if it does mange to scan 2 billiong rows in the first 1,000 seconds) so I want to know how it reduces those rows down to one per scan - as indicated by the Rows (Actual) in lines 19 and 18. I'd want the execution plan from v$sql_plan so that I could check the predicate section to see how any predicates were applied. If I were to start running the query again I would use Tanel's code to capture a snapshot of latch activity over a few seconds to see if there were any significant latch gets over and above those related to the buffer cache. Regards Jonathan Lewis http://jonathanlewis.wordpress.com _at_jloracle ________________________________________ From: GG [grzegorzof_at_interia.pl] Sent: 04 January 2015 10:48 To: Jonathan Lewis; oracle-l_at_freelists.org Subject: Re: hash join waits on cpu 100% time W dniu 2015-01-04 o 11:30, Jonathan Lewis pisze: > Checking the description of the bug it seems to say that it's the creation of the internal GTT that is the threat. In the context of the bug - it's Siebel (which typically means large numbers of columns) and a very large number of (possibly small ?) queries. This would make the recursive SQL a significant fraction of the total work done, and for concurrent queries introduce a degree of latch contention. > > In your case you are talking about the execution of a single big query; moreover, if the problem were recursive SQL then the Tanel snapshot would show a high value for STAT execute count. Finally, if you check the monitoring execution plan you can see (lines 12/13) that the Load as Select starts 8 seconds into the query and completes within one second - that's the point at which the bug would apply. > > There is one oddity that I would like an explanation for - how come line 18 started at +4 seconds, when line 12 didn't start until +8 seconds ? The only (non-buggy) thing I can think of is that there's a run-time dependency that doesn't show up in the execution plan - even then that's only showing (at worst) only a 4 second anomaly, not a very slow query. Got Your point Jonathan . Query finished after 10h , I'm out of office right now so cant provide any more details . Just wondering ,how can we narrow down the issue, should I go for detailed lvl 12 10046 trace or/and better 10053 ? Seems like Oracle instrumentation lacks of granularity if Your query 'hangs on' CPU . Maybe I'll try with few short stack dumps using oradebug when query is running . Regards GG --http://www.freelists.org/webpage/oracle-l Received on Sun Jan 04 2015 - 13:12:37 CET