Re: cache buffers chain latch contention

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Thu, 25 Mar 2004 16:57:46 -0000
Message-ID: <01c101c4128a$4c26bb10$7102a8c0@Primary>

Which version of Oracle ?

How parallel ?

Would you send the full execution plans for the parallel and serial create statements.

Contention for cache buffers chains latches only means that multiple processes want to find blocks that are protected by the same latch - this could be caused by almost any type of activity.

In the case of a complex parallel CTAS,
it could be that you have N slaves which are scanning one table and indexing into another (very small) table, and therefore continually hitting the same couple of blocks in the small table concurrently. If this is the case, then perhaps identifying that small table and rebuilding it with one row per block would be sufficient to solve the problem. But you do need to know very specifically where the problem is before you start rebuilding objects.

Since the process is stuck anyway, why not run it with 10046 set at level 8 - this should propagate to the slaves, and you can see which of the decomposed pieces of SQL they are getting stuck on, and this might help pinpoint the problem.

(In general, a db block doesn't hold rows from multiple tables - only a block belonging to a cluster has that option).

Regards

Jonathan Lewis
http://www.jlcomp.demon.co.uk

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

April 2004 Iceland http://www.index.is/oracleday.php June 2004 UK - Optimising Oracle Seminar

Original Message ----- From: "Koivu, Lisa" <Lisa.Koivu_at_Cendant-TRG.com> To: <oracle-l_at_freelists.org> Sent: Thursday, March 25, 2004 4:10 PM Subject: cache buffers chain latch contention

Hi everyone,

A couple of days ago I attempted a parallel mv-create statement in my dev database. Instead of completing, or even showing any progress, it sat there. When I didn't run it in parallel, I saw temp space being chewed up, to me that indicates progress. When I run it in parallel, nothing. It sat there.

Investigation of v$system_event indicated latch free wait. OK, fine, so I look at v$latch and find I have enormous misses and sleeps on cache buffers chains. I chased down the child latches and pinpointed the affected objects. I had several hash partitioned indexes (4096 partitions, yes I went crazy, hey why not?) that were on the latches. I rebuilt them range partitioned, tried the mv build again, same problem. OK, fine, then what?

My impression of this latch, and what happens, is this (PLEASE correct me if I am wrong): A db block can house records from multiple tables. When I fired the build in parallel, the slaves began fighting over who got the latch for what hot block in cache. It appears that I can begin to alleviate this problem by rebuilding objects off of hot child latches.

Am I wrong? Metalink doesn't say much other than "this is your problem". Gaja states this is because of excessive i/o according to Cary Millsap http://www.revealnet.com/newsletter-v3/0402_C.htm , and Steve Adams theorizes that it could be due to an extreme number of free extents, http://www.ixora.com.au/q+a/0104/13001720.htm and comes to the same conclusion as Cary
http://www.ixora.com.au/newsletter/2000_11.htm#hash_latches in that it's excessive i/o.

Well, yes, that's why I'm trying to build a mv to stop this nasty join between 18 tables for queries that someone set the expectation of "it should run in a few minutes" (don't ask, I'm tired of fighting). I need to get through the create.

My question is, will reorging objects really help? IN the last few days I have heard more than once "just separate data and indexes, temp and redo on different luns" and don't worry too much about it. This also came from an oracle "expert" that is coming next week to hold our hands through a Windows 2003 RAC install. Seems to me that if moving objects will help, this idea of just split it up simplistically and let it go is out the window.

What am I missing? Since this is a dw, yes, it's going to have a ton of i/o. There isn't much I can do about that - the reports for our first user group are detailed, not summarized. They really do pull thousands of records and look at them. The bottom line is I will need to turn PQO on in this database, and right now I am convinced I'll have the same lockup in prod because I see a large number of sleeps and misses on the same latch.

I have no idea why my title is "senior" dba. Today I feel like I just started learning Oracle yesterday.

I am interested in any comments... they sure would be appreciated... thanks all

Lisa Koivu
Oracle Database Monkey
Orlando, FL, USA

Please see the official ORACLE-L FAQ: http://www.orafaq.com

To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line.
--

Archives are at http://www.freelists.org/archives/oracle-l/ FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html

Received on Thu Mar 25 2004 - 11:09:16 CST