Re: RAC Experiences

From: Don Granaman <granaman_at_cox.net>
Date: Fri, 14 May 2004 01:27:09 -0400
Message-ID: <01db01c439e9$84d41030$6401a8c0@dilbert>

Since I have posted some of the things that might make one think that any consideration of RAC is on the lunatic fringe, I'll make some amends (perhaps)...

OPS/RAC can be useful and relatively stable. Sometimes you just hit the limit (either financially or physically) of vertical scalability and have to go horizontal. The nature of the application and its RAC implementation is, IMHO, crucial. Simply trying to prop up a poorly-designed application with a truck load of RAC hardware can be something of a disaster.

One such (painful) OPS experience of mine was an application that

(1) used no bind variables for queries (DML did and was tight code, but we had over 2500 users submitting queries)

(2) had no coherent locking strategy - leading to constant deadlocks (in RAC/OPS deadlocks can be global - and, if frequent enough, sometimes deadly)

(3) forced on *every* query on all the largest tables a hint on a sequence-generated primary key index - and a stop key (e.g. "select /*+ index (ORDERS ORDER_PK) */ ... where ACCOUNT_NO = 10289 and rownum <= 50") to sort the data in chronological order and limit the result set to what could be shown on the screen, even for queries that did not use the primary key column in the where clause - because the outsourced designers/developers knew that "Oracle sorts are too expensive". Sure. This is far better than fetching the 27 records for that account via the index on ACCOUNT_ID and sorting them. [If you want the absolute ultimate in inefficiency, try querying on COL1 and/or COL2 against a multi-million row table using a hint on the index on COLPK!] By the way, the application designers' eventual "solution" for this year+ long lunacy (after finally having to admit, in the light of a pickup load of hard evidence, that the code might *possibly* have something to do with performance) was to build the hint dynamically based on the where clause (still using a stopkey), and append the ORDER_PK column to every index - all 13 of them! [e.g. Primary key was ORDER_PK, index on (ACCOUNT_NO, ORDER_PK), index on (CLIENT_ID, ORDER_PK), etc.]

(4) when a user submitted a query, the designers' homegrown "muddleware" would submit the query on the user's behalf, but their GUI would time out after 3 seconds if the query wasn't finished - and tell the user to "Try Again". The query submitted by the muddleware would continue to run against the database though - sometimes for 45 minutes or more. Of course, the user didn't know this, so they would [loop until lunch] resubmit the query, it would "time out" but keep running against Oracle [end loop]. I once found 38 exactly identical copies of the same 40+ minute query (using the aforementioned PK hint method) running, all submitted by the same user - after the user called and complained that, even after repeated attempts, it always timed out. The eventual "solution": They changed the GUI timeout to 10 seconds.. Then 20 seconds. Ad nauseum...

(5) Led the users and others to call me at least a half-dozen times per day to ask "Why is the database so slow? Can't you tune it?" [Answer: "No"]

If you have something even vaguely like this, run for the door! Now! Putting in on RAC will likely only make it worse.

However, if you have a fairly well-designed and tuned application, some influence over its construction, and (most importantly) the respect and ears of development, you can make RAC (or OPS) work.

-Don Granaman
OraSaurus - "Entropy ain't what it used to be."

Original Message ----- From: "John P Weatherman" <asahoshi_at_infionline.net> To: "oracle-l" <oracle-l_at_freelists.org> Sent: Thursday, May 13, 2004 7:51 AM Subject: RAC Experiences

> All,
> Given some of the recent postings on RAC, we are probably crazy to
continue looking at it. Still, we have an app that we can see scaling to 40+ CPUs in the next year and want some way to avoid buying ever bigger servers (with the attendant ever bigger $$) to handle it. Currently we are looking at IBM and are left with a few questions:
>
> 1) Is anyone running production RAC on AIX and what has the experience
been from a reliability standpoint? Administrative difficulties? Etc?
>
> 2) Is anyone running production RAC on Linux on IBM hardware and what has
the experience been from a reliability standpoint? Administrative difficulties? Etc?
>
> 3) IBM claims that their cluster software handles heterogeneous boxes
quite well. We are considering the possibility of getting 3 4-way 530s to "prove out" and get RAC running. Later we would add in 650s for scaling up. Eventually the 530s go away and become something else, like a test RAC. Has anyone been running RAC in a heterogeneous environment and what have your experiences been?
>
> Any practical experience feedback would be greatly appreciated.
>
> TIA,
>
> John P Weatherman
> Oracle Database Administrator
> Advance America

Please see the official ORACLE-L FAQ: http://www.orafaq.com

To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line.

--
Archives are at http://www.freelists.org/archives/oracle-l/
FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Received on Fri May 14 2004 - 00:24:24 CDT