Re: 2-node RAC faster than 6-node!
From: Dan Norris <dannorris_at_dannorris.com>
Date: Wed, 02 Apr 2008 20:23:25 -0500
Message-ID: <47F4318D.3060703@dannorris.com>
Abdul,
There was once a whitepaper describing some scalability testing done with Oracle EBS on a RAC cluster. It showed that as far as scalability goes, the two-to-three node addition didn't yield as much scalability increase as the three-to-four node addition and all other node additions beyond that. I did a quick look, but couldn't find it.
In the case of that particular study, this "dip" in scalability (though it was still an increase in overall scalability/throughput) was directly related to the way that RAC's global cache is managed. In any case where a block is cached in another instance, there are three parties involved: the requestor, the holder, and the global resource master for the requested block. In a two-node cluster, two of these three parties are the same instance and therefore the messaging between them is very fast. When a third node is added, there's a statistical probability that these three parties will involve all three instances for a significant portion of the occurrences of this situation. In the case where three instances are involved, the communications take longer since all messaging has to cross the interconnect.
That is a long explanation, but may be applicable to some extent in your case. If your application causes the database to perform a lot of interconnect messaging (most likely related to global cache requests), then this may be part of your issue. Statistics are kept on how many block transfers are requested and successful, so it'd be interesting to identify those numbers for your test scenario while varying the number of nodes.
On the other hand, it could be something completely unrelated :). That's the wonderful, magical world that is RAC!
Dan
A Ebadi wrote:
Date: Wed, 02 Apr 2008 20:23:25 -0500
Message-ID: <47F4318D.3060703@dannorris.com>
There was once a whitepaper describing some scalability testing done with Oracle EBS on a RAC cluster. It showed that as far as scalability goes, the two-to-three node addition didn't yield as much scalability increase as the three-to-four node addition and all other node additions beyond that. I did a quick look, but couldn't find it.
In the case of that particular study, this "dip" in scalability (though it was still an increase in overall scalability/throughput) was directly related to the way that RAC's global cache is managed. In any case where a block is cached in another instance, there are three parties involved: the requestor, the holder, and the global resource master for the requested block. In a two-node cluster, two of these three parties are the same instance and therefore the messaging between them is very fast. When a third node is added, there's a statistical probability that these three parties will involve all three instances for a significant portion of the occurrences of this situation. In the case where three instances are involved, the communications take longer since all messaging has to cross the interconnect.
That is a long explanation, but may be applicable to some extent in your case. If your application causes the database to perform a lot of interconnect messaging (most likely related to global cache requests), then this may be part of your issue. Statistics are kept on how many block transfers are requested and successful, so it'd be interesting to identify those numbers for your test scenario while varying the number of nodes.
On the other hand, it could be something completely unrelated :). That's the wonderful, magical world that is RAC!
Dan
A Ebadi wrote:
-- http://www.freelists.org/webpage/oracle-l Received on Wed Apr 02 2008 - 20:23:25 CDTWe have an existing 4-node RAC cluster (v440's on Solaris & Oracle 10.2 w/ ASM) that was big time CPU-bound and application performance was suffering. We added 2 new nodes(M5000's) to bring the total to 6 and the app performance actually went down significantly! After messing around with it we found out that by taking the original 4 nodes out of the picture & only running on the 2 new M5000's nodes things improved significantly - the app jobs started running in some cases 20X faster than ever! We tested adding just a single node (v440) back in the mix and performance went way down immediately, so we decided to keep things running on just the 2 M5000's for now.Any suggestions on identifying root cause as to why the adding of any nodes back to the picture slows things way down? We have an SR opened with Oracle Support but nothing firm so far.Thanks,Abdul
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.