Re: Oracle RAC and IRQ Balance
Date: Sun, 9 Oct 2011 17:35:34 -0700
Message-ID: <CAGXkmivUtN9Re6KkZihuopfufGBVHBxWe2Dcdq6sncJVQWOEBQ_at_mail.gmail.com>
A few things:
- Just for clarity - this isn't RAC specific. The issue of "burning" an entire core/thread on interrupts can happen on any system sending enough packets. I've seen it plenty of times on network interfaces to chatty application tiers.
- Even though on said core/thread there is 41.33% user, the %idle is only 4.08% - so this little guy is almost out of gas.
As long as 1 core/thread doesn't run out of gas, this shouldn't be an issue, but in this case, it's pretty darn close -- too close for my comfort. I'd recommend enabling irqbalance and monitoring the workload & sys metrics carefully. More details on this can be found at http://irqbalance.org/
You may find that collectl [http://collectl.sourceforge.net/] comes in handy for gathering & monitoring this sys metric (and others!). My mantra on collectl is: "If your OS is Linux and you are not using collectl, you probably should be." (I'm a big fan.)
Cheers,
On Sun, Oct 9, 2011 at 10:00 AM, Riyaj Shamsudeen
<riyaj.shamsudeen_at_gmail.com> wrote:
> Hello Jed
> � NIC cards interrupt CPU for the packet delivery. Of course, in a busy RAC
> database, there can be huge amount of network packets being transferred
> leading to high IRQs. If IRQs are pinned to be interrupted to one CPU, then
> latency in that CPU can cause issues as kernel threads need to be scheduled
> to serve the irqs only in that CPU.
> � If you want IRQs to be pinned to one CPU, then you should make sure that
> no other process is scheduled to execute in that CPU. But, I see that 40% of
> usage in CPU in USER mode which indicates that this is probably not
> happening in your case.
> �But, why is this important for you? Do you see network delays causing RAC
> performance issues? If yes, then I don't see an issue of IRQs being serviced
> by all CPUs. Also, I am surprised that this is not a default.
>
>
> On Mon, Oct 3, 2011 at 4:29 PM, Walker, Jed S
> <Jed_Walker_at_cable.comcast.com>wrote:
>
>> Back to my learning of RAC. Today, it was suggested that we turn on
>> IRQBALANCE on our Oracle 11.2.0 RAC systems to help distribute the IRQ load,
>> to hopefully help with performance. I did a check and can see that just one
>> CPU appears to be handling all of these.
>> mpstat -P ALL 2
>> Linux 2.6.18-53.el5 (node-01) � � � � 10/03/2011
>>
>> 09:19:46 PM �CPU � %user � %nice � �%sys %iowait � �%irq � %soft �%steal
>> %idle � �intr/s
>> 09:19:48 PM �all � 14.30 � �0.00 � �3.04 � 23.54 � �0.25 � �1.27 � �0.00
>> 57.59 �10903.06
>> 09:19:48 PM � �0 � 41.33 � �0.00 � �9.18 � 40.31 � �1.02 � �4.08 � �0.00
>> �4.08 �10902.55
>> 09:19:48 PM � �1 � �2.55 � �0.00 � �0.51 � 14.29 � �0.00 � �0.00 � �0.00
>> 82.65 � � �0.00
>> 09:19:48 PM � �2 � 12.24 � �0.00 � �2.04 � 34.18 � �0.00 � �0.00 � �0.00
>> 52.04 � � �0.00
>> 09:19:48 PM � �3 � �1.02 � �0.00 � �0.51 � �6.63 � �0.00 � �0.00 � �0.00
>> 92.35 � � �0.00
>> (this is consistent over a period of time)
>>
>> I then read an article saying that in many cases this doesn't matter -
>> something to do with processes being pinned to a CPU (Sorry, I can't find
>> the article again!).
>>
>> Does anyone have any experience, or is there a good practice for this and
>> RAC?
>>
>> service irqbalance start
>> chkconfig irqbalance on
>>
-- Regards, Greg Rahn http://structureddata.org -- http://www.freelists.org/webpage/oracle-lReceived on Sun Oct 09 2011 - 19:35:34 CDT