RE: Unknown Exadata Cell Parameter - "_cell_fc_toresilver_limit_chdrs"

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Fri, 15 Dec 2023 17:33:15 -0500
Message-ID: <039101da2fa6$b283f540$178bdfc0$_at_rsiz.com>



“guidance of another consulting team to reduce Internal IO observed during that period”  

presumably they know all about it and left copious documentation of the measurements and why they believed it required to set an underscore parameter.  

Reducing Internal IO observed during that period would only be something to do anything about if the Internal IO was causing degradation of response to some user process or system maintenance.  

mwf  

From: Osman DINC [mailto:dinch.osman_at_gmail.com] Sent: Friday, December 15, 2023 4:18 PM
To: jlewisoracle_at_gmail.com; Mark W. Farnham Cc: oracle-l_at_freelists.org
Subject: Re: Unknown Exadata Cell Parameter - "_cell_fc_toresilver_limit_chdrs"  

Thanks Mark and Jonathan,  

Environment has been running with this parameter for the last 18 months and it was set with the guidance of another consulting team to reduce Internal IO observed during that period and Exadata was on 20.1.5.0.0.201209 image version at that time. Last week I updated all the image versions of Exadata and this parameter caught my attention.  

I do not have knowledge of how it will behave without this parameter being set. Internal IO rates are on a low level( %1,5) in AWR reports now. it may be related with the storage server image used or the activity going on that period. Today I have further analyzed that period. After investigating all the cell alert.log files, I have detected that too much resilvering operation occurred at that time and also all user defined database objects moved to another tablespace and a raid controller card change operation was done.

All these maintenance operations should be the reason why that much Internal IO was observed and this parameter was set.  

I have decided to reset it next week. I will observe cell metrics and AWR reports. Thanks for clarifications.  

Regards,

Osman DİNÇ    

Jonathan Lewis <jlewisoracle_at_gmail.com>, 15 Ara 2023 Cum, 17:54 tarihinde şunu yazdı:

I don't know what this parameter is supposed to controll, but looking at the stats you produced I would have guessed that it was something to limit the RATE at which discs were resilvered rather than having something to do with the time between checks of something.  

Do you see a lot of this type of activity going on all the time, or does it only happen in bursts occasionally. If it's bursts from time to time (and you haven't seen any indication that someone has decided to reconfigure or rebalance your ASM discs) then perhaps you have some failing or failed hardware that Oracle keeps trying to work around, so I'd check for any reports that might be available about the state of your ASM discs and what your disc groups look like. I don't do anything with Exadata hardware, so I can't offer any detailed suggestions about that.  

Another thought about when this happens - can you find some correlation between (e.g.) activity that loads new data partitions and the resilvering - maybe it's a simple side effect of data files growing, and Oracle doing something to grow them on one half of redundant pairs and then catching up on the other half by resilvering.  

Regards

Jonathan Lewis      

On Thu, 14 Dec 2023 at 20:22, Osman DINC <dinch.osman_at_gmail.com> wrote:

Hi all Oracle enthusiasts,  

Do you have any information about the "_cell_fc_toresilver_limit_chdrs" parameter?

It is set on my environment (X7-2) cellinit.ora. It is an undocumented and non-default configuration.  

I could not find any information about it, but it is commented as "It is set to reduce Internal IO on exadata cells". We do not have prior knowledge about it.  

Also I have screenshots of AWR reports - Exadata Statistics - Top IO Reasons by MB section.

(Before and after parameter change)  

Before _cell_fc_toresilver_limit_chdrs parameter is set:

https://drive.google.com/file/d/1PEcjUYAuG8SNTq1vfXLl0fyoeoffSLhq/view?usp=drive_link  

After "_cell_fc_toresilver_limit_chdrs"=6000000

https://drive.google.com/file/d/12tDsDvt22oFyEaQtRdMsXcCYduw9992A/view?usp=drive_link  

According to the screenshots, It reduces Internal IO done by storage servers dramatically, but there may be a trade-off which i do not know.  

I want to clear this parameter as it is a non-default configuration. It was set on image version 20.1.5.0.0.201209 and it has been retained for years.  

Exadata image version is 22.1.17 now.

If anybody can share more information about what this parameter does, I will be glad.    

Regards,

Osman DİNÇ.    

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Dec 15 2023 - 23:33:15 CET

Original text of this message