Frank van Bortel wrote:
> SG wrote:
>
>> Hi all.
>>
>> I am new to Oracle so would appreciate any insight as to what are main
>> reasons based on your experience that cause corrupt redo log files and
>> data files? We had a corrupt sysaux.dbf file and constant corrupt redo
>> logs that would stop our application since it was in archive mode and
>> we'd get archiver errors. We tried creating new log groups and that
>> didn't help. We had to constantly clear unarchived log groups,etc. to
>> get it working. We rebuilt the database and used a dmp from the
>> "suspect" dbase to import our custom talbles in our newly created db
>> and tablespace. All had been fine for a month, but now it's happening
>> again. I looked in the alert logs and see that now we have a corrupt
>> system.dbf file. Has anyone had this type of experience? We are
>> running Oracle 10g on Redhat ES 3.0. The kernel version on the system
>> is 2.4.21.4. An identical system with no problems, same hardware, is
>> running kernel version 2.4.21-15.0.3. Seems to point to a hardware
>> issue maybe? Any ideas would be grealty appreciate. TIA.
>>
>> SG
>>
>>
>>
> And the filesystem(s) you use?
> Ext2, ext3, Reiserfs, hardware RAID, software RAID?
> Hardware: SCSI, IDE (tinkered with params?)
>
> Why are you running unpatched kernels anyway?
> See: http://rhn.redhat.com/errata/RHSA-2005-043.html
> Linux csdb01.cs.nl 2.4.21-27.0.2.EL
Check values of DB_BLOCK_CHECKING and DB_BLOCK_CHECKSUM and adjust for
testing if desired.
In my experience, what you describe is similar to problems I've seen
with faulty disk controllers or even bad memory modules. What makes it
so difficult to trouble-shoot is the intermittent and unpredictable
nature -- runs fine for hours, days, even weeks and then errors start
cropping up.
-Mark Bole
Received on Tue Mar 15 2005 - 17:08:31 CST