Re: FW: My worst nightmare - ORA-8103

From: Gerwin Hendriksen <gerwin.hendriksen_at_gmail.com>
Date: Thu, 12 Jun 2014 07:10:45 +0200
Message-ID: <CAHo8gJugnsgmPAiF7KWMEpCjrnU=FUTvh_GHpDupCAxpBEr-Rw_at_mail.gmail.com>



Hello Maureen,

I think Marc might be right with index suggestion. What worries me is the fact you had a corrupted undo tablespace. I guess you used parameters like '_corrupted_rollback_segments'
and/or '_offline_rollback_segments' to startup your database before able to create a new undo tablespace. Can you explain how this was done?

Can you also state what kind of errors you encountered in your database at the time you recognized a corrupted undo tablespace? The problem is that most of this problems are going with ORA-600 errors in the 4000 range (transaction layer in the RDBMS code). So the problem you see with the table involved giving ORA-8103 might be just one of the problems in your database. Maybe some others have simply not been touched yet (don't you see any errors in your alert file?).

If the database was openened with one of the underscore parameters I mentioned the only valid thing to do is recreating the database by export / import. This is probably a very big job, but in my opinion the only official Oracle supported way to run the database.

With kind regards,

Gerwin Hendriksen

2014-06-12 6:09 GMT+02:00 Maureen English <maureen.english_at_alaska.edu>:

> Thanks! I updated our ticket with Oracle asking for info on this bug. I
> can't find anything about it on Metalink.
>
> Regarding Mark's post, does this mean that the reason users aren't seeing
> any problems is because
> they are able to access the data because the application uses the indexes
> when querying?
>
> We did have to create a new undo tablespace due to corruption in the undo
> tablespace.
>
> At least one of the indexes got an ORA-8103 when running analyze on it. I
> didn't check the others.
> Now I'm starting to panic again. Oracle says that the data is lost, but
> now it might not be???
>
> Interestingly enough, we were able to fix the other issues that we've
> found by recreating indexes, or stepping through
> a table using a value from a unique index and then find the corrupt or
> duplicate rows and delete them. This table
> doesn't have a unique index...and it has 553M+ rows. It takes 20 minutes
> just to get a rowcount back :-(
>
> I ended up following Oracle's suggestion of copying the rows that didn't
> return the ORA-8102 error to another table,
> exported that table, truncated my application table, dropped the indexes
> on the application table and imported the
> data back in. I'm currently rebuilding the indexes. Tomorrow will be
> when the 2 main Finance users will check things
> out and let me know the status.
>
> I'm so dreading the possibility of someone telling me that all of a sudden
> nothing balances...especially when it all
> seems to be in balance now....
>
> - Maureen
>
>
>
>
> On 6/11/2014 5:59 AM, Gerwin Hendriksen wrote:
>
>> Hello Maureen,
>>
>> The ORA-8103 can occur when the block contains a vector to an undo
>> segment and for some reason the change can't be found
>> in the undo segment. Years ago I logged a bug at Oracle (at that time
>> working for Oracle Support) describing a situation
>> when recovering a database ending in an ORA-8103 or ORA-600 [4146], when
>> selecting a table. Because the issue described
>> here is in a reporting database, this bug although a very old bug could
>> be underneath. The bug is caused by the way redo
>> is applied and a small issue there might give the same situation as
>> described in the bug. The bug was filed 31 march
>> 2001 and has number 1714688, the bug is probably never really fixed
>> because fixing it would mean a major design change
>> in the redo apply. Further more the bug was encountered due to human
>> error in the way the recovery was done. Just ask
>> support that this bug might give an idea why you encounter the ORA-8103.
>>
>> Good luck and kind regards,
>>
>> Gerwin Hendriksen
>>
>>
>> 2014-06-11 14:01 GMT+02:00 Mark W. Farnham <mwf_at_rsiz.com <mailto:
>> mwf_at_rsiz.com>>:
>>
>>
>> Sorry, missed the list.
>>
>> -----Original Message-----
>> From: Mark W. Farnham [mailto:mwf_at_rsiz.com <mailto:mwf_at_rsiz.com>]
>> Sent: Tuesday, June 10, 2014 2:43 PM
>> To: 'maureen.english_at_alaska.edu <mailto:maureen.english_at_alaska.edu>'
>> Subject: RE: My worst nightmare - ORA-8103
>>
>> If a unique index survives, you should be able to retrieve the rowids
>> for
>> the rows not in your recovery target table with a minus.
>> Then, using the rowids and column sets for indexes, you can retrieve
>> at
>> least those column values from the indexes.
>> This at least should provide you with a complete set of keys for the
>> missing
>> rows.
>>
>> I presume you lack the ability to do individual block patching via
>> RMAN,
>> etc.
>>
>> good luck,
>>
>> mwf
>>
>> -----Original Message-----
>> From: oracle-l-bounce_at_freelists.org <mailto:oracle-l-bounce_at_
>> freelists.org> [mailto:oracle-l-bounce_at_freelists.org
>> <mailto:oracle-l-bounce_at_freelists.org>]
>> On Behalf Of Maureen English
>> Sent: Tuesday, June 10, 2014 2:29 PM
>> To: oracle-l_at_freelists.org <mailto:oracle-l_at_freelists.org>
>> Subject: My worst nightmare - ORA-8103
>>
>> Hi,
>>
>> Original post to BOracle list...apologies to those who are seeing this
>> again.
>>
>> On 5/31, the refresh of a materialized view in our reporting instance
>> failed
>> with an ORA-8103 error, as did a gather stats job for the same table
>> in the
>> production database. We had an application upgrade done on 6/1 and
>> copied
>> our production database to a preprod version on 6/2, so the errors
>> weren't
>> caught immediately.
>>
>> Users complained that the current data wasn't available in the
>> reporting
>> instance but my attempts to refresh and recreate all failed with the
>> same
>> ORA-8103 error.
>>
>> I've been working with Oracle Support since early last week and keep
>> hitting
>> brick walls. At the moment, we've managed to copy 'uncorrupted' rows
>> out of
>> the table in our preproduction database and are 480 rows short in a
>> table
>> with 550M rows in it. Oracle is currently working to identify the
>> rowids of
>> the corrupt blocks based on the output in a trace file generated by a
>> failing query.
>>
>> I'm looking at Document 336133.1 while I wait for more info from
>> Oracle.
>>
>> Anyone have any comments/suggestions/other info that might help
>> identify and
>> fix the problem? We really can't afford to lose that much data.
>> We're
>> working on recovering our database to a different location to try to
>> get
>> back any data that we lose, but since I don't have any idea what
>> caused the
>> corruption, I'm lost.
>>
>> - Maureen
>> --
>> http://www.freelists.org/webpage/oracle-l
>>
>>
>> --
>> http://www.freelists.org/webpage/oracle-l
>>
>>
>>
>>
> --
> Maureen English
> Lead Database Administrator
> University of Alaska
> Fairbanks, AK
> (907) 450-8329
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Jun 12 2014 - 07:10:45 CEST

Original text of this message