Re: Wrong data in 8i mview refreshed from 10g db with different characterset

From: Martin Berger <martin.a.berger_at_gmail.com>
Date: Wed, 17 Sep 2008 06:46:00 +0200
Message-Id: <D8B9E5EA-E834-49D8-BBD3-493BD8306210@gmail.com>


Maureen,

This is the correct behaviour for the database. (Well, in fact it was not 100% correct during the time, you used US7ASCII on both DBs) Let me explain, what's going on:
First of all, there is an important component involved you did not mention yet: your client (sqlplus or whatever you use) and its NLS_* settings. (especially the character set!) 1) A database converts characters only if the client and the server does NOT have the same character set and the codepoint in both caractersets does not match.
2) US7ASCII is only defined for the least 7 bit.

This brings you to some interresting situations: A) If all components (client, master, replication) hast the same character set configured, you can insert any character you can enter in the client, it will not be interpreted or converted, not even checked and stored in the data block. At a select also nothing happens, it just got read from the block and sent to the client. Even if the character does not match into the character set (e.g. has 8 bits instead of 'allowed' 7) it's handled this way. B) If the client and the master has a different character set, either of them (prefered the client, but if it cannot, the DB) converts the chartacters to the destinations code point. the same vice versa. Oracle seems to use the 'nearest' good character, if a correct conversion is not possible. (I have not checked this fully)

In your situation, the client and master has either the same character set, or there is a conversion. So the characters get in the master corect.
As the Replication is nothing more than a client-server communication, there is a character conversion. And as US7ASCII can not gather all differeent characters, i has to search for the 'nearest'.

So my summary on this is (with knowing that I had to guess a lot and could not proove my theory on your test-system) your previous setup (all US7ASCII) was well working but wrong, and you will have to switch your replication to a characterset which can hold all characters from WE8MSWIN1252. I hope, this helps.

best regards,
  Martin

--
Martin Berger             
martin.a.berger@gmail.com                                          http://berx.at/

Am 17.09.2008 um 02:44 schrieb Maureen English:


> We have an 8i database (US7ASCII characterset) with materialized views
> being refreshed from a 10g database (was US7ASCII, just changed to
> WE8MSWIN1252).
>
> Prior to converting the 10g database to WE8MSWIN1252, the data was
> displayed
> the same in both databases.
>
> For example:
>
> Langue et littérature français
>
> resulted from a select statement in both databases when they were
> both US7ASCII.
> Now, the result is the same in the WE8MSWIN1252 database, but after
> I refreshed
> the materialized view, the result from the 8i database, which is
> still US7ASCII,
> is different:
>
> Langue et litterature francais
>
>
>
> The codepoints should have been the same in both databases, but for
> some reason,
> the refresh changed things.
>
>
>
> The table in the 10g master (WE8MSWIN1252) database shows:
>
> TEST> select dump(STVCIPC_DESC) from stvcipc where
> stvcipc_code='550100';
>
> DUMP(STVCIPC_DESC)
> ------------------------------------------------------------------------------------------------------------------------------------
> Typ=1 Len=30:
> 76,97,110,103,117,101,32,101,116,32,108,105,116,116,233,114,97,116,117,114,101,32,102,114,97,110,231,97,105,115
>
>
> but the table in the 8i (US7ASCII) database shows different
> codepoints:
>
>
> RPTT> select dump(STVCIPC_DESC) from stvcipc where
> stvcipc_code='550100';
>
> DUMP(STVCIPC_DESC)
> --------------------------------------------------------------------------------
> Typ=1 Len=30:
> 76,97,110,103,117,101,32,101,116,32,108,105,116,116,101,114,97,116,117,114,101,32,102,114,97,110,99,97,105,115
>
> ^ ^
>
>
> If I select the codepoints from the 8i database, the characters show
> up correctly....
>
> RPTT> select chr(101),chr(99),chr(233),chr(231) from dual;
>
> C C C C
> - - - -
> e c é ç
>
>
> Any ideas why the refresh of the materialized view wouldn't still
> contain the correct codepoints?
>
> - Maureen
>
>
>
>
> --
> http://www.freelists.org/webpage/oracle-l
>
>

-- http://www.freelists.org/webpage/oracle-l

  • application/pkcs7-signature attachment: smime.p7s
Received on Tue Sep 16 2008 - 23:46:00 CDT

Original text of this message