Re: Characterset Question

From: Martin Berger <martin.a.berger_at_gmail.com>
Date: Sat, 23 Mar 2024 10:12:45 +0100
Message-ID: <CALH8A90-rdYo+SjvN-+W6cv2RCQ8AKyzc4dP0C-Vr03SpSDLgA_at_mail.gmail.com>





The tool to convert databases to Unicode characterset is called *Database Migration Assistant for Unicode* (DMU
<
https://www.oracle.com/de/database/technologies/appdev/oracle-database-migration-assistant.html> ).
Previous there was CSSCAN and CSALTER, but they are long gone: *CSSCAN and CSALTER To Be Desupported after DB 11.2 (Doc ID 1418321.1 <https://support.oracle.com/epmos/faces/DocContentDisplay?id=1418321.1>)*

If you face any issues with characters wrongly inserted (e.g. you have a these hex values in the dump() result of a varchar2 column 0xC2 0xB0 - which is ° in UTF8 encoding of Unicode Character “°” (U+00B0 <https://www.compart.com/en/unicode/U+00B0>) ) you can try to identify & fix this with the functions SYS_OP_CSCONVTEST() <https://berxblog.blogspot.com/2021/12/under-dmus-cover.html> and SYS_OP_CSCONV()
<https://docs.oracle.com/en/database/oracle/dmu/19.1/dumag/ch5_advanced_topics.html#GUID-3E852E81-30FD-4413-83CC-D378DE01DAFA__GUID-B3D03649-C5FE-4873-8EF6-73D0C48242B1> .

hth,
 Martin

Am Fr., 22. März 2024 um 23:58 Uhr schrieb Mark W. Farnham <mwf_at_rsiz.com>:

> If memory serves, the utility was csscan, and I **thought** it was
> upgraded at some point to flag bytes not (supposed to be) in the source
> database characterset.
>
>
>
> That’s probably moot for this case, since Scott identified that inserts
> from informatica into the new server are the apparent problem, and as Clay
> pointed out, checking the NLS_LANG (I think) on the client running
> informatica should be checked.
>
>
>
> If the client and server nls language values versions are identical, I *
> *think** no translation is done, which if Scott is correct they are both
> AL32UTF8, then the translation is the no-op, which should both work
> correctly and be infinitesimally faster.
>
>
>
> Scott, at your convenience please let us know how you make out.
>
>
>
> *From:* dimensional.dba_at_comcast.net [mailto:dimensional.dba_at_comcast.net]
> *Sent:* Friday, March 22, 2024 6:03 PM
> *To:* joncrisler_at_gmail.com; Clay.Jackson_at_quest.com
> *Cc:* srcdco_at_rit.edu; 'Mark W. Farnham'; oracle-l_at_freelists.org
> *Subject:* RE: Characterset Question
>
>
>
> The tool works great all charactersets except for databases that are set
> to us7ascii and have 8bit bytes too.
>
>
>
> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> *On
> Behalf Of *Jon Crisler
> *Sent:* Friday, March 22, 2024 2:22 PM
> *To:* Clay.Jackson_at_quest.com
> *Cc:* srcdco_at_rit.edu; Mark W. Farnham <mwf_at_rsiz.com>;
> oracle-l_at_freelists.org
> *Subject:* Re: Characterset Question
>
>
>
> There is a utility that is part of the Oracle characterset conversion
> program, that can scan the existing DB and the proposed characterset. It
> will report back on any problem tables / columns and if the result is
> lossless, losey and/or info about manual cleanup. The exact name escapes
> me, but check out Oracle DocIDs relating to converting from one
> characterset to another and it should be mentioned. I will try to find it
> as well.
>
>
>
> On Fri, Mar 22, 2024 at 4:38 PM Clay Jackson <dmarc-noreply_at_freelists.org>
> wrote:
>
> Check the Oracle client NLS settings that the Informatica server is
> using. 😊
>
>
>
>
>
> *Clay Jackson*
>
> Database Solutions Sales Engineer
>
> <https://www.quest.com/solutions/database-performance-monitoring/>
>
> clay.jackson_at_quest.com
>
> *office* 949-754-1203 *mobile* 425-802-9603
>
>
>
> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> *On
> Behalf Of *Scott Canaan
> *Sent:* Friday, March 22, 2024 11:54 AM
> *To:* Mark W. Farnham <mwf_at_rsiz.com>; oracle-l_at_freelists.org
> *Subject:* RE: Characterset Question
>
>
>
> *CAUTION:* This email originated from outside of the organization. Do not
> follow guidance, click links, or open attachments unless you recognize the
> sender and know the content is safe.
>
>
>
> Interesting that the source is also AL32UTF8, but the data is transferred
> via Informatica. My guess is that’s where the issue is.
>
>
>
> *Scott Canaan ‘88*
>
> *Sr Database Administrator *Information & Technology Services
> Finance & Administration
>
>
> *Rochester Institute of Technology*o: (585) 475-7886 | f: (585) 475-7520
>
> *srcdco_at_rit.edu <srcdco_at_rit.edu>* | c: (585) 339-8659
>
> *CONFIDENTIALITY NOTE*: The information transmitted, including
> attachments, is intended only for the person(s) or entity to which it is
> addressed and may contain confidential and/or privileged material. Any
> review, retransmission, dissemination or other use of, or taking of any
> action in reliance upon this information by persons or entities other than
> the intended recipient is prohibited. If you received this in error, please
> contact the sender and destroy any copies of this information.
>
>
>
> *From:* Mark W. Farnham <mwf_at_rsiz.com>
> *Sent:* Friday, March 22, 2024 2:50 PM
> *To:* Scott Canaan <srcdco_at_rit.edu>; oracle-l_at_freelists.org
> *Subject:* RE: Characterset Question
>
>
>
> That should be a lossless conversion with the possible exception of
> clob/blob.
>
>
>
> However, figuring out the appropriate NLS_PARAMETER values for the client
> and the server have to be set right.
>
>
>
> How are you bringing data from Peoplesoft over to your new database?
>
>
>
> IF it is application to database, setting the client and server parameters
> should take care of it.
>
>
>
> IF it is database (old) to database (new), then there is possibly some
> data in the (old) database that is not actually US7ASCII and some cleanup
> may be required.
>
>
>
>
> https://docs.oracle.com/en/database/oracle/oracle-database/19/nlspg/supporting-multilingual-databases-with-unicode.html
>
>
>
> should give you a leg up, and there are more similar documents on Oracle.
>
>
>
> I’m a bit rusty on this, my friend, but I think that is about correct.
>
>
>
> mwf
>
>
>
>
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [
> mailto:oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org>] *On
> Behalf Of *Scott Canaan ("srcdco")
> *Sent:* Friday, March 22, 2024 2:23 PM
> *To:* oracle-l_at_freelists.org
> *Subject:* Characterset Question
>
>
>
> We are in the process of migrating all of our Oracle databases from Red
> Hat 7 to Red Hat 8. In creating the new databases in Red Hat 8, we’ve been
> going with the default characterset of AL32UTF8. Many of the old databases
> are US7ASCII. We are having trouble with applications trying to insert
> into text fields in the new database. In one case, bringing data from
> Peoplesoft that has a degree symbol fails because that comes over as 2
> bytes instead of one. I thought that AL32UTF8 should handle those
> characters. What needs to be done to accommodate this?
>
>
>
> *Scott Canaan ‘88*
>
> *Sr Database Administrator *Information & Technology Services
> Finance & Administration
>
>
> *Rochester Institute of Technology*o: (585) 475-7886 | f: (585) 475-7520
>
> *srcdco_at_rit.edu <srcdco_at_rit.edu>* | c: (585) 339-8659
>
> *CONFIDENTIALITY NOTE*: The information transmitted, including
> attachments, is intended only for the person(s) or entity to which it is
> addressed and may contain confidential and/or privileged material. Any
> review, retransmission, dissemination or other use of, or taking of any
> action in reliance upon this information by persons or entities other than
> the intended recipient is prohibited. If you received this in error, please
> contact the sender and destroy any copies of this information.
>
>
>
>

-- 
Martin Berger       Oracle ♠ Alumnus
martin.a.berger_at_gmail.com _at_martinberx <https://twitter.com/martinberx>
^∆x      http://berxblog.blogspot.com



-- http://www.freelists.org/webpage/oracle-l

image001.jpg
(image/jpeg attachment: image001.jpg)

Received on Sat Mar 23 2024 - 10:12:45 CET

Original text of this message