RE: Characterset Question

From: <dimensional.dba_at_comcast.net>
Date: Fri, 22 Mar 2024 13:02:07 -0700
Message-ID: <021001da7c93$d2894860$779bd920$_at_comcast.net>



What is the characterset of the peoplesoft DB? Normally it wouldn't be US7ACII.   I have seen problems of US7ASCII databases, because Oracle doesn't check the value of the 8th bit on insert and you have inserted characters that are bad. An app using alternate characterset can retrieve the field and what shows is what was inputted that was really an alternate 8 bit character byte. Some utilities in translating it get an error like Goldengate.

For it to be a two byte character that AL32UTF8 is not interpreting then it is more than likely a UTF8 or UTF-16 character set.  

Normally this means you will have to do some data cleaning to fix the mixed bytes charactersets in your data.

These problems normally occur in open format free form text fields.  

We would actually need to see the bytes in hex or octal so we could help you determine what characterset they belong to in order to try and determine what to fix the data to.

A lot of the freeform fields for inputting data I have seen mainly coming from the windows side where the human is using emoji's ,wingdings or European specail language characters that are all 8bit characters and then various Asian charactersets that are 2-4 byte characters depending on the original characterset.        

From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> On Behalf Of Scott Canaan ("srcdco")
Sent: Friday, March 22, 2024 11:23 AM
To: oracle-l_at_freelists.org
Subject: Characterset Question  

We are in the process of migrating all of our Oracle databases from Red Hat 7 to Red Hat 8. In creating the new databases in Red Hat 8, we've been going with the default characterset of AL32UTF8. Many of the old databases are US7ASCII. We are having trouble with applications trying to insert into text fields in the new database. In one case, bringing data from Peoplesoft that has a degree symbol fails because that comes over as 2 bytes instead of one. I thought that AL32UTF8 should handle those characters. What needs to be done to accommodate this?  

Scott Canaan '88
Sr Database Administrator
Information & Technology Services
Finance & Administration

Rochester Institute of Technology
o: (585) 475-7886 | f: (585) 475-7520

 <mailto:srcdco_at_rit.edu> srcdco_at_rit.edu | c: (585) 339-8659

CONFIDENTIALITY NOTE: The information transmitted, including attachments, is intended only for the person(s) or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and destroy any copies of this information.  

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Mar 22 2024 - 21:02:07 CET

Original text of this message