Re: Characterset Question

From: Jon Crisler <joncrisler_at_gmail.com>
Date: Fri, 22 Mar 2024 23:00:05 -0400
Message-ID: <CAB44qRSMWE2iKqgbefRJd9u-O6Au6LX0-NYZ7W5hk_dzX0Fd=Q_at_mail.gmail.com>





When Informatica was first testing the cloud waters, my then-company was hosting Informatica's cloud servers. We had a lot of problems with source symptoms throwing unknown errors when the target was Oracle, and this included both Oracle as a source, and other non-Oracle products. One of the problems was invalid data in the source system, but there was 3-4 major problems overall, and invalid source data was maybe 1/3 of the major incidents. IIRC and this was a looong time ago, Informatica fixed it with a higher checking level that was a configurable option. Along with all the NLS_LANG checks, run appropriate checks on the source server data. Once your initial investigation is completed, open a ticket with Informatica. I recall my team co-authoring some tech notes with their devs on troubleshooting for Oracle, but again this was so long ago that I doubt they still exist. You also have the option of not changing the characterset , then do manual cleanup later.

On Fri, Mar 22, 2024 at 6:56 PM Mark W. Farnham <mwf_at_rsiz.com> wrote:

> If memory serves, the utility was csscan, and I **thought** it was
> upgraded at some point to flag bytes not (supposed to be) in the source
> database characterset.
>
>
>
> That’s probably moot for this case, since Scott identified that inserts
> from informatica into the new server are the apparent problem, and as Clay
> pointed out, checking the NLS_LANG (I think) on the client running
> informatica should be checked.
>
>
>
> If the client and server nls language values versions are identical, I *
> *think** no translation is done, which if Scott is correct they are both
> AL32UTF8, then the translation is the no-op, which should both work
> correctly and be infinitesimally faster.
>
>
>
> Scott, at your convenience please let us know how you make out.
>
>
>
> *From:* dimensional.dba_at_comcast.net [mailto:dimensional.dba_at_comcast.net]
> *Sent:* Friday, March 22, 2024 6:03 PM
> *To:* joncrisler_at_gmail.com; Clay.Jackson_at_quest.com
> *Cc:* srcdco_at_rit.edu; 'Mark W. Farnham'; oracle-l_at_freelists.org
> *Subject:* RE: Characterset Question
>
>
>
> The tool works great all charactersets except for databases that are set
> to us7ascii and have 8bit bytes too.
>
>
>
> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> *On
> Behalf Of *Jon Crisler
> *Sent:* Friday, March 22, 2024 2:22 PM
> *To:* Clay.Jackson_at_quest.com
> *Cc:* srcdco_at_rit.edu; Mark W. Farnham <mwf_at_rsiz.com>;
> oracle-l_at_freelists.org
> *Subject:* Re: Characterset Question
>
>
>
> There is a utility that is part of the Oracle characterset conversion
> program, that can scan the existing DB and the proposed characterset. It
> will report back on any problem tables / columns and if the result is
> lossless, losey and/or info about manual cleanup. The exact name escapes
> me, but check out Oracle DocIDs relating to converting from one
> characterset to another and it should be mentioned. I will try to find it
> as well.
>
>
>
> On Fri, Mar 22, 2024 at 4:38 PM Clay Jackson <dmarc-noreply_at_freelists.org>
> wrote:
>
> Check the Oracle client NLS settings that the Informatica server is
> using. 😊
>
>
>
>
>
> *Clay Jackson*
>
> Database Solutions Sales Engineer
>
> <https://www.quest.com/solutions/database-performance-monitoring/>
>
> clay.jackson_at_quest.com
>
> *office* 949-754-1203 *mobile* 425-802-9603
>
>
>
> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> *On
> Behalf Of *Scott Canaan
> *Sent:* Friday, March 22, 2024 11:54 AM
> *To:* Mark W. Farnham <mwf_at_rsiz.com>; oracle-l_at_freelists.org
> *Subject:* RE: Characterset Question
>
>
>
> *CAUTION:* This email originated from outside of the organization. Do not
> follow guidance, click links, or open attachments unless you recognize the
> sender and know the content is safe.
>
>
>
> Interesting that the source is also AL32UTF8, but the data is transferred
> via Informatica. My guess is that’s where the issue is.
>
>
>
> *Scott Canaan ‘88*
>
> *Sr Database Administrator *Information & Technology Services
> Finance & Administration
>
>
> *Rochester Institute of Technology*o: (585) 475-7886 | f: (585) 475-7520
>
> *srcdco_at_rit.edu <srcdco_at_rit.edu>* | c: (585) 339-8659
>
> *CONFIDENTIALITY NOTE*: The information transmitted, including
> attachments, is intended only for the person(s) or entity to which it is
> addressed and may contain confidential and/or privileged material. Any
> review, retransmission, dissemination or other use of, or taking of any
> action in reliance upon this information by persons or entities other than
> the intended recipient is prohibited. If you received this in error, please
> contact the sender and destroy any copies of this information.
>
>
>
> *From:* Mark W. Farnham <mwf_at_rsiz.com>
> *Sent:* Friday, March 22, 2024 2:50 PM
> *To:* Scott Canaan <srcdco_at_rit.edu>; oracle-l_at_freelists.org
> *Subject:* RE: Characterset Question
>
>
>
> That should be a lossless conversion with the possible exception of
> clob/blob.
>
>
>
> However, figuring out the appropriate NLS_PARAMETER values for the client
> and the server have to be set right.
>
>
>
> How are you bringing data from Peoplesoft over to your new database?
>
>
>
> IF it is application to database, setting the client and server parameters
> should take care of it.
>
>
>
> IF it is database (old) to database (new), then there is possibly some
> data in the (old) database that is not actually US7ASCII and some cleanup
> may be required.
>
>
>
>
> https://docs.oracle.com/en/database/oracle/oracle-database/19/nlspg/supporting-multilingual-databases-with-unicode.html
>
>
>
> should give you a leg up, and there are more similar documents on Oracle.
>
>
>
> I’m a bit rusty on this, my friend, but I think that is about correct.
>
>
>
> mwf
>
>
>
>
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [
> mailto:oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org>] *On
> Behalf Of *Scott Canaan ("srcdco")
> *Sent:* Friday, March 22, 2024 2:23 PM
> *To:* oracle-l_at_freelists.org
> *Subject:* Characterset Question
>
>
>
> We are in the process of migrating all of our Oracle databases from Red
> Hat 7 to Red Hat 8. In creating the new databases in Red Hat 8, we’ve been
> going with the default characterset of AL32UTF8. Many of the old databases
> are US7ASCII. We are having trouble with applications trying to insert
> into text fields in the new database. In one case, bringing data from
> Peoplesoft that has a degree symbol fails because that comes over as 2
> bytes instead of one. I thought that AL32UTF8 should handle those
> characters. What needs to be done to accommodate this?
>
>
>
> *Scott Canaan ‘88*
>
> *Sr Database Administrator *Information & Technology Services
> Finance & Administration
>
>
> *Rochester Institute of Technology*o: (585) 475-7886 | f: (585) 475-7520
>
> *srcdco_at_rit.edu <srcdco_at_rit.edu>* | c: (585) 339-8659
>
> *CONFIDENTIALITY NOTE*: The information transmitted, including
> attachments, is intended only for the person(s) or entity to which it is
> addressed and may contain confidential and/or privileged material. Any
> review, retransmission, dissemination or other use of, or taking of any
> action in reliance upon this information by persons or entities other than
> the intended recipient is prohibited. If you received this in error, please
> contact the sender and destroy any copies of this information.
>
>
>
>



--
http://www.freelists.org/webpage/oracle-l


image001.jpg
(image/jpeg attachment: image001.jpg)

Received on Sat Mar 23 2024 - 04:00:05 CET

Original text of this message