Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> RE: Finding illegal UTF8 sequences
The the Character Set Scanner on Metalink be of=20
any use? (never used it myself)
According to Metalink note 66320.1,
it's included with 8.1.7 and above or can be got from
OTN
-----Original Message-----
From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org]On Behalf Of Weaver, Walt
Sent: Thursday, May 27, 2004 2:38 PM
To: oracle-l_at_freelists.org
Subject: Finding illegal UTF8 sequences
Is anyone experienced with finding illegal UTF8 sequences and doing something about them?
We have a UTF8 database containing Japanese data. One of the customers appears to have random malformed data; when the data is displayed it's displayed as random characters rather than Kanji characters.
Using the dump() function I've found sequences where there appears to be, say, a valid trail byte with no associated lead byte. I've found a valid three-character lead byte with no associated trail byte, and so on and so on.
At least, I think that's what I've found.=3D20
At this point I'm still in a bit of learning mode here and am still trying to figure out what I'm looking at and what I'm going to do.
This problem is isolated to one customer and may be the result of a data import that was done some time ago.
So, does anyone know of any utilities that can find and print out illegal UTF8 sequences? Or am I going to have to hire someone to do it for me (I'm not smart enough to be able to do that sort of thing)?
Thanks,
--Walt Weaver
Bozeman, Montana
FAQ is at http://www.freelists.org/help/fom-serve/cache/1.htmlput 'unsubscribe' in the subject line.
-----------------------------------------------------------------
----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to: oracle-l-request_at_freelists.org
![]() |
![]() |