Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.server -> Re: charset for kangi
On Jan 30, 2:37 pm, Frank van Bortel <frank.van.bor..._at_gmail.com>
wrote:
> I would not go for different fields at all when designing
> such an application, but rather have one characterset.
> I would always opt for the AL16UTF16, because:
> - it is closest to the Windows code set (like it or not,
> most clients use that on the desktop to enter characters)
> - it is fixed double byte.
>
> There may be other considerations, which would make the
> first option a viable choice.
> Simple UTF8, as you call it, is not 10G - AL32UTF8 is.
> And it is a valid choice.
>
Frank,
Because AL16UTF16 is a double byte character set - it cannot be used for the database character set. It can only be used for a national character set. I'm not sure what you mean by "fixed" either... AL16UTF16 expands two bytes at a time to handle multi-byte UTF characters. AL16UTF8 has a single byte base and expands one byte at a time to handle multi-byte UTF.
Why would you have the same data in two different table columns? All fields/column that would contain Kanji data would use a national data type. You can store USASCII7 in the national character set as well as Kanji.
As to whether different data types cause problems in applications... which single data type do you use for all your fields/columns now - char, varchar2, number, clob, blob? I find no more problems with using nvarchar2 and with using char and varchar2. Yes, the developer needs to be aware of the characteristics of the different data types, particularly when assigning character data to declared variables... but then the developer should always be aware of the source and destination data types anyway. Just because oracle can usually implicitly convert a string to a number and back to a string without problems doesn't mean the developer has not just introduced a "bug" into the code that's going to show up as soon as zero leading numeric strings are used.
WE8MSWIN1252 is the most supported windows character set, but XP supports multiple character sets that can be changed on the fly. That's why it's so neat for a dumb single language person like me to see the XP character set changed from american english to canadian french to some indian set and watch the desktop change from something I can read to something I sorta recognize (that year of high school french was a long time ago), to something only my colleague can read.
BTW - having both the database character set and the national character set in the UTF space reduces the opportunities of "losing" some bytes when accidently crossing data types.
Regards,
Margaret
Received on Tue Jan 30 2007 - 15:34:58 CST
![]() |
![]() |