Character Set Conversion [message #168826] |
Sun, 23 April 2006 08:43 |
woods
Messages: 1 Registered: April 2006 Location: Australia
|
Junior Member |
|
|
Hi,
I am a developer in a software company and am overseeing the conversion of all of our clients' databases from character set WE8ISO8850P1 to UTF8 (or more specifically AL32UTF8). Most clients are still on 9i but a couple are on 10g. I have managed to convert one of our development databases without record loss but have a couple of questions as to whether I have taken the best path.
I cleaned up the data, based on csscan results, then performed a full export. When importing to a new instance of the database (created with CHARACTERSET = AL32UTF8) Oracle spat the dummy with the import error IMP-0019 based on ORA-00401 meaning the values inserted were too large for the column.
The records rejected had data with accented characters, which will consume more than 1 byte in utf8. Knowing that Oracle doesn't trim spaces in char data types - I realised that any multi-byte character in a char data type was going to expand the overall length of the value and cause the error when importing to the AL32UTF8 database.
My workaround was to re-create the new AL32UTF8 database adding the initialisation parameter BLANK_TRIMMING = TRUE. This allowed all data to be successfully imported.
My questions are:
1). Is AL32UTF8 the best utf-8 character set to convert to?
2). Are there any side-effects on the day-to-day running of the database by setting the BLANK_TRIMMING to TRUE?
3). Is there an alternative method (to the BLANK_TRIMMING method) I could have used to successfully import all data - bearing in mind we cannot just increase the length of column names ad hoc as we maintain a generic database structure for all of our clients.
Any advise would be greatly appreciated.
Stew.
|
|
|
|