conversion problems [message #663929] |
Mon, 26 June 2017 09:14 |
|
banach600
Messages: 14 Registered: June 2017
|
Junior Member |
|
|
Hi all,
In a project, I am led to select data in oracle database (distant server) and then inject them into a mysql database (local machine with Ubuntu 16). In the oracle database, we have data in French and Arabic, with the following parameters:
NLS_LANGUAGE FRENCH
NLS_TERRITORY FRANCE
NLS_CURRENCY ?
NLS_ISO_CURRENCY FRANCE
NLS_NUMERIC_CHARACTERS ,.
NLS_CHARACTERSET WE8ISO8859P9
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD/MM/YY
NLS_DATE_LANGUAGE FRENCH
NLS_SORT FRENCH
NLS_TIME_FORMAT HH24:MI:SSXFF
NLS_TIMESTAMP_FORMAT DD/MM/RR HH24:MI:SSXFF
NLS_TIME_TZ_FORMAT HH24:MI:SSXFF TZR
NLS_TIMESTAMP_TZ_FORMAT DD/MM/RR HH24:MI:SSXFF TZR
NLS_DUAL_CURRENCY ?
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_RDBMS_VERSION 9.2.0.4.0
In the local machine, when I use UTF8, the accented characters are displayed correctly, unlike the Arabic characters (ÏÈáæã). I also tested AMERICAN_AMERICA.WE8ISO8859P9, then there all the data are displayed correctly in the navigator (modulo the change of encoding in windows-1256). On the other hand, when I start inserting this data in mysql, I receive this error:
SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF4me EN...' for column 'lic_dip' at row 1
Thank to anyone who can enlighten me on this problem.
|
|
|
|
Re: conversion problems [message #663931 is a reply to message #663929] |
Mon, 26 June 2017 10:31 |
|
Michel Cadot
Messages: 68722 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
Quote: In the oracle database, we have data in French and Arabic, with the following parameters:
...
NLS_CHARACTERSET WE8ISO8859P9
...
Assuming what you posted are database and client character sets; from doc, WE8ISO8859P9 is ISO 8859-9 West European & Turkish.
The correct 8-bit character set for both French and Arabic is AR8MSWIN1256, but it is a partial support of both (for instance there is no French accented uppercase characters), for full support you have to use Unicode character sets: AL16UTF16 or AL32UTF8.
[Updated on: Mon, 26 June 2017 10:36] Report message to a moderator
|
|
|
Re: conversion problems [message #663932 is a reply to message #663930] |
Mon, 26 June 2017 19:21 |
|
banach600
Messages: 14 Registered: June 2017
|
Junior Member |
|
|
Hello,
Sorry if I did not explain my problem well. I try to import data from an oracle database in a remote server to a mysql database in my computer.
Oracle DB version (remote server) :
SQL> SELECT * FROM V$VERSION;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
PL/SQL Release 9.2.0.4.0 - Production
CORE 9.2.0.3.0 Production
TNS for IBM/AIX RISC System/6000: Version 9.2.0.4.0 - Production
NLSRTL Version 9.2.0.4.0 - Production
NLS parameters :
SQL> SELECT * FROM V$NLS_PARAMETERS
2 ;
PARAMETER
----------------------------------------------------------------
VALUE
----------------------------------------------------------------
NLS_LANGUAGE
FRENCH
NLS_TERRITORY
FRANCE
NLS_CURRENCY
?
PARAMETER
----------------------------------------------------------------
VALUE
----------------------------------------------------------------
NLS_ISO_CURRENCY
FRANCE
NLS_NUMERIC_CHARACTERS
,.
NLS_CALENDAR
GREGORIAN
PARAMETER
----------------------------------------------------------------
VALUE
----------------------------------------------------------------
NLS_DATE_FORMAT
DD/MM/YY
NLS_DATE_LANGUAGE
FRENCH
NLS_CHARACTERSET
WE8ISO8859P9
PARAMETER
----------------------------------------------------------------
VALUE
----------------------------------------------------------------
NLS_SORT
FRENCH
NLS_TIME_FORMAT
HH24:MI:SSXFF
NLS_TIMESTAMP_FORMAT
DD/MM/RR HH24:MI:SSXFF
PARAMETER
----------------------------------------------------------------
VALUE
----------------------------------------------------------------
NLS_TIME_TZ_FORMAT
HH24:MI:SSXFF TZR
NLS_TIMESTAMP_TZ_FORMAT
DD/MM/RR HH24:MI:SSXFF TZR
NLS_DUAL_CURRENCY
?
PARAMETER
----------------------------------------------------------------
VALUE
----------------------------------------------------------------
NLS_NCHAR_CHARACTERSET
AL16UTF16
NLS_COMP
BINARY
NLS_LENGTH_SEMANTICS
BYTE
PARAMETER
----------------------------------------------------------------
VALUE
----------------------------------------------------------------
NLS_NCHAR_CONV_EXCP
FALSE
On my computer, I have Ubuntu 16.04, and the collation of Mysql database is utf8_unicode_ci.
My problem is that I can not find the right NLS_LANG parameter. Indeed,
- Using export NLS_LANG=FRENCH_FRANCE.AL32UTF8 in my computer, selected data with arabic characters from oracle can't be displayed correctly in the navigator (ÏÈáæã).
- Using export NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P9, french and arabic words are displayed correctly in the navigator (modulo the change of encoding in windows-1256). But when I want to insert this data in mysql, I have the following error :
SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF4me EN...' for column 'lic_dip' at row 1
I hope that now my problem is clear. Thank you in advance.
[Updated on: Mon, 26 June 2017 19:32] Report message to a moderator
|
|
|
Re: conversion problems [message #663935 is a reply to message #663932] |
Tue, 27 June 2017 01:25 |
|
Michel Cadot
Messages: 68722 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
The problem is that if you have not a correct database set it is normal you have not the correct data.
In short, the data are wrong INSIDE the original database whatever you can see with a navigator.
Let see what happen when you inserted:
1/ You create a database with WE8ISO8859P9 character set
2/ You say Oracle that as client you use WE8ISO8859P9 character set
3/ Whatever you do now, Oracle knows that it has to make no conversion and no verification on what character you enter as you use the same character set for both
4/ You insert a character a) with a valid code point but that corresponds to another character or b) that is not in this character set (an invalid code point), it is inserted anyway
5/ You select with the same character set, no problem Oracle knows it has nothing to do so it sends you the character, or rather the byte you inserted
6/ You want to select it with another character set (for instance UTF8), Oracle has to convert this character into this later character set and then a) sends you the code point for the character it thinks it has in the new character set (so you have not your original character) or b) realizes the character it has to convert is NOT in the character set, so you have an error.
As you can see the error is already there lurked inside the database.
[Edit! missing word]
[Updated on: Wed, 28 June 2017 07:25] Report message to a moderator
|
|
|
|
|
|