Home » Developer & Programmer » Precompilers, OCI & OCCI » Data Loading problem of French Input file (Oracle 10g, SUN OS 5.6 , Korn Shell , Pro*C)
Data Loading problem of French Input file [message #563699] |
Thu, 16 August 2012 09:33 |
|
pradeepsanthosh
Messages: 7 Registered: August 2012
|
Junior Member |
|
|
Hello All,
I am trying to load a input flat file (french data in it) into the database using pro*c (not using sql loader because of some validations ). i am reading the line by line and populating it into the structure and then process it.
The input file is encoded in WE8ISO8895P1 , I want the records to be populated into a table.
so i did set NLS_LANG=French_France.WE8ISO8895P1 and ran the pro*c program . i used character host variable in the insert query , i used data from the earlier read structure and set these character host variable.
The problem i am facing is , when i am printing the values before insert i could see correct data.
For example the printed data of variable is "pas de donné " and strlen is 14. the target table field is of varchar type
( name varchar(20 char) )
but after i insert i could see only the truncated value in the database i.e "pas de donn" in the table length is 11.
kindly request you all to provide me valuable suggestions , please let me know if you need further details.
Thanks in advance !!
Regards,
Pradeep
|
|
|
|
|
Re: Data Loading problem of French Input file [message #563716 is a reply to message #563705] |
Thu, 16 August 2012 10:55 |
|
pradeepsanthosh
Messages: 7 Registered: August 2012
|
Junior Member |
|
|
Thanks for the quick response !!
please find the details below.
1) connecting to oracle
EXEC SQL WHENEVER SQLERROR GOTO connect_error;
EXEC SQL CONNECT :username IDENTIFIED BY :password USING :ssid;
2) structure is
struct ref_ident
{
char nom_commercial [40];
char adr_no_voie [4];
char tel_entite [14];
char fax_entite [14];
char no_tva [14];
char no_siret [14];
char code_ape [4];
};
host variables are declared
commercial_name char(41);
tel_number char(15);
2) read records into the structure in loop one by one into structure ref_ident and then insert into a table
while(!feof(fp_read))
{
pt_struct = &ref_ident;
taille_struct = sizeof(struct _ref_ident);
fread(pt_struct,taille_struct,1,fp_read);
-- after reading into the structure some structure values are used to insert record into a table
memset(commercial_name,'\0',strlen(commercial_name));
memset(tel_number,'\0',strlen(tel_number));
strncpy(commercial_name,ref_ident.nom_commercial,40);
strncpy(tel_number,ref_ident.tel_entite,14);
-- inserting into table
printf("NAME[%s] LEN[%d] TELE[%s] LEN[%s]",commercia_name,strlen(commercial_name),tel_number,strlen(tel_number));
EXEC SQL
INSERT INTO CUSTOMER (NAME,TELEPHONE_NUM)
VALUES (:commercial_name,:tel_number);
-- Commiting in the end
EXEC SQL
COMMIT;
fgetc(fp_read)
}
make file
----------------------------
# Suffix rules
##############
.SUFFIXES: .pc
# definition
############
CODE=ANSI_C
SHELL=/usr/bin/ksh
LIB=${VERS}/lib
BIN=${VERS}/bin
SRC=${VERS}/src/proc
MAKE=${VERS}/make
LIBHOME=${ORACLE_HOME}/lib32
PROC=${ORACLE_HOME}/bin/proc
PROFLAGS=include=${LIB} line=yes code=${CODE}
CC=/logiciel/SUN_Studio_11/SUNWspro/bin/cc
LDFLAGS=-L${LIBHOME}
LIB806=${LIBHOME}/libclntsh.so -lc -lm
PROLDLIBS= ${LDFLAGS} ${LIB806}
OBJ=${MAKE}/load_detail.o
EXE=${BIN}/load_detail.exe
RM = rm -f
# regles
########
${EXE} : ${OBJ}
${CC} ${LDFLAGS} ${OBJ} -o ${EXE} ${PROLDLIBS} -L${LIB} -llogito -L${LIB} -lmodulecrm
${MAKE}/load_detail.o : ${SRC}/load_detail.c
${CC} -c ${SRC}/load_detail.c -I/usr/include -I${LIB}
${SRC}/load_detail.c : ${SRC}/load_detail.pc
${PROC} ${PROFLAGS} iname=${SRC}/load_detail.pc SQLCHECK=SEMANTICS USERID=$username/$passwd@$sid
clean :
$(RM) $(MAKE)/load_detail.o \
$(RM) $(SRC)/load_detail.lis \
$(RM) $(SRC)/load_detail.c
---------------------
I compiled it , there were no warnings . my default server profile language $NLS_LANG was us en_us utf8.
so i changed it to French
export NLS_LANG='French_France.WE8ISO8895P1'
export LANG ='fr_FR.ISO8859-1'
then executed the binary
it got completed successfully all the records were loaded but those who had french character in it were loaded with truncated values.
I connected to database from the same putty session using sqlplus
and also checked the NLS settings were of french only
SELECT * FROM nls_session_parameters;
select name,telephone_num from customer;
-- gave me truncated values for records which had french special character
|
|
|
|
Re: Data Loading problem of French Input file [message #563731 is a reply to message #563722] |
Thu, 16 August 2012 12:22 |
|
Michel Cadot
Messages: 68732 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
The problem might be there:
memset(commercial_name,'\0',strlen(commercial_name));
memset(tel_number,'\0',strlen(tel_number));
As the variables are not set strlen could return anything.
It should be:
memset(commercial_name,'\0',41*sizeof(char));
memset(tel_number,'\0',15*sizeof(char));
As you have this statement before insert:
printf("NAME[%s] LEN[%d] TELE[%s] LEN[%s]",commercia_name,strlen(commercial_name),tel_number,strlen(tel_number));
Please post the result of the execution.
Note that you used "commercia_name" when the variable name is "commercial_name" so this code is not the real one or it did not compile.
Regards
Michel
[Updated on: Thu, 16 August 2012 12:23] Report message to a moderator
|
|
|
|
Re: Data Loading problem of French Input file [message #563786 is a reply to message #563736] |
Fri, 17 August 2012 02:25 |
|
dws1
Messages: 15 Registered: July 2012
|
Junior Member |
|
|
Are your ref_ident strings long enough? Remember that C stores strings with a terminating NULL, so "buffer overflow" is actually *16* characters long: 'b'(1), 'u'(2), 'f'(3),...,'o'(14), 'w'(15), '\0'(16), and would need to be stored in at least char[16]. char[15] or less would give you undefined behaviour, which could well be the cause of this issue.
>>For example the printed data of variable is "pas de donné " and strlen is 14. the target table field is of varchar type
>>( name varchar(20 char) )
Which variable? Anyway, that implies this is NOT ref_ident::nom_commercial because we know that's varchar2(40) or similar, and putting that string into any of the others would cause a buffer overflow.
|
|
|
|
|
|
|
Re: Data Loading problem of French Input file [message #563858 is a reply to message #563808] |
Fri, 17 August 2012 10:14 |
|
Michel Cadot
Messages: 68732 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
Quote:[ Note there are 27 spaces after the é character, which is not getting displayed thru this post reply ]
Please read How to use [code] tags and make your code easier to read.
Quote:SELECT value into :nls_lang FROM nls_session_parameters where parameter='NLS_LANGUAGE';
and the output of the nls_lang variable was "AMERICAN"
The language is not important (I'm French and I always use American language but for dates and I can enter French specific charcaters), the root point is the character set.
Quote:american_america.utf8
This is correct ONLY if you actually use UTF8 in your client program.
Are you sure your program handles UTF8 character set? It is common in Java, rare in C (C programs usually use AL16UTTF16 for Unicode).
Regards
Michel
[Updated on: Fri, 17 August 2012 10:15] Report message to a moderator
|
|
|
Re: Data Loading problem of French Input file [message #563861 is a reply to message #563858] |
Fri, 17 August 2012 10:40 |
|
pradeepsanthosh
Messages: 7 Registered: August 2012
|
Junior Member |
|
|
Thanks Michel !
Please help me understand i am little confused in understanding the issue.
1) I receive a file which is encoded in FRENCH ISO8859-1 format.
2) I write a client program in pro*c and when executing it i set the NLS_LANG as the same encode format FRENCH ISO8859-1 format
Please correct me if i am wrong My assumption is that the client program will read the file correctly and insert the records into database. Database default characterset will be overriden by the NLS_LANG set by us in the host.
But this is not happening i could see that records are correctly split by fread and structure values are set correct. but after inserting the iso8859-1 characters are not in database instead they are removed.
I also did one experiment , i converted the input file to UTF-8 format using
iconv -f ISO8859-1 -t UTF-8 my_file > new_file
i then executed the binary with the new file as input and this time i didnt change my host $NLS_LANG to french instead by default it is set to American_America.UTF8
The result was fread had issue in correctly setting the structure values , i assume this is due to fread reads by bytes and it doesnot support multibyte character , i.e for a single ISO-8859-1 special character UTF-8 file had 2 bytes.
Regards,
Pradeep
[Updated on: Fri, 17 August 2012 10:44] Report message to a moderator
|
|
|
|
Re: Data Loading problem of French Input file [message #564012 is a reply to message #563869] |
Mon, 20 August 2012 04:01 |
|
dws1
Messages: 15 Registered: July 2012
|
Junior Member |
|
|
There are two simple rules for solving virtually every NLS issue:
(1) Set the database characterset to a characterset that represents characters you want to store in the database.
(2) Set NLS_LANG to represent the characterset your application uses. This environment variable tells Oracle what characterset your program is using.
Oracle will then correctly translate characters from your application for storage in the DB, assuming the characters are represented by both charactersets, and if they aren't, you'll get an 0xBF.
Do those and virtually every NLS issue disappears, in my experience. Assume either doesn't need to hold for any reason, however reasonable it sounds, and you WILL get NLS issues.
[Updated on: Mon, 20 August 2012 04:03] Report message to a moderator
|
|
|
|
|
Goto Forum:
Current Time: Sun Feb 02 09:05:57 CST 2025
|