from html to xml and insert to table [message #478950] |
Wed, 13 October 2010 14:01 |
mape
Messages: 298 Registered: July 2006 Location: Slovakia
|
Senior Member |
|
|
Hi
I basically get it how to insert selected records from html to table by converting into the xml format thanks to
http://www.orafaq.com/forum/m/478306/88643
(Barbara and Michel know what Im talking about).
But now I cant get the hang of convert one html site
where are a very usefull information for my job.
Its about the site http://www.mobitola.sk/phones/Sony_Ericsson/C702/
where I need to get informations about phone specs and insert into the table.
Informations begin with" rozmery telefónu 106 x 48 x 16 mm"
and ends "podpora WCSS".
However I try to figure it out always I gonna get some ORA errors.
Would you be so kind and show me how to do it?
Thanks a lot
Regards
|
|
|
Re: from html to xml and insert to table [message #478952 is a reply to message #478950] |
Wed, 13 October 2010 14:14 |
mape
Messages: 298 Registered: July 2006 Location: Slovakia
|
Senior Member |
|
|
I got a problem to remote connect to database right now
but it's all about to select phone specs and insert into the table.
I try to do it like:
DECLARE
l_pieces utl_http.html_pieces;
l_url varchar2(32767) default 'http://www.mobitola.sk/phones/Sony_Ericsson/C702/';
l_page clob;
BEGIN
-- get html pieces from main url and put page into clob:
dbms_lob.createtemporary (l_page, TRUE);
l_pieces := utl_http.request_pieces(l_url);
for i in 1 .. l_pieces.count loop
l_page := l_page || l_pieces(i);
--DBMS_OUTPUT.PUT_LINE(l_pieces(i) );
end loop;
-- do whatever needs to be done to convert the html to valid xml:
l_page := substr (l_page, instr (l_page, 'základné údaje'));
l_page := substr (l_page, 1, instr (l_page, 'podpora WCSS') + 7);
DBMS_OUTPUT.PUT_LINE(l_page);
-- free temporary clob:
dbms_lob.freetemporary (l_page);
-- end loop;
end;
If you run it you will see ORA errors.
[Updated on: Wed, 13 October 2010 14:15] Report message to a moderator
|
|
|
|
|
Re: from html to xml and insert to table [message #478959 is a reply to message #478956] |
Wed, 13 October 2010 14:23 |
mape
Messages: 298 Registered: July 2006 Location: Slovakia
|
Senior Member |
|
|
I dont get it "print" in my procedure.
DBMS_OUTPUT.PUT_LINE is there just for see html parse.
I can remove this dbms_output from plsq
but the problem is still how to insert records into the table.
Green check sign means YES and other one red cross means NO.
[Updated on: Wed, 13 October 2010 14:28] Report message to a moderator
|
|
|
|
|
|
|
|
Re: from html to xml and insert to table [message #478973 is a reply to message #478969] |
Wed, 13 October 2010 15:01 |
|
Barbara Boehmer
Messages: 9100 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
If you remove the dbms_output, then there aren't any errors. What I frequently do when I need to view long things to debug them is instead of:
declare
...
begin
...
dbms_output.put_line (l_page);
...
end;
/
I use:
variable g_ref refcursor
declare
...
begin
...
open :g_ref for select l_page from dual;
...
end;
/
print g_ref
as shown below:
SCOTT@orcl_11gR2> VARIABLE g_ref REFCURSOR
SCOTT@orcl_11gR2> DECLARE
2 l_pieces UTL_HTTP.HTML_PIECES;
3 l_url VARCHAR2(32767) DEFAULT 'http://www.mobitola.sk/phones/Sony_Ericsson/C702/';
4 l_page CLOB;
5 BEGIN
6 -- get html pieces from main url and put page into clob:
7 DBMS_LOB.CREATETEMPORARY (l_page, TRUE);
8 l_pieces := UTL_HTTP.REQUEST_PIECES (l_url);
9 FOR i in 1 .. l_pieces.COUNT LOOP
10 l_page := l_page || l_pieces(i);
11 END LOOP;
12
13 -- do whatever needs to be done to convert the html to valid xml:
14 l_page := SUBSTR (l_page, INSTR (l_page, 'základné údaje'));
15 l_page := SUBSTR (l_page, 1, INSTR (l_page, 'podpora WCSS') + 7);
16
17 OPEN :g_ref FOR SELECT l_page FROM DUAL;
18
19 -- free temporary clob:
20 DBMS_LOB.FREETEMPORARY (l_page);
21 END;
22 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> PRINT g_ref
:B1
--------------------------------------------------------------------------------
základné údaje</h3></td></tr>
... truncated to save space
<td width="50%"><img src="/images/pu.png" width=12 height=12> podpora
1 row selected.
SCOTT@orcl_11gR2>
|
|
|