Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: pl/sql question and owa_pattern question

RE: pl/sql question and owa_pattern question

From: Guang Mei <gmei_at_incyte.com>
Date: Fri, 21 Nov 2003 08:59:31 -0800
Message-ID: <F001.005D7711.20031121085931@fatcity.com>


Perl is a good tool for text processing. But our program is already written in pl/sql long time ago and there are intensive db calls in this pl/sql program. (text processing is only part of it). So I can not change that.

BTW I did a comparison study a while ago for some of our pl/sql packages (specifically for our application). When there are lots of db calls (select, insert, update and delete), pl/sql package is faster than correponding perl program (I made sure sqls are prepared once and used bind variables in perl. All code were executed on the unix server, no other programs were running, etc). That's why we stick to pl/sql because our app need the performance. Others may have different results, it all depends on what the code does.

Guang

-----Original Message-----
Mladen Gogala
Sent: Thursday, November 20, 2003 11:14 PM To: Multiple recipients of list ORACLE-L

I don't know about PL/SQL but here is how I would get separate words from a big string:

#!/usr/bin/perl -w
use strict;
my (@ARR);
while (<>) {

    chomp;
    @ARR = split(/[^0-9a-zA-Z_\.,<>]/);
    foreach (@ARR) {

        print "$_\n";
    }
}

There is something called DBI and it can be used to insert separated words into the database, instead
of printing them. The bottom line is that perl is an excellent tool for parsing strings and all sorts of string manipulation.

On 2003.11.20 22:39, Guang Mei wrote:
> Hi:
>
> In my pl/sql program, I want to process each "word" in a string. The
> string is selected from a varchar2(300) column. The delimit that separates
> the words is not necessary space character. The definition of the delimit
> in this program is set as
>
> 1. Any character that is NOT AlphaNumerical (0-9, A-Z,a-z)
> and
> 2. the character is not one of these: '-.,/<*>_'
>
> Now my program is basically checking each character, find the delimit, and
> rebuild each word. After that I process each "word". The code looks like
> this:
>
> -------
> str := "This will be a long string with length upto 300 characters, it
> may contain some invisible characters';
> len := length(str)+1;
> for i in 1..len loop
> ch := substr(str,i,1);
> if (not strings.isAlnum(ch) and instr('-.,/<*>_', ch)<1) then
> if word is not null then
> -- do some processing to variable word !
> word := null; -- reset it
> end if;
> else
> word := word || ch; -- concat ch to word
> end if;
> end loop;
>
> -------
>
> I think It's taking too long because it loops through each characters. I
> hope I could find a way to speed it up. I don't have experiience in
> owa_pattern, but I thought there might be a way to do it here:
>
> ----
> str := "This will be a long string with length upto 300 characters, it
> may contain some invisible characters';
> newstr := str;
> pos := 1;
> while pos != 0 loop
> pos := owa_pattern.amatch(newstr, 1, '\W'); -- how can I mask out
> these '-.,/<*>_' ???
> word := substr(newstr, 1, pos-1);
> -- do some processing to variable word !
> if pos != 0 then
> newstr := substr(newstr, pos+1);
> end if;
> end loop;
> ------
>
> My simple tests showed that owa_pattern call is much slower than direct
> string manupilation. But I would like to try it in this case if I could
> easily get the "wrods" from the string. Any suggestions?
>
> TIA.
>
> Guang
>
>
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.net
> --
> Author: Guang Mei
> INET: gmei_at_incyte.com
>
> Fat City Network Services -- 858-538-5051 http://www.fatcity.com
> San Diego, California -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from). You may
> also send the HELP command for other information (like subscribing).
>

--
Mladen Gogala
Oracle DBA
--
Please see the official ORACLE-L FAQ: http://www.orafaq.net
--
Author: Mladen Gogala
  INET: mgogala_at_adelphia.net

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Guang Mei
  INET: gmei_at_incyte.com

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Fri Nov 21 2003 - 10:59:31 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US