Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> hint needed to complete a single "tokens deduplication" update statement ...
Hello all,
I am close to have a beatiful approach
to de-duplicate tokens from each name row
in a table i.e. I have a table of names
and I want to make a single statement update
that remove duplicated terms within the name
e.g.
SULTAN HAMID SULTAN should be updated to
SULTAN HAMID I already have the core approach of the single row de-duplicating statement:
SELECT DISTINCT VALUE(term) FROM
THE(SELECT tokenize(name_full)
FROM T_NLM_NAME
WHERE name_id=<current_id>) term;
where "tokenize" is a PL/SQL function that
does exactly that: returns the collection
of tokens that occur in a VARCHAR2(X). The
collection type is defined as:
type strTable is table of VARCHAR2(X).
Now that I have the result set of distinct tokens I want to bring it back to VARCHAR2 but I do not know how to do that, would be something like (this of course does not work otherwise I would not be asking :-))
SELECT to_char(CASE WHEN RNUM=1 THEN term END)
|| to_char(CASE WHEN RNUM=2 THEN term END)
|| to_char(CASE WHEN RNUM=3 THEN term END) as deduplicated_name_full
FROM (
SELECT DISTINCT VALUE(term) as term
, row_number() over (order by VALUE(term)) as RNUM FROM THE(SELECT tokenize(name_full)
FROM T_NLM_NAME WHERE name_id=<current_id>) term);
The complete thing would look like:
UPDATE T_NLM_NAME
SET name_full=
SELECT <what_to_do_here> as name_full
FROM (
SELECT DISTINCT VALUE(term) as term
, row_number() over (order by VALUE(term)) as rnum FROM THE(SELECT tokenize(name_full)
FROM T_NLM_NAME WHERE name_id=<current_id>) term);
Thanks in advance,
Best Regards,
Giovanni
Received on Mon Oct 25 2004 - 06:45:20 CDT