Home » Server Options » Text & interMedia » Word category and word relation (Oracle text)
Word category and word relation [message #608914] |
Wed, 26 February 2014 11:25 |
ariton
Messages: 5 Registered: February 2007 Location: Bucharest / Romania
|
Junior Member |
|
|
Word category and word relations
Starting from an already example - http://www.orafaq.com/forum/t/95509/
ID RUBRO TIPO_PRODUCT MARCA
---------- ----------------------------------- ---------------------------------------- -------------------------
8818 PAPEL BOND Papel Bond Blanco 75 gramos FIRST RESMA, CONPACA
8826 PAPEL BOND Papel Bond Blanco 75 gramos, por mtr.2 APUNTA, VIVIAN
8931 PAPEL BOND PAPEL BOND COLOR MONDI / LIBRAS, MONDI
10855 CARTONES, CARTONCILLO Y CARTULINAS Cart?n Chip de 30" X 40", calibre 20 PLEK, PAINSA
11029 CARTONES, CARTONCILLO Y CARTULINAS CARTULINA OPALINA BLANCA FIRST PLIEGO, CONPACA
Is it possible using ctx index and contains operator to obtain all words relating to a specific search? (RUBRO + TIPO_PRODUCT + MARCA are togheter into an ctx index)
Ex: If I search for PAPEL I want an output like this
SELECT * FROM test_tab WHERE CONTAINS (test_col, 'PAPEL') > 0
ID Related Category
8818 PAPEL RUBRO
8818 BOND RUBRO - because RUBRO is related to PAPEL from RUBRO category
8818 BLANCO TIPO_PRODUCT - because BLANCO is related to PAPEL from TIPO_PRODUCT
8818 BOND TIPO PRODUCT - ...
8818 FIRST MARCA
8818 RESMA MARCA
8818 CONPACA MARCA
...
Thank you very much,
Claudiu
[mod-edit: code tags added by bb; next time please add them yourself]
[Updated on: Wed, 26 February 2014 13:37] by Moderator Report message to a moderator
|
|
|
Re: Word category and word relation [message #608916 is a reply to message #608914] |
Wed, 26 February 2014 16:36 |
|
Barbara Boehmer
Messages: 9100 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
The following assumes that you want to retrieve all the words from all columns in any row that contains the search term.
SCOTT@orcl12c> -- original data table:
SCOTT@orcl12c> CREATE TABLE original_table
2 (id NUMBER PRIMARY KEY,
3 rubro VARCHAR2(34),
4 tipo_product VARCHAR2(38),
5 marca VARCHAR2(22))
6 /
Table created.
SCOTT@orcl12c> -- new table:
SCOTT@orcl12c> CREATE TABLE new_table
2 (id NUMBER,
3 related VARCHAR2(30),
4 category VARCHAR2(30),
5 test_col VARCHAR2(100))
6 /
Table created.
SCOTT@orcl12c> -- trigger that applies data changes to original table to new table:
SCOTT@orcl12c> CREATE OR REPLACE TRIGGER original_table_biudr
2 BEFORE INSERT OR UPDATE OR DELETE ON original_table
3 FOR EACH ROW
4 BEGIN
5 IF DELETING OR UPDATING THEN
6 DELETE FROM new_table WHERE id = :OLD.ID;
7 END IF;
8 IF INSERTING OR UPDATING THEN
9 INSERT INTO new_table (id, related, category, test_col)
10 SELECT :NEW.id, related, category,
11 :NEW.rubro || ' ' || :NEW.tipo_product || ' ' || :NEW.marca
12 FROM (SELECT UPPER (RTRIM (REGEXP_SUBSTR (:NEW.rubro, '[^ ,/]+', 1, COLUMN_VALUE), ' ,/')) related,
13 'RUBRO' category
14 FROM DUAL,
15 TABLE
16 (CAST
17 (MULTISET
18 (SELECT LEVEL
19 FROM DUAL
20 CONNECT BY LEVEL <= REGEXP_COUNT (:NEW.rubro, ' ') + 1)
21 AS SYS.ODCINUMBERLIST))
22 UNION
23 SELECT UPPER (RTRIM (REGEXP_SUBSTR (:NEW.tipo_product, '[^ ,/]+', 1, COLUMN_VALUE), ' ,/')) related,
24 'TIPO_PRODUCT' category
25 FROM DUAL,
26 TABLE
27 (CAST
28 (MULTISET
29 (SELECT LEVEL
30 FROM DUAL
31 CONNECT BY LEVEL <= REGEXP_COUNT (:NEW.tipo_product, ' ') + 1)
32 AS SYS.ODCINUMBERLIST))
33 UNION
34 SELECT UPPER (RTRIM (REGEXP_SUBSTR (:NEW.marca, '[^ ,/]+', 1, COLUMN_VALUE), ' ,/')) related,
35 'MARCA' category
36 FROM DUAL,
37 TABLE
38 (CAST
39 (MULTISET
40 (SELECT LEVEL
41 FROM DUAL
42 CONNECT BY LEVEL <= REGEXP_COUNT (:NEW.marca, ' ') + 1)
43 AS SYS.ODCINUMBERLIST)))
44 WHERE related IS NOT NULL;
45 END IF;
46 END original_table_biudr;
47 /
Trigger created.
SCOTT@orcl12c> SHOW ERRORS
No errors.
SCOTT@orcl12c> -- text index on new table:
SCOTT@orcl12c> CREATE INDEX test_idx ON new_table (test_col) INDEXTYPE IS CTXSYS.CONTEXT
2 PARAMETERS ('SYNC (ON COMMIT)')
3 /
Index created.
SCOTT@orcl12c> -- insert data into original table and commit:
SCOTT@orcl12c> INSERT ALL
2 INTO original_table VALUES (8818, 'PAPEL BOND', 'Papel Bond Blanco 75 gramos', 'FIRST RESMA, CONPACA')
3 INTO original_table VALUES (8826, 'PAPEL BOND', 'Papel Bond Blanco 75 gramos, por mtr.2', 'APUNTA, VIVIAN')
4 INTO original_table VALUES (8931, 'PAPEL BOND', 'PAPEL BOND COLOR', 'MONDI / LIBRAS, MONDI')
5 INTO original_table VALUES (10855, 'CARTONES, CARTONCILLO Y CARTULINAS', 'Cart?n Chip de 30" X 40", calibre 20', 'PLEK, PAINSA')
6 INTO original_table VALUES (11029, 'CARTONES, CARTONCILLO Y CARTULINAS', 'CARTULINA OPALINA BLANCA', 'FIRST PLIEGO, CONPACA')
7 SELECT * FROM DUAL
8 /
5 rows created.
SCOTT@orcl12c> COMMIT
2 /
Commit complete.
SCOTT@orcl12c> -- query new table using text index:
SCOTT@orcl12c> SET AUTOTRACE ON EXPLAIN
SCOTT@orcl12c> COLUMN related FORMAT A30
SCOTT@orcl12c> COLUMN category FORMAT A30
SCOTT@orcl12c> SELECT DISTINCT related, category
2 FROM new_table
3 WHERE CONTAINS (test_col, 'papel') > 0
4 ORDER BY DECODE (category, 'RUBRO', 1, 'TIPO_PRODUCT', 2, 'MARCA', 3), related
5 /
RELATED CATEGORY
------------------------------ ------------------------------
BOND RUBRO
PAPEL RUBRO
75 TIPO_PRODUCT
BLANCO TIPO_PRODUCT
BOND TIPO_PRODUCT
COLOR TIPO_PRODUCT
GRAMOS TIPO_PRODUCT
MTR.2 TIPO_PRODUCT
PAPEL TIPO_PRODUCT
POR TIPO_PRODUCT
APUNTA MARCA
CONPACA MARCA
FIRST MARCA
LIBRAS MARCA
MONDI MARCA
RESMA MARCA
VIVIAN MARCA
17 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2737401209
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 98 | 4 (0)| 00:00:01 |
| 1 | SORT ORDER BY | | 1 | 98 | 4 (0)| 00:00:01 |
| 2 | HASH UNIQUE | | 1 | 98 | 4 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| NEW_TABLE | 1 | 98 | 4 (0)| 00:00:01 |
|* 4 | DOMAIN INDEX | TEST_IDX | | | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("CTXSYS"."CONTAINS"("TEST_COL",'papel')>0)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
SCOTT@orcl12c>
|
|
|
|
Re: Word category and word relation [message #608974 is a reply to message #608938] |
Thu, 27 February 2014 10:37 |
|
Barbara Boehmer
Messages: 9100 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
ariton wrote on Thu, 27 February 2014 02:48 Thank you Barbara for your great and quick response.
I undestand the approach, but from performance point of view, it is not suitable to use directly the DR$ table with USER_DATASTORE concatenation of columns? The DR$ tables already contain the words and the relations with original table, but I don't know it is possible to get the results in this way?
Claudiu
I can't picture what you are suggesting. Can you provide a demonstration or at least a partial demonstration?
If you use either a multi_column_datastore or a user_datastore to concatenate the columns, and create a context index that uses either datastore, then you can query using CONTAINS to get all rows containing the search word. You could use WITHIN in your CONTAINS clause to select from each column, then concatenate the results, to get your category. You could then use some SQL to extract all the words from those rows. If there are very few matching rows, then that might be efficient. If there are a lot of rows in your result set, then it could take a while.
If your are planning to query the dr$your_index$i domain index table directly, I don't see how you could get the desired results. Although there is a lot of information stored, it is not stored in a way that is accessible other than through CONTAINS queries. You won't be able to get the column that it comes from or even the row that it came from in order to extract the other words in the row. Also, if you access the dr$your_index$i table directly, then you are not using an index and it can produce unexpected side effects if you try to index such a table.
Have you tested what I provided? I posted the explain plan to show the efficient plan that it produces.
|
|
|
Re: Word category and word relation [message #608981 is a reply to message #608974] |
Thu, 27 February 2014 13:40 |
ariton
Messages: 5 Registered: February 2007 Location: Bucharest / Romania
|
Junior Member |
|
|
Thank you Barbara for your complete response. I already implemended your solution and it works great.
You are right about dr$ table, it is not supported directly by Oracle and the only way to access data is the "contains" operator.
Claudiu
|
|
|
Goto Forum:
Current Time: Tue Nov 26 22:19:56 CST 2024
|