Home » RDBMS Server » Performance Tuning » Query Re-write for connect by (Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production)
Query Re-write for connect by [message #578582] |
Fri, 01 March 2013 09:43 |
manubatham20
Messages: 566 Registered: September 2010 Location: Seattle, WA, USA
|
Senior Member |
|
|
Hi,
Please consider the following scenario.
1. Place test.txt file in d: drive.
2. Go to SQL prompt (SYS or SYSTEM user)
3. Create directory object by executing below:
create directory TESTDIR as 'D:\';
4. Create table using table script:
create table test (row_no number, col_no number, item varchar2(30));
5. Run the below script to read the file and process it line by line to insert the items in the test table in order of transaction rows and columns.
DECLARE
F UTL_FILE.FILE_TYPE;
S VARCHAR2(200);
C number:=1;
COUNTER_FOR_LINE number:=1;
COUNTER number:=1;
STRLENGTH number:=NULL;
COMMA_POS number :=NULL;
ITEM_NAME varchar2(30);
BEGIN
F := UTL_FILE.FOPEN('TESTDIR','TEST.TXT','R');
LOOP
UTL_FILE.GET_LINE(F,S);
LOOP
STRLENGTH :=LENGTH(S);
COMMA_POS :=INSTR(S,',',COUNTER_FOR_LINE);
IF comma_pos IS NULL THEN
ITEM_NAME :=SUBSTR(S,COUNTER_FOR_LINE);
ELSE
ITEM_NAME :=SUBSTR(S,COUNTER_FOR_LINE,COMMA_POS-COUNTER_FOR_LINE);
END IF;
INSERT INTO TEST VALUES (C,COUNTER,ITEM_NAME);
ITEM_NAME := NULL;
COUNTER :=COUNTER +1;
COUNTER_FOR_LINE :=COMMA_POS +1;
EXIT WHEN
(COUNTER_FOR_LINE >=strLENGTH ) OR (comma_pos = NULL);
END LOOP;
COMMA_POS := NULL;
COUNTER :=1;
COUNTER_FOR_LINE := 1;
C:=C+1;
END LOOP;
EXCEPTION
WHEN NO_DATA_FOUND THEN
UTL_FILE.FCLOSE(f);
DBMS_OUTPUT.PUT_LINE('Number of lines: ' || c);
UTL_FILE.FCLOSE(f);
END;
/
6. Now to get all the possible combinations of ordered items, run the below script:
create table combinations as
with
temp as (
select row_no, row_number() over (partition by row_no order by item) col_no,item from test),
items as (
select row_no, level lvl, substr(sys_connect_by_path(item,','),2) items
from temp
connect by prior row_no = row_no and col_no > PRIOR col_no
),
counts as (
select i.items, count(*)/i.lvl cnt, i.lvl
from items i, temp t
where i.items like '%'||t.item||'%'
and i.row_no = t.row_no
group by i.items, i.lvl
),
total as ( select count(distinct row_no) total from temp )
select lvl, items, sum(cnt) cnt, sum(cnt)/total ratio
from counts, total
group by lvl, items, total
order by lvl, items
/
7. Run the below query to get the weights (ratio).
create table edges as
select '('||a.items||')-('||b.items||')' EDGE, to_char(b.ratio/a.ratio,'99.99') WEIGHTS from combinations a, combinations b
where b.lvl=a.lvl+1
and INSTR(b.items,a.items) > 0
order by a.lvl,a.items
/
Now EDGES tables have all edges.
9. Create the function multiply as below:
create or replace function multiply
(p_a in varchar2)
return number
as
v_result number;
begin
execute immediate 'select ' || p_a || ' from dual' into v_result;
return v_result;
end multiply;
/
10. Execute below query to get the desired result:
Input in below query:
1. Replace the items in alphabatic arranged format
2. Replace the level = No. of items - 1
select sys_connect_by_path(substr(edge,instr(edge,'-')+1)||'-'||substr(edge,1,instr(edge,'-')-1), '/') "PATH",
multiply(substr(sys_connect_by_path(weights,'*'),2))
from edges
where level=3
start with substr(edge,instr(edge,'-')+1)='(BEER,COKE,DIAPER,MILK)'
connect by substr(edge,instr(edge,'-')+1) = prior substr(edge,1,instr(edge,'-')-1)
/
I want to fetch the result till step 8, but not able to fetch. Can any of you please help.
I think it will require Query Re-write. Please let me know if any step require any clarification.
First try with the below example (Below is the data for Sample TEST.TXT)
BREAD,MILK,
BEER,DIAPER,BREAD,EGGS,
BEER,COKE,DIAPER,MILK,
BEER,BREAD,DIAPER,MILK,
COKE,BREAD,DIAPER,MILK,
In addition (I have not included this condition in 6th step, we can also exclude while making combinations like {BREAD}-{BREAD,MILK} or {BREAD,BUTTER}-{BREAD,BUTTER,MILK}
Regards,
Manu
-
Attachment: TEST.TXT
(Size: 135.45KB, Downloaded 2309 times)
|
|
|
Goto Forum:
Current Time: Wed Dec 18 00:44:23 CST 2024
|