Home » RDBMS Server » Performance Tuning » Perfomance problem with select distinct from cartesian join with OR-ed filter (Production: (11.2.0.1, Windows Server x64); Test: (9.2.01, Windows XP))
Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522536] Mon, 12 September 2011 04:21 Go to next message
malishich
Messages: 8
Registered: September 2011
Location: ALA
Junior Member
Hello all.
Having production system: 11.2.0.1 on Windows Server x64
Test system: 9.2.0.1 on Windows XP

Problem preface: to get all unique CASEID which should be checked up by biometric system.
What i should check - all CASEs for different PERSONs having same PHONEs at least among one phone type (1..4).
Real table contains little bit more than 10 million records.
I made test scripts.

Below the DDL for test table creation:
------------------------------------------
-- Create CASEINFO test table 
------------------------------------------
DROP TABLE CASEINFO;
CREATE TABLE CASEINFO
(
       CASEID INTEGER,
       PERSON INTEGER,
       SEX INTEGER,
       FED INTEGER, REG INTEGER,
       PHONE1 VARCHAR2(24),
       PHONE2 VARCHAR2(24),
       PHONE3 VARCHAR2(24),
       PHONE4 VARCHAR2(24)
);
ALTER TABLE CASEINFO ADD PRIMARY KEY (CASEID) USING INDEX;
CREATE INDEX I_PERSON ON CASEINFO (PERSON);
CREATE INDEX I_SEX ON CASEINFO (SEX);
CREATE INDEX I_FED ON CASEINFO (FED);
CREATE INDEX I_REG ON CASEINFO (REG);
CREATE INDEX I_FED_REG ON CASEINFO (FED,REG);
CREATE INDEX I_PHONE1 ON CASEINFO (PHONE1);
CREATE INDEX I_PHONE2 ON CASEINFO (PHONE2);
CREATE INDEX I_PHONE3 ON CASEINFO (PHONE3);
CREATE INDEX I_PHONE4 ON CASEINFO (PHONE4);
CREATE INDEX I_PHONES_12 ON CASEINFO (PHONE1,PHONE2);
CREATE INDEX I_PHONES_13 ON CASEINFO (PHONE1,PHONE3);
CREATE INDEX I_PHONES_14 ON CASEINFO (PHONE1,PHONE4);
CREATE INDEX I_PHONES_23 ON CASEINFO (PHONE2,PHONE3);
CREATE INDEX I_PHONES_24 ON CASEINFO (PHONE2,PHONE4);
CREATE INDEX I_PHONES_34 ON CASEINFO (PHONE3,PHONE4);


Below i've put SQL/DLL to make test data.
Be aware, number of records inserted 2 millions.
If you need other amount # for test, please set
PERSON_COUNT := #/8;
------------------------------------------
-- fill CASEINFO with sample data
------------------------------------------
DECLARE
  I INTEGER;
  R1 INTEGER;
  R2 INTEGER;
  N INTEGER;
  M INTEGER;
  OBJECTS_PER_PERSON INTEGER;
  APERSON INTEGER;
  PERSON_COUNT INTEGER;
  CRIMINAL_PERCENT INTEGER;
  RNDSEED NUMBER(20);
  APHONE VARCHAR2(24);
BEGIN
  OBJECTS_PER_PERSON := 8;
  PERSON_COUNT := 250000;
  CRIMINAL_PERCENT := 3;
  N := OBJECTS_PER_PERSON * PERSON_COUNT;
  DBMS_OUTPUT.PUT_LINE('Record count '||TO_CHAR(N)||' ...');
  FOR I IN 0..N-1 LOOP
    APERSON := MOD(I, PERSON_COUNT);
    INSERT INTO CASEINFO 
      (
        CASEID,
        PERSON,
        SEX,
        FED,
        REG,
        PHONE1,
        PHONE2,
        PHONE3,
        PHONE4
      ) 
           VALUES
           (
             I+1,
             APERSON+1,
             MOD(APERSON,2),
             MOD(I,13)+1,
             MOD(I,97)+1,
             '+7727'||
               TRIM(TO_CHAR(APERSON+1,'0000009'))||'1',
             '+7727'||
               TRIM(TO_CHAR(APERSON+1,'0000009'))||'2',
             '+7727'||
               TRIM(TO_CHAR(APERSON+1,'0000009'))||'3',
             '+7727'||
               TRIM(TO_CHAR(APERSON+1,'0000009'))||'4'
           );
    IF MOD(I,1000)=0 THEN
      COMMIT;
      --DBMS_OUTPUT.PUT_LINE('Done '||TO_CHAR(I)||' ...');
    END IF;
  END LOOP;
  COMMIT;
  ------------------------------------------
  -- Lets make "CRIMINAL" CASEs among which we're searching for
  --on 11.2.0.1 next line isn't working
  --SELECT hsecs INTO RNDSEED FROM gv$timer;
  DBMS_RANDOM.INITIALIZE(131071);
  M := TRUNC(N * CRIMINAL_PERCENT / 100, 0);
  DBMS_OUTPUT.PUT_LINE('Criminal count '||TO_CHAR(M)||' ...');
  FOR I IN 0..M-1 LOOP
    --Phone1 => Phone2
    R1 := MOD(ABS(DBMS_RANDOM.RANDOM), N)+1;
    R2 := MOD(ABS(DBMS_RANDOM.RANDOM), N)+1;
    APHONE := '+7-CRIMINAL-'||
        TRIM(TO_CHAR(R1,'0000009'))||'-12';
    UPDATE CASEINFO SET PHONE1=APHONE WHERE (CASEID=R1);
    UPDATE CASEINFO SET PHONE2=APHONE WHERE (CASEID=R2);
    --Phone1 => Phone3
    R1 := MOD(ABS(DBMS_RANDOM.RANDOM), N)+1;
    R2 := MOD(ABS(DBMS_RANDOM.RANDOM), N)+1;
    APHONE := '+7-CRIMINAL-'||
        TRIM(TO_CHAR(R1,'0000009'))||'-13';
    UPDATE CASEINFO SET PHONE1=APHONE WHERE (CASEID=R1);
    UPDATE CASEINFO SET PHONE3=APHONE WHERE (CASEID=R2);
    --Phone1 => Phone4
    R1 := MOD(ABS(DBMS_RANDOM.RANDOM), N)+1;
    R2 := MOD(ABS(DBMS_RANDOM.RANDOM), N)+1;
    APHONE := '+7-CRIMINAL-'||
        TRIM(TO_CHAR(R1,'0000009'))||'-14';
    UPDATE CASEINFO SET PHONE1=APHONE WHERE (CASEID=R1);
    UPDATE CASEINFO SET PHONE4=APHONE WHERE (CASEID=R2);
    --Phone2 => Phone3
    R1 := MOD(ABS(DBMS_RANDOM.RANDOM), N)+1;
    R2 := MOD(ABS(DBMS_RANDOM.RANDOM), N)+1;
    APHONE := '+7-CRIMINAL-'||
        TRIM(TO_CHAR(R1,'0000009'))||'-23';
    UPDATE CASEINFO SET PHONE2=APHONE WHERE (CASEID=R1);
    UPDATE CASEINFO SET PHONE3=APHONE WHERE (CASEID=R2);
    --Phone2 => Phone4
    R1 := MOD(ABS(DBMS_RANDOM.RANDOM), N)+1;
    R2 := MOD(ABS(DBMS_RANDOM.RANDOM), N)+1;
    APHONE := '+7-CRIMINAL-'||
        TRIM(TO_CHAR(R1,'0000009'))||'-24';
    UPDATE CASEINFO SET PHONE2=APHONE WHERE (CASEID=R1);
    UPDATE CASEINFO SET PHONE4=APHONE WHERE (CASEID=R2);
    --Phone3 => Phone4
    R1 := MOD(ABS(DBMS_RANDOM.RANDOM), N)+1;
    R2 := MOD(ABS(DBMS_RANDOM.RANDOM), N)+1;
    APHONE := '+7-CRIMINAL-'||
        TRIM(TO_CHAR(R1,'0000009'))||'-34';
    UPDATE CASEINFO SET PHONE3=APHONE WHERE (CASEID=R1);
    UPDATE CASEINFO SET PHONE4=APHONE WHERE (CASEID=R2);
    --
    COMMIT;
  END LOOP;
  COMMIT;
  DBMS_RANDOM.TERMINATE;
END;


Below SQL select to check the data in created table.
------------------------------------------
-- Check test data counters
------------------------------------------
SELECT 'TOTAL',count(*) from CASEINFO
UNION ALL
SELECT 'LEGAL',count(*) from CASEINFO where 
  (PHONE1 = '+7727'||
    TRIM(TO_CHAR(PERSON,'0000009'))||'1') AND 
  (PHONE2 = '+7727'||
    TRIM(TO_CHAR(PERSON,'0000009'))||'2') AND 
  (PHONE3 = '+7727'||
    TRIM(TO_CHAR(PERSON,'0000009'))||'3') AND
  (PHONE4 = '+7727'||
    TRIM(TO_CHAR(PERSON,'0000009'))||'4')
UNION ALL
SELECT 'CRIMINAL',count(*) from CASEINFO where 
  (PHONE1 <> '+7727'||
    TRIM(TO_CHAR(PERSON,'0000009'))||'1') OR
  (PHONE2 <> '+7727'||
    TRIM(TO_CHAR(PERSON,'0000009'))||'2') OR
  (PHONE3 <> '+7727'||
    TRIM(TO_CHAR(PERSON,'0000009'))||'3') OR
  (PHONE4 <> '+7727'||
    TRIM(TO_CHAR(PERSON,'0000009'))||'4')
UNION ALL
SELECT 'PHONE1',count(*) from CASEINFO where 
  (PHONE1 <> '+7727'||
    TRIM(TO_CHAR(PERSON,'0000009'))||'1')
UNION ALL
SELECT 'PHONE2',count(*) from CASEINFO where 
  (PHONE2 <> '+7727'||
    TRIM(TO_CHAR(PERSON,'0000009'))||'2')
UNION ALL
SELECT 'PHONE3',count(*) from CASEINFO where 
  (PHONE3 <> '+7727'||
    TRIM(TO_CHAR(PERSON,'0000009'))||'3')
UNION ALL
SELECT 'PHONE4',count(*) from CASEINFO where 
  (PHONE4 <> '+7727'||
    TRIM(TO_CHAR(PERSON,'0000009'))||'4');


The PROBLEM is that i am experiencing HUGE perfomance problems on both test and production systems with that query:
select distinct b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and 
(
      (a.phone1=b.phone1) or 
      (a.phone1=b.phone2) or 
      (a.phone1=b.phone3) or 
      (a.phone1=b.phone4) or 

      (a.phone2=b.phone1) or 
      (a.phone2=b.phone2) or 
      (a.phone2=b.phone3) or 
      (a.phone2=b.phone4) or

      (a.phone3=b.phone1) or 
      (a.phone3=b.phone2) or 
      (a.phone3=b.phone3) or
      (a.phone3=b.phone4) or

      (a.phone4=b.phone1) or 
      (a.phone4=b.phone2) or 
      (a.phone4=b.phone3) or 
      (a.phone4=b.phone4)
);

This query takes almost 90 minutes to execute.
And i do not know how to avoid this.

Full SQL file to make test attached.

Any help appreciated.
Even data restructurization.
Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522537 is a reply to message #522536] Mon, 12 September 2011 04:38 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
phone1, phone2... should be in X rows in another table, this is 1NF (see Normalization).
What will happen if someone has 5 phone numbers?

For any peformances question, please read http://www.orafaq.com/forum/mv/msg/84315/433888/102589/#msg_433888 and post the required information.

Regards
Michel
Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522542 is a reply to message #522537] Mon, 12 September 2011 05:27 Go to previous messageGo to next message
malishich
Messages: 8
Registered: September 2011
Location: ALA
Junior Member
According normalization to 1NF - the data inserted from documents - one page form.
Also i've posted simplified problem and logic. I am specially connected only 4 varchar2 phones, but in real it has 30 different fields (in meaning) to be selected using special logic.
You mean 1NF - i should create tables:
1. persons with field person(primary key)
2. caseinfo with 2 fields caseid(primary), person(foreign)
3. casedata with 2 fields caseid(foreign), phone
But as i understand after that i will need to make select with 2 cartesian joins, because need to find same phone , after to link them to person table.

...
I will post missed "EXPLAIN PLAN" and "SQL Trace"...
Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522547 is a reply to message #522542] Mon, 12 September 2011 06:02 Go to previous messageGo to next message
malishich
Messages: 8
Registered: September 2011
Location: ALA
Junior Member
My Oracle give this plan:
"PLAN_TABLE_OUTPUT"
"Plan hash value: 3878668926"
" "
"---------------------------------------------------------------------------------------------------"
"| Id  | Operation                      | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |"
"---------------------------------------------------------------------------------------------------"
"|   0 | SELECT STATEMENT               |          |    52G|  8630G|       |  2747M  (1)|999:59:59 |"
"|   1 |  HASH UNIQUE                   |          |    52G|  8630G|  9742G|  2747M  (1)|999:59:59 |"
"|   2 |   CONCATENATION                |          |       |       |       |            |          |"
"|   3 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|   4 |     NESTED LOOPS               |          |    52G|  8630G|       |    58  (97)| 00:00:01 |"
"|   5 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|*  6 |      INDEX RANGE SCAN          | I_PHONE4 |     1 |       |       |     0   (0)| 00:00:01 |"
"|*  7 |     TABLE ACCESS BY INDEX ROWID| CASEINFO | 25114 |  2329K|       |     0   (0)| 00:00:01 |"
"|   8 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|   9 |     NESTED LOOPS               |          |   438K|    74M|       |    58  (97)| 00:00:01 |"
"|  10 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 11 |      INDEX RANGE SCAN          | I_PHONE3 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 12 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  13 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  14 |     NESTED LOOPS               |          | 21940 |  3792K|       |    58  (97)| 00:00:01 |"
"|  15 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 16 |      INDEX RANGE SCAN          | I_PHONE2 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 17 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  18 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  19 |     NESTED LOOPS               |          |  1034 |   178K|       |    58  (97)| 00:00:01 |"
"|  20 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 21 |      INDEX RANGE SCAN          | I_PHONE1 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 22 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  23 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  24 |     NESTED LOOPS               |          |    55 |  9735 |       |    58  (97)| 00:00:01 |"
"|  25 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 26 |      INDEX RANGE SCAN          | I_PHONE4 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 27 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  28 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  29 |     NESTED LOOPS               |          |     3 |   531 |       |    58  (97)| 00:00:01 |"
"|  30 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 31 |      INDEX RANGE SCAN          | I_PHONE3 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 32 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  33 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  34 |     NESTED LOOPS               |          |     1 |   177 |       |    58  (97)| 00:00:01 |"
"|  35 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 36 |      INDEX RANGE SCAN          | I_PHONE2 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 37 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  38 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  39 |     NESTED LOOPS               |          |     1 |   177 |       |    58  (97)| 00:00:01 |"
"|  40 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 41 |      INDEX RANGE SCAN          | I_PHONE1 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 42 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  43 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  44 |     NESTED LOOPS               |          |     1 |   177 |       |    58  (97)| 00:00:01 |"
"|  45 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 46 |      INDEX RANGE SCAN          | I_PHONE4 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 47 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  48 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  49 |     NESTED LOOPS               |          |     1 |   177 |       |    58  (97)| 00:00:01 |"
"|  50 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 51 |      INDEX RANGE SCAN          | I_PHONE3 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 52 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  53 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  54 |     NESTED LOOPS               |          |     1 |   177 |       |    58  (97)| 00:00:01 |"
"|  55 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 56 |      INDEX RANGE SCAN          | I_PHONE2 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 57 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  58 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  59 |     NESTED LOOPS               |          |     1 |   177 |       |    58  (97)| 00:00:01 |"
"|  60 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 61 |      INDEX RANGE SCAN          | I_PHONE1 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 62 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  63 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  64 |     NESTED LOOPS               |          |     1 |   177 |       |    58  (97)| 00:00:01 |"
"|  65 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 66 |      INDEX RANGE SCAN          | I_PHONE4 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 67 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  68 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  69 |     NESTED LOOPS               |          |     1 |   177 |       |    58  (97)| 00:00:01 |"
"|  70 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 71 |      INDEX RANGE SCAN          | I_PHONE3 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 72 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  73 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  74 |     NESTED LOOPS               |          |     1 |   177 |       |    58  (97)| 00:00:01 |"
"|  75 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 76 |      INDEX RANGE SCAN          | I_PHONE2 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 77 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"|  78 |    NESTED LOOPS                |          |       |       |       |            |          |"
"|  79 |     NESTED LOOPS               |          |     1 |   177 |       |    58  (97)| 00:00:01 |"
"|  80 |      TABLE ACCESS FULL         | CASEINFO |  2084K|   163M|       |    28  (93)| 00:00:01 |"
"|* 81 |      INDEX RANGE SCAN          | I_PHONE1 |     1 |       |       |     0   (0)| 00:00:01 |"
"|* 82 |     TABLE ACCESS BY INDEX ROWID| CASEINFO |     1 |    95 |       |     0   (0)| 00:00:01 |"
"---------------------------------------------------------------------------------------------------"
" "
"Predicate Information (identified by operation id):"
"---------------------------------------------------"
" "
"   6 - access(""A"".""PHONE4""=""B"".""PHONE4"")"
"   7 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"")"
"  11 - access(""A"".""PHONE4""=""B"".""PHONE3"")"
"  12 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  16 - access(""A"".""PHONE4""=""B"".""PHONE2"")"
"  17 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  21 - access(""A"".""PHONE4""=""B"".""PHONE1"")"
"  22 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE3"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  26 - access(""A"".""PHONE3""=""B"".""PHONE4"")"
"       filter(LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  27 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE1"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE2"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE3""))"
"  31 - access(""A"".""PHONE3""=""B"".""PHONE3"")"
"       filter(LNNVL(""A"".""PHONE4""=""B"".""PHONE3""))"
"  32 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE4"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE1"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  36 - access(""A"".""PHONE3""=""B"".""PHONE2"")"
"       filter(LNNVL(""A"".""PHONE4""=""B"".""PHONE2""))"
"  37 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE4"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE1"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE3"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  41 - access(""A"".""PHONE3""=""B"".""PHONE1"")"
"       filter(LNNVL(""A"".""PHONE4""=""B"".""PHONE1""))"
"  42 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE3"") AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE4"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE2"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  46 - access(""A"".""PHONE2""=""B"".""PHONE4"")"
"       filter(LNNVL(""A"".""PHONE3""=""B"".""PHONE4"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  47 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE1"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE2"") AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE1"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE3""))"
"  51 - access(""A"".""PHONE2""=""B"".""PHONE3"")"
"       filter(LNNVL(""A"".""PHONE3""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE3""))"
"  52 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE2""=""B"".""PHONE4"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE1"") AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE4"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE1"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE2"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  56 - access(""A"".""PHONE2""=""B"".""PHONE2"")"
"       filter(LNNVL(""A"".""PHONE3""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE2""))"
"  57 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE2""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE2""=""B"".""PHONE4"") AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE1"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE3"") AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE4"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE1"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  61 - access(""A"".""PHONE2""=""B"".""PHONE1"")"
"       filter(LNNVL(""A"".""PHONE3""=""B"".""PHONE1"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE1""))"
"  62 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE2""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE2""=""B"".""PHONE3"") AND "
"              LNNVL(""A"".""PHONE2""=""B"".""PHONE4"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE2"") AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE4"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE3"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  66 - access(""A"".""PHONE1""=""B"".""PHONE4"")"
"       filter(LNNVL(""A"".""PHONE2""=""B"".""PHONE4"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE4"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  67 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE2""=""B"".""PHONE1"") AND LNNVL(""A"".""PHONE2""=""B"".""PHONE2"") AND "
"              LNNVL(""A"".""PHONE2""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE1"") AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE3"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE1"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE2"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE3""))"
"  71 - access(""A"".""PHONE1""=""B"".""PHONE3"")"
"       filter(LNNVL(""A"".""PHONE2""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE3"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE3""))"
"  72 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE1""=""B"".""PHONE4"") AND LNNVL(""A"".""PHONE2""=""B"".""PHONE1"") AND "
"              LNNVL(""A"".""PHONE2""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE2""=""B"".""PHONE4"") AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE1"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE2"") AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE4"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE1"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  76 - access(""A"".""PHONE1""=""B"".""PHONE2"")"
"       filter(LNNVL(""A"".""PHONE2""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE2"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE2""))"
"  77 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE1""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE1""=""B"".""PHONE4"") AND "
"              LNNVL(""A"".""PHONE2""=""B"".""PHONE1"") AND LNNVL(""A"".""PHONE2""=""B"".""PHONE3"") AND "
"              LNNVL(""A"".""PHONE2""=""B"".""PHONE4"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE1"") AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE4"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE1"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE3"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
"  81 - access(""A"".""PHONE1""=""B"".""PHONE1"")"
"       filter(LNNVL(""A"".""PHONE2""=""B"".""PHONE1"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE1"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE1""))"
"  82 - filter(""A"".""SEX""=""B"".""SEX"" AND ""A"".""PERSON""<>""B"".""PERSON"" AND "
"              LNNVL(""A"".""PHONE1""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE1""=""B"".""PHONE3"") AND "
"              LNNVL(""A"".""PHONE1""=""B"".""PHONE4"") AND LNNVL(""A"".""PHONE2""=""B"".""PHONE2"") AND "
"              LNNVL(""A"".""PHONE2""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE2""=""B"".""PHONE4"") AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE2"") AND LNNVL(""A"".""PHONE3""=""B"".""PHONE3"") AND "
"              LNNVL(""A"".""PHONE3""=""B"".""PHONE4"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE2"") AND "
"              LNNVL(""A"".""PHONE4""=""B"".""PHONE3"") AND LNNVL(""A"".""PHONE4""=""B"".""PHONE4""))"
" "
"Note"
"-----"
"   - dynamic sampling used for this statement (level=2)"
Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522550 is a reply to message #522547] Mon, 12 September 2011 06:14 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Are you sure you want to return 52,000,000,000 rows? Wink
Rereading you query, it means you want all pairs of persons of same sex that share a phone number.
I think you should gather the statistics on the tables and indexes.

Regards
Michel
Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522553 is a reply to message #522550] Mon, 12 September 2011 06:30 Go to previous messageGo to next message
malishich
Messages: 8
Registered: September 2011
Location: ALA
Junior Member
According my test data actually i will have in worst case (because of usage DBMS_RANDOM.RANDOM) 720 000 UNIQUE records.
On small data amount (10 000 records) it works great.
Even - i've made new same table from real and copied into it first 20 000 records - work great with DISTINCT.
But for whole table of 10 million records - it is awesome...

At now guys recommended me to split my query to
simple small queries and make UNION or results.
Code below:
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone1=b.phone1)
UNION 
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone1=b.phone2)
UNION 
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone1=b.phone3)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone1=b.phone4)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone2=b.phone1)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone2=b.phone2)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone2=b.phone3)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone2=b.phone4)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone3=b.phone1)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone3=b.phone2)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone3=b.phone3)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone3=b.phone4)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone4=b.phone1)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone4=b.phone2)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone4=b.phone3)
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone4=b.phone4)


Above query work MUCH better (execution was done just now for almost 175 seconds) and it returned 286 083 UNIQUE records.

But i need faster... sure... we didn't stop on half a way!...
Laughing
Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522555 is a reply to message #522553] Mon, 12 September 2011 06:34 Go to previous messageGo to next message
malishich
Messages: 8
Registered: September 2011
Location: ALA
Junior Member
How to make this simple query faster?
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone4=b.phone4)

It executes 17 seconds.
Is there any way to do it faster?
I believe the way exists...

Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522556 is a reply to message #522553] Mon, 12 September 2011 06:36 Go to previous messageGo to next message
cookiemonster
Messages: 13958
Registered: September 2008
Location: Rainy Manchester
Senior Member
This:
UNION 
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone1=b.phone2)
Will return exactly the same data as:
UNION
select b.caseid
from CASEINFO a, CASEINFO b 
where (a.person<>b.person) and (a.sex=b.sex) and (a.phone2=b.phone1)

So you've got more checks than you need.
However I'd change the data model to third normal form and have a seperate table for phone numbers. That'll simplify the sql further and most likely speed it up.
Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522557 is a reply to message #522556] Mon, 12 September 2011 06:47 Go to previous messageGo to next message
malishich
Messages: 8
Registered: September 2011
Location: ALA
Junior Member
I think (a.phone1=b.phone2) and (a.phone2=b.phone1) are NOT same, because selecting always b.caseid. But i need all CASEs.
So if i have this data (assume other checks done):
CASEID, PHONE1, PHONE2
1,         1_1,    1_2
2,         FOO,    2_2
3,         3_1,    3_2
4,         4_1,    FOO
5,         5_1,    5_2


I should get both CASEID: 2 and 4
Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522565 is a reply to message #522557] Mon, 12 September 2011 07:09 Go to previous messageGo to next message
cookiemonster
Messages: 13958
Registered: September 2008
Location: Rainy Manchester
Senior Member
Yup. I'm thinking about it as though it's 3NF which it isn't. Any reason why you can't make it 3NF?
Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522616 is a reply to message #522565] Mon, 12 September 2011 16:42 Go to previous messageGo to next message
malishich
Messages: 8
Registered: September 2011
Location: ALA
Junior Member
I'd made 3NF structure.
But how to get duplicate phones for different people?

------------------------------------------
-- Create CASE 3NF test structure
------------------------------------------
DROP TABLE CASEPERSON;
DROP TABLE CASEDATA;
DROP TABLE CASEPHONE;
CREATE TABLE CASEPERSON
(
	PERSON INTEGER,
	SEX INTEGER
);
ALTER TABLE CASEPERSON ADD PRIMARY KEY (PERSON) USING INDEX;
CREATE INDEX I_CASEPERSON_SEX ON CASEPERSON (SEX);
/
CREATE TABLE CASEDATA
(
	CASEID INTEGER,
	PERSON INTEGER,
	SEX INTEGER
);
CREATE INDEX I_CASEDATA_CASEID ON CASEDATA (CASEID);
CREATE INDEX I_CASEDATA_PERSON ON CASEDATA (PERSON);
CREATE INDEX I_CASEDATA_SEX ON CASEDATA (SEX);
/
CREATE TABLE CASEPHONE
(
	CASEID INTEGER,
	PHONE VARCHAR2(24)
);
CREATE INDEX I_CASEPHONE_CASEID ON CASEPHONE (CASEID);
CREATE INDEX I_CASEPHONE_PHONE ON CASEPHONE (PHONE);


Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522706 is a reply to message #522616] Tue, 13 September 2011 04:36 Go to previous messageGo to next message
cookiemonster
Messages: 13958
Registered: September 2008
Location: Rainy Manchester
Senior Member
SELECT cd1.caseid
FROM casedata cd1, casedata cd2, casephone cp1, casephone cp2
WHERE cd1.caseid = cp1.caseid
AND cd2.caseid = cp2.caseid
AND cd1.person != c2.person
AND cp1.phone = cp2.phone

Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522722 is a reply to message #522706] Tue, 13 September 2011 05:52 Go to previous messageGo to next message
malishich
Messages: 8
Registered: September 2011
Location: ALA
Junior Member
I had checked this your query on 2 million records (casedata 2 million, casephone 8 millions, because of 4 phones for each person). Results correct, but...
Your query takes 10 minutes to execute on 11.2.0.1

The fastest solution I found at now is,
at first, make single special data table.
In this table all fields NON UNIQUE (but indexed).
CREATE TABLE CASELIST
(
	CASEID INTEGER,
	PERSON INTEGER,
	SEX INTEGER,
	PHONE VARCHAR2(24)
);
CREATE INDEX I_CASELIST_CASEID ON CASELIST (CASEID);
CREATE INDEX I_CASELIST_PERSON ON CASELIST (PERSON);
CREATE INDEX I_CASELIST_SEX ON CASELIST (SEX);
CREATE INDEX I_CASELIST_PHONE ON CASELIST (PHONE);


And the select (which executes 76 seconds):
select c.CASEID
from CASELIST c, 
(
  select b.PHONE
    rank() over 
      (partition by a.PHONE order by a.PERSON) RNK
  from CASELIST a,  
  (
    select PHONE
    from CASELIST
    having count(*)>1
    group by PHONE
  ) b
  where (a.PHONE = b.PHONE)
) d
where (d.RNK=2) and (d.PHONE=c.PHONE)


That the situation with cartesian joins in all SQL engines.
Actually i've checked up this on Oracle, SQL Server, PostgresSQL,MySQL,Firebird.
Only oracle can give me such speed.
Others cant (or even cant execute due to SQL lang limitations).

Searching more... Need 30 seconds... at
Re: Perfomance problem with select distinct from cartesian join with OR-ed filter [message #522754 is a reply to message #522722] Tue, 13 September 2011 08:29 Go to previous message
cookiemonster
Messages: 13958
Registered: September 2008
Location: Rainy Manchester
Senior Member
Turning this bit into a materialized view would probably make a lot of difference:
select PHONE
    from CASELIST
    having count(*)>1
    group by PHONE
Previous Topic: pl/sql
Next Topic: "NOT IN" REPLACE WITH "MINUS"
Goto Forum:
  


Current Time: Mon Nov 25 04:09:25 CST 2024