Wrong query result with match_recognize query and parallel degree equal to number of hash partition

From: Thomas Ranchon <thomas_at_ranchon.org>
Date: Fri, 13 Aug 2021 19:09:53 +0200
Message-ID: <CAJNN_2QtTU36VOGTvxFBRwqUa1o=+XA42VSYLs_LtpW04Kob-g_at_mail.gmail.com>



Hi,

I have a match_recognize query that I use to merge periods that have the same attributes values.
Tables are hash partitioned with 32 partitions. If parallel degree <= number of partitions / 2 periods are correctly merged. If parallel degree > number of partitions / 2 periods are not merged.

I have this problem on Oracle EE 19.8 (dev database) and Oracle EE 18.6 (prod database)
dev database will be patched to 19.12 shortly, I'll make some more tests once it's done.

Here is a simplified test case which give me this problem :

CREATE
 TABLE tmp_periode(

       id_1      NUMBER(8)
     , id_2      NUMBER(8)
     , date_deb  DATE
     , date_fin  DATE)

  PARTITION BY HASH(id_1)
  PARTITIONS 32; CREATE
 TABLE tmp_data(
       id_1      NUMBER(8)
     , id_2      NUMBER(8)
     , date_deb  DATE
     , date_fin  DATE
     , mt_taxe   NUMBER(11, 4)
     , pr        NUMBER(11, 4))

  PARTITION BY HASH(id_1)
  PARTITIONS 32; INSERT INTO tmp_periode VALUES(1, 2, TRUNC(SYSDATE) - 100, TRUNC(SYSDATE)); INSERT INTO tmp_periode VALUES(1, 2, TRUNC(SYSDATE) + 1, TRUNC(SYSDATE) + 100);

INSERT INTO tmp_data VALUES(1, 2, TRUNC(SYSDATE) - 100, TRUNC(SYSDATE), 0.0018, 0.8318);
INSERT INTO tmp_data VALUES(1, 2, TRUNC(SYSDATE) + 1, TRUNC(SYSDATE) + 100, 0.0018, 0.8318);

COMMIT; SELECT /*+ PARALLEL(16) */
       *

  FROM (SELECT a.id_1

, a.date_deb
, a.date_fin
, MAX(b.mt_taxe)
KEEP(DENSE_RANK FIRST ORDER BY b.pr DESC, a.id_2 ASC) AS mt_taxe FROM tmp_periode a
, tmp_data b
WHERE b.id_1 = a.id_1 AND b.id_2 = a.id_2 AND b.date_deb <= a.date_deb AND b.date_fin >= a.date_deb GROUP BY a.id_1
, a.date_deb
, a.date_fin)
MATCH_RECOGNIZE( PARTITION BY id_1 ORDER BY date_deb ASC MEASURES MIN(date_deb) AS date_deb
, MAX(date_fin) AS date_fin
, mt_taxe AS mt_taxe
ONE ROW PER MATCH PATTERN(strt ind_same*) DEFINE ind_same AS ( date_deb - 1 = PREV(date_fin) AND mt_taxe = PREV(mt_taxe)));

=> With PARALLEL(16) the query returns 1 line which is the expected result, with PARALLEL(32) the query returns 2 lines. => Without the KEEP(DENSE_RANK FIRST ORDER BY b.pr DESC, a.id_2 ASC) the result is correct even using PARALLEL(32) I have seen "Bug 31393600 - Dense_Rank Query Returns Incomplete Results (Doc ID 31393600.8)" but it should be fixed in 19.8

WITH
  cte_data AS (
    SELECT /*+ MATERIALIZE */

           a.id_1
         , a.date_deb
         , a.date_fin
         , MAX(b.mt_taxe)
             KEEP(DENSE_RANK FIRST ORDER BY b.pr DESC, a.id_2 ASC) AS
mt_taxe
      FROM tmp_periode a
         , tmp_data b
     WHERE b.id_1 = a.id_1
       AND b.id_2 = a.id_2
       AND b.date_deb <= a.date_deb
       AND b.date_fin >= a.date_deb
     GROUP
        BY a.id_1
         , a.date_deb
         , a.date_fin)
SELECT /*+ PARALLEL(32) */
       *

  FROM cte_data
    MATCH_RECOGNIZE(
      PARTITION BY id_1
      ORDER BY date_deb ASC
      MEASURES MIN(date_deb) AS date_deb

, MAX(date_fin) AS date_fin
, mt_taxe AS mt_taxe
ONE ROW PER MATCH PATTERN(strt ind_same*) DEFINE ind_same AS ( date_deb - 1 = PREV(date_fin) AND mt_taxe = PREV(mt_taxe)));

=> When using a CTE for the subquery the result is good no matter the parallel degree used.

Obviously plans are not the same between CTE and SUBQUERY :

CTE plan :


| Id  | Operation                                           | Name
              | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | Pstart| Pstop |
TQ |IN-OUT| PQ Distrib | OMem | 1Mem | Used-Mem |
|   0 | SELECT STATEMENT                                    |
               |        |       |    13 (100)|          |       |       |
      |      |            |       |       |          |
|   1 |  TEMP TABLE TRANSFORMATION                          |
               |        |       |            |          |       |       |
      |      |            |       |       |          |
|   2 |   PX COORDINATOR                                    |
               |        |       |            |          |       |       |
      |      |            | 73728 | 73728 |          |
|   3 |    PX SEND QC (RANDOM)                              | :TQ10001
              |      1 |   114 |    10   (0)| 00:00:01 |       |       |
Q1,01 | P->S | QC (RAND)  |       |       |          |
|   4 |     LOAD AS SELECT (TEMP SEGMENT MERGE)             |
SYS_TEMP_0FD9EC3D5_B0240DB9 |        |       |            |          |
 |       |  Q1,01 | PCWP |            |  1043K|  1043K| 1043K (0)|
|   5 |      SORT GROUP BY                                  |
               |      1 |   114 |    10   (0)| 00:00:01 |       |       |
Q1,01 | PCWP |            | 36864 | 36864 |          |
|   6 |       PX RECEIVE                                    |
               |      1 |   114 |    10   (0)| 00:00:01 |       |       |
Q1,01 | PCWP |            |       |       |          |
|   7 |        PX SEND HASH                                 | :TQ10000
              |      1 |   114 |    10   (0)| 00:00:01 |       |       |
Q1,00 | P->P | HASH       |       |       |          |
|   8 |         SORT GROUP BY                               |
               |      1 |   114 |    10   (0)| 00:00:01 |       |       |
Q1,00 | PCWP |            |  2048 |  2048 | 2048  (0)|
|   9 |          NESTED LOOPS                               |
               |      1 |   114 |    10   (0)| 00:00:01 |       |       |
Q1,00 | PCWP |            |       |       |          |
|  10 |           PARTITION HASH ALL                        |
               |      1 |    44 |    10   (0)| 00:00:01 |     1 |    32 |
Q1,00 | PCWC |            |       |       |          |
|  11 |            TABLE ACCESS FULL                        | TMP_PERIODE
               |      1 |    44 |    10   (0)| 00:00:01 |     1 |    32 |
Q1,00 | PCWP |            |       |       |          |
|  12 |           PX BLOCK ITERATOR                         |
               |      1 |    70 |     9   (0)| 00:00:01 |   KEY |   KEY |
Q1,00 | PCWC |            |       |       |          |
|* 13 |            TABLE ACCESS FULL                        | TMP_DATA
              |      1 |    70 |     9   (0)| 00:00:01 |   KEY |   KEY |
Q1,00 | PCWP |            |       |       |          |
|  14 |   PX COORDINATOR                                    |
               |        |       |            |          |       |       |
      |      |            | 73728 | 73728 |          |
|  15 |    PX SEND QC (RANDOM)                              | :TQ20001
              |      1 |    44 |     3  (34)| 00:00:01 |       |       |
Q2,01 | P->S | QC (RAND)  |       |       |          |
|  16 |     BUFFER SORT                                     |
               |      1 |    44 |            |          |       |       |
Q2,01 | PCWP |            | 36864 | 36864 |          |
|  17 |      VIEW                                           |
               |      1 |    44 |     3  (34)| 00:00:01 |       |       |
Q2,01 | PCWP |            |       |       |          |
|  18 |       MATCH RECOGNIZE SORT DETERMINISTIC FINITE AUTO|
               |      1 |    44 |     3  (34)| 00:00:01 |       |       |
Q2,01 | PCWP |            | 36864 | 36864 |          |
|  19 |        PX RECEIVE                                   |
               |      1 |    44 |     2   (0)| 00:00:01 |       |       |
Q2,01 | PCWP |            |       |       |          |
|  20 |         PX SEND HASH                                | :TQ20000
              |      1 |    44 |     2   (0)| 00:00:01 |       |       |
Q2,00 | P->P | HASH       |       |       |          |
|  21 |          VIEW                                       |
               |      1 |    44 |     2   (0)| 00:00:01 |       |       |
Q2,00 | PCWP |            |       |       |          |
|  22 |           PX BLOCK ITERATOR                         |
               |      1 |    44 |     2   (0)| 00:00:01 |       |       |
Q2,00 | PCWC |            |       |       |          |
|* 23 |            TABLE ACCESS FULL                        |
SYS_TEMP_0FD9EC3D5_B0240DB9 |      1 |    44 |     2   (0)| 00:00:01 |
 |       |  Q2,00 | PCWP |            |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Subquery plan :


| Id  | Operation                                          | Name        |
E-Rows |E-Bytes| Cost (%CPU)| E-Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib | OMem | 1Mem | Used-Mem |
|   0 | SELECT STATEMENT                                   |             |
      |       |    10 (100)|          |       |       |        |      |
        |       |       |          |
|   1 |  PX COORDINATOR                                    |             |
      |       |            |          |       |       |        |      |
        | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)                              | :TQ10002    |
    1 |    44 |    10   (0)| 00:00:01 |       |       |  Q1,02 | P->S | QC
(RAND)  |       |       |          |
|   3 |    BUFFER SORT                                     |             |
    1 |    44 |            |          |       |       |  Q1,02 | PCWP |
        | 36864 | 36864 |          |
|   4 |     VIEW                                           |             |
    1 |    44 |    10   (0)| 00:00:01 |       |       |  Q1,02 | PCWP |
        |       |       |          |
|   5 |      MATCH RECOGNIZE BUFFER DETERMINISTIC FINITE AU|             |
    1 |    44 |    10   (0)| 00:00:01 |       |       |  Q1,02 | PCWP |
        | 36864 | 36864 |          |
|   6 |       PX RECEIVE                                   |             |
    1 |    44 |    10   (0)| 00:00:01 |       |       |  Q1,02 | PCWP |
        |       |       |          |
|   7 |        PX SEND HASH                                | :TQ10001    |
    1 |    44 |    10   (0)| 00:00:01 |       |       |  Q1,01 | P->P |
HASH       |       |       |          |
|   8 |         VIEW                                       |             |
    1 |    44 |    10   (0)| 00:00:01 |       |       |  Q1,01 | PCWP |
        |       |       |          |
|   9 |          SORT GROUP BY                             |             |
    1 |   114 |    10   (0)| 00:00:01 |       |       |  Q1,01 | PCWP |
        |  2048 |  2048 | 2048  (0)|
|  10 |           PX RECEIVE                               |             |
    1 |   114 |    10   (0)| 00:00:01 |       |       |  Q1,01 | PCWP |
        |       |       |          |
|  11 |            PX SEND HASH                            | :TQ10000    |
    1 |   114 |    10   (0)| 00:00:01 |       |       |  Q1,00 | P->P |
HASH       |       |       |          |
|  12 |             SORT GROUP BY                          |             |
    1 |   114 |    10   (0)| 00:00:01 |       |       |  Q1,00 | PCWP |
        |  2048 |  2048 | 2048  (0)|
|  13 |              NESTED LOOPS                          |             |
    1 |   114 |    10   (0)| 00:00:01 |       |       |  Q1,00 | PCWP |
        |       |       |          |
|  14 |               PARTITION HASH ALL                   |             |
    1 |    44 |    10   (0)| 00:00:01 |     1 |    32 |  Q1,00 | PCWC |
        |       |       |          |
|  15 |                TABLE ACCESS FULL                   | TMP_PERIODE |
    1 |    44 |    10   (0)| 00:00:01 |     1 |    32 |  Q1,00 | PCWP |
        |       |       |          |
|  16 |               PX BLOCK ITERATOR                    |             |
    1 |    70 |     9   (0)| 00:00:01 |   KEY |   KEY |  Q1,00 | PCWC |
        |       |       |          |
|* 17 |                TABLE ACCESS FULL                   | TMP_DATA    |
    1 |    70 |     9   (0)| 00:00:01 |   KEY |   KEY |  Q1,00 | PCWP |
        |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------------

I don't know if it"s the culprit but with subquery I have a "MATCH RECOGNIZE BUFFER DETERMINISTIC FINITE AU" and with CTE I have a "MATCH RECOGNIZE SORT DETERMINISTIC FINITE AUTO" I have also found a bug related to match_recognize, fixed in 19.11 "Using descending order within the row_pattern_order_by syntax of a match_recognize query gives wrong final order", but I'm not using descending order in my match_recognize order by.

Is this a known problem?

Regards,
Thomas Ranchon

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Aug 13 2021 - 19:09:53 CEST

Original text of this message