RE: Exadata Tuning Question+

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Sun, 9 Nov 2014 17:57:40 +0000
Message-ID: <CE70217733273F49A8A162EE074F64D901E5CD88_at_exmbx06.thus.corp>



Iggy,

I think you've lost a "group by" on your example (and a LEFT JOIN to SERVICE_LOCATION).

Given the size of the sort group by that would appear (7M * 125) rows if we trust the 25 rows per sid/relationship_level, that could be a very expensive operation - and may be why the alternative in-line query approach was taken. However, if most of the rows from service_relationship are actually involved in the resultset, then it might be more efficient to join (sl, slv) to an aggregate inline view holding the result (select listagg() sr1, listagg() sr2.... from service_relationship).

An argument in favour of the join (whether or not it does the listagg early or late) is that the inline select approach seems to execute the inline subqueries at the query coordinator (i.e. a single process doing all of them, rather than the parallel slaves sharing them); the join would simply be another layer (or two) in the parallel execution.

I'd be interested to hear if anyone has any idea why the optimizer has done a "hash join buffered" at line 28 of the plan. From what we've heard so far there doesn't seem to be any need for the buffering.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle



From: oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org] on behalf of Iggy Fernandez [iggy_fernandez_at_hotmail.com] Sent: 09 November 2014 05:07
To: Abdul.Ebadi_at_Level3.com; ORACLE-L
Subject: RE: Exadata Tuning Question+

Abdul,

One last comment. You could check if sort-merge is effective for the join between SERVICE_LOOKUP and SERVICE_RELATIONSHIP, followed by a hash join to SERVICE_RELATIONSHIP. The following hints will force such a strategy

SELECT /*+ LEADING (slv sl sr) USE_MERGE (slv sl) USE_HASH(sr) SWAP_JOIN_INPUTS(sr) */   slv.sid,
  slv.service_location_id,

  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", DECODE(sr.relationship_level, '1', sr.RELATED_SID || ',', NULL)) ORDER BY sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel1,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", DECODE(sr.relationship_level, '2', sr.RELATED_SID || ',', NULL)) ORDER BY sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel2,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", DECODE(sr.relationship_level, '3', sr.RELATED_SID || ',', NULL)) ORDER BY sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel3,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", DECODE(sr.relationship_level, '4', sr.RELATED_SID || ',', NULL)) ORDER BY sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel4,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", DECODE(sr.relationship_level, '5', sr.RELATED_SID || ',', NULL)) ORDER BY sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel5
FROM
  service_lookup slv
  INNER JOIN service_location sl ON sl.service_location_id = slv.service_location_id   LEFT OUTER JOIN service_relationship sr ON sr.sid = slv.sid AND sr.relationship_level IN ('1', '2', '3', '4', '5');

Or

SELECT /*+ LEADING (slv sl sr) USE_MERGE (slv sl) USE_HASH(sr) SWAP_JOIN_INPUTS(sr) */   slv.sid,
  slv.service_location_id,

  LISTAGG(DECODE(sr.relationship_level, '1', sr.related_sid, NULL) ',') WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel1,
  LISTAGG(DECODE(sr.relationship_level, '2', sr.related_sid, NULL) ',') WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel2,
  LISTAGG(DECODE(sr.relationship_level, '3', sr.related_sid, NULL) ',') WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel3,
  LISTAGG(DECODE(sr.relationship_level, '4', sr.related_sid, NULL) ',') WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel4,
  LISTAGG(DECODE(sr.relationship_level, '5', sr.related_sid, NULL) ',') WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel5
FROM
  service_lookup slv
  INNER JOIN service_location sl ON sl.service_location_id = slv.service_location_id   LEFT OUTER JOIN service_relationship sr ON sr.sid = slv.sid AND sr.relationship_level IN ('1', '2', '3', '4', '5');

From: iggy_fernandez_at_hotmail.com
To: abdul.ebadi_at_level3.com; oracle-l_at_freelists.org Subject: RE: Exadata Tuning Question+
Date: Sat, 8 Nov 2014 20:31:15 -0800

Abdul,

Thank you for the very interesting problem. I'm curious why you need to display the rows on your laptop screen but the main reason why your query takes hours is Oracle is being forced to perform 35 million full table scans of the large SERVICE_RELATIONSHIP table. The 35 million table scans can be avoided if the query is written as follows:

SELECT
  slv.sid,
  slv.service_location_id,

  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", DECODE(sr.relationship_level, '1', sr.RELATED_SID || ',', NULL)) ORDER BY sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel1,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", DECODE(sr.relationship_level, '2', sr.RELATED_SID || ',', NULL)) ORDER BY sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel1,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", DECODE(sr.relationship_level, '3', sr.RELATED_SID || ',', NULL)) ORDER BY sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel1,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", DECODE(sr.relationship_level, '4', sr.RELATED_SID || ',', NULL)) ORDER BY sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel1,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", DECODE(sr.relationship_level, '5', sr.RELATED_SID || ',', NULL)) ORDER BY sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel1
FROM
  service_lookup slv
  INNER JOIN service_location sl ON sl.service_location_id = slv.service_location_id   LEFT OUTER JOIN service_relationship sr ON sr.sid = slv.sid AND sr.relationship_level IN ('1', '2', '3', '4', '5');

Or, more simply:

SELECT
  slv.sid,
  slv.service_location_id,

  LISTAGG(DECODE(sr.relationship_level, '1', sr.related_sid, NULL) ',') WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel1,
  LISTAGG(DECODE(sr.relationship_level, '2', sr.related_sid, NULL) ',') WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel2,
  LISTAGG(DECODE(sr.relationship_level, '3', sr.related_sid, NULL) ',') WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel3,
  LISTAGG(DECODE(sr.relationship_level, '4', sr.related_sid, NULL) ',') WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel4,
  LISTAGG(DECODE(sr.relationship_level, '5', sr.related_sid, NULL) ',') WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel5
FROM
  service_lookup slv
  INNER JOIN service_location sl ON sl.service_location_id = slv.service_location_id   LEFT OUTER JOIN service_relationship sr ON sr.sid = slv.sid AND sr.relationship_level IN ('1', '2', '3', '4', '5');

Would you give us the query plan and the execution time for the rewritten query and let us know whether or not you see pauses when displaying the results of the rewritten query.

Kindest regards,
Iggy

P.S. Note that XMLAGG (and XMLERIALIZE) return a CLOB while LISTAGG returns VARCHAR2 which is of consequence if the comma-delimited string may have more data than can be accommodated by VARCHAR2.



From: Abdul.Ebadi_at_Level3.com
To: tim_at_evdbt.com; oracle-l_at_freelists.org Subject: RE: Exadata Tuning Question+
Date: Fri, 7 Nov 2014 22:02:51 +0000

Tim/others… thanks for your replies! Below is a slightly generalized version of the query to give you an idea of the joins and an actual explain plan:

SELECT bunch of stuff.....,

        (

                SELECT  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", sr1.RELATED_SID

                        ||

                        ',')

                ORDER BY sr1.RELATED_SID), '//text()' ) ) , ',' )

                FROM    service_relationship sr1

                WHERE   sr1.SID                    = slv.SID

                        AND sr1.RELATIONSHIP_LEVEL = '1'

                GROUP BY sr1.SID

        ) AS RELATEDSERVICEINSTANCEIDLEVEL1,

        (

                SELECT  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", sr2.RELATED_SID

                        ||

                        ',')

                ORDER BY sr2.RELATED_SID), '//text()' ) ) , ',' )

                FROM    service_relationship sr2

                WHERE   sr2.SID                    = slv.SID

                        AND sr2.RELATIONSHIP_LEVEL = '2'

                GROUP BY sr2.SID

        ) AS RELATEDSERVICEINSTANCEIDLEVEL2,

        (

               SELECT  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", sr3.RELATED_SID

                        ||

                        ',')

                ORDER BY sr3.RELATED_SID), '//text()' ) ) , ',' )

                FROM    service_relationship sr3

                WHERE   sr3.SID                    = slv.SID

                        AND sr3.RELATIONSHIP_LEVEL = '3'

                GROUP BY sr3.SID

        ) AS RELATEDSERVICEINSTANCEIDLEVEL3,

        (

                SELECT  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", sr4.RELATED_SID

                        ||

                        ',')

                ORDER BY sr4.RELATED_SID), '//text()' ) ) , ',' )

                FROM    service_relationship sr4

                WHERE   sr4.SID                    = slv.SID

                        AND sr4.RELATIONSHIP_LEVEL = '4'

                GROUP BY sr4.SID

        ) AS RELATEDSERVICEINSTANCEIDLEVEL4,

        (

                SELECT  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", sr5.RELATED_SID

                        ||

                        ',')

                ORDER BY sr5.RELATED_SID), '//text()' ) ) , ',' )

                FROM    service_relationship sr5

                WHERE   sr5.SID                    = slv.SID

                        AND sr5.RELATIONSHIP_LEVEL = '5'

                GROUP BY sr5.SID

        ) AS RELATEDSERVICEINSTANCEIDLEVEL5

FROM service_lookup slv

        LEFT JOIN service_location sl

        ON sl.service_location_id = slv.service_location_id;

PLAN_TABLE_OUTPUT


Plan hash value: 1570133209


| Id  | Operation                        | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |


--------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7331K| 5593M| 1877 (5)| 00:00:01 | | | | | 1 | SORT GROUP BY | | 1 | 22 | 368 (6)| 00:00:01 | | | | | 2 | PX COORDINATOR | | | | | | | | | | 3 | PX SEND QC (RANDOM) | :TQ10000 | 25 | 550 | 368 (6)| 00:00:01 | Q1,00 | P->S | QC (RAND) | | 4 | PX BLOCK ITERATOR | | 25 | 550 | 368 (6)| 00:00:01 | Q1,00 | PCWC | | |* 5 | TABLE ACCESS STORAGE FULL | SERVICE_RELATIONSHIP | 25 | 550 | 368 (6)| 00:00:01 | Q1,00 | PCWP | |

PLAN_TABLE_OUTPUT


|   6 |  SORT GROUP BY                   |                      |     1 |    22 |   368   (6)| 00:00:01 |        |      |            |

|   7 |   PX COORDINATOR                 |                      |       |       |            |          |        |      |            |

|   8 |    PX SEND QC (RANDOM)           | :TQ20000             |    25 |   550 |   368   (6)| 00:00:01 |  Q2,00 | P->S | QC (RAND)  |

|   9 |     PX BLOCK ITERATOR            |                      |    25 |   550 |   368   (6)| 00:00:01 |  Q2,00 | PCWC |            |

|* 10 |      TABLE ACCESS STORAGE FULL   | SERVICE_RELATIONSHIP |    25 |   550 |   368   (6)| 00:00:01 |  Q2,00 | PCWP |            |

|  11 |  SORT GROUP BY                   |                      |     1 |    22 |   368   (6)| 00:00:01 |        |      |            |

|  12 |   PX COORDINATOR                 |                      |       |       |            |          |        |      |            |

|  13 |    PX SEND QC (RANDOM)           | :TQ30000             |    25 |   550 |   368   (6)| 00:00:01 |  Q3,00 | P->S | QC (RAND)  |

|  14 |     PX BLOCK ITERATOR            |                      |    25 |   550 |   368   (6)| 00:00:01 |  Q3,00 | PCWC |            |

|* 15 |      TABLE ACCESS STORAGE FULL   | SERVICE_RELATIONSHIP |    25 |   550 |   368   (6)| 00:00:01 |  Q3,00 | PCWP |            |

|  16 |  SORT GROUP BY                   |                      |     1 |    22 |   368   (6)| 00:00:01 |        |      |            |



PLAN_TABLE_OUTPUT


|  17 |   PX COORDINATOR                 |                      |       |       |            |          |        |      |            |

|  18 |    PX SEND QC (RANDOM)           | :TQ40000             |    25 |   550 |   368   (6)| 00:00:01 |  Q4,00 | P->S | QC (RAND)  |

|  19 |     PX BLOCK ITERATOR            |                      |    25 |   550 |   368   (6)| 00:00:01 |  Q4,00 | PCWC |            |

|* 20 |      TABLE ACCESS STORAGE FULL   | SERVICE_RELATIONSHIP |    25 |   550 |   368   (6)| 00:00:01 |  Q4,00 | PCWP |            |

|  21 |  SORT GROUP BY                   |                      |     1 |    22 |   368   (6)| 00:00:01 |        |      |            |

|  22 |   PX COORDINATOR                 |                      |       |       |            |          |        |      |            |

|  23 |    PX SEND QC (RANDOM)           | :TQ50000             |    25 |   550 |   368   (6)| 00:00:01 |  Q5,00 | P->S | QC (RAND)  |

|  24 |     PX BLOCK ITERATOR            |                      |    25 |   550 |   368   (6)| 00:00:01 |  Q5,00 | PCWC |            |

|* 25 |      TABLE ACCESS STORAGE FULL   | SERVICE_RELATIONSHIP |    25 |   550 |   368   (6)| 00:00:01 |  Q5,00 | PCWP |            |

|  26 |  PX COORDINATOR                  |                      |       |       |            |          |        |      |            |

|  27 |   PX SEND QC (RANDOM)            | :TQ60002             |  7331K|  5593M|  1877   (5)| 00:00:01 |  Q6,02 | P->S | QC (RAND)  |



PLAN_TABLE_OUTPUT


|* 28 |    HASH JOIN RIGHT OUTER BUFFERED|                      |  7331K|  5593M|  1877   (5)| 00:00:01 |  Q6,02 | PCWP |            |

|  29 |     PX RECEIVE                   |                      |  3175K|   920M|   366   (3)| 00:00:01 |  Q6,02 | PCWP |            |

|  30 |      PX SEND HASH                | :TQ60000             |  3175K|   920M|   366   (3)| 00:00:01 |  Q6,00 | P->P | HASH       |

|  31 |       PX BLOCK ITERATOR          |                      |  3175K|   920M|   366   (3)| 00:00:01 |  Q6,00 | PCWC |            |

|  32 |        TABLE ACCESS STORAGE FULL | SERVICE_LOCATION     |  3175K|   920M|   366   (3)| 00:00:01 |  Q6,00 | PCWP |            |

|  33 |     PX RECEIVE                   |                      |  7331K|  3467M|  1507   (5)| 00:00:01 |  Q6,02 | PCWP |            |

|  34 |      PX SEND HASH                | :TQ60001             |  7331K|  3467M|  1507   (5)| 00:00:01 |  Q6,01 | P->P | HASH       |

|  35 |       PX BLOCK ITERATOR          |                      |  7331K|  3467M|  1507   (5)| 00:00:01 |  Q6,01 | PCWC |            |

|  36 |        TABLE ACCESS STORAGE FULL | SERVICE_LOOKUP       |  7331K|  3467M|  1507   (5)| 00:00:01 |  Q6,01 | PCWP |            |


--------------------------------------------------------------------------------------------------------------------------------------

From: Tim Gorman [mailto:tim_at_evdbt.com] Sent: Friday, November 07, 2014 2:36 PM
To: ORACLE-L; Ebadi, Abdul
Subject: Re: Exadata Tuning Question+

Abdul,

Although it might be a "best practice" to remove indexes to encourage full scans in parallel, its important to remember that in some circumstances this might cause a lot of SORT-MERGE or HASH joins, especially if the query involves joins to lots of big tables. Those SM or HA joins will spend a lot of time reading and writing to the temporary tablespace, and that I/O on temporary tablespaces do not have any special off-loading or smart-scan optimizations. Exadata only has off-loading and smart-scan optimizations for I/O operations taking place in datafiles.

For tables this size, you can't increase the PGA size enough to cache the entire SM or HA join in memory, so that's probably why you haven't seen any benefit.

Chances are good that SQL Plan Monitor or DBMS_XPLAN.DISPLAY_CURSOR(option=>'ALLSTATS ALL') would show you that the time spent scanning the row-sources is negligible, but the time spent on joins is taking the majority of elapsed time.

How many and what type of join operations? Are they straight-forward INNER joins, or OUTER joins? Are there any FULL OUTER JOINs or sub-queries?

Hope this helps...

-Tim

On 11/7/14 13:59, Ebadi, Abdul wrote:

We have a half rack 4-node Exadata (X2 high capacity) running several DW databases for us. We have a query going against a 21 million row table with several self-joins in it. This query returns 7 million rows takes way too long too run (hrs).

We have made sure it is running in parallel using cell offloading (full storage scans) and when we put a count(*) around the query it returns in only 4 seconds for 7 million rows returned.

However, when we display the output to the screen it takes hours for it to finish and we see pauses in the display every second or two while it is running.

Trying to figure out what is causing these pauses? The wait is PX Deq: type waits when it runs with these pauses. SQL Monitor doesn’t tell us much either except cell efficiency is negative 85%!

We have increased PGA size and didn’t make much difference. We are considering putting TEMP tablespace on flash cache possibly. Another DBA added an index to it just to see (bad idea on Exadata) and did’t improve it. Before making any more change we would like to see some evidence for root cause.

We were told for best practices on Exadata it is better to remove indexes and hints (if possible) and let the machine full scan in parallel using storage offloading. Do you guys agree and are there other best practices on Exadata also?

Any other suggestions on tuning this query and also general Exadata best practices?

Thanks,

Abdul

--
http://www.freelists.org/webpage/oracle-l
Received on Sun Nov 09 2014 - 18:57:40 CET

Original text of this message