Re: Performance issue on Oracle 9i

From: Martin Berger <martin.a.berger_at_gmail.com>
Date: Sat, 13 Jan 2024 19:57:47 +0100
Message-ID: <CALH8A91kri_OfeEWW0m3v+jhtA2Bg=EkvEVA6umMxKaNK97uEw_at_mail.gmail.com>

Hi Sandy,

I understand you can not share any data from your company, so I decided to create my own model and play with this one. Obviously my datas distribution doesn't match yours - if it seems to be relevant somewhere you might be able to improve the generator so we have a better understanding and you never shared anything regarding your company. We are only playing with the model.

All my tests here are done with 23c but with the hint OPTIMIZER_FEATURES_ENABLE('9.2') - so let's hope it comes close to your env.
My sandbox is 23c free (23.2) on linux with 8k blocksize, default USERS tablespace.

First the table:

CREATE TABLE txn_proc
(
ID NUMBER NOT NULL
, TX_REF NUMBER NOT NULL
, SEND_DATE DATE
, EXP_CMNT VARCHAR2(20)
, PROJNO VARCHAR2(20)
, TNO VARCHAR2(20)
, PAD VARCHAR2(999)

);

INSERT INTO txn_proc
select level as id

     , trunc(level/10) as tx_ref
     , case when mod (level,29)=1 then sysdate else null end send_date
     , case mod (level, 11)
       when 1 then 'A'
       when 2 then 'B'
       when 3 then 'C'
       when 4 then 'D'
       when 5 then 'PR'
       else 'X'
      end as exp_cmnt
    , case when mod(level, 7) = 1 then 'PR' else '0' end  projno

, case when mod(level,13) = 1 then 'PR' else '1' end tno , rpad ('X', 42, 'U') pad
from dual
connect by level < 380000
;

commit;
begin

DBMS_STATS.GATHER_TABLE_STATS (ownname => '"BERX"', tabname => '"TXN_PROC"',
estimate_percent => 100 );
end;
/

with these statistics:
(less important data removed)

TABLE: TXN_PROC
ROWS :379999
SAMPLE SIZE :379999
Columns

NAME       LOW_VALUE                 HIGH_VALUE               NUM_DISTINCT
 ID            1                         379999                   379999

 TX_REF        0                         37999                    38000

 SEND_DATE     2024.01.13.18.29.25       2024.01.13.18.29.25      1

 EXP_CMNT      A                         X                        6

 PROJNO        0                         PR                       2

 TNO           1                         PR                       2

 PAD           XU....                    XUU....                  1

The +/- original statement
select /*+ qb_name(MAIN) OPTIMIZER_FEATURES_ENABLE('9.2') DB_VERSION('9.2') */
-- BX1

*
from txn_proc t1
where t1.tx_ref not in

        (select t.tx_ref
         from txn_proc t
         where t.send_date is null
           and t.exp_cmnt not like 'PR%'
           and t.exp_cmnt != projno||tno
        )

and t1.send_date is null
/

generates an execution plan

|   0 | SELECT STATEMENT   |          |      1 |        |  *1895 *|      0
|    *6938 *|
|*  1 |  HASH JOIN ANTI    |          |      1 |      1 |  1895 |      0 |
   6938 |
|*  2 |   TABLE ACCESS FULL| TXN_PROC |      1 |    366K|   536 |    366K|
   3469 |
|*  3 |   TABLE ACCESS FULL| TXN_PROC |      1 |    254K|   536 |    333K|

3469 |

with no index - nothing spectacular.

with these 2 indices things gets more interesting:

create index txn_proc_bx_A on txn_proc( send_date, exp_cmnt, projno, tno, tx_ref);
create index txn_proc_bx_B on txn_proc( send_date, tx_ref);

leads to

|   0 | SELECT STATEMENT      |               |      1 |        |  *1548 *|
     0 |    *3853 *|
|*  1 |  HASH JOIN ANTI       |               |      1 |      1 |  1548 |
   0 |    3853 |
|*  2 |   TABLE ACCESS FULL   | TXN_PROC      |      1 |    366K|   536 |
 366K|    3469 |
|*  3 |   INDEX FAST FULL SCAN| TXN_PROC_BX_A |      1 |    254K|   189 |
 99816 |     384 |
--------------------------------------------------------------------------------------------

A slightly decrease at Costs, but buffers went down by ~45%

Playing with hints changes the situation a little bit:

select /*+ qb_name(MAIN) *INDEX(_at_"SEL$A3F38ADC" "T1"_at_"MAIN" txn_proc_bx_B )* OPTIMIZER_FEATURES_ENABLE('9.2') DB_VERSION('9.2') */ -- BX3

*
from txn_proc t1
where t1.tx_ref not in

        (select t.tx_ref
         from txn_proc t
         where t.send_date is null
           and t.exp_cmnt not like 'PR%'
           and t.exp_cmnt != projno||tno
        )

and t1.send_date is null
/

| Id  | Operation                    | Name          | Starts | E-Rows |

Cost | A-Rows | Buffers |

|   0 | SELECT STATEMENT             |               |      1 |        |  *9066
*|      0 |    *5245 *|
|*  1 |  HASH JOIN ANTI              |               |      1 |      1 |
 9066 |      0 |    5245 |
|   2 |   TABLE ACCESS BY INDEX ROWID| TXN_PROC      |      1 |    366K|
 8054 |    366K|    4861 |
|*  3 |    INDEX RANGE SCAN          | TXN_PROC_BX_B |      1 |    366K|
871 |    366K|     859 |
|*  4 |   INDEX FAST FULL SCAN       | TXN_PROC_BX_A |      1 |    254K|
189 |  99816 |     384 |
---------------------------------------------------------------------------------------------------

is worse in my env,

select /*+ qb_name(MAIN)

NO_UNNEST(_at_"SEL$1") INDEX(_at_"SEL$A3F38ADC" "T1"@"MAIN" txn_proc_bx_B ) * OPTIMIZER_FEATURES_ENABLE('9.2') DB_VERSION('9.2') */ -- BX4 * from txn_proc t1 ...

| Id  | Operation                    | Name          | Starts | E-Rows |

Cost | A-Rows | Buffers |

|   0 | SELECT STATEMENT             |               |      1 |        |   *541
*|      0 |     *166K*|
|*  1 |  FILTER                      |               |      1 |        |
    |      0 |     166K|
|*  2 |   TABLE ACCESS FULL          | TXN_PROC      |      1 |  18345 |
536 |    366K|    3469 |
|*  3 |   TABLE ACCESS BY INDEX ROWID| TXN_PROC      |  40828 |      7 |
  5 |  40828 |     163K|
|*  4 |    INDEX SKIP SCAN           | TXN_PROC_BX_B |  40828 |     10 |
  4 |  44391 |     122K|
---------------------------------------------------------------------------------------------------

shows a very nice Cost but the Buffers are ... worse.

another approach by rewriting the query slightly (it was just a test to see if I can optimize the indexes)
SELECT /*+ QB_NAME ("MAIN") OPTIMIZE R_FEATURES_ENABLE('9.2') DB_V ERSION('9.2') */
-- BX6

      *
FROM
    TXN_PROC A1
Where A1.rowid = (
  select /*+ qb_name(RID) materialize */ rowid   from TXN_PROC A2
WHERE
    "A2"."TX_REF" <> ALL (

        SELECT
            "A3"."TX_REF" "TX_REF"
        FROM
            TXN_PROC A3
        WHERE
            "A3"."SEND_DATE" IS NULL
            AND "A3"."EXP_CMNT" NOT LIKE 'PR%'
            AND "A3"."EXP_CMNT" <> "A3"."PROJNO" || "A3"."TNO"

)
AND "A2"."SEND_DATE" IS NULL)
;

leads to

| Id  | Operation                             | Name          | Starts |

E-Rows | Cost (%CPU)| A-Rows | Buffers |

|   0 | SELECT STATEMENT                      |               |      1 |
     |   307 (100)|      0  |     207K|
|   1 |  TABLE ACCESS BY USER ROWID           | TXN_PROC      |      1 |
   1 |     1   (0)|      0  |     207K|
|   2 |   NESTED LOOPS ANTI                   |               |      1 |
   1 |   306   (0)|      0  |     207K|
|*  3 |    INDEX RANGE SCAN                   | TXN_PROC_BX_A |      1 |
 101 |     3   (0)|    366K |    1182 |
|*  4 |    TABLE ACCESS BY INDEX ROWID BATCHED| TXN_PROC      |    238K|
 254K|     3   (0)|    238K |     206K|
|*  5 |     INDEX RANGE SCAN                  | TXN_PROC_BX_B |    238K|
  10 |     2   (0)|    259K |     148K|
------------------------------------------------------------------------------------------------------------------

. even better costs, but worse Buffers.

There are many more possibilities and most of them depend on the data distribution you have.

My examples also show how different goals (cost? buffers? execution time?) can lead to different plans - choose your goals according to your needs.

I generated my plans with DBMS_XPLAN.DISPLAY_CURSOR('<SQL_ID>', ...) - in 9.2 there is only DBMS_XPLAN.DISPLAY which required a kind of plan_table. To mitigate this limitation, there is a very nice solution provided by Tom Kyte:
https://asktom.oracle.com/ords/f?p=100:11:0::::P11_QUESTION_ID:230338600346782894 - I hope this helps in your investigations.

let's see where our investigations leads us, Martin

Am Do., 11. Jan. 2024 um 14:57 Uhr schrieb Sandra Becker < sbecker6925_at_gmail.com>:

> Unfortunately, company policy prohibits me from providing the sql plan.  I
> can tell you that pre-index the query was doing full table scans for both
> the query and subquery.  Post-index build, it was using the index for
> both.  I can create new indexes and/or rewrite the query.   The sub-query
> returns 29k rows on average.    The t.exp_cmnt not like replacement since
> it can be a combination of any two letters of the alphabet.  I can play
> with that and see what I can come up with, possibly combining with the date
> column?
>
> Thanks for your assistance.
>
> Sandy
>
>
> On Thu, Jan 11, 2024 at 6:05 AM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>>
>> Are you allowed to rewrite the query or do you have to live with it and
>> tweak the database.
>> How long does it take to run (pre- and post- index creation), and what
>> did the two execution plans look like.
>> Was the index on Just the send_date, or was it on (send_date, some
>> non-null declared column).  If the former then it shouldn't have made any
>> difference to the plan (in the general case - but there are some "exotic"
>> anomalies) which would suggest that the improvement was from side effects
>> (like better statistics, or the effects of blocks being cleaned out during
>> the index build).
>>
>> If you can rewrite the query there are several possibilities, but the
>> best strategy depends on know the effects of each of the data predicates
>> and combinations of predicates. For example, how many rows have a null
>> send_date, how many rows have exp_cmnt != projno||tno.  (It would be better
>> to include the "t." table alias in that predicate - even though it should
>> make no difference in this case (unless you've been getting the wrong
>> results because it was supposed to be a predicate correlating to "t1.".))
>>
>> Example step - if rewrites are allowed and necessary:
>>     t.exp_cmnt not like 'PR%'
>> is equivalent to
>>     t.exp_cmnt >= 'PS' or t.exp_cmnt < 'PR'
>>
>> (Note: greater than or equal to, strictly less than)
>> This means the subquery could be rewritten as a UNION ALL of two query
>> blocks that could operate through an index.
>>
>>
>>
>> Regards
>> Jonathan Lewis
>>
>>
>> On Wed, 10 Jan 2024 at 22:58, Sandra Becker <sbecker6925_at_gmail.com>
>> wrote:
>>
>>> OS:   SunOS 5.8
>>> DB:   Oracle 9.2.0.5
>>>
>>> We're sitting on really old hardware with a really old version of
>>> Oracle.  There is a project to migrate to another application, but they
>>> estimate it will take another 18 months given the human resources that are
>>> available.  That being said, this is a production financial application and
>>> is performing extremely poorly for queries against a specific table.  It's
>>> not a huge table, 388,000 rows, but given the age of the hardware/software,
>>> I'm surprised we don't have more issues.
>>>
>>> The query itself is poorly written, but I haven't figured out how to
>>> make it more efficient.  I did manage to reduce the cost and execution time
>>> of the query by 50% by creating an index on the SEND_TO_DATE column--which
>>> can contain nulls--but it's still very slow.  I also set the degree on the
>>> table to 8, which gave us a minor bump in performance.  Any suggestions
>>> would be appreciated, specifically on how I can change the "not like" and
>>> "!=" predicates.
>>>
>>> select *
>>> from aps1.txn_proc t1
>>> where t1.tx_ref not in
>>>         (select t.tx_ref
>>>          from aps1.txn_proc t
>>>          where t.send_date is null
>>>            and t.exp_cmnt not like 'PR%'
>>>            and t.exp_cmnt != projno||tno
>>>         )
>>>   and t1.send_date is null
>>> /
>>>
>>>
>>> Thank you,
>>> --
>>> Sandy B.
>>>
>>>
>
> --
> Sandy B.
>
>

-- 
Martin Berger                Oracle ♠
martin.a.berger_at_gmail.com _at_martinberx <https://twitter.com/martinberx>
^∆x      http://berxblog.blogspot.com

--
http://www.freelists.org/webpage/oracle-l

Received on Sat Jan 13 2024 - 19:57:47 CET

This message: [ Message body ]
Next message: yudhi s: "Differences in key database design and architecture"
Previous message: Mark W. Farnham: "RE: Performance issue on Oracle 9i"
In reply to: Sandra Becker: "Re: Performance issue on Oracle 9i"
Next in thread: Jonathan Lewis: "Re: Performance issue on Oracle 9i"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message