Re: Finding cause and fix for Idle wait event

From: Nenad Noveljic <nenad.noveljic_at_gmail.com>
Date: Fri, 24 Feb 2023 08:24:54 +0100
Message-ID: <CAESzRb=EJMJEQoZnpXFj3qSDLpuuOdg4XTZcDxp7+3dy8oJdXg_at_mail.gmail.com>



I've seen this behaviour with Informatica. The reason were the workflows where the tool was unnecessarily grabbing large quantities of data from the database and joining them on the Informatica server. In the Informatica vernacular the joins are called "look-ups" and they require sorting the selected data. As this was very intensive, the Oracle session was idly waiting during that time. I worked together with the Informatica specialist to select only a necessary subset of data and join them in the database, which reduced the processing from hours to less than a minute.

Best regards,

Nenad

On Fri, Feb 24, 2023 at 6:21 AM Pap <oracle.developer35_at_gmail.com> wrote:

> You may try to check with informatica admin for the cpu or memory
> utilization trends.
>
> On Fri, 24 Feb, 2023, 10:19 am Lok P, <loknath.73_at_gmail.com> wrote:
>
>> Thank you Maris. So I think it means, say the etl was doing a certain
>> amount of work in x time and now it's doing the same work in a lot more
>> time and thus these high idle/"SQL * net messages from client waits".
>> Correct me if I'm wrong. Can we rule out the network settings like sdu size
>> or tdu etc to fix this issue here? Or is there any sort of oracle net
>> tracing we can do to dig more into this issue, wanting to
>> explore all possible options at hand?
>>
>> As we talked to etl team, they said its cached lookup which is happening
>> on ~200million rows and no transformation as such is happening in the ETL
>> code during this insert. We asked to check the cache lookup performance
>> trend over the past few days but they said no such instrumentation
>> available to check that trend.
>>
>>
>>
>> On Fri, 24 Feb, 2023, 2:13 am Maris Elsins, <elmaris_at_gmail.com> wrote:
>>
>>> Hey,
>>>
>>> Did you already connect to your client machine where Informatica
>>> is running, and check if it's doing fine?
>>> I wouldn't be surprised if there was something "stuck" using lots of CPU
>>> or creating some other bottleneck, which would be slowing down the ETL
>>> process' ability to push the data to the DB a lot.
>>>
>>> ---
>>> Māris Elsiņš
>>>
>>>
>>>
>>>
>>> On Thu, Feb 23, 2023 at 10:11 PM Lok P <loknath.73_at_gmail.com> wrote:
>>>
>>>> Hello Friends, It's a 19c database on exadata machine. We have a simple
>>>> insert as below executed from an ETL tool(Informatica). Suddenly the
>>>> application team complained of slowness in that INSERT query. It used to
>>>> run for 2-3hrs and now started running for 7-8hrs. While checking the query
>>>> during run time in v$session, we see the query is getting inactive
>>>> frequently and the event is showing as "SQL*Net message from client".
>>>> However the team is saying no changes have been made to the ETL code. And
>>>> it used to execute in similar fashion. Also no changes have been made in DB
>>>> side or in network too.
>>>>
>>>> Looking more into the historical execution from dba_hist_sqlstat , we
>>>> saw the elapsed_time of the query remains the same since the last couple of
>>>> months i.e ~5ms/execution. And also the rows_processed/executions is
>>>> showing consistently ~110 which means it's a batched insert getting
>>>> executed from the etl tool and batch size also remains the same.
>>>>
>>>> After monitoring it for ~5minutes duration, we see it inserted
>>>> ~1.6million rows with ~14829 executions and the total elapsed_time was
>>>> ~1.3minutes which means, rest of the time(~3.7minutes) it was doing
>>>> something at client side only. I collected the non zero stats difference
>>>> from gv$sesstat for the ~5minutes run duration of the insert query and
>>>> published it below gist. I can't figure out anything odd from these stats.
>>>>
>>>> https://gist.github.com/oraclelearner/0bcab0a80c53a66b999b4d2a696d2974
>>>>
>>>> Is there any way we can debug more into this "SQL*Net message from
>>>> client" , wait through any trace etc, to see what exactly it's doing in the
>>>> network in the client? And we see nothing is mentioned in the sqlnet.ora so
>>>> it's the default sdu_size it's using in DB side, so is there any chance of
>>>> by tweaking the sdu_size, it will help better data packet transfer and thus
>>>> we will get better performance? Or any other way out to speed up the Insert
>>>> query performance?
>>>>
>>>> The avg_row_len is 222 and it's a daily partitioned table with 4
>>>> indexes on it.
>>>>
>>>> INSERT INTO TAB1(c1,c2..........c45)
>>>> VALUES ( :1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13, :14,
>>>> :15, :16, :17, :18, :19, :20, :21, :22, :23, :24, :25, :26, :27, :28,
>>>> :29, :30, :31, :32, :33, :34, :35, :36, :37, :38, :39, :40, :41, :42,
>>>> :43, :44, :45)
>>>>
>>>> -------------------------------------------------
>>>> | Id | Operation | Name | Cost |
>>>> -------------------------------------------------
>>>> | 0 | INSERT STATEMENT | | 1 |
>>>> | 1 | LOAD TABLE CONVENTIONAL | | |
>>>> -------------------------------------------------
>>>>
>>>> Query Block Name / Object Alias (identified by operation id):
>>>> -------------------------------------------------------------
>>>>
>>>> 1 - INS$1
>>>>
>>>> Note
>>>> -----
>>>> - cpu costing is off (consider enabling it)
>>>>
>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Feb 24 2023 - 08:24:54 CET

Original text of this message