Re: Finding cause and fix for Idle wait event

From: Lok P <loknath.73_at_gmail.com>
Date: Fri, 24 Feb 2023 13:17:32 +0530
Message-ID: <CAKna9VYK362fqtRrs=V6_Dzum-nZOSntFvYO+9tjnAPQgUmhdw_at_mail.gmail.com>



Thank you Nenad. I was under impression that look ups means just filters/read from the informatica cache which would be fast, but I agree sorting rows like 200million and joining can be resource intensive.

Is there any instrumentation available st informatica which shows how much time/resources does those cached lookups take for a workflow?

And do you say, modifying sdu_size/tdu in client side or database is not going to make any difference in this case?

On Fri, 24 Feb, 2023, 12:55 pm Nenad Noveljic, <nenad.noveljic_at_gmail.com> wrote:

> I've seen this behaviour with Informatica. The reason were the workflows
> where the tool was unnecessarily grabbing large quantities of data from the
> database and joining them on the Informatica server. In the Informatica
> vernacular the joins are called "look-ups" and they require sorting the
> selected data. As this was very intensive, the Oracle session was idly
> waiting during that time. I worked together with the Informatica specialist
> to select only a necessary subset of data and join them in the database,
> which reduced the processing from hours to less than a minute.
>
> Best regards,
>
> Nenad
>
> On Fri, Feb 24, 2023 at 6:21 AM Pap <oracle.developer35_at_gmail.com> wrote:
>
>> You may try to check with informatica admin for the cpu or memory
>> utilization trends.
>>
>> On Fri, 24 Feb, 2023, 10:19 am Lok P, <loknath.73_at_gmail.com> wrote:
>>
>>> Thank you Maris. So I think it means, say the etl was doing a certain
>>> amount of work in x time and now it's doing the same work in a lot more
>>> time and thus these high idle/"SQL * net messages from client waits".
>>> Correct me if I'm wrong. Can we rule out the network settings like sdu size
>>> or tdu etc to fix this issue here? Or is there any sort of oracle net
>>> tracing we can do to dig more into this issue, wanting to
>>> explore all possible options at hand?
>>>
>>> As we talked to etl team, they said its cached lookup which is happening
>>> on ~200million rows and no transformation as such is happening in the ETL
>>> code during this insert. We asked to check the cache lookup performance
>>> trend over the past few days but they said no such instrumentation
>>> available to check that trend.
>>>
>>>
>>>
>>> On Fri, 24 Feb, 2023, 2:13 am Maris Elsins, <elmaris_at_gmail.com> wrote:
>>>
>>>> Hey,
>>>>
>>>> Did you already connect to your client machine where Informatica
>>>> is running, and check if it's doing fine?
>>>> I wouldn't be surprised if there was something "stuck" using lots of
>>>> CPU or creating some other bottleneck, which would be slowing down the ETL
>>>> process' ability to push the data to the DB a lot.
>>>>
>>>> ---
>>>> Māris Elsiņš
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Feb 23, 2023 at 10:11 PM Lok P <loknath.73_at_gmail.com> wrote:
>>>>
>>>>> Hello Friends, It's a 19c database on exadata machine. We have a
>>>>> simple insert as below executed from an ETL tool(Informatica). Suddenly the
>>>>> application team complained of slowness in that INSERT query. It used to
>>>>> run for 2-3hrs and now started running for 7-8hrs. While checking the query
>>>>> during run time in v$session, we see the query is getting inactive
>>>>> frequently and the event is showing as "SQL*Net message from client".
>>>>> However the team is saying no changes have been made to the ETL code. And
>>>>> it used to execute in similar fashion. Also no changes have been made in DB
>>>>> side or in network too.
>>>>>
>>>>> Looking more into the historical execution from dba_hist_sqlstat , we
>>>>> saw the elapsed_time of the query remains the same since the last couple of
>>>>> months i.e ~5ms/execution. And also the rows_processed/executions is
>>>>> showing consistently ~110 which means it's a batched insert getting
>>>>> executed from the etl tool and batch size also remains the same.
>>>>>
>>>>> After monitoring it for ~5minutes duration, we see it inserted
>>>>> ~1.6million rows with ~14829 executions and the total elapsed_time was
>>>>> ~1.3minutes which means, rest of the time(~3.7minutes) it was doing
>>>>> something at client side only. I collected the non zero stats difference
>>>>> from gv$sesstat for the ~5minutes run duration of the insert query and
>>>>> published it below gist. I can't figure out anything odd from these stats.
>>>>>
>>>>> https://gist.github.com/oraclelearner/0bcab0a80c53a66b999b4d2a696d2974
>>>>>
>>>>> Is there any way we can debug more into this "SQL*Net message from
>>>>> client" , wait through any trace etc, to see what exactly it's doing in the
>>>>> network in the client? And we see nothing is mentioned in the sqlnet.ora so
>>>>> it's the default sdu_size it's using in DB side, so is there any chance of
>>>>> by tweaking the sdu_size, it will help better data packet transfer and thus
>>>>> we will get better performance? Or any other way out to speed up the Insert
>>>>> query performance?
>>>>>
>>>>> The avg_row_len is 222 and it's a daily partitioned table with 4
>>>>> indexes on it.
>>>>>
>>>>> INSERT INTO TAB1(c1,c2..........c45)
>>>>> VALUES ( :1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13, :14,
>>>>> :15, :16, :17, :18, :19, :20, :21, :22, :23, :24, :25, :26, :27, :28,
>>>>> :29, :30, :31, :32, :33, :34, :35, :36, :37, :38, :39, :40, :41, :42,
>>>>> :43, :44, :45)
>>>>>
>>>>> -------------------------------------------------
>>>>> | Id | Operation | Name | Cost |
>>>>> -------------------------------------------------
>>>>> | 0 | INSERT STATEMENT | | 1 |
>>>>> | 1 | LOAD TABLE CONVENTIONAL | | |
>>>>> -------------------------------------------------
>>>>>
>>>>> Query Block Name / Object Alias (identified by operation id):
>>>>> -------------------------------------------------------------
>>>>>
>>>>> 1 - INS$1
>>>>>
>>>>> Note
>>>>> -----
>>>>> - cpu costing is off (consider enabling it)
>>>>>
>>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Feb 24 2023 - 08:47:32 CET

Original text of this message