Re: blocking enq-ss wait events

From: Chris Taylor <christopherdtaylor1994_at_gmail.com>
Date: Thu, 6 May 2021 16:57:02 -0400
Message-ID: <CAP79kiSzVx34HeS47k2Zt+mV+V0yG7c4sGMz5=y5VgLRJy0b0A_at_mail.gmail.com>



It's interesting that you have different TEMP space assigned to SYS and application users. Are both TEMP spaces actually TEMPFILES or is one dictionary managed?

(DBA_TEMP_FILES vs DBA_DATA_FILES)

Also there used to be a scenario where FAST_START_PARALLEL_ROLLBACK parameter would negatively impact performance but in 11.2, I would expect that to not be an issue unless you're CPU bound where CPUs are pegged (100% busy or near 100% busy).

Chris

On Thu, May 6, 2021 at 2:52 PM Lok P <loknath.73_at_gmail.com> wrote:

> I see we have different temp space aligned to SYS and different for
> application users. So why would SMON(which is a SYS process) create the SS
> or SORT SEGMENT contention on a tempspace i.e. aligned to application
> users/queries and that too after a DB reboot?
>
> On Thu, May 6, 2021 at 11:44 PM Lok P <loknath.73_at_gmail.com> wrote:
>
>> Thanks a lot. So here is what happened and I am still struggling to
>> clearly understand how logically these are related. We had killed one long
>> running transaction as it was reading from UNDO from a long time(~8+hrs),
>> that resulted in a rollback and it was keep going and we use to see wait
>> event "wait for a undo record" from multiple SYS sessions (which most
>> probably SMON doing the cleanup in multiple parallel slaves). But till that
>> time we were all okay because other application queries/sessions and
>> everything was going fine and that was not blocking anyone.
>>
>> Then when above was going on the infra team did another planned activity
>> in which the database had to be rebooted. They did that and brought the
>> database back online, and after this we started seeing the same "wait for
>> undo record" wait event and thought it may be that SMON is resuming its
>> rollback/cleanup and it should not impact other application queries. But
>> then suddenly it appears "enq:ss contention" for multiple application
>> sessions and the blocking session was waiting on "sort segment request" and
>> those application queries were just stuck.
>>
>> I am trying to understand why even the database shutdown happened and it
>> started up seamlessly but the old rollback/cleanup reinitiated again by the
>> SMON. And even then, why after the DB reboot , the application session was
>> stuck on "sort segment request" causing other sessions to hang in "enq-ss
>> contention"? How is it related to the big rollback?
>>
>> Regards
>> Lok
>>
>>
>> On Thu, May 6, 2021 at 7:02 PM Chris Taylor <
>> christopherdtaylor1994_at_gmail.com> wrote:
>>
>>> No killing the recovery process isn't a great idea as I'm fairly certain
>>> SMON is involved which is a critical component of the database operation.
>>> Kill it, you kill the instance.
>>>
>>> Are you CPUs on the server very busy, very low idle ?
>>>
>>> Find SID & SERIAL for the SMON process and query GV$PX_SESSION for
>>> QCSID=<sid of SMON> and QCSERIAL# = <serial# of SMON> and see how many
>>> parallel server processes its using.
>>>
>>> If you kill (or have to restart the database) be aware that the startup
>>> of the database will "pause" while SMON cleans up that dead transaction
>>> before it opens the database (most likely) but SMON should do that in
>>> parallel and might be your best bet to get the database back to normal
>>> operating procedure. Though it might take a while for SMON to finish and
>>> thus delay the opening of the database which you can monitor by tailing the
>>> alert log file if you do kill & restart the database instance.
>>>
>>> Chris
>>>
>>>
>>>
>>>
>>>
>>> On Thu, May 6, 2021 at 1:14 AM Lok P <loknath.73_at_gmail.com> wrote:
>>>
>>>> Thank you . It seems to match our symptoms. (Big Rollback causing
>>>> enq-ss contention)
>>>>
>>>> We were thinking about killing the system processes which are trying to
>>>> perform the rollback as we don't need that job anymore. Is that safe?
>>>> Or we have to either wait to let that finish and the new transactions
>>>> may move on by increasing the pga_aggregate_target as suggested in the
>>>> note. else drop and create temp tablespace.
>>>>
>>>> On Thu, May 6, 2021 at 10:13 AM Chris Taylor <
>>>> christopherdtaylor1994_at_gmail.com> wrote:
>>>>
>>>>> I had to look up ENQ-SS on Oracle support. That's a sort segment
>>>>> contention, usually encountered when SMON is really busy cleaning up a dead
>>>>> / killed transaction.
>>>>>
>>>>> Oracle support has a few notes on this but the most applicable seem to
>>>>> stop at 11.1 .
>>>>>
>>>>> There is one note that says to try to increase PGA_AGGREGATE_TARGET
>>>>> SS Sort Segment Enqueue: 'enq: SS - contention' (Doc ID 2601825.1)
>>>>>
>>>>> Chris
>>>>>
>>>>> On Wed, May 5, 2021 at 11:56 PM Lok P <loknath.73_at_gmail.com> wrote:
>>>>>
>>>>>> Hi All, Need some help, we had killed one of the long running
>>>>>> sessions which was running since ~8hrs+ , but after we kill that we see a
>>>>>> lot of "wait for a undo record" wait events but we ignored that thinking
>>>>>> that will run for sometime as because it will do rollback. But suddenly now
>>>>>> we are seeing "enq-SS contention" wait event in addition to "wait for a
>>>>>> undo record" and that is blocking all other application queries. So
>>>>>> wondering how we should mitigate this issue?
>>>>>>
>>>>>> The version is 11.2.0.4 of Exadata.
>>>>>>
>>>>>> Regards
>>>>>> Lok
>>>>>>
>>>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu May 06 2021 - 22:57:02 CEST

Original text of this message