Re: Alerting early for minimum impact
Date: Tue, 17 Sep 2024 22:00:58 +0100
Message-ID: <CAGtsp8=+-1t0GwahYPaXxpyc0zmQmfp78682UWS7xpVzNFXq4A_at_mail.gmail.com>
I haven't thought this one through, but could you do something with v$sql,
checking the related columns, possibly:
Regards
On Tue, 17 Sept 2024 at 06:38, yudhi s <learnerdatabase99_at_gmail.com> wrote:
> Hi ,
IO_CELL_OFFLOAD_ELIGIBLE_BYTES
IO_INTERCONNECT_BYTES
IO_CELL_UNCOMPRESSED_BYTES
IO_CELL_OFFLOAD_RETURNED_BYTES
Jonathan Lewis
>
> We are using Oracle exadata servers 23.1 image and the database version is
> 19c.
>
> We have just encountered a scenario in which we have one select query
> which was doing a full table scan(cell smart scan) and suddenly as part of
> a bulk migration effort that query started running Avg ~4000 times per hour
> from initial(avg ~20 runs per hour usual pattern) and this has caused the
> cell server cpu spike to ~85% from normal ~30% CPU and the disk IO response
> (mainly the write response) increased from <.5ms to double i.e >1.5 ms and
> it impacted one of the other database which was latency sensitive one
> hosted in that same box.
>
> It took us a long time to get to the sql because it was not showing as any
> of the TOP contributors in that database layer but it did impact the
> storage cell servers IO and CPU significantly. The response time of that
> sql just increased from ~5-10 seconds earlier to ~20 seconds during the
> issue period , so we were unable to find this as one of the top sql in that
> database. And this query was not stuck but finishing up.
>
> And also we were having 3 to 4 other databases hosted in that exadata
> cluster. And when we got complaints from a different database which was a
> latency sensitive one , we spent a good amount of time digging into the top
> sqls from that database whereas it was a different database query which has
> caused this issue.
>
> So I wanted to understand , in such a scenario in which one of the queries
> impacting the storage cell IO/CPU may not be one showing up in the database
> layer. So what alerting can be done to get to it as soon as possible. and
> if anything is possible in the OEM metrics itself which we can configure to
> catch these kinds of issues beforehand? Also I don't see an easy way to
> find the top queries consuming storage cell CPU or IO across all the
> databases hosted in that box. Is there any option possible from
> awr/dba_hist* views?
>
> Regards
> Yudhi
>
-- http://www.freelists.org/webpage/oracle-lReceived on Tue Sep 17 2024 - 23:00:58 CEST