Alerting early for minimum impact

From: yudhi s <learnerdatabase99_at_gmail.com>
Date: Tue, 17 Sep 2024 11:07:11 +0530
Message-ID: <CAEzWdqfkj6cMmdmZOC_jkpZLsxDK2jf5SrrtgB+zPEe_vcDP-Q_at_mail.gmail.com>



Hi ,

We are using Oracle exadata servers 23.1 image and the database version is 19c.

We have just encountered a scenario in which we have one select query which was doing a full table scan(cell smart scan) and suddenly as part of a bulk migration effort that query started running Avg ~4000 times per hour from initial(avg ~20 runs per hour usual pattern) and this has caused the cell server cpu spike to ~85% from normal ~30% CPU and the disk IO response (mainly the write response) increased from <.5ms to double i.e >1.5 ms and it impacted one of the other database which was latency sensitive one hosted in that same box.

It took us a long time to get to the sql because it was not showing as any of the TOP contributors in that database layer but it did impact the storage cell servers IO and CPU significantly. The response time of that sql just increased from ~5-10 seconds earlier to ~20 seconds during the issue period , so we were unable to find this as one of the top sql in that database. And this query was not stuck but finishing up.

And also we were having 3 to 4 other databases hosted in that exadata cluster. And when we got complaints from a different database which was a latency sensitive one , we spent a good amount of time digging into the top sqls from that database whereas it was a different database query which has caused this issue.

So I wanted to understand , in such a scenario in which one of the queries impacting the storage cell IO/CPU may not be one showing up in the database layer. So what alerting can be done to get to it as soon as possible. and if anything is possible in the OEM metrics itself which we can configure to catch these kinds of issues beforehand? Also I don't see an easy way to find the top queries consuming storage cell CPU or IO across all the databases hosted in that box. Is there any option possible from awr/dba_hist* views?

Regards
Yudhi

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Sep 17 2024 - 07:37:11 CEST

Original text of this message