Re: Query help
From: Andrew Kerber <andrew.kerber_at_gmail.com>
Date: Thu, 12 Mar 2015 11:04:14 -0500
Message-Id: <0410AEC0-80BD-45BE-A105-709EDF7AE41B_at_gmail.com>
Oh. Yes, an earlier version reported by tablespace, I just never took it out. I see your point about block size , but we only use 8k block size, so for us it doesn't matter.
>>> Hi, Andrew
>>>
>>> Well, the simple way to get rid of WITH is to replace with an inline view:
>>>
>>> select get_date, sum(size_gb) tot_size, sum(used_gb) used_size
>>> from ( select to_char(s.begin_interval_time, 'yyyy-mm-dd') get_date
>>> , v.name ts_name
>>> , (round(max((t.tablespace_size * 8192)) / 1024 / 1024 / 1024, 2)) size_gb
>>> , (round(max((tablespace_usedsize * 8192)) / 1024 / 1024 / 1024, 2)) used_gb
>>> from v$tablespace v, dba_hist_snapshot s, dba_hist_tbspc_space_usage t
>>> where t.tablespace_id = v.ts# and t.snap_id = s.snap_id
>>> group by to_char(s.begin_interval_time, 'yyyy-mm-dd'), v.name) group1
>>> group by get_date
>>> order by get_date
>>> /
>>>
>>> But I guess that's not what you are really asking. You want to avoid doing TWO aggregations.
>>>
>>> If the inner aggregation had been a SUM and the outer aggregation had been a SUM of that sum - probably you would make do with just the outer aggregation or some ROLLUP if you wished to keep both results of inner aggregation or outer aggregation.
>>>
>>> As it is, your inner aggregation uses MAX and the outer aggregation is a SUM. I don't see how that could be combined with ROLLUP or similar. Nesting aggregations I do like you have done - an inline view or with clause.
>>>
>>> It's possible to nest an aggregate in an analytic function, like for example:
>>>
>>> select to_char(s.begin_interval_time, 'yyyy-mm-dd') get_date
>>> , v.name ts_name
>>> , (round(sum(max((t.tablespace_size * 8192)) / 1024 / 1024 / 1024) over (
>>> partition by to_char(s.begin_interval_time, 'yyyy-mm-dd')
>>> ), 2)) tot_size
>>> , (round(sum(max((tablespace_usedsize * 8192)) / 1024 / 1024 / 1024) over (
>>> partition by to_char(s.begin_interval_time, 'yyyy-mm-dd')
>>> ), 2)) used_size
>>> from v$tablespace v, dba_hist_snapshot s, dba_hist_tbspc_space_usage t
>>> where t.tablespace_id = v.ts# and t.snap_id = s.snap_id
>>> group by to_char(s.begin_interval_time, 'yyyy-mm-dd'), v.name
>>> order by get_date, ts_name
>>> /
>>>
>>> But that does not give the result you want, as even though it calculates the correct tot_size and used_size you want, it keeps the rows for each date/tablespace combination rather than aggregating to date level.
>>>
>>> I can't offhand think of a way to avoid two aggregations. Two aggregations is the way to do what you want, as far as I can think of ;-)
>>>
>>>
>>> Do you actually want the maximum tablespace size for each tablespace per day? Or would you actually want the latest tablespace size for each tablespace per day?
>>>
>>> I mean, if you during a day move a lot of data from one tablespace to another, the max for the first tablespace will be a high value from before the move, the max for the second tablespace will be a high value from after the move. Then the sum for the day will be artificially high, as the amount of moved data will count twice in the sum, won't it?
>>>
>>> You could utilize the KEEP functionality of aggregates in the inner aggregation to do something like this:
>>>
>>> select to_char(get_date, 'yyyy-mm-dd') the_date
>>> , round(sum(ts_size) * 8192 / 1024 / 1024 / 1024, 2) tot_gb
>>> , round(sum(ts_used) * 8192 / 1024 / 1024 / 1024, 2) used_gb
>>> from ( select trunc(s.begin_interval_time) get_date
>>> , max(t.tablespace_size) keep (
>>> dense_rank last order by s.begin_interval_time
>>> ) ts_size
>>> , max(tablespace_usedsize) keep (
>>> dense_rank last order by s.begin_interval_time
>>> ) ts_used
>>> from v$tablespace v, dba_hist_snapshot s, dba_hist_tbspc_space_usage t
>>> where t.tablespace_id = v.ts# and t.snap_id = s.snap_id
>>> group by trunc(s.begin_interval_time), v.name) group1
>>> group by get_date
>>> order by get_date
>>> /
>>>
>>> Using the KEEP with dense_rank last order by s.begin_interval_time means that the MAX function only operates on those rows (within the group) that have the latest begin_interval_time. So that query will not find the highest values for a tablespace on a given day, but the latest.
>>>
>>> Whether that is what you want is another question - perhaps both versions actually might be useful ;-)
>>>
>>>
>>> But avoiding two aggregates I just don't think is possible (at least not without some very tricky weird SQL that probably would cause optimizer meltdown ;-)
>>>
>>>
>>>
>>> Regards
>>>
>>>
>>> Kim Berg Hansen
>>>
>>> http://dspsd.blogspot.com
>>> kibeha_at_gmail.com
>>> _at_kibeha
>>>
>>>
>>> On Thu, Mar 12, 2015 at 1:55 PM, Andrew Kerber <andrew.kerber_at_gmail.com> wrote:
>>> This is a learning experience for me. I wrote the query below to pull sizing trends at the database level. Note it uses a with. I think it can be written with a single select and a roll up clause, or perhaps another function I am not aware of. Any ideas?
>>>
>>> with group1 as (select
>>> to_char(s.begin_interval_time,'yyyy-mm-dd') get_date,
>>> v.name ts_name,
>>> (round(max((t.tablespace_size*8192))/1024/1024/1024,2)) size_gb,
>>> (round(max((tablespace_usedsize*8192))/1024/1024/1024,2)) used_gb
>>> from v$tablespace v, dba_hist_snapshot s, dba_hist_tbspc_space_usage t
>>> where t.tablespace_id=v.ts#
>>> and t.snap_id=s.snap_id
>>> group by to_char(s.begin_interval_time,'yyyy-mm-dd'), v.name)
>>> select get_date, sum(size_gb) tot_size,sum(used_gb) used_size
>>> from group1
>>> group by get_date
>>> order by get_date
>>>
>>> Sent from my iPad--
>>> http://www.freelists.org/webpage/oracle-l
>
Date: Thu, 12 Mar 2015 11:04:14 -0500
Message-Id: <0410AEC0-80BD-45BE-A105-709EDF7AE41B_at_gmail.com>
Oh. Yes, an earlier version reported by tablespace, I just never took it out. I see your point about block size , but we only use 8k block size, so for us it doesn't matter.
Sent from my iPhone
> On Mar 12, 2015, at 9:51 AM, Stéphane Faroult <sfaroult_at_roughsea.com> wrote: > > Just a question, why are you all keeping a join with v$tablespace and not using anything that comes from it in the final result? For aggregation purposes, a tablespace_id does as nicely as a tablespace name. > In fact, I *might* use it to get the block size, because seeing an 8K block-size being hard-coded makes me uncomfortable ... > > S Faroult > >> On 12/03/15 09:43, stephen van linge (Redacted sender swvanlinge_at_yahoo.com for DMARC) wrote: >> If you're curious about the "at least not without some very tricky weird SQL that probably would cause optimizer meltdown" case that Kim was referring to, I threw one together for you (sorry I also removed the comma joins, I hate reading those). I found the window function ordering was much quicker than the double aggregation, but we also have a much smaller system than you likely have (and a very short snapshot duration period), so consider it an academic response. >> >> with group1 as ( >> SELECT to_char(s.begin_interval_time,'yyyy-mm-dd') AS get_date >> , v.name AS ts_name >> , row_number() OVER(PARTITION BY to_char(s.begin_interval_time,'yyyy-mm-dd'), v.name ORDER BY round((t.tablespace_size*8192)/1024/1024/1024,2) DESC) AS size_gb_rank >> , row_number() OVER(PARTITION BY to_char(s.begin_interval_time,'yyyy-mm-dd'), v.name ORDER BY round((t.tablespace_usedsize*8192)/1024/1024/1024,2) DESC) AS usedsize_gb_rank >> , round((t.tablespace_size*8192)/1024/1024/1024,2) AS size_gb >> , round((tablespace_usedsize*8192)/1024/1024/1024,2) AS used_gb >> from dba_hist_snapshot s >> JOIN dba_hist_tbspc_space_usage t >> ON s.snap_id=t.snap_id >> JOIN v$tablespace v >> ON t.tablespace_id=v.ts# >> ) >> select get_date >> , sum(CASE WHEN size_gb_rank = 1 THEN size_gb ELSE 0 END) AS tot_size >> , sum(CASE WHEN usedsize_gb_rank = 1 THEN used_gb ELSE 0 END) AS used_size >> from group1 >> group by get_date >> order by get_date; >> >> Thanks, >> >> Stephen >> >> From: Andrew Kerber <andrew.kerber_at_gmail.com> >> To: Kim Berg Hansen <kibeha_at_gmail.com> >> Cc: "<oracle-l_at_freelists.org>" <Oracle-L_at_freelists.org> >> Sent: Thursday, March 12, 2015 7:24 AM >> Subject: Re: Query help >> >> Hi Kim. Thanks for the input. We do not move data between tablespaces so I wasn't even thinking about that. One of the items on my to learn list is a decent class on analytic functions in oracle. I'll probably be able to get to that once we implement the 30 hour day. >> >> Sent from my iPhone >> >> >> >> On Mar 12, 2015, at 8:35 AM, Kim Berg Hansen <kibeha_at_gmail.com> wrote: >>
>>> Hi, Andrew
>>>
>>> Well, the simple way to get rid of WITH is to replace with an inline view:
>>>
>>> select get_date, sum(size_gb) tot_size, sum(used_gb) used_size
>>> from ( select to_char(s.begin_interval_time, 'yyyy-mm-dd') get_date
>>> , v.name ts_name
>>> , (round(max((t.tablespace_size * 8192)) / 1024 / 1024 / 1024, 2)) size_gb
>>> , (round(max((tablespace_usedsize * 8192)) / 1024 / 1024 / 1024, 2)) used_gb
>>> from v$tablespace v, dba_hist_snapshot s, dba_hist_tbspc_space_usage t
>>> where t.tablespace_id = v.ts# and t.snap_id = s.snap_id
>>> group by to_char(s.begin_interval_time, 'yyyy-mm-dd'), v.name) group1
>>> group by get_date
>>> order by get_date
>>> /
>>>
>>> But I guess that's not what you are really asking. You want to avoid doing TWO aggregations.
>>>
>>> If the inner aggregation had been a SUM and the outer aggregation had been a SUM of that sum - probably you would make do with just the outer aggregation or some ROLLUP if you wished to keep both results of inner aggregation or outer aggregation.
>>>
>>> As it is, your inner aggregation uses MAX and the outer aggregation is a SUM. I don't see how that could be combined with ROLLUP or similar. Nesting aggregations I do like you have done - an inline view or with clause.
>>>
>>> It's possible to nest an aggregate in an analytic function, like for example:
>>>
>>> select to_char(s.begin_interval_time, 'yyyy-mm-dd') get_date
>>> , v.name ts_name
>>> , (round(sum(max((t.tablespace_size * 8192)) / 1024 / 1024 / 1024) over (
>>> partition by to_char(s.begin_interval_time, 'yyyy-mm-dd')
>>> ), 2)) tot_size
>>> , (round(sum(max((tablespace_usedsize * 8192)) / 1024 / 1024 / 1024) over (
>>> partition by to_char(s.begin_interval_time, 'yyyy-mm-dd')
>>> ), 2)) used_size
>>> from v$tablespace v, dba_hist_snapshot s, dba_hist_tbspc_space_usage t
>>> where t.tablespace_id = v.ts# and t.snap_id = s.snap_id
>>> group by to_char(s.begin_interval_time, 'yyyy-mm-dd'), v.name
>>> order by get_date, ts_name
>>> /
>>>
>>> But that does not give the result you want, as even though it calculates the correct tot_size and used_size you want, it keeps the rows for each date/tablespace combination rather than aggregating to date level.
>>>
>>> I can't offhand think of a way to avoid two aggregations. Two aggregations is the way to do what you want, as far as I can think of ;-)
>>>
>>>
>>> Do you actually want the maximum tablespace size for each tablespace per day? Or would you actually want the latest tablespace size for each tablespace per day?
>>>
>>> I mean, if you during a day move a lot of data from one tablespace to another, the max for the first tablespace will be a high value from before the move, the max for the second tablespace will be a high value from after the move. Then the sum for the day will be artificially high, as the amount of moved data will count twice in the sum, won't it?
>>>
>>> You could utilize the KEEP functionality of aggregates in the inner aggregation to do something like this:
>>>
>>> select to_char(get_date, 'yyyy-mm-dd') the_date
>>> , round(sum(ts_size) * 8192 / 1024 / 1024 / 1024, 2) tot_gb
>>> , round(sum(ts_used) * 8192 / 1024 / 1024 / 1024, 2) used_gb
>>> from ( select trunc(s.begin_interval_time) get_date
>>> , max(t.tablespace_size) keep (
>>> dense_rank last order by s.begin_interval_time
>>> ) ts_size
>>> , max(tablespace_usedsize) keep (
>>> dense_rank last order by s.begin_interval_time
>>> ) ts_used
>>> from v$tablespace v, dba_hist_snapshot s, dba_hist_tbspc_space_usage t
>>> where t.tablespace_id = v.ts# and t.snap_id = s.snap_id
>>> group by trunc(s.begin_interval_time), v.name) group1
>>> group by get_date
>>> order by get_date
>>> /
>>>
>>> Using the KEEP with dense_rank last order by s.begin_interval_time means that the MAX function only operates on those rows (within the group) that have the latest begin_interval_time. So that query will not find the highest values for a tablespace on a given day, but the latest.
>>>
>>> Whether that is what you want is another question - perhaps both versions actually might be useful ;-)
>>>
>>>
>>> But avoiding two aggregates I just don't think is possible (at least not without some very tricky weird SQL that probably would cause optimizer meltdown ;-)
>>>
>>>
>>>
>>> Regards
>>>
>>>
>>> Kim Berg Hansen
>>>
>>> http://dspsd.blogspot.com
>>> kibeha_at_gmail.com
>>> _at_kibeha
>>>
>>>
>>> On Thu, Mar 12, 2015 at 1:55 PM, Andrew Kerber <andrew.kerber_at_gmail.com> wrote:
>>> This is a learning experience for me. I wrote the query below to pull sizing trends at the database level. Note it uses a with. I think it can be written with a single select and a roll up clause, or perhaps another function I am not aware of. Any ideas?
>>>
>>> with group1 as (select
>>> to_char(s.begin_interval_time,'yyyy-mm-dd') get_date,
>>> v.name ts_name,
>>> (round(max((t.tablespace_size*8192))/1024/1024/1024,2)) size_gb,
>>> (round(max((tablespace_usedsize*8192))/1024/1024/1024,2)) used_gb
>>> from v$tablespace v, dba_hist_snapshot s, dba_hist_tbspc_space_usage t
>>> where t.tablespace_id=v.ts#
>>> and t.snap_id=s.snap_id
>>> group by to_char(s.begin_interval_time,'yyyy-mm-dd'), v.name)
>>> select get_date, sum(size_gb) tot_size,sum(used_gb) used_size
>>> from group1
>>> group by get_date
>>> order by get_date
>>>
>>> Sent from my iPad--
>>> http://www.freelists.org/webpage/oracle-l
>
-- http://www.freelists.org/webpage/oracle-lReceived on Thu Mar 12 2015 - 17:04:14 CET