Re: Estimating effectivity of performance improvement measures

From: Robert Klemme <shortcutter_at_googlemail.com>
Date: Wed, 26 Nov 2008 08:35:12 -0800 (PST)
Message-ID: <6a6127e1-32b7-48ed-8d58-3690c59e38d0@a12g2000yqm.googlegroups.com>


On Nov 26, 4:59 pm, Steve Howard <stevedhow..._at_gmail.com> wrote:
> On Nov 26, 5:45 am, Robert Klemme <shortcut..._at_googlemail.com> wrote:

Hi Steve,

thanks for sharing your thoughts!

> > My idea so far is this: on the test database recompile all indexes and
> > maybe reorganize all tables to get all into a defined shape.  Then
> > look at the sum of the number of blocks all involved objects use up
> > per solution implemented.  Pick the one with the smallest number.
>
> I actually think this is a bad idea unless you plan to do the same
> thing on production.  When you say this is a "copy of production", is
> the test database a block for block copy of production, i.e., a
> storage clone or RMAN restore?

Yes, the copy is a restored RMAN backup.

>  If so, I would test against that, once
> again, unless you plan on changing the indexes in production.

I am not sure what you are suggesting. I *did* test against this database, as I have tried to explain. However, the test database contains new indexes and new tables which I created as part of my experiments and there are two issues with that: new tables and indexes have a different structure than "grown" tables and indexes (i.e. which have undergone insert/update/delete activity). Secondly the test database does not have any load other than my tests. My posting was targeted at finding ways to overcome these limitations.

> The physical stuff you can probably guesstimate the difference.  If
> your test database get 10ms per single block read and your production
> database gets 7ms, then you can extrapolate time differences for the
> physical I/O.

That's not my main concern: instead - again, as I have written - it's the actual number and distribution of physical IO which is determined by query plans, concurrent activity and object structure.

> > This is even more the case as part of these tables involve historic
> > data, i.e. multiple values with different time ranges of validity,
> > which makes a certain amount of data dead - from _this_ query's point
> > of view.  So the way how this "dead wood" is distributed across
> > objects may have a significant impact (for example when using
> > timestamps as leading columns for an index vs. trailing columns or
> > having them in the table only).
>
> How much has this changed since you refreshed test?  My goal would be
> to have test as close as possible to production, and identify any
> differences such as physical I/O access time and account for that.

There's not too much change (the copy is probably one week old now) compared to the overall history of the database so the current changes in the production database do not cause me headaches.

"As close as possible to production" in my case means, "with a realistic load" - not necessarily with the exact same data.

You seem to be focusing on the physical timing of individual queries while I am searching for a way to predict (as good as possible with the constraints imposed) behavior of a change once it is in production. In other words, I am trying to find a solution to a more complex problem than single query behavior. I know that I won't get an exact forecast but I try to find a way to predict with a certain level of confidence that a particular change will bring benefits. :-)

Kind regards

robert Received on Wed Nov 26 2008 - 10:35:12 CST

Original text of this message