Re: That crazy CBO.....

From: Dan Tow <dantow_at_singingsql.com>
Date: Wed, 27 Oct 2004 16:43:38 -0500
Message-ID: <1098913418.4180168a4bb05@www.singingsql.com>

I can't say why you're getting diferent stats for the two tests, but I have some general advice how you might do better than *either*:

You appear to have an example of a special case I see sometimes:

A table is laid out with one row per range of values, where the ranges of values are non-overlapping and (usually) inclusive, meaning they leave no "holes" in the midst of the ranges if the ranges are laid out end-to-end, so to speak. The most common examples of this involve date ranges, but you seem to have a case for address ranges. Thus, to point to a row, you say

:x between lowcol and highcol

or (where on ranges highcol is the next range's lowcol)

(:x > lowcol and :x <= highcol).

Most commonly, the designer chooses to create an index on (lowcol, highcol), but sometimes the order is reversed, or just one of these is indexed.

The problem is that any such index strategy means that the index range scan to find the single range that meets the condition is likely to have to read about *half* the index, if you ask for a range in the middle of the whole set of ranges. (With date ranges, you can often escape most of the dilemma because applications tend to ask for the most-recent range, or one of the most-recent ranges, anyway, so an index on *highcol* will see a very short range scan, usually.)

Date or Codd (can't recall which, off-hand) discussed a handy set of comparators, greatest-less-than, least-greater-than, greatest-less-than-or-equal-to, least-greater-than-or-equal-to. With these available, you'd just ask for an index range scan with

highcol least-greater-than-or-equal-to :x

and the database would find the very first (least) value of highcol in an index range scan on a conventional index of highcol, then would quit, because that's the row you want. The cost, if you look at the I/O, would be no different than an index unique scan!

Sadly, I've never found a database that has implemented this lovely concept. Instead, the best I've found is the following admittedly hacky solution:

(Forewarning: the following solution may offend the relational purists among you. Yes, I *know* it's a hack.) Let's say the index on highcol is called highcolind. If you want to quickly find the row you're after, the query you need is

select /*+ index(t highcolind) */ <whatever> from rangetable t
where highcol >= :x
and lowcol <= :x
and rownum=1

Here, the database won't keep searching for more ranges that might enclose :x once it finds the first one, and the range scan on highcolind fidns the first one in the first index entry it sees, as long as the database follows the hint! (If it fails to follow the hint, performance will suffer, clearly, but the database will still return the right row, since it will keep looking until it satisfies both conditions on :x. This will *not* find multiple ranges that include :x, though, if the table contains overlapping ranges - I have no easy answer of finding multiple ranges if your ranges overlap. If your ranges have holes, it will also have to do the entire range scan from :x to the end of the range before it will conclude that no range contains the particular value of :x you're looking with - that's why no-holes helps.)

To build this into a more-complex query, you'd do something like

Select /*+ ordered <maybe other hints> */ <whatever> from (select /*+ index(t highcolind) */ *

        from rangetable t
       where highcol >= :x
         and lowcol <= :x
         and rownum=1) t, taba a, tabb b, ...

where t.fkey=a.pkey
...

In your particular case, you should have to do probably 1-3 I/Os to reach the rowid you want from AIA_INDX_PR01, assuming that index leads with AIA_IP_ADDRESS_END, then a single I/O for the table row, then about the same number to reach the row you need in ADDS_USERS, probably no more than 8 I/Os, likely all of them cached. However, the *biggest* win isn't going to be the number of logical I/Os! I have found that I/Os to do the sort of large index range scan you're liely getting, now, are *way* more expensive than normal logical I/Os, because they have to do the work of examining every entry in each leaf block they hit, which is much more expensive than a logical I/O to read a single row or to reach a single rowid in a UNIQUE scan.

Hope it helps.

Thanks,

Dan Tow
650-858-1557
www.singingsql.com

Quoting "Bobak, Mark" <Mark.Bobak_at_il.proquest.com>:

> Ok, I'm a bit at a loss to explain this.....
>
> I have two tables, ADDS_USERS, which has AUSR_ID as the primary key, and
> AUTHORIZED_IP_ADDRESSES.
>
> Now, my original query looks like this:
> SELECT A.AUSR_LOGIN_SCREEN_NAME FROM ADDS_USERS A,
> AUTHORIZED_IP_ADDRESSES B
> WHERE A.AUSR_ID =3D B.AUSR_ID AND :B1 BETWEEN B.AIA_IP_ADDRESS_START AND
> B.AIA_IP_ADDRESS_END;
>
> and when run w/ autotrace set to traceonly, produces the following
> output:
>
> Execution Plan
> ----------------------------------------------------------
> 0 SELECT STATEMENT Optimizer=3DCHOOSE (Cost=3D13 Card=3D110 =
> Bytes=3D46
> 20)
>
> 1 0 NESTED LOOPS (Cost=3D13 Card=3D110 Bytes=3D4620)
> 2 1 INDEX (RANGE SCAN) OF 'AIA_INDX_PR01' (NON-UNIQUE) (Cost
> =3D4 Card=3D110 Bytes=3D3080)
>
> 3 1 TABLE ACCESS (BY INDEX ROWID) OF 'ADDS_USERS' (Cost=3D2 Ca
> rd=3D1 Bytes=3D14)
>
> 4 3 INDEX (UNIQUE SCAN) OF 'AUSR_PK' (UNIQUE)
>
>
>
>
> Statistics
> ----------------------------------------------------------
> 0 recursive calls
> 0 db block gets
> 1321 consistent gets
> 864 physical reads
> 0 redo size
> 513 bytes sent via SQL*Net to client
> 652 bytes received via SQL*Net from client
> 2 SQL*Net roundtrips to/from client
> 0 sorts (memory)
> 0 sorts (disk)
> 1 rows processed
>
> Now, that plan looks quite reasonable, but, 1321 buffer gets is too
> many, not to mention the physical I/O. This is a very frequently called
> SQL statement.
>
> So, I tried:
> re-creating ADDS_USERS ordered by AUSR_ID, to improve AUSR_PK index
> clustering factor. No use.
> creating AUTHORIZED_IP_ADDRESSES as an IOT. Also useless.
>
> So, I went back to SQL hacking, and finally came up with the somewhat
> odd looking, but effective:
> SELECT (select A.AUSR_LOGIN_SCREEN_NAME FROM ADDS_USERS A where
> a.ausr_id =3D b.ausr_id) from AUTHORIZED_IP_ADDRESSES B
> WHERE :B1 BETWEEN B.AIA_IP_ADDRESS_START AND B.AIA_IP_ADDRESS_END;
>
> which produces an output like:
> Execution Plan
> ----------------------------------------------------------
> 0 SELECT STATEMENT Optimizer=3DCHOOSE (Cost=3D2 Card=3D110 =
> Bytes=3D308
> 0)
>
> 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'ADDS_USERS' (Cost=3D2 Card
> =3D1 Bytes=3D14)
>
> 2 1 INDEX (UNIQUE SCAN) OF 'AUSR_PK' (UNIQUE) (Cost=3D1 =
> Card=3D1
> )
>
> 3 0 INDEX (RANGE SCAN) OF 'AIA_INDX_PR01' (NON-UNIQUE) (Cost=3D4
> Card=3D110 Bytes=3D3080)
>
>
>
>
>
> Statistics
> ----------------------------------------------------------
> 0 recursive calls
> 0 db block gets
> 67 consistent gets
> 0 physical reads
> 0 redo size
> 562 bytes sent via SQL*Net to client
> 652 bytes received via SQL*Net from client
> 2 SQL*Net roundtrips to/from client
> 0 sorts (memory)
> 0 sorts (disk)
> 1 rows processed
>
> Now, that's a bit odd looking plan, but, look at those consistent gets!
> Down to 67 from 1321!
>
> So, I guess what I'm wondering is why this re-write is so effective?
> I'm happy that I found a solution, but, I'm curious what's going on
> here. Intuitively, I'm thinking that all I'm doing with the re-write is
> forcing the optimizer to do what it seems to claim to be already doing
> with the original version . With the scalar subquery, it seems the
> nested loop join is implicit in the select statement.
>
> Well, anyhow, I thought this was pretty odd.
>
> Thoughts, anyone?
>
> -Mark
> --
> Mark J. Bobak
> Oracle DBA
> ProQuest Company
> Ann Arbor, MI
> "On two occasions, I have been asked [by members of Parliament], "Pray,
> Mr. Babbage, if you put into the machine wrong figures, will the right
> answers come out?' I am not able to rightly apprehend the kind of
> confusion of ideas that could provoke such a question."
> -- Charles Babbage (1791-1871)
>
> --
> http://www.freelists.org/webpage/oracle-l
>

--
http://www.freelists.org/webpage/oracle-l

Received on Wed Oct 27 2004 - 16:39:34 CDT