Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: indexing a column with only 2 or 3 values

Re: indexing a column with only 2 or 3 values

From: Richard Foote <richard.foote_at_bigpond.nospam.com>
Date: Fri, 03 Aug 2007 14:13:39 GMT
Message-ID: <nKGsi.15619$4A1.6631@news-server.bigpond.net.au>


"Richard Foote" <richard.foote_at_bigpond.nospam.com> wrote in message news:pCGsi.15616$4A1.14750_at_news-server.bigpond.net.au...
> "Thomas Kellerer" <JUAXQOSZFGQQ_at_spammotel.com> wrote in message
> news:5hgnqqF3l1n69U1_at_mid.individual.net...

>> Richard Foote wrote:
>>> "Thomas Kellerer" <JUAXQOSZFGQQ_at_spammotel.com> wrote in message 
>>> news:5hg5aiF3kia4nU1_at_mid.individual.net...
>>>> ciapecki wrote:
>>>>> Hi,
>>>>>
>>>>> Does indexing a very big table (about 5Mio records) on the columnA
>>>>> which can hold only values Y,N,<NULL> make sense?
>>>>>
>>>> Yes, that's what bitmap indexes were made for.
>>>>
>>>
>>> Hi Thomas
>>>
>>> A Bitmap index is of no use if both Y and N are roughly evenly 
>>> distributed and you have no other predicate in the query.
>>>
>>> Returning approximately 2.5 millions rows through a bitmap index would 
>>> be dramatically slower than a full table scan.
>>>
>>> A single bitmap index only would be useless in this scenario, even more 
>>> so if the table is subjected to any transactional based DML load.
>>>
>>
>> Hi Richard,
>>
>> thanks for the pointing this out.
>> I wasn't aware of that, but it does sound reasonable.
>>
>> But after all the Concepts manual says:
>>
>> "If the number of distinct values of a column is less than 1% of the
>> number of rows in the table, or if the values in a column are repeated
>> more than 100 times, then the column is a candidate for a bitmap index."
>>
>> Actually a bit further down in the Concepts manual there is an example 
>> very similar to the OP's situation:
>>
>> "There are only three possible values for marital status and region, two 
>> possible values for gender, and four for income level. Therefore, it is 
>> appropriate to create bitmap indexes on these columns"
>>
>

> Hi Thomas
>

> There are a number of classic myths associated with bitmap indexes and
> yes, Oracle is as guilty as anyone in propagating them. One is that bitmap
> indexes are only useful for low cardinality columns. The above definition
> is better than many I've read but it's still one of those rules of thumbs
> that is not entirely accurate as column values outside of the definition
> could possibly be candidates for a bitmap index.
>

> The other classic myth is that a *single* bitmap index on a very low
> cardinality column (as in the OPs example) can be very efficiently
> utilised to retrieve the required number of rows. But if this means
> retrieving 50% or 33% or 25% etc of all rows in a huge table, then it not
> going to be very efficient at all when compared to the poor old full table
> scan.
>

> Note in the above example, it mentions 4 different columns, not one
> column. These four columns when *combined* could possibly reduce the final
> result set to a small enough subset of required rows that would make
> retrieving them one at a time through the rowids a possibly attractive
> option. For example, there may not be that many single males that live in
> Canberra that have a really low income ...
>

> At the end of the day, it comes back to the overall selectivity of
> predicates and can combinations of bitmaps when and/or/not together
> produce a small enough set of rowids to make it all worthwhile to read the
> bitmap blocks, perform the set logic and retrieve the resultant rowids one
> at a time when compared with other alternatives (such as the full table
> scan).
>

> The answer is almost certainly a big no for a single bitmap index on a low
> cardinality column.

>

Hi Thomas

Ops, forgot to highlight my other original point that's ignored in the Concept recommendations you quoted in that bitmap indexes are going to be problematic in a transactional environment, primarily due to the likely locking and contention issues that would result. Although improved in latter releases, bitmap indexes can also become structurally less efficient under DML load. Therefore Bitmap indexes and transactional systems do not mix well at all ...

Cheers

Richard Received on Fri Aug 03 2007 - 09:13:39 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US