Home » RDBMS Server » Performance Tuning » Bitmap Index and Histogram
Bitmap Index and Histogram [message #181798] |
Tue, 11 July 2006 10:07  |
srinivas4u2
Messages: 66 Registered: June 2005
|
Member |
|
|
Hi,
I have a question with regard to "Bitmap Index" and "Histograms". Let me explain my understanding of them first -
Bitmap indexes- Use them in low cardinality and with "and" and "or" predicates. Low cardinality is with refernce to the total number of rows in the table.
Histograms - Create histograms on tables where the data is skewed. Now here does skewed mean totally irrelevant values (of the column say sal) between the rows or does it mean one value being very very repetitive?
What I mean by irrelevant is say -
row1 - sal =1000
row2 - sal =10000000
row3 - sal =10
row4 - sal = 999999999999
I know that's a great sal to get...I am just saying for an eg!
And repetitive -
1000 rows haning sal as 100000 another 1000 having 50000 another 500 having 25000 etc.
And on what decision does histogram help the optimizer to make? Is it with respect to using an index or the kind of join?
If histograms are to be created based on irrrelevant values then I have no confusion with histograms and bitmap. But if histograms are based on repetitive values then Bitmap and Histogram seem to be closely related. That is, if it's an OLTP then we'd not create bitmap and use b-tree if needed with histogram and if it's not an OLTP then we create bitmap (due to dml)? And after creating bitmap do we need histogram?
Thanks,
Srinivas
|
|
|
|
|
|
|
|
Re: Bitmap Index and Histogram [message #182080 is a reply to message #182042] |
Wed, 12 July 2006 22:21   |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
1. If tha table has 100 distinct values equally distributed among the 1 million rows. Bitmap if used in conjunction with other columns.
2. If the table has 10 distinct values equally distributed among the 1 million rows. Bitmap
3. If the table has 100 distinct values with 2 distinct values forming 0.9 million rows and the rest 98 distinct values forming the remaining 0.1 million rows. Bitmap and histogram
4. If the table has 10 distinct values with 2 distinct values forming 0.4 million, 4 distinct values forming 0.4 million and the remaining 2 distinct values forming 0.2 million rows. Bitmap
5. If tha table has 2 (let's say m & f) distinct values with one (m) forming 0.9 million rows and the other (f) forming 0.1 million rows. Bitmap and histogram
6. If the table has 1M rows with 100,002 distinct values with 2 distinct values forming .5 million, 100,000 distinct values forming 0.5 million. B-tree and histogram
7. If the table has 1M rows distinct values evenly distributed. B-tree
100 is not a large number of distinct values for a bitmap index. 2500 is getting up there, but I would still be benchmarking it against a b-tree index on a large table for up to 10000 distinct values.
Indexed access to a table is ONLY efficient if it will allow you to identify less than between 1 and 10 percent of the rows. Somewhere between 1% and 10% - depending on your scenario - it becomes faster to do a full table scan.
Thats why a bitmap index needs to be used in conjunction with other bitmap indexes. On its own, it will typically return more than 1% and frequently more than 10%. You need to add further AND predicates to trim down the number of matching rows.
Histograms help here, because they can tell you whether the values you used in your SQL are skewed. If so, then the CBO will be able to ignore that index REGARDLESS of whether it is bitmap or b-tree.
Ross Leishman
|
|
|
|
|
|
Goto Forum:
Current Time: Fri May 02 12:19:12 CDT 2025
|