Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: What is a good blocksize to use.
Comments at end
-- Regards Jonathan Lewis http://www.jlcomp.demon.co.uk Next Seminar dates: (see http://www.jlcomp.demon.co.uk/seminar.html ) ____USA__________November 7/9 (MI), 19/21 (TX) ____England______November 12/14 The Co-operative Oracle Users' FAQ http://www.jlcomp.demon.co.uk/faq/ind_faq.html ben brugman wrote in message ...Received on Mon Sep 30 2002 - 02:33:18 CDT
>
>"Jonathan Lewis" <jonathan_at_jlcomp.demon.co.uk> wrote in message
>news:an3l6q$iob$1$8300dec7_at_news.demon.co.uk...
>>
>> Given the details that Ben Brugman has supplied
>> so far for this particular case I __suspect__ that an index
>> structure (any index structure, not just an IOT) on the
>> client_id is going to degenerate quite badly because
>> of the data split across clients. It is likely (but something
>> to check) that a significant number of leaf blocks will take
>> a 50/50 split and the lower block will remain half empty.
>> But it's the nature of the data - combined with the design
>> of the feature - that introduces the side-effect.
>
>Maintenance on B-tree's is relatively low. Only a few percent
>of all diskaccesses are needed for maintenance. On growing
>datastructures, where the data is inserted 'fairly' random, on
>average each leave and each node (not the root) will be filled
>for 75 %. On a tree which has grown to 5000 blocks there have
>been 4999 splits. This is not to bad. Even for a high changing
>index the maintenance is only a few percent.
>
In this case, the maintenance aspect is pretty irrelevant. Your comment about 'fairly random' inserts leading to 75% efficiency is correct. However, your data does not have a random distribution, which is why I made the comments I did. You have a very large number of client ids (type A, say) with 3 or 4 rows, 10% have several hundred rows (type B), and a few have row counts in the hundreds of thousands (type C). In your case, therefore, there is a significant risk that the arrival of a type B client will cause a leaf block split at the usual 50/50 location. It is then less likely than average that future data (i.e for the few type A client sharing the lower block) will fill the lower data block, and the only data that could go into the higher block will be data for that one specific type B client.