List vs Range Partitioning [message #166314] |
Wed, 05 April 2006 12:28 |
a_developer
Messages: 194 Registered: January 2006
|
Senior Member |
|
|
I have a table which I can partition using Range OR List. However, I cannot quite decide which to use and quite confused about my findings. When I made the table to be RANGE partitioned, the INSERT is faster than when using LIST partitioning. But when I gather the table's stats, the LIST is faster than the RANGE partitioned. Why is it so?? I could not find much documents about the performance differences between List and RANGE partitioning..
thanks in advance..
|
|
|
|
Re: List vs Range Partitioning [message #166359 is a reply to message #166322] |
Wed, 05 April 2006 21:42 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
As Tom says in the link supplied above, one of the factors is how you use the data:
VAL = :a - doesn't matter
VAL IN (:a, :b, :c) - doesn't matter
VAL < :a - Range is better
A word of warning though: range predicates with scalar sub-queries will not use partition-pruning: eg.
WHERE VAL < (select max(val) FROM ...)
However IN-subqueries DO user partition pruning for LIST partitions: eg.
WHERE VAL IN (SELECT val FROM ...)
Ross Leishman
|
|
|
Re: List vs Range Partitioning [message #166524 is a reply to message #166359] |
Thu, 06 April 2006 18:44 |
a_developer
Messages: 194 Registered: January 2006
|
Senior Member |
|
|
Thanks, but the problem is during insert/gather_stats to the table. I am not at all querying from the tables.
For example, List_tab is partitioned by list. Range_tab is partitioned by range. They have exactly the same columns, indexes and partition keys. My process truncates both these 2 tables and inserts into them exactly the same set of records. After the insert, I get both their stats. During the insert, List is faster. But during the Gather_table_stats, the Range is faster. Please note that I am inserting using 'INSERT INTO VALUES' and not 'INSERT INTO SELECT AS' - though I don't think this info is relevant..
[Updated on: Thu, 06 April 2006 18:45] Report message to a moderator
|
|
|
|
|
Re: List vs Range Partitioning [message #166830 is a reply to message #166644] |
Sun, 09 April 2006 21:59 |
a_developer
Messages: 194 Registered: January 2006
|
Senior Member |
|
|
Ross, you're right. The timing differences is not much - just 30 minutes for gathering stats, and 15 minutes during insert. The number of rows is 43M.
Mahesh, the stmt is:
dbms_stats.gather_table_stats(ownname => user, tabname => v_table , estimate_percent => 20,method_opt => 'for all indexed columns size auto',degree => 4 ,cascade => true);
We're using 10g.
I am just curious why the difference.. Is this acceptable for a 43M rows of data? If so, is it correct to say that there should be no 'significant' difference when it comes to INSERT and 'GATHER_TABLE_STATS' between a LIST and RANGE partitioned tables?
[Updated on: Sun, 09 April 2006 21:59] Report message to a moderator
|
|
|
Re: List vs Range Partitioning [message #166836 is a reply to message #166830] |
Sun, 09 April 2006 23:07 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
Before I admitted there was a difference, I would want to:
- Isolate the hardware and the network during benchmarking - no other jobs, no other users.
- Ensure that there are no indexes on either target.
- Run each job in isolation at least 5 times, discarding the shortest and longest, and averaging the remaining times.
- Ensure that only one type of disk is used for storage. If there are varying grades, then more of one table may be stored on lower grade disk.
- If the jobs are run as INSERT INTO (rather than SQL*Loader), then disable the parallel server from the job. If the source partitioning is compatible with the target for one of the jobs, then Oracle might optimise the paralellism differently.
- Ensure both the LIST and RANGE options have the same number of partitions with the same distribution of data across partitions.
Logically, the only difference between LIST and RANGE for INSERTS is the time it takes Oracle to find the right partition for each row. If there are only a few partitions, this would be negligible. If there are lots, then some partitions might be faster to find than others (just guessing here). So if the distribution is skewed (ie. some partitions have few/no rows, some have heaps more than the average) then it is possible that by dumb luck the popular partitions are more easily found for RANGE.
If you're still skeptical, try tracing both and running TK*Prof over the trace file. If the DISK, QUERY, and CURRENT figures are a lot less for RANGE, then there is something about RANGE that faster for INSERT/analyze. If not, then it will be the result of resource contention or parallelism.
Ross Leishman
|
|
|