Home » RDBMS Server » Performance Tuning » Subpartitions number (Oracle 10.2G)
icon1.gif  Subpartitions number [message #380843] Tue, 13 January 2009 09:48 Go to next message
wakula
Messages: 150
Registered: February 2008
Location: Poland
Senior Member
Hi,

I was wondering whether there are any drawbacks of creating large number of subpartitions.
I have a partitioned table (partitioned by date thus it seems that I can only use hash subpartition template) and I wish to divide it into 64 or more partitions. Is there any negative impact of such subpartitioning?
Re: Subpartitions number [message #380850 is a reply to message #380843] Tue, 13 January 2009 10:31 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Have you 64 or more hard disks to support your subpartitioning?
Have you many contention on hot blocks?
Why do you want to subpartition?

Regards
Michel
Re: Subpartitions number [message #380865 is a reply to message #380843] Tue, 13 January 2009 11:15 Go to previous messageGo to next message
wakula
Messages: 150
Registered: February 2008
Location: Poland
Senior Member
The goal is to split the table into smaller chunks - without changing its name or partitioning schema (new partitions are added/dropped by some other code).

I have big table (average 20 000 000 rows per partition) partitioned by date column. That partition holds about 100 sets of data. Queries are only for a single set of data.
CREATE TABLE my_table
(col_1  DATE
,col_2  NUMBER -- FOREIGN KEY
,col_3  NUMBER
,target VARCHAR2(1024)
);

Partitioned by col_1.
Example (simplified) query:
SELECT something
FROM   other_table
WHERE  other_table.target IN
(
   SELECT target
   FROM   my_table
   WHERE  col_1 = to_date('2009-01-13','YYYY-MM-DD')
   AND    col_2 = :variable_1
   AND    col_3 > :variable_2
)

If I could - I would create separate tables for every col_2 value. Right now I seek how to split that existing table into smaller chunks (using col_2) without changes in the schema (there is some code that rely on that table).

I am not sure about the hard drives - probably come RAID matrix.


Additional question:
The query is executed using DBMS_SQL.EXECUTE. We had to hard-code that to_date(...) into the query statement - full table scan was performed when it was specified as a variable. What could have been the reason?

I would be grateful for any hints/answers.
Re: Subpartitions number [message #380868 is a reply to message #380865] Tue, 13 January 2009 11:33 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Quote:
I would be grateful for any hints/answers.

Don't touch anything.

Regards
Michel
Re: Subpartitions number [message #380881 is a reply to message #380843] Tue, 13 January 2009 14:36 Go to previous messageGo to next message
wakula
Messages: 150
Registered: February 2008
Location: Poland
Senior Member
Michel,
Should I understand that I would get no benefit from dividing that table into subpartitions?
Maybe this is worth mentioning that the table is used only by my PL/SQL code and the some maintenance scripts that I have mentioned already (responsible for partitions adding/dropping).
Re: Subpartitions number [message #380908 is a reply to message #380881] Wed, 14 January 2009 00:25 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Quote:
Should I understand that I would get no benefit from dividing that table into subpartitions?

Most likely.

Regards
Michel
Re: Subpartitions number [message #380920 is a reply to message #380908] Wed, 14 January 2009 01:31 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
I disagree with Michel. If you have 20M rows per partiton and you typically perform SQLs that scan 0.5M - 10M rows, then further partitioning will help full partition scans.

First of all, are your current partitions DAILY, or are they some larger grain (monthly)? If they are bigger than daily and most of your queries are for a single day or a VERY small range of days, then make your partitions DAILY.

If you have partitioned to the grain of your queries (ie. daily queries and daily partitions) and you still have 20M rows per partition, then sub-partitions can help queries that include that column. Especially if you can reduce a 20M row scan down to say 1M rows.

If you have 100 distinct values that are EVENLY DISTRIBUTED, then I would suggest 5-10 hash sub-partitions.

If the values are SKEWED, you should use LIST based sub-partitions to get the most populous values in their own sub-partitions.

If the values are HEAVILY SKEWED (>90% with a single value) then there will be no benefit to sub-partitioning. You would be better to index the column; this will provide indexed access for uncommon values and full scans for the common value.


The one thing you have to watch out for when you create lots of partitions is LOCALLY PARTITIONED INDEXES.

If you have 100 partitions and 10 sub-partitions, that's 1000 index segments. If you perform a query that does NOT include the partition key, then Oracle will need to perform 1000 separate range scans.

This is not a problem with BITMAP indexes, nor is it a problem when you fetch thousands of rows (ie. long range scan or full scan). It is only problematic when you fetch a small number of rows without providing the partition key in the SQL.


Note that this is a NON-STANDARD implementation of sub-partitons. The best reasons for sub-partitioning are partition-wise joins and parallel full table scans. If you have NO NEED WHATSOEVER for these popular features, you will be OK. However if these are a factor for you, then they may be at odd with the partition-pruning approach that you are currently taking.

Ross Leishman
Re: Subpartitions number [message #380922 is a reply to message #380920] Wed, 14 January 2009 01:45 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
What benefit could you have if you only have 1 disk?
Just more contention.
What if the statement is not parallelizable?
What if the workload is 100%CPU yet?

Regards
Michel

[Updated on: Wed, 14 January 2009 01:47]

Report message to a moderator

Re: Subpartitions number [message #380931 is a reply to message #380843] Wed, 14 January 2009 03:03 Go to previous messageGo to next message
wakula
Messages: 150
Registered: February 2008
Location: Poland
Senior Member
Data distribution:
The queries are almost evenly distributed. Some data sets contains 400'000 rows while some other 50'000. Since this is about 1% of the partition - I guess that it would qualify to "small numer of rows". Data are inserted daily - so the insertion order should match partitioning key order.

Partitioning:
Partitions are created on a monthly basis.
We have 3 types of queries:
1) Query that is executed daily on 1 day of data
2) Query that is executed weekly on 7 days of data
3) Query is executed usually on 28 days of data (usually that would target 2 partitions).
Dividing the data into 1 day partitions is tempting (especially for the first group of queries).
Once in a month we drop oldest partitions and add a new one for upcoming month. This is performed by a shell/sqlplus script executed from cron (yeah - I know that it is not the best approach).

Indexes:
Indexes on almost all tables ale locally partitioned. Queries are always performed using the partitioning key (the key is always a date, the query statement has always the date range embedded into it as a string instead of variable) and some extra columns. Subpartitions would be created on hash of the "extra columns" used in the query. Local indexes are created on date and other columns. Date should be always the first query after WHERE keyword (just in case since Oracle performs the query from left to right).

Server stats:
I am not sure about the stats but some time ago the server used only 1 to 3 CPU's while processing, up to 16 GB of RAM, some swap and main bottleneck was I/O. Not sure about the disk (it is mounted using NFS) - it might be a single disk but most probably this is a RAID with 4 or more drives. The bottleneck for my queries is I/O but the table is partitioned and has local indexes. Single (huge) query should fetch/aggregate up to 2M rows at once from different tables - it takes seconds to minutes to do that. Adding small subquery (to the table that I was asking about) to that long statement would add 50k-400k rows to fetch (without aggregation) and increase processing time to minutes/hours.

Comment:
Until now I was thinking that it is not possible to use range/index partitioning/subpartitioning with a subpartitioning template. Creating the subpartitioning manually could make the logic too complex. Could you confirm that it is possible to use range/index using subpartition template?

[Updated on: Wed, 14 January 2009 04:05]

Report message to a moderator

Re: Subpartitions number [message #381205 is a reply to message #380922] Thu, 15 January 2009 14:46 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Michel Cadot wrote on Wed, 14 January 2009 18:45
What benefit could you have if you only have 1 disk?
Just more contention.
What if the statement is not parallelizable?
What if the workload is 100%CPU yet?

Regards
Michel


I don't understand the importance of CPU and number of disks to Partition Pruning. If you can partition your data in such a way that your query scans less data, it will be faster. Right?

Suggest you try daily partitioning on your test system. Instead of creating 1 partition per month, create 28-31 of them.

You may find then that IO is reduced to an extent that you are happy to live with the few that are not significantly improved.

If you are going down the sub-partitioning path, I would go with List Partitioning on a single column since it seems your data is somewhat skewed. The syntax is pretty straightforward - just read the manual. Make sure you do lots of prototyping - this is a pretty serious step that you shouldn't be rushing into production.

Ross Leishman
Re: Subpartitions number [message #381260 is a reply to message #381205] Fri, 16 January 2009 00:53 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Quote:
I don't understand the importance of CPU and number of disks to Partition Pruning

OP was talking about hash (sub)partitioning, there is no partition pruning with hash partitioning just spreading IO (and CPU work if parallel query).

Regards
Michel

Re: Subpartitions number [message #381279 is a reply to message #381260] Fri, 16 January 2009 02:21 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Michel Cadot wrote on Fri, 16 January 2009 17:53
Quote:
I don't understand the importance of CPU and number of disks to Partition Pruning

OP was talking about hash (sub)partitioning, there is no partition pruning with hash partitioning just spreading IO (and CPU work if parallel query).

Regards
Michel



Oracle Data Warehousing Guide 10.2
Oracle Database prunes partitions when you use range, LIKE, equality, and IN-list predicates on the range or list partitioning columns, and when you use equality and IN-list predicates on the hash partitioning columns.

On composite partitioned objects, Oracle can prune at both the range partition level and at the hash or list subpartition level using the relevant predicates.
Re: Subpartitions number [message #381283 is a reply to message #381279] Fri, 16 January 2009 02:33 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Thanks for the correction.

Regards
Michel
icon3.gif  Re: Subpartitions number [message #381304 is a reply to message #380843] Fri, 16 January 2009 04:00 Go to previous messageGo to next message
wakula
Messages: 150
Registered: February 2008
Location: Poland
Senior Member
Thanks for your replies!

I was thinking about range/hash sub-partitioning because I can set a subpartition template. That would be simple and would require no change to the existing code. Such change would be a "database administration" task (since it would be transparent to the existing code).
I agree that range/list would be better - however that require a manual (or semi-automated) definition for every subpartition thus would require code changes. That change would be "development" task.

However I might be wrong (I found that the subpartitioning documentation is not as complete as I would like).
Please correct me if I am wrong, but:
- With range/list subpartitioning I need to create every subpartition and specify its name on my own. That require a change in current script called "AddPartition.sh" (yes - that is a shell script) so that it adds subpartitions (and probably partitions/subpartitions should be created daily instead of monthly).
- With range/hash I can define the number of subpartitions and suffixes that are automatically added to the subpartitions. Thus subpartition template would create the monthly (or daily) subpartitions whenever a partition is created (subpartition name is <partition name>_<subpartition template name>). I have also noticed that the subpartition name is limited in length to 30 characters (if I specify 15 characters for subpartition template name and 15 characters for partition name then exception is thrown).
Re: Subpartitions number [message #381628 is a reply to message #381304] Mon, 19 January 2009 00:39 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
When all else fails, try it out:

SQL> CREATE TABLE sptest (
  2      a number
  3  ,   b number
  4  ,   c number
  5  )
  6  PARTITION BY RANGE (a)
  7  SUBPARTITION BY LIST (b)
  8  SUBPARTITION TEMPLATE (
  9      SUBPARTITION S1 VALUES (1)
 10  ,   SUBPARTITION S2 VALUES (2)
 11  ,   SUBPARTITION DF VALUES (DEFAULT)
 12  )
 13  (
 14          PARTITION VALUES LESS THAN (1)
 15  ,       PARTITION VALUES LESS THAN (2)
 16  )
 17  /

Table created.

SQL>
SQL> ALTER TABLE sptest
  2  ADD PARTITION VALUES LESS THAN (3)
  3  /

Table altered.


The documentation is at http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_7002.htm#i2146320

Ross Leishman
Re: Subpartitions number [message #388790 is a reply to message #380843] Thu, 26 February 2009 04:13 Go to previous message
wakula
Messages: 150
Registered: February 2008
Location: Poland
Senior Member
Hello again,

I have been doing some experiments with the subpartitions.
Although I don't have as much knowledge about that subject, as I would like to (my company is not going to "waste" money on trainings and they want cheap stuff instead - sadly this is the approach of most companies nowadays) - I will try to share my experiences with anyone who might read this thread.

When I have created multiple partitions/subpartitions (total about 365 partitions with 16 subpartitions in each partition) then my database went down. The cause was disk space usage - too many extents were reserved. The database had grown a lot (initial size was about 4GiB). NOTE: the partitions/subpartitions were empty. Used Oracle was 10.2.

Right now I'm going to use bitmap indexes instead (not sure what will be the disk space usage though.
My new DB structure is going to look like (the code bellow could contain some syntax misspelling):
CREATE TABLE a_lot_of_data (EventDate DATE, CompanyID INTEGER, more_columns SOMETHING) PARTITION BY RANGE (EventDate);

There is 1 partition created every month (I am not sure if the duration is not too long thoug).
Then indexes:
CREATE BITMAP INDEX a_lot_of_data_date ON a_lot_of_data(EventDate);
CREATE BITMAP INDEX a_lot_of_data_company ON a_lot_of_data(companyid);

And code hints to use:
SELECT /*+ hint INDEX_COMBINE(a_lot_of_data a_lot_of_data_date a_lot_of_data_company) */ more_columns FROM a_lot_of_data WHERE EventDate BETWEEN :1 AND :2 AND CompanyID=:3


I will update this thread once I have completed my work.
Previous Topic: cardinality in bitmap index
Next Topic: unable to reduce the cost of the query
Goto Forum:
  


Current Time: Fri Nov 22 18:37:44 CST 2024