Partitioning and Oracle Standard Edition
I first came across partitioning with release 7, when it didn't exist. Like many DBAs, I simulated it by writing a lot of application code. You can still do this – and you may have to if you haven't bought Enterprise Edition plus the partitioning option. Here's another way to do it, with partitioned views.
Consider this example:
create table sales2016 ( prod_id number constraint s2016_prod_nn not null, cust_id number constraint s2016_cust_nn not null, time_id date constraint s2016_time_nn not null , quantity_sold number, constraint part2016 check (time_id >=to_date('01-JAN-2016','dd-mon-yyyy') and time_id <to_date('01-JAN-2009','dd-mon-yyyy')) ); create table sales2015 ( prod_id number constraint s2015_prod_nn not null, cust_id number constraint s2015_cust_nn not null, time_id date constraint s2015_time_nn not null , quantity_sold number, constraint part2015 check (time_id >=to_date('01-JAN-2015','dd-mon-yyyy') and time_id <to_date('01-JAN-2016','dd-mon-yyyy')) ); create table sales2014 ( prod_id number constraint s2014_prod_nn not null, cust_id number constraint s2014_cust_nn not null, time_id date constraint s2014_time_nn not null , quantity_sold number, constraint part2014 check (time_id >=to_date('01-JAN-2014','dd-mon-yyyy') and time_id <to_date('01-JAN-2015','dd-mon-yyyy')) ); create view sales as select * from sales2014 union all select * from sales2015 union all select * from sales2016; create or replace trigger insert_sale instead of insert on sales begin if :new.time_id between to_date('01-JAN-2014','dd-mon-yyyy') and to_date('31-DEC-2014','dd-mon-yyyy') then insert into sales2014 values( :new.prod_id, :new.cust_id, :new.time_id, :new.quantity_sold); elsif :new.time_id between to_date('01-JAN-2015','dd-mon-yyyy') and to_date('31-DEC-2015','dd-mon-yyyy') then insert into sales2015 values( :new.prod_id, :new.cust_id, :new.time_id, :new.quantity_sold); elsif :new.time_id between to_date('01-JAN-2016','dd-mon-yyyy') and to_date('31-DEC-2016','dd-mon-yyyy') then insert into sales2016 values( :new.prod_id, :new.cust_id, :new.time_id, :new.quantity_sold); end if; end; /
The intent of this code fragment is to permit developers to write code that will insert and query an object called SALES that will appear to them to be a simple table, though in fact it is several segments:
insert into sales values (13,987,to_date('01-JUN-2015','dd-mon-yyyy'),1); insert into sales values (13,987,to_date('01-JUN-2016','dd-mon-yyyy'),1); select * from sales where prod_id=13;
In principle, this is partitioning - without having to pay for it. Of course, the example needs to be completed. It needs more INSTEAD OF triggers, and it needs dimension tables, constraints, and indexes to build up the star schema:
create table products (prod_id number primary key, prod_name varchar2(20)); create table customers (cust_id number primary key, cust_name varchar2(20)); alter table sales2016 add constraint s2016_prod_fk foreign key (prod_id) references products; alter table sales2015 add constraint s2015_prod_fk foreign key (prod_id) references products; alter table sales2014 add constraint s2014_prod_fk foreign key (prod_id) references products; alter table sales2016 add constraint s2016_cust_fk foreign key (cust_id) references customers; alter table sales2015 add constraint s2015_cust_fk foreign key (cust_id) references customers; alter table sales2014 add constraint s2014_cust_fk foreign key (cust_id) references customers; create index s2016_prod_ix on sales2016 (prod_id); create index s2016_cust_ix on sales2016 (cust_id); create index s2016_time_ix on sales2016 (time_id); create index s2015_prod_ix on sales2015 (prod_id); create index s2015_cust_ix on sales2015 (cust_id); create index s2015_time_ix on sales2015 (time_id); create index s2014_prod_ix on sales2014 (prod_id); create index s2014_cust_ix on sales2014 (cust_id); create index s2014_time_ix on sales2014 (time_id);
So let's see what all this achieves. The one logical object (the SALES “table”) has been broken up into a number of smaller objects: one table for each year. This will improve manageability substantially, because the parts can be moved, indexed, and stored independently. In earlier times, the ability to spread a logical object across different tablespaces stored in datafiles on different devices was important too, though with modern storage technology this is less significant. It may also improve query performance, because (provided that the constraints are adequate enough to tell the optimizer what is going on) queries need only address a part of the logical object. Consider a query that retrieves all sales of books in the northern hemisphere summer of 2015:
select sum(quantity_sold) from sales join products using (prod_id) where time_id between to_date(’01-JUN-2015’,'dd-mon-yyyy') and to_date(’31-AUG-2015’,'dd-mon-yyyy') and prod_name='books';
Unless there is a skew in the data, a time_id index on all sales would be useless: it would select one twelfth (three months out of thirty six) of the total row count, which is probably nowhere near selective enough to be worth using. The optimizer could elect to use the index on prod_id, and then scan all the relevant rows to extract those for the summer of 2015 – but even that will be horribly inefficient because eleven twelfths of the rows retrieved will be discarded. However, with the partitioned and constrained structures, the optimizer will be aware that only the rows in the sales2015 table can be relevant, and will read only the smaller bitmap index on that table. The subsequent filtering of the rows for summer will be much quicker. Or a simple scan of sales2015 would be feasible – three times quicker than scanning the entire logical object. Check out the execution plans of a few queries against simple tables and indexes, and the same queries against objects that look the same to the user but are in fact many tables, and you’ll soon see the difference. Here's the execution plan for the query above:
Execution Plan ---------------------------------------------------------- Plan hash value: 471613365 ---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 40 | 4 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 40 | | | | 2 | NESTED LOOPS | | 3 | 120 | 4 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL | PRODUCTS | 1 | 25 | 3 (0)| 00:00:01 | | 4 | VIEW | SALES | 3 | 45 | 1 (0)| 00:00:01 | | 5 | UNION ALL PUSHED PREDICATE | | | | | | |* 6 | FILTER | | | | | | |* 7 | TABLE ACCESS BY INDEX ROWID BATCHED| SALES2014 | 1 | 35 | 1 (0)| 00:00:01 | |* 8 | INDEX RANGE SCAN | S2014_PROD_IX | 1 | | 1 (0)| 00:00:01 | |* 9 | TABLE ACCESS BY INDEX ROWID BATCHED | SALES2015 | 1 | 35 | 1 (0)| 00:00:01 | |* 10 | INDEX RANGE SCAN | S2015_PROD_IX | 1 | | 1 (0)| 00:00:01 | |* 11 | FILTER | | | | | | |* 12 | TABLE ACCESS BY INDEX ROWID BATCHED| SALES2016 | 1 | 35 | 1 (0)| 00:00:01 | |* 13 | INDEX RANGE SCAN | S2016_PROD_IX | 1 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("PRODUCTS"."PROD_NAME"='books') 6 - filter(NULL IS NOT NULL) 7 - filter("TIME_ID">=TO_DATE(' 2015-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"<=TO_DATE(' 2015-08-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 8 - access("PROD_ID"="PRODUCTS"."PROD_ID") 9 - filter("TIME_ID">=TO_DATE(' 2015-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"<=TO_DATE(' 2015-08-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 10 - access("PROD_ID"="PRODUCTS"."PROD_ID") 11 - filter(NULL IS NOT NULL AND NULL IS NOT NULL AND NULL IS NOT NULL) 12 - filter("TIME_ID">=TO_DATE(' 2015-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TIME_ID"<=TO_DATE(' 2015-08-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 13 - access("PROD_ID"="PRODUCTS"."PROD_ID")
Note the filters at steps 6 and 11: NULL IS NOT NULL. This can never be true, and Oracle knows it. Therefore, even though the plan does mention the indexes and tables of data for 2014 and 2016, it will not even look at them.
And there is a second reason for breaking up the table: maintenance. It is possible to move the segment for an individual year's data without affecting the others, or to rebuild its indexes. The older data can be placed in read-only tablespaces and moved to slower, cheaper, storage.
This is what partitioning is about: adjusting the storage of data objects to improve manageability or performance or both, without affecting the logical presentation. And you can do it with Standard Edition.
--
John Watson
Oracle Certified Master DBA
http://skillbuilders.com
- John Watson's blog
- Log in to post comments