Let's save some space by using COMPRESSED INDEXES

AbidMalik's picture
articles: 

Summary: An index created on column that has many duplicated rows can be tuned to save space as well as I/O trips by compressing the index defined on it.

Details: By default when we create index in Oracle, the compression is disabled on it. What if we have an index defined on a column that contains last name of all the customers, some of the names are very common as a last name. We can take advantage of this duplicated data by compressing the index defined on it. This is where the KEY-COMPRESS INDEXES comes in handy.

COMPRESSED indexes reduce storage space, They can fit more rows in less Leaf Blocks and that mean there are less I/O trips when accessing data from these indexes.

Example:

--Create table
CREATE TABLE EMP (EmpID INT, EmpName VARCHAR2(30));

--Add 10 Rows
INSERT INTO EMP VALUES(1,'Johny Depp');
INSERT INTO EMP VALUES(2,'Sarah Pawlin');
INSERT INTO EMP VALUES(3,'Christy Martin');
INSERT INTO EMP VALUES(4,'Joe Smith');
INSERT INTO EMP VALUES(5,'Johny Boy');
INSERT INTO EMP VALUES(6,'Greg Toy');
INSERT INTO EMP VALUES(7,'Julian Clara');
INSERT INTO EMP VALUES(8,'Abid Malik');
INSERT INTO EMP VALUES(9,'Anthony Quinn');
INSERT INTO EMP VALUES(10,'Jim Craker');
COMMIT;

--Define Index on EmpName
CREATE INDEX EMP_EmpName_Idx ON EMP(EmpName);

--Analyze Table so CBO get fresh statistics
EXEC Dbms_Stats.GATHER_TABLE_STATS(OWNNAME => 'DBA_OWNER', TABNAME => 'EMP', METHOD_OPT => 'FOR ALL COLUMNS SIZE AUTO',ESTIMATE_PERCENT => DBMS_STATS.AUTO_SAMPLE_SIZE,CASCADE => TRUE);

--Get the Statistics of Index
SELECT INDEX_NAME,COMPRESSION,LEAF_BLOCKS,Round(NUM_ROWS/Decode(LEAF_BLOCKS,0,1,LEAF_BLOCKS)) "ROWS PER BLOCK", DISTINCT_KEYS,NUM_ROWS,NUM_ROWS-DISTINCT_KEYS DUP_ROWS FROM USER_INDEXES WHERE INDEX_NAME = 'EMP_EMPNAME_IDX'; ORDER BY DUPLICATE_ROWS DESC;

INDEX_NAME COMPRESSION LEAF_BLOCKS ROWS_PER_BLOCK DISTINCT_KEYS NUM_ROWS DUP_ROWS
EMP_EMPNAME_IDX DISABLED 1 10 10 10 0

--Now let's add about 1000 rows that will have same last name
DECLARE v_a NUMBER;
BEGIN
v_a := 11;
WHILE v_a < 1000
LOOP
INSERT INTO EMP VALUES(v_a,'Smith');
v_a := v_a + 1;
END LOOP;
COMMIT;
END;

--Get the Statistics of Index

INDEX_NAME COMPRESSION LEAF_BLOCKS ROWS_PER_BLOCK DISTINCT_KEYS NUM_ROWS DUP_ROWS
EMP_EMPNAME_IDX DISABLED 5 200 11 999 988

Note:
As you can see there are 5 Leaf Blocks that holding the data, also note there are 200 rows per Block.
There are total 999 total rows out of which 988 are duplicated entries (last name = Smith)

Now let's drop this index and re-create it as COMPRESSED index:
DROP INDEX EMP_EMPNAME_IDX;

CREATE INDEX EMP_EMPNAME_IDX ON EMP("EMPNAME")COMPRESS TABLESPACE USERS;

--Index Statistics

INDEX_NAME COMPRESSION LEAF_BLOCKS ROWS_PER_BLOCK DISTINCT_KEYS NUM_ROWS DUP_ROWS
EMP_EMPNAME_IDX ENABLED 2 500 11 999 988

Now there are only 2 Leaf Blocks and 500 rows can fit in one block. This is the beauty of COMPRESSED INDEXES.

Comments

Good demonstration!