Re: Performance Problems

From: Mladen Gogala <mgogala.SPAM_ME.NOT_at_verizon.net>
Date: Thu, 17 May 2007 03:33:04 GMT
Message-ID: <pan.2007.05.17.03.33.04@verizon.net>

On Wed, 16 May 2007 07:27:38 -0700, Barry Bulsara wrote:

> On May 16, 12:55 pm, Mladen Gogala <mgogala.SPAM_ME...._at_verizon.net>
> wrote:

>> On Wed, 16 May 2007 02:35:04 -0700, mjmather wrote:
>> > Anyone have suggestions, comments or considerations?
>>
>> Cluster the tables.
>>
>> --http://www.mladen-gogala.com

>
> Mladen, I was not aware that this until you wrote this. Is the reason
> for you saying cluster the tables to reduce IO as written on this web
> page.
>
> http://www.dba-oracle.com/oracle_tip_hash_index_cluster_table.htm
>
> If so, I have learned something new today. I am a developer often uses
> Oracle, I am not a DBA but I do write database scripts. Thank you
> Barry

Barry, there are two types of clustering. Hash clusters are used as method for cheating and having bitmap indexes without bitmap indexes in an OLTP environment. Index clusters are standard way of packing the rows from related tables together, so that you can retrieve rows from both tables at the same time and thus decrease the time needed for query. Oracle uses index clusters in the data dictionary:

SQL> select owner,cluster_name,tablespace_name from dba_clusters;

OWNER                CLUSTER_NAME                   TABLESPACE
-------------------- ------------------------------ ----------
SYS                  C_COBJ#                        SYSTEM
SYS                  C_TS#                          SYSTEM
SYS                  C_FILE#_BLOCK#                 SYSTEM
SYS                  C_USER#                        SYSTEM
SYS                  C_OBJ#                         SYSTEM
SYS                  C_MLOG#                        SYSTEM
SYS                  C_TOID_VERSION#                SYSTEM
SYS                  C_RG#                          SYSTEM
SYS                  C_OBJ#_INTCOL#                 SYSTEM
SYS                  SMON_SCN_TO_TIME               SYSTEM

Of course, meddling with these clusters would be ill advised and would render your database unsupported. Here is the list of tables stored in the clusters:
SQL> select cluster_name,table_name from dba_tables 2 where cluster_name > 'C'
3 order by cluster_name;

CLUSTER_NAME                   TABLE_NAME
------------------------------ -------------------------
C_COBJ#                        CDEF$
C_COBJ#                        CCOL$
C_FILE#_BLOCK#                 SEG$
C_FILE#_BLOCK#                 UET$
C_MLOG#                        MLOG$
C_MLOG#                        SLOG$
C_OBJ#                         COLTYPE$
C_OBJ#                         ICOL$
C_OBJ#                         IND$
C_OBJ#                         COL$
C_OBJ#                         CLU$
C_OBJ#                         TAB$
C_OBJ#                         LOB$
C_OBJ#                         SUBCOLTYPE$
C_OBJ#                         ATTRCOL$
C_OBJ#                         VIEWTRCOL$
C_OBJ#                         TYPE_MISC$
C_OBJ#                         NTAB$
C_OBJ#                         REFCON$
C_OBJ#                         OPQTYPE$
C_OBJ#                         ICOLDEP$
C_OBJ#                         LIBRARY$
C_OBJ#_INTCOL#                 HISTGRM$
C_RG#                          RGCHILD$
C_RG#                          RGROUP$
C_TOID_VERSION#                RESULT$
C_TOID_VERSION#                COLLECTION$
C_TOID_VERSION#                METHOD$
C_TOID_VERSION#                TYPE$
C_TOID_VERSION#                ATTRIBUTE$
C_TOID_VERSION#                PARAMETER$
C_TS#                          TS$
C_TS#                          FET$
C_USER#                        USER$
C_USER#                        TSQ$
SMON_SCN_TO_TIME               SMON_SCN_TIME

36 rows selected.

This is an oracle 10.2.0.3 database. Look at the C_OBJ# cluster and see how many tables are in there. Here is what you have in $ORACLE_HOME/rdbms/admin/sql.bsq:

create cluster c_obj# (obj# number)

  pctfree 5 size 800                           /* don't waste too much

space */

  /* A table of 32 cols, 2 index, 2 col per index requires about 2K.
   * A table of 10 cols, 2 index, 2 col per index requires about 750.
   */

storage (initial 130K next 200k maxextents unlimited pctincrease 0) /* avoid space management during IOR I */ /
create index i_obj# on cluster c_obj#

You can also test for yourself:

SQL> create cluster demo(deptno number(2)) 2 pctfree 25 size 2000;

Cluster created.

SQL> create index demo_deptno_i on cluster demo;

Index created.

Then add the two well known tables:

CREATE TABLE EMP
(EMPNO NUMBER(4,0),

	ENAME VARCHAR2(10), 
	JOB VARCHAR2(9), 
	MGR NUMBER(4,0), 
	HIREDATE DATE, 
	SAL NUMBER(7,2), 
	COMM NUMBER(7,2), 
	DEPTNO NUMBER(2,0), 
	 CONSTRAINT PK_EMP PRIMARY KEY (EMPNO)

) cluster demo(deptno);
CREATE TABLE DEPT
(DEPTNO NUMBER(2,0),

	DNAME VARCHAR2(14), 
	LOC VARCHAR2(13), 
	CONSTRAINT PK_DEPT PRIMARY KEY (DEPTNO)

) cluster demo(deptno);

And finally, insert the rows:
SQL> insert into emp select * from scott.emp;

14 rows created.

SQL> insert into dept select * from scott.dept;

4 rows created.

SQL> commit;

Commit complete.

Here is what you get. First, consider the classical, non-clustered setup:

SQL> connect scott/tiger
Connected.

SQL> set timing on
SQL> set autotrace on explain;
SQL> select ename,job,dname,loc

  2 from emp e,dept d
  3 where e.deptno=d.deptno
  4 order by e.deptno,e.sal desc;

ENAME JOB DNAME LOC
---------- --------- -------------- -------------

KING       PRESIDENT ACCOUNTING     NEW YORK
CLARK      MANAGER   ACCOUNTING     NEW YORK
MILLER     CLERK     ACCOUNTING     NEW YORK
FORD       ANALYST   RESEARCH       DALLAS
SCOTT      ANALYST   RESEARCH       DALLAS
JONES      MANAGER   RESEARCH       DALLAS
ADAMS      CLERK     RESEARCH       DALLAS
SMITH      CLERK     RESEARCH       DALLAS
BLAKE      MANAGER   SALES          CHICAGO
ALLEN      SALESMAN  SALES          CHICAGO
TURNER     SALESMAN  SALES          CHICAGO
WARD       SALESMAN  SALES          CHICAGO
MARTIN     SALESMAN  SALES          CHICAGO
JAMES      CLERK     SALES          CHICAGO

14 rows selected.

Elapsed: 00:00:00.18

Execution Plan

Plan hash value: 1736908262

| Id  | Operation                     | Name    | Rows  | Bytes | Cost (%
CPU)| T
ime     |

--------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 14 | 574 | 7 (29)| 0
0:00:01 |

|   1 |  SORT ORDER BY                |         |    14 |   574 |     7

(29)| 0
0:00:01 |

|   2 |   MERGE JOIN                  |         |    14 |   574 |     6

(17)| 0
0:00:01 |

| 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 80 | 2 (0)| 0
0:00:01 |

| 4 | INDEX FULL SCAN | PK_DEPT | 4 | | 1 (0)| 0
0:00:01 |

|*  5 |    SORT JOIN                  |         |    14 |   294 |     4

(25)| 0
0:00:01 |

| 6 | TABLE ACCESS FULL | EMP | 14 | 294 | 3 (0)| 0
0:00:01 |

This is a classic merge join. Now, let's see the same query in the clustered setting:
SQL> connect /
Connected.

SQL> set timing on
SQL> set autotrace on explain;
SQL> select ename,job,dname,loc

  2 from emp e,dept d
  3 where e.deptno=d.deptno
  4 order by e.deptno,e.sal desc;

ENAME JOB DNAME LOC
---------- --------- -------------- -------------

KING       PRESIDENT ACCOUNTING     NEW YORK
CLARK      MANAGER   ACCOUNTING     NEW YORK
MILLER     CLERK     ACCOUNTING     NEW YORK
SCOTT      ANALYST   RESEARCH       DALLAS
FORD       ANALYST   RESEARCH       DALLAS
JONES      MANAGER   RESEARCH       DALLAS
ADAMS      CLERK     RESEARCH       DALLAS
SMITH      CLERK     RESEARCH       DALLAS
BLAKE      MANAGER   SALES          CHICAGO
ALLEN      SALESMAN  SALES          CHICAGO
TURNER     SALESMAN  SALES          CHICAGO
WARD       SALESMAN  SALES          CHICAGO
MARTIN     SALESMAN  SALES          CHICAGO
JAMES      CLERK     SALES          CHICAGO

14 rows selected.

Elapsed: 00:00:00.07

Execution Plan

Plan hash value: 1859139630

| 0 | SELECT STATEMENT | | 14 | 966 | 7 (15)| 00
:00:01 |

| 1 | SORT ORDER BY | | 14 | 966 | 7 (15)| 00
:00:01 |

| 2 | NESTED LOOPS | | 14 | 966 | 6 (0)| 00
:00:01 |

| 3 | TABLE ACCESS FULL | DEPT | 4 | 120 | 3 (0)| 00
:00:01 |

| 4 | TABLE ACCESS CLUSTER| EMP | 4 | 156 | 1 (0)| 00
:00:01 |

|* 5 | INDEX UNIQUE SCAN | DEMO_DEPTNO_I | 1 | | 0 (0)| 00
:00:01 |

Predicate Information (identified by operation id):

5 - access("E"."DEPTNO"="D"."DEPTNO")

Note

dynamic sampling used for this statement

SQL> Now, observe that for this plan we do not have 2 full table scans. Also, the execution was 2.5 times faster then non-clustered setup. Granted, both times are sub-second times but it shows you the kind of effect that such restructuring might have.

Trade-off comes in two forms:
1) You need to do careful space management. You don't want any chained

rows.
2) Updating cluster key for a table becomes exorbitantly expensive. Row

gets physically copied to another block.

So, you already do have some clusters in the database that you can study (although changing those wouldn't be very wise thing to do) and it is easy enough to create your own and play with them.

More about clusters can be found at: http://tinyurl.com/3do9kn

To reorganize table data for large production tables it would definitely be advisable to ask your DBA for help.

-- 
http://www.mladen-gogala.com

Received on Wed May 16 2007 - 22:33:04 CDT