Skip navigation.

DBA Blogs

#GoldenGate Bound Recovery

DBASolved - Sun, 2014-08-03 19:40

Every once in awhile when I restart an extract, I see entries in the report file that reference “Bounded Recovery”.  What exactly is “Bounded Recovery”?

First, keep in mind that “Bounded Recovery” is only for Oracle databases!

Second, according to the documentation, “Bounded Recovery” is a component of the general extract checkpointing facility.  This component of the extract guarantees an efficient recovery after an extract stops for any reason, no matter how many uncommitted transactions are currently outstanding.  The Bounded Recovery parameter sets an upper boundary for the maximum amount of time that an extract is needed to recover to the point where it stopped before continuing normal processing.

The default settings for “Bounded Recovery” is set to 4 hours and needed recovery information is cached in the OGG_HOME/BR/<extract name> directory  This is verified when I look at the report file for my extract named EXT.


2014-07-21 17:26:30 INFO OGG-01815 Virtual Memory Facilities for: BR
 anon alloc: mmap(MAP_ANON) anon free: munmap
 file alloc: mmap(MAP_SHARED) file free: munmap
 target directories:
 /oracle/app/product/12.1.2/oggcore_1/BR/EXT.

Bounded Recovery Parameter:
BRINTERVAL = 4HOURS
BRDIR = /oracle/app/product/12.1.2/oggcore_1

According to documentation, the default setting so for “Bounded Recovery” should be sufficient for most environments.  It is also noted that the “Bounded Recovery” settings shouldn’t be changed without the guidance of Oracle Support.

Now that the idea of a “Bounded Recovery” has been established, lets try to understand a bit more about how a transaction is recovered in Oracle GoldenGate with the “Bounded Recovery” feature.

At the start of a transaction, Oracle GoldenGate must cache the transaction (even if it contains no data).  The reason for this is due to the need to support future operations of a transaction.  If the extract hits a committed transaction, then the cached transaction is written to the trail file and clears the transaction from memory.  If the extract hits a rollback, then the cached transaction is discarded from memory.  As long as a an extract is processing a transaction, before a commit or rollback, the transaction is considered an open transaction and will be collected.  If the extract is stopped before it encounters a commit or rollback, the extract needs all of the cached transaction information recovered before the extract can start.  This approach applies to all transactions that were open at the time of the extract being stopped.

There are three ways that an extract performs recovery:

  1. No open transactions when extract is stopped, the recovery begins at the current extract read checkpoint (Normal recovery)
  2. Open transactions whose start points in the log were very close in time to the time when the extracted was stopped, the extract begins its recovery by re-reading the logs from the beginning of the oldest open transaction (Considered a normal recovery)
  3. One or more open transactions that extract qualified as long-running open transactions, extract begins recovery (Bounded Recovery)

What defines a long-running transaction for Oracle GoldenGate?

Transactions in Oracle GoldenGate are long-running if the transaction has been open longer than one (1) “Bounded Recovery” interval.

A “bounded recovery interval” is the amount of time between “Bounded Recovery checkpoints” which persists the current state and data of the extract to disk.  “Bounded Recovery checkpoints” are used to identify a recovery position between tow “Bounded Recovery intervals”.  The extract will pick up from the last “bounded recovery checkpoint”, instead of processing from the log position where the open long-running transaction first appeared.

What is the maximum Bounded Recovery time?

The maximum bounded recovery time is no more than twice the current “Bounded Recovery checkpoint” interval.  However, the actual recovery time will be dictated by the following:

  1. The time from the last valid Bounded Recovery interval to when the extract was stopped
  2. Utilization of the extract in that period
  3. The percent of utilization for transaction that were previously written to the trail

Now that the basic details of “Bounded Recovery” have been discussed.  How can the settings for “Bounded Recovery” be changed?

“Bounded Recovery” can be changed by updating the extract parameter file with the following parameter:


BR
[, BRDIR directory]
[, BRINTERVAL number {M | H}]
[, BRKEEPSTALEFILES]
[, BROFF]
[, BROFFONFAILURE]
[, BRRESET]

As noted, there are a few options that can be set with the BR parameter.  If I wanted to shorten my “Bound Recovery” time and change directories where the cached information is stored I can do something similar to this:


--Bound Recovery
BR BRDIR ./dirbr BRINTERVAL 20M

In the example above, I’m changing the directory to a new directory called DIRBR (created manually as part of subdirs).  I also changed the interval from 4 hours to 20 minutes.

Note: 20 minutes is the smallest accepted time for the BRINTERVAL parameter.

After adding the BR parameter with options to the extract, the extract needs to be restarted.  Once the extract is up and running, the report file for the extract can be checked to verify that the new parameters have been taken.


2014-08-02 22:20:54  INFO    OGG-01815  Virtual Memory Facilities for: BR
    anon alloc: mmap(MAP_ANON)  anon free: munmap
    file alloc: mmap(MAP_SHARED)  file free: munmap
    target directories:
    /oracle/app/product/12.1.2/oggcore_1/dirbr/BR/EXT.

Bounded Recovery Parameter:
BRINTERVAL = 20M
BRDIR      = ./dirbr

Hopefully, this post provided a better understanding of one least understood option within Oracle GoldenGate.

Enjoy!!

About me: http://about.me/dbasolved


Filed under: Golden Gate
Categories: DBA Blogs

GI Commands : 2 -- Managing the Local and Cluster Registry

Hemant K Chitale - Sun, 2014-08-03 07:51
In 11gR2

There are essentially two registries, the Local Registry and the Cluster Registry.

Let's check the Local Registry :

[root@node1 ~]# cat /etc/oracle/olr.loc
olrconfig_loc=/u01/app/grid/11.2.0/cdata/node1.olr
crs_home=/u01/app/grid/11.2.0
[root@node1 ~]#
[root@node1 ~]# file /u01/app/grid/11.2.0/cdata/node1.olr
/u01/app/grid/11.2.0/cdata/node1.olr: data
[root@node1 ~]#

So, like the Cluster Registry, the Local Registry is a binary file.  It is on a local filesystem on the node, not on ASM/NFS/CFS.  Each node in the cluster has its own Local Registry.

The Local Registry can be checked for consistency (corruption) using ocrcheck with the "-local" flag.  Note : As demonstrated in my previous post, the root account must be used for the check.

[root@node1 ~]# su - grid
-sh-3.2$ su
Password:
[root@node1 grid]# ocrcheck -local
Status of Oracle Local Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 2696
Available space (kbytes) : 259424
ID : 1388021147
Device/File Name : /u01/app/grid/11.2.0/cdata/node1.olr
Device/File integrity check succeeded

Local registry integrity check succeeded

Logical corruption check succeeded

[root@node1 grid]#

Now let's look at the Cluster Registry :

[root@node1 grid]# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3668
Available space (kbytes) : 258452
ID : 605940771
Device/File Name : +DATA
Device/File integrity check succeeded
Device/File Name : /fra/ocrfile
Device/File integrity check succeeded
Device/File Name : +FRA
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

[root@node1 grid]#

The Cluster Registry is distributed across two ASM DiskGroups (+DATA and +FRA) and one filesystem (/fra/ocrfile). Yes, this is a special case that I've created to distribute the OCR in this manner.

I cannot add the OCR to a location which is an ASM diskgroup with a lower asm.compatible.

[root@node1 grid]# ocrconfig -add +DATA2
PROT-30: The Oracle Cluster Registry location to be added is not accessible
PROC-8: Cannot perform cluster registry operation because one of the parameters is invalid.
ORA-15056: additional error message
ORA-17502: ksfdcre:4 Failed to create file +DATA2.255.1
ORA-15221: ASM operation requires compatible.asm of 11.1.0.0.0 or higher
ORA-06512: at line 4

[root@node1 grid]#

I now remove the filesystem copy of the OCR.

[root@node1 grid]# ocrconfig -delete /fra/ocrfile
[root@node1 grid]# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3668
Available space (kbytes) : 258452
ID : 605940771
Device/File Name : +DATA
Device/File integrity check succeeded
Device/File Name : +FRA
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

[root@node1 grid]#

Note, however, that the ocrconfig delete doesn't actually remove the filesystem file that I had created.

[root@node1 grid]# ls -l /fra/ocrfile
-rw-r--r-- 1 root root 272756736 Aug 3 21:27 /fra/ocrfile
[root@node1 grid]# rm /fra/ocrfile
rm: remove regular file `/fra/ocrfile'? yes
[root@node1 grid]#

I will now add a filesystem location for the OCR.

[root@node1 grid]# touch /fra/new_ocrfile
[root@node1 grid]# ocrconfig -add /fra/new_ocrfile
[root@node1 grid]# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3668
Available space (kbytes) : 258452
ID : 605940771
Device/File Name : +DATA
Device/File integrity check succeeded
Device/File Name : +FRA
Device/File integrity check succeeded
Device/File Name : /fra/new_ocrfile
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

[root@node1 grid]# ls -l /fra/new_ocrfile
-rw-r--r-- 1 root root 272756736 Aug 3 21:30 /fra/new_ocrfile
[root@node1 grid]#

What about OCR Backups ?  (Note : Oracle does frequent automatic backups of the OCR, but *not* of the OLR).
N.B. : This listing doesn't show all the OCR backups you'd expect because I don't have my cluster running continuously through all the days.

[root@node1 grid]# ocrconfig -showbackup

node1 2014/07/06 21:53:25 /u01/app/grid/11.2.0/cdata/rac/backup00.ocr

node1 2011/10/22 03:09:03 /u01/app/grid/11.2.0/cdata/rac/backup01.ocr

node1 2011/10/21 23:06:39 /u01/app/grid/11.2.0/cdata/rac/backup02.ocr

node1 2014/07/06 21:53:25 /u01/app/grid/11.2.0/cdata/rac/day.ocr

node1 2014/07/06 21:53:25 /u01/app/grid/11.2.0/cdata/rac/week.ocr

node1 2014/07/06 22:39:55 /u01/app/grid/11.2.0/cdata/rac/backup_20140706_223955.ocr

node1 2014/07/05 17:30:25 /u01/app/grid/11.2.0/cdata/rac/backup_20140705_173025.ocr

node1 2014/06/16 22:15:07 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221507.ocr

node1 2014/06/16 22:14:05 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221405.ocr

node1 2011/11/09 23:20:25 /u01/app/grid/11.2.0/cdata/rac/backup_20111109_232025.ocr
[root@node1 grid]#

Let me run an additional backup from node2.

[root@node2 grid]# ocrconfig -manualbackup

node1 2014/08/03 21:37:17 /u01/app/grid/11.2.0/cdata/rac/backup_20140803_213717.ocr

node1 2014/07/06 22:39:55 /u01/app/grid/11.2.0/cdata/rac/backup_20140706_223955.ocr

node1 2014/07/05 17:30:25 /u01/app/grid/11.2.0/cdata/rac/backup_20140705_173025.ocr

node1 2014/06/16 22:15:07 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221507.ocr

node1 2014/06/16 22:14:05 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221405.ocr
[root@node2 grid]#

We can see that the backup done today (03-Aug) is listed at the top.  Let's check a listing from node1

[root@node1 grid]# ocrconfig -showbackup

node1 2014/07/06 21:53:25 /u01/app/grid/11.2.0/cdata/rac/backup00.ocr

node1 2011/10/22 03:09:03 /u01/app/grid/11.2.0/cdata/rac/backup01.ocr

node1 2011/10/21 23:06:39 /u01/app/grid/11.2.0/cdata/rac/backup02.ocr

node1 2014/07/06 21:53:25 /u01/app/grid/11.2.0/cdata/rac/day.ocr

node1 2014/07/06 21:53:25 /u01/app/grid/11.2.0/cdata/rac/week.ocr

node1 2014/08/03 21:37:17 /u01/app/grid/11.2.0/cdata/rac/backup_20140803_213717.ocr

node1 2014/07/06 22:39:55 /u01/app/grid/11.2.0/cdata/rac/backup_20140706_223955.ocr

node1 2014/07/05 17:30:25 /u01/app/grid/11.2.0/cdata/rac/backup_20140705_173025.ocr

node1 2014/06/16 22:15:07 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221507.ocr

node1 2014/06/16 22:14:05 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221405.ocr
[root@node1 grid]#

Yes, the backup of 03-Aug is also listed.  But, wait ! Why is it on node1 ?  Let's go back to node2 and do a filesytem listing.

[root@node2 grid]# ls -l /u01/app/grid/11.2.0/cdata/rac/backup*
ls: /u01/app/grid/11.2.0/cdata/rac/backup*: No such file or directory
[root@node2 grid]#

Yes, as we've noticed. The backup doesn't really exist on node2.

[root@node1 grid]# ls -lt /u01/app/grid/11.2.0/cdata/rac/
total 114316
-rw------- 1 root root 8024064 Aug 3 21:37 backup_20140803_213717.ocr
-rw------- 1 root root 8003584 Jul 6 22:39 backup_20140706_223955.ocr
-rw------- 1 root root 8003584 Jul 6 21:53 day.ocr
-rw------- 1 root root 8003584 Jul 6 21:53 week.ocr
-rw------- 1 root root 8003584 Jul 6 21:53 backup00.ocr
-rw------- 1 root root 8003584 Jul 5 17:30 backup_20140705_173025.ocr
-rw------- 1 root root 7708672 Jun 16 22:15 backup_20140616_221507.ocr
-rw------- 1 root root 7708672 Jun 16 22:14 backup_20140616_221405.ocr
-rw------- 1 root root 7688192 Nov 9 2011 backup_20111109_232025.ocr
-rw------- 1 root root 7667712 Nov 9 2011 backup_20111109_230940.ocr
-rw------- 1 root root 7647232 Nov 9 2011 backup_20111109_230916.ocr
-rw------- 1 root root 7626752 Nov 9 2011 backup_20111109_224725.ocr
-rw------- 1 root root 7598080 Nov 9 2011 backup_20111109_222941.ocr
-rw------- 1 root root 7593984 Oct 22 2011 backup01.ocr
-rw------- 1 root root 7593984 Oct 21 2011 backup02.ocr
[root@node1 grid]#

Yes, *ALL* the OCR backups to date have been created on node1 -- even when executed from node2.  node1 is still the "master" node for OCR backups as long as it is up and running.  I shut down Grid Infrastructure on node1.

[root@node1 grid]# crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node1'
CRS-2673: Attempting to stop 'ora.crsd' on 'node1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'node1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'node1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'node1'
CRS-2673: Attempting to stop 'ora.racdb.new_svc.svc' on 'node1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'node1'
CRS-2673: Attempting to stop 'ora.cvu' on 'node1'
CRS-2673: Attempting to stop 'ora.oc4j' on 'node1'
CRS-2673: Attempting to stop 'ora.gns' on 'node1'
CRS-2677: Stop of 'ora.cvu' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.cvu' on 'node2'
CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.scan3.vip' on 'node1'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'node1' succeeded
CRS-2677: Stop of 'ora.scan3.vip' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.scan3.vip' on 'node2'
CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.scan2.vip' on 'node1'
CRS-2677: Stop of 'ora.scan2.vip' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.scan2.vip' on 'node2'
CRS-2676: Start of 'ora.cvu' on 'node2' succeeded
CRS-2677: Stop of 'ora.racdb.new_svc.svc' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.node1.vip' on 'node1'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'node1'
CRS-2673: Attempting to stop 'ora.registry.acfs' on 'node1'
CRS-2673: Attempting to stop 'ora.racdb.db' on 'node1'
CRS-2677: Stop of 'ora.node1.vip' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.node1.vip' on 'node2'
CRS-2676: Start of 'ora.scan3.vip' on 'node2' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN3.lsnr' on 'node2'
CRS-2676: Start of 'ora.scan2.vip' on 'node2' succeeded
CRS-2677: Stop of 'ora.registry.acfs' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN2.lsnr' on 'node2'
CRS-2677: Stop of 'ora.gns' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.gns.vip' on 'node1'
CRS-2677: Stop of 'ora.gns.vip' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.gns.vip' on 'node2'
CRS-2676: Start of 'ora.node1.vip' on 'node2' succeeded
CRS-2676: Start of 'ora.gns.vip' on 'node2' succeeded
CRS-2672: Attempting to start 'ora.gns' on 'node2'
CRS-2676: Start of 'ora.LISTENER_SCAN3.lsnr' on 'node2' succeeded
CRS-2676: Start of 'ora.LISTENER_SCAN2.lsnr' on 'node2' succeeded
CRS-2677: Stop of 'ora.racdb.db' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.DATA1.dg' on 'node1'
CRS-2673: Attempting to stop 'ora.DATA2.dg' on 'node1'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'node1'
CRS-2676: Start of 'ora.gns' on 'node2' succeeded
CRS-2677: Stop of 'ora.DATA1.dg' on 'node1' succeeded
CRS-2677: Stop of 'ora.DATA2.dg' on 'node1' succeeded
CRS-2677: Stop of 'ora.oc4j' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.oc4j' on 'node2'
CRS-2676: Start of 'ora.oc4j' on 'node2' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'node1' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'node1'
CRS-2677: Stop of 'ora.asm' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'node1'
CRS-2677: Stop of 'ora.ons' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'node1'
CRS-2677: Stop of 'ora.net1.network' on 'node1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'node1' has completed
CRS-2677: Stop of 'ora.crsd' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on 'node1'
CRS-2673: Attempting to stop 'ora.crf' on 'node1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'node1'
CRS-2673: Attempting to stop 'ora.evmd' on 'node1'
CRS-2673: Attempting to stop 'ora.asm' on 'node1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'node1'
CRS-2677: Stop of 'ora.crf' on 'node1' succeeded
CRS-2677: Stop of 'ora.asm' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'node1'
CRS-2677: Stop of 'ora.evmd' on 'node1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'node1' succeeded
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'node1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'node1'
CRS-2677: Stop of 'ora.drivers.acfs' on 'node1' succeeded
CRS-2677: Stop of 'ora.cssd' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'node1'
CRS-2673: Attempting to stop 'ora.diskmon' on 'node1'
CRS-2677: Stop of 'ora.gipcd' on 'node1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'node1'
CRS-2677: Stop of 'ora.gpnpd' on 'node1' succeeded
CRS-2677: Stop of 'ora.diskmon' on 'node1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'node1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@node1 grid]#

So, all the Grid Infrastructure services are down on node1. I will run an OCR Backup from node2 and verify it's location.

[root@node2 grid]# ocrconfig -manualbackup

node2 2014/08/03 21:49:02 /u01/app/grid/11.2.0/cdata/rac/backup_20140803_214902.ocr

node1 2014/08/03 21:37:17 /u01/app/grid/11.2.0/cdata/rac/backup_20140803_213717.ocr

node1 2014/07/06 22:39:55 /u01/app/grid/11.2.0/cdata/rac/backup_20140706_223955.ocr

node1 2014/07/05 17:30:25 /u01/app/grid/11.2.0/cdata/rac/backup_20140705_173025.ocr

node1 2014/06/16 22:15:07 /u01/app/grid/11.2.0/cdata/rac/backup_20140616_221507.ocr
[root@node2 grid]# ls -l /u01/app/grid/11.2.0/cdata/rac/backup*
-rw------- 1 root root 8024064 Aug 3 21:49 /u01/app/grid/11.2.0/cdata/rac/backup_20140803_214902.ocr
[root@node2 grid]#

Yes, the backup got created on node2 now.

Question : Would there have been a way to create a backup on node2 without shutting down node1 ?

.
.
.

Categories: DBA Blogs

New in Oracle 12c: Querying an Associative Array in PL/SQL Programs

Galo Balda's Blog - Sat, 2014-08-02 16:23

I was aware that up to Oracle 11g, a PL/SQL program wasn’t allowed use an associative array in a SQL statement. This is what happens when I try to do it.

SQL> drop table test_array purge;

Table dropped.

SQL> create table test_array as
  2  select level num_col from dual
  3  connect by level <= 10;

Table created.

SQL> select * from test_array;

   NUM_COL
----------
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10                                                                      

10 rows selected.

SQL> drop package PKG_TEST_ARRAY;

Package dropped.

SQL> create or replace package PKG_TEST_ARRAY as
  2
  3    type tab_num is table of number index by pls_integer;
  4
  5  end PKG_TEST_ARRAY;
  6  /

Package created.

SQL> declare
  2    my_array pkg_test_array.tab_num;
  3  begin
  4    for i in 1 .. 5 loop
  5      my_array(i) := i*2;
  6    end loop;
  7
  8    for i in (
  9              select num_col from test_array
 10              where num_col in (select * from table(my_array))
 11             )
 12    loop
 13      dbms_output.put_line(i.num_col);
 14    end loop;
 15  end;
 16  /
            where num_col in (select * from table(my_array))
                                                  *
ERROR at line 10:
ORA-06550: line 10, column 51:
PLS-00382: expression is of wrong type
ORA-06550: line 10, column 45:
PL/SQL: ORA-22905: cannot access rows from a non-nested table item
ORA-06550: line 9, column 13:
PL/SQL: SQL Statement ignored
ORA-06550: line 13, column 26:
PLS-00364: loop index variable 'I' use is invalid
ORA-06550: line 13, column 5:
PL/SQL: Statement ignored

As you can see, the TABLE operator is expecting either a nested table or a varray.

The limitation has been removed in Oracle 12c. This is what happens now.

SQL> set serveroutput on
SQL>
SQL> drop table test_array purge;

Table dropped.

SQL> create table test_array as
  2  select level num_col from dual
  3  connect by level <= 10;

Table created.

SQL> select * from test_array;

   NUM_COL
----------
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10                                                                      

10 rows selected.

SQL> drop package PKG_TEST_ARRAY;

Package dropped.

SQL> create or replace package PKG_TEST_ARRAY as
  2
  3    type tab_num is table of number index by pls_integer;
  4
  5  end PKG_TEST_ARRAY;
  6  /

Package created.

SQL> declare
  2    my_array pkg_test_array.tab_num;
  3  begin
  4    for i in 1 .. 5 loop
  5      my_array(i) := i*2;
  6    end loop;
  7
  8    for i in (
  9              select num_col from test_array
 10              where num_col in (select * from table(my_array))
 11             )
 12    loop
 13      dbms_output.put_line(i.num_col);
 14    end loop;
 15  end;
 16  /

2
4
6
8
10                                                                              

PL/SQL procedure successfully completed.

Here’s another example using a slightly different query.

SQL> declare
  2    my_array pkg_test_array.tab_num;
  3  begin
  4    for i in 1 .. 5 loop
  5      my_array(i) := i*2;
  6    end loop;
  7
  8    for i in (
  9              select a.num_col, b.column_value
 10              from
 11                test_array a,
 12                table (my_array) b
 13              where
 14                a.num_col = b.column_value
 15             )
 16    loop
 17      dbms_output.put_line(i.num_col);
 18    end loop;
 19  end;
 20  /

2
4
6
8
10                                                                              

PL/SQL procedure successfully completed.

Very nice stuff.


Filed under: 12C, PL/SQL Tagged: 12C, PL/SQL
Categories: DBA Blogs

Common Roles get copied upon plug-in with #Oracle Multitenant

The Oracle Instructor - Fri, 2014-08-01 08:51

What happens when you unplug a pluggable database that has local users who have been granted common roles? They get copied upon plug-in of the PDB to the target container database!

Before Unplug of the PDBThe picture above shows the situation before the unplug command. It has been implemented with these commands:

 

SQL> connect / as sysdba
Connected.
SQL> create role c##role container=all;

Role created.

SQL> grant select any table to c##role container=all;

Grant succeeded.

SQL> connect sys/oracle_4U@pdb1 as sysdba
Connected.
SQL> grant c##role to app;

Grant succeeded.



SQL> grant create session to app;

Grant succeeded.

The local user app has now been granted the common role c##role. Let’s assume that the application depends on the privileges inside the common role. Now the pdb1 is unplugged and plugged in to cdb2:

SQL> shutdown immediate
Pluggable Database closed.
SQL> connect / as sysdba
Connected.
SQL> alter pluggable database pdb1 unplug into '/home/oracle/pdb1.xml';

Pluggable database altered.

SQL> drop pluggable database pdb1;

Pluggable database dropped.

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
[oracle@EDE5R2P0 ~]$ . oraenv
ORACLE_SID = [cdb1] ? cdb2
The Oracle base for ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1 is /u01/app/oracle
[oracle@EDE5R2P0 ~]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.1.0 Production on Tue Jul 29 12:52:19 2014

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> create pluggable database pdb1 using '/home/oracle/pdb1.xml' nocopy;

Pluggable database created.

SQL> alter pluggable database pdb1 open;

Pluggable database altered.

SQL> connect app/app@pdb1
Connected.
SQL> select * from scott.dept;

    DEPTNO DNAME          LOC
---------- -------------- -------------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON

SQL> select * from session_privs;

PRIVILEGE
----------------------------------------
CREATE SESSION
SELECT ANY TABLE

SQL> connect / as sysdba
Connected.

SQL> select role,common from cdb_roles where role='C##ROLE';

ROLE
--------------------------------------------------------------------------------
COM
---
C##ROLE
YES

As seen above, the common role has been copied upon the plug-in like the picture illustrates:
After plug-in of the PDBNot surprisingly the local user app together with the local privilege CREATE SESSION was moved to the target container database. But it is not so obvious that the common role is copied then to the target CDB. This is something I found out during delivery of a recent Oracle University LVC about 12c New Features, thanks to a question of one attendee. My guess was it will lead to an error upon unplug, but this test-case proves it doesn’t. I thought that behavior may be of interest to the Oracle Community. As always: Don’t believe it, test it! :-)


Tagged: 12c New Features, Multitenant
Categories: DBA Blogs

Log Buffer #382, A Carnival of the Vanities for DBAs

Pythian Group - Fri, 2014-08-01 07:41

Leading the way are the blogs which are acting as beacons of information guiding the way towards new vistas of innovation. This Log Buffer edition appreciates that role and presents you with few of those blogs.

Oracle:

Is there any recommended duration after which Exalytics Server should be rebooted for optimal performance of Server?

GlassFish On the Cloud Consulting Services by C2B2

This introduction to SOA Governance series contains two videos. The first one explains SOA Governance and why we need it by using a case study. The second video introduces Oracle Enterprise Repository (OER), and how it can help with SOA Governanc.

Oracle BI APPs provide two data warehouse generated fiscal calendars OOTB.

If you’re a community manager who’s publishing, monitoring, engaging, and analyzing communities on multiple social networks manually and individually, you need a hug.

SQL Server:

Spackle: Making sure you can connect to the DAC

Test-Driven Development (TDD) has a misleading name, because the objective is to design and specify that the system you are developing behaves in the ways that the customer expects, and to prove that it does so for the lifetime of the system.

Set a security standard across environments that developers can see and run, but not change.

Resilient T-SQL code is code that is designed to last, and to be safely reused by others. The goal of defensive database programming, the goal of this book, is to help you to produce resilient T-SQL code that robustly and gracefully handles cases of unintended use, and is resilient to common changes to the database environment.

One option to get notified when TempDB grows is to create a SQL Alert to fire a SQL Agent Job that will automatically send an email alerting the DBA when the Tempdb reaches a specific file size.

MySQL:

By default when using MySQL’s standard replication, all events are logged in the binary log and those binary log events are replicated to all slaves (it’s possible to filter out some schema).

Testing MySQL repository packages: how we make sure they work for you

If your project does not have something that you can adapt that quote to, odds are your testing is inadequate.

Compare and Synchronize with Updated Comparison Tools!

Beyond the FRM: ideas for a native MySQL Data Dictionary.

Categories: DBA Blogs

Oracle Database 12c Release 1 (12.1.0.2) Generally Available

Oracle Database Server 12.1.0.2.0 is now generally available under Oracle Software Delivery Website edelivery.oracle.com and...

We share our skills to maximize your revenue!
Categories: DBA Blogs

SQL Server and OS Error 1117, Error 9001, Error 823

Pythian Group - Thu, 2014-07-31 08:32

small__3212904193 Along with other administrators, life of us, the DBAs are no different but full of adventure.  At times, we encounter an issue which is very new for us, rather, one that we have not faced in the past.  Today, I will be writing about such case.  Not so long back, in the beginning of June, I was having my morning tea I got a page from a customer we normally do not receive pages from. While I was analyzing the error logs, I noticed several lines of error like the ones below:

2014-06-07 21:03:40.57 spid6s Error: 17053, Severity: 16, State: 1.
LogWriter: Operating system error 21(The device is not ready.) encountered.
2014-06-07 21:03:40.57 spid6s Write error during log flush.
2014-06-07 21:03:40.57 spid67 Error: 9001, Severity: 21, State: 4.
The log for database 'SSCDB' is not available. Check the event log for related error messages. Resolve any errors and restart the database.
2014-06-07 21:03:40.58 spid67 Database SSCDB was shutdown due to error 9001 in routine 'XdesRMFull::Commit'. Restart for non-snapshot databases will be attempted after all connections to the database are aborted.
2014-06-07 21:03:40.65 spid25s Error: 17053, Severity: 16, State: 1.
fcb::close-flush: Operating system error (null) encountered.
2014-06-07 21:03:40.65 spid25s Error: 17053, Severity: 16, State: 1.
fcb::close-flush: Operating system error (null) encountered.

I had never seen this kind of error in the past so my next step was to check Google , which returned too many results. There were two sites that were worthwhile: The first site covers the OS Error 1117 , a Microsoft KB article, whereas the second site by Erin Stellato ( B | T ) talks about other errors like Error 823, Error 9001.  Further, I checked the server details and found that it’s exactly what the issue is here,  the server is using  PVSCSI (Para Virtualized SCSI) controller to LSI on the VMWare host. 

Resolving the issue

I had a call with client and have his consent to restart the service. This was quick, and after it came back, I ran checkdb – “We are good!” I thought.

But wait. This was the temporary fix. Yes, you read that correctly. This was the temporary fix, and this issue is actually lies with the VMWare, it’s a known issue according to VMWare KB Article. To fix this issue, we’ll have to upgrade to vSphere 5.1 according to the VMWare KB article.

Please be advised that the first thing that I did here is to apply the temporary fix, the root cause analysis – I did that last, after the server is up and running fine.

photo credit: Andreas.  via photopin CC

Categories: DBA Blogs

Interesting Behavior of MaxCmdsInTran Parameter

Pythian Group - Wed, 2014-07-30 10:06

I recently worked on transactional replication issue and discovered interesting behavior of the log reader agent switch called MaxCmdsInTran and wanted to share it with you guys.

Lets take a look at  the use of this switch by looking at the msdn documentation below,

MaxCmdsInTran number_of_commands

Specifies the maximum number of statements grouped into a transaction as the Log Reader writes commands to the distribution database. Using this parameter allows the Log Reader Agent and Distribution Agent to divide large transactions (consisting of many commands) at the Publisher into several smaller transactions when applied at the Subscriber. Specifying this parameter can reduce contention at the Distributor and reduce latency between the Publisher and Subscriber. Because the original transaction is applied in smaller units, the Subscriber can access rows of a large logical Publisher transaction prior to the end of the original transaction, breaking strict transnational atomicity. The default is 0, which preserves the transaction boundaries of the Publisher.

However, I observed that if you do any update on Primary Column which won’t be split into multiple smaller transactions as described in the documentation.

Looking further on this reveals that  it probably the effect of bounded update. Bounded update has to be processed as a whole since it send all delete followed by all insert, can’t break into smaller transactions as it won’t know what would be a safe boundary.

The key difference comes from the fact that how updates are replicated when you update PK column and non-PK column. Let’s take an example to look at this (In this example C1 is non-PK and C2 is PK column)

If you update the non-PK column it replicated as update.

– Updating non-PK column

begin tran My_Deferred_Update_1_Row

update T1 set c1 = 1 where C1=2

commit tran My_Deferred_Update_1_Row

– Below is what gets added in msrepl_commands

exec sp_replshowcmds 1000

xact_seqno                                      command

0x0000016E000005330004 {CALL [dbo].[sp_MSupd_dbot1] (1,,2,0×01)}

What is bounded update?

However when you do a update on PK/Clustered index columns are replicated as Delete/Insert pair.

– Updating unique column

begin tran My_Bounded_Update_2_Rows

update T1 set c2 = c2 + 1000

commit tran My_Bounded_Update_2_Rows

– Below is what get added in msrepl_commands

exec sp_replshowcmds 1000

xact_seqno                                        command

0x00000017000000B5000E  {CALL?[dbo].[sp_MSdel_dboT1] (1)}

0x00000017000000B5000E  {CALL?[dbo].[sp_MSdel_dboT1] (2)}

0x00000017000000B5000E  {CALL?[dbo].[sp_MSins_dboT1] (1,3000,1)}

0x00000017000000B5000E  {CALL?[dbo].[sp_MSins_dboT1] (2,1002,2)}

As you can see in above case when we do update on PK/clustered index column, the updates are sent as deletes followed by inserts( this is called bounded update). This is one single transaction which is converted into delete and update pair. All deletes are sent first followed by insert.

We cannot break this transaction (PK update) as it will cause the delete (few or all) to happen first and then insert in separate transaction and will break transaction boundary, breaking this operation into multiple transaction will cause inconsistency and that’s most probably reason for this switch won’t work in this situation.

Why replication sending all deletes first and then all inserts and not the pairs delete/insert in order?

Let’s assume table A contains two rows, unique column C1 values being 1 and 2.

Now user runs the following: update A set c1 = c1 + 1.

The log records will be like

LOP_BEGIN_UPDATE

Del 1

Ins 2

Del 2

Ins 3

LOP_END_UPDATE

And the commands posted in the distribution database will be like

{CALL [sp_MSdel_dboA] (1)}

{CALL [sp_MSdel_dboA] (2)}

{CALL [sp_MSins_dboA] (1,2)}

{CALL [sp_MSins_dboA] (2,3)}

But if its send update directly, you’ll see

Update A set c1 = 2

Update A set c1 = 3

In that case, the first update will fail since c1 = 2 already exist. that’s why it deletes the row first before inserting them back to the new value.

I would recommend to look at the option of publishing the stored procedure execution to avoid this kind of huge updates which will cause performance issues in replication.

Happy Reading!

 

 

Categories: DBA Blogs

In-Memory Column Store: 10046 May Be Lying to You!

Pythian Group - Wed, 2014-07-30 07:46

The Oracle In-Memory Column Store (IMC) is a new database option available to Oracle Database Enterprise Edition (EE) customers. It introduces a new memory area housed in your SGA, which makes use of the new compression functionality brought by the Oracle Exadata platform, as well as the new column oriented data storage vs the traditional row oriented storage. Note: you don’t need to be running on Exadata to be able to use the IMC!

 

Part I – How does it work?

In this part we’ll take a peek under the hood of the IMC and check out some of its internal mechanics.

Let’s create a sample table which we will use for our demonstration:


create table test inmemory priority high
as
select a.object_name as name, rownum as rn,
sysdate + rownum / 10000 as dt
from all_objects a, (select rownum from dual connect by level <= 500)
/

Almost immediately upon creating this table, the w00? processes will wake up from sleeping on the event ‘Space Manager: slave idle wait’ and start their analysis to check out the new table. By the way, the sleep times for this event are between 3 and 5 seconds, so it’s normal if you experience a little bit of a delay.

The process who picked it up will then create a new entry in the new dictionary table compression$, such as this one:

SQL> exec pt('select ts#,file#,block#,obj#,dataobj#,ulevel,sublevel,ilevel,flags,bestsortcol, tinsize,ctinsize,toutsize,cmpsize,uncmpsize,mtime,spare1,spare2,spare3,spare4 from compression$');
TS# : 4
FILE# : 4
BLOCK# : 130
OBJ# : 20445
DATAOBJ# : 20445
ULEVEL : 5
SUBLEVEL : 9
ILEVEL : 1582497813
FLAGS :
BESTSORTCOL : -1
TINSIZE : 16339840
CTINSIZE :
TOUTSIZE : 9972219
CMPSIZE :
UNCMPSIZE :
MTIME : 13-may-2014 23:14:46
SPARE1 : 31
SPARE2 : 5256
SPARE3 : 571822
SPARE4 :



Plus, there is also a BLOB column in compression$, which holds the analyzer’s findings:


SQL> select analyzer from compression$;

ANALYZER
——————————————————————————
004B445A306AD5025A0000005A6B8E0200000300000000000001020000002A0000003A0000004A(output truncated for readability)


A quick check reveals that this is indeed our object:


SQL> exec pt('select object_name, object_type, owner from dba_objects where data_object_id = 20445');
OBJECT_NAME : TEST
OBJECT_TYPE : TABLE
OWNER : FOO
-----------------

PL/SQL procedure successfully completed.


And we can see the object is now stored in the IMC by looking at v$im_segments:

SQL> exec pt('select * from v$im_segments');
OWNER : FOO
SEGMENT_NAME : TEST
PARTITION_NAME :
SEGMENT_TYPE : TABLE
TABLESPACE_NAME : USERS
INMEMORY_SIZE : 102301696
BYTES : 184549376
BYTES_NOT_POPULATED : 0
POPULATE_STATUS : COMPLETED
INMEMORY_PRIORITY : HIGH
INMEMORY_DISTRIBUTE : AUTO
INMEMORY_DUPLICATE : NO DUPLICATE
INMEMORY_COMPRESSION : FOR QUERY LOW
CON_ID : 0
-----------------

PL/SQL procedure successfully completed.



Thus, we are getting the expected performance benefit of it being in the IMC:

SQL> alter session set inmemory_query=disable;

Session altered.

Elapsed: 00:00:00.01
SQL> select count(*) from test;

COUNT(*)
———-
4187500

Elapsed: 00:00:03.96
SQL> alter session set inmemory_query=enable;

Session altered.

Elapsed: 00:00:00.01
SQL> select count(*) from test;

COUNT(*)
———-
4187500

Elapsed: 00:00:00.13


So far, so good.


Part II – Execution Plans

Some things we need to be aware of, though, when we are using the IMC in 12.1.0.2. One of them being that we can’t always trust in the execution plans anymore.

Let’s go back to our original sample table and recreate it using the default setting of INMEMORY PRIORITY NONE.


drop table test purge
/

create table test inmemory priority none
as
select a.object_name as name, rownum as rn,
sysdate + rownum / 10000 as dt
from all_objects a, (select rownum from dual connect by level <= 500)
/



Now let’s see what plan we’d get if we were to query it right now:


SQL> explain plan for select name from test where name = 'ALL_USERS';

Explained.

SQL> @?/rdbms/admin/utlxpls

PLAN_TABLE_OUTPUT
——————————————————————————————————————————————————————————————————–
Plan hash value: 1357081020

———————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
———————————————————————————–
| 0 | SELECT STATEMENT | | 614 | 12280 | 811 (73)| 00:00:01 |
|* 1 | TABLE ACCESS INMEMORY FULL| TEST | 614 | 12280 | 811 (73)| 00:00:01 |
———————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – inmemory(“NAME”=’ALL_USERS’)
filter(“NAME”=’ALL_USERS’)

14 rows selected.


Okay, you might say now that EXPLAIN PLAN is only a guess. It’s not the real plan, and the real plan has to be different. And you would be right. Usually.

Watching the slave processes, there is no activity related to this table. Since it’s PRIORITY is NONE, it won’t be loaded into IMC until it’s actually queried for the first or second time around.

So let’s take a closer look than, shall we:

SQL> alter session set tracefile_identifier='REAL_PLAN';

Session altered.

SQL> alter session set events ’10046 trace name context forever, level 12′;

Session altered.

SQL> select name from test where name = ‘ALL_USERS’;



Now let’s take a look at the STAT line on that tracefile. Note: I closed the above session to make sure that we’ll get the full trace data.


PARSING IN CURSOR #140505885438688 len=46 dep=0 uid=64 oct=3 lid=64 tim=32852930021 hv=3233947880 ad='b4d04b00' sqlid='5sybd9b0c4878'
select name from test where name = 'ALL_USERS'
END OF STMT
PARSE #140505885438688:c=6000,e=10014,p=0,cr=2,cu=0,mis=1,r=0,dep=0,og=1,plh=1357081020,tim=32852930020
EXEC #140505885438688:c=0,e=58,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=32852930241
WAIT #140505885438688: nam='SQL*Net message to client' ela= 25 driver id=1650815232 #bytes=1 p3=0 obj#=20466 tim=32852930899
WAIT #140505885438688: nam='direct path read' ela= 13646 file number=4 first dba=21507 block cnt=13 obj#=20466 tim=32852950242
WAIT #140505885438688: nam='direct path read' ela= 2246 file number=4 first dba=21537 block cnt=15 obj#=20466 tim=32852953528
WAIT #140505885438688: nam='direct path read' ela= 1301 file number=4 first dba=21569 block cnt=15 obj#=20466 tim=32852955406

FETCH #140505885438688:c=182000,e=3365871,p=17603,cr=17645,cu=0,mis=0,r=9,dep=0,og=1,plh=1357081020,tim=32857244740
STAT #140505885438688 id=1 cnt=1000 pid=0 pos=1 obj=20466 op='TABLE ACCESS INMEMORY FULL TEST (cr=22075 pr=22005 pw=0 time=865950 us cost=811 size=12280 card=614)'



So that’s still the wrong one right there, and the STAT line even clearly shows that we’ve actually done 22005 physical reads, and therefore likely no in-memory scan, but a full scan from disk. There’s clearly a bug there with the execution plan reported, which is plain wrong.

Thus, be careful about using INMEMORY PRIORITY NONE, as you may not get what you expect. Since the PRIORITY NONE settings may also be overridden by any other PRIORITY settings, your data may get flushed out of the IMC, even though your execution plans will say otherwise. And I’m sure many of you know it’s often not slow response times on queries which cause a phone ringing hot. It’s inconsistent response times. This feature, if used inappropriately will pretty much guarantee inconsistent response times.

Apparently, what we should be doing is size up the In Memory Column store appropriately, to hold the objects we actually need to be in there. And make sure they’re always in there by setting a PRIORITY of LOW or higher. Use CRITICAL and HIGH to ensure the most vital objects of the application are populated first.

There was one other oddity that I noticed while tracing the W00? processes.

Part III – What are you scanning, Oracle ?

The m000 process’ trace file reveals many back-to-back executions of this select:


PARSING IN CURSOR #140670951860040 len=104 dep=1 uid=0 oct=3 lid=0 tim=23665542991 hv=2910336760 ad='fbd06928' sqlid='24uqc4aqrhdrs'
select /*+ result_cache */ analyzer from compression$ where obj#=:1 and ulevel=:2



They all supply the same obj# bind value, which is our table’s object number. The ulevel values used vary between executions.


However, looking at the related WAIT lines for this cursor, we see:


WAIT #140670951860040: nam='direct path read' ela= 53427 file number=4 first dba=18432 block cnt=128 obj#=20445 tim=23666569746
WAIT #140670951860040: nam='direct path read' ela= 38073 file number=4 first dba=18564 block cnt=124 obj#=20445 tim=23666612210
WAIT #140670951860040: nam='direct path read' ela= 38961 file number=4 first dba=18816 block cnt=128 obj#=20445 tim=23666665534
WAIT #140670951860040: nam='direct path read' ela= 39708 file number=4 first dba=19072 block cnt=128 obj#=20445 tim=23666706469
WAIT #140670951860040: nam='direct path read' ela= 40242 file number=4 first dba=19328 block cnt=128 obj#=20445 tim=23666749431
WAIT #140670951860040: nam='direct path read' ela= 39147 file number=4 first dba=19588 block cnt=124 obj#=20445 tim=23666804243
WAIT #140670951860040: nam='direct path read' ela= 33654 file number=4 first dba=19840 block cnt=128 obj#=20445 tim=23666839836
WAIT #140670951860040: nam='direct path read' ela= 38908 file number=4 first dba=20096 block cnt=128 obj#=20445 tim=23666881932
WAIT #140670951860040: nam='direct path read' ela= 40605 file number=4 first dba=20352 block cnt=128 obj#=20445 tim=23666924029
WAIT #140670951860040: nam='direct path read' ela= 32089 file number=4 first dba=20612 block cnt=124 obj#=20445 tim=23666962858
WAIT #140670951860040: nam='direct path read' ela= 36223 file number=4 first dba=20864 block cnt=128 obj#=20445 tim=23667001900
WAIT #140670951860040: nam='direct path read' ela= 39733 file number=4 first dba=21120 block cnt=128 obj#=20445 tim=23667043146
WAIT #140670951860040: nam='direct path read' ela= 17607 file number=4 first dba=21376 block cnt=128 obj#=20445 tim=23667062232

… and several more.


Now, compression$ contains only a single row. Its total extent size is neglibile as well:


SQL> select sum(bytes)/1024/1024 from dba_extents where segment_name = 'COMPRESSION$';

SUM(BYTES)/1024/1024
——————–
.0625


So how come Oracle is reading so many blocks ? Note that each of the above waits is a multi-block read, of 128 blocks.

Let’s take a look at what Oracle is actually reading there:

begin
pt('select segment_name, segment_type, owner
from dba_extents where file_id = 4
and 18432 between block_id and block_id + blocks - 1');
end;
/

SEGMENT_NAME : TEST
SEGMENT_TYPE : TABLE
OWNER : FOO
—————–

PL/SQL procedure successfully completed.

There’s our table again. Wait. What ?

There must be some magic going on underneath the covers here. In my understanding, a plain select against table A, is not scanning table B.

If I manually run the same select statement against compression$, I get totally normal trace output.

This reminds me of the good old:

SQL> select piece from IDL_SB4$;
ERROR:
ORA-00932: inconsistent datatypes: expected CHAR got B4



But I digress.

It could simply be a bug that results in these direct path reads being allocated to the wrong cursor. Or it could be intended, as it’s indeed this process’ job to analyze and load this table, and using this the resource usage caused by this is instrumented and can be tracked?

Either way, to sum things up we can say that:

- Performance benefits can potentially be huge
- Oracle automatically scans and caches segments marked as INMEMORY PRIORITY LOW|MEDIUM|HIGH|CRITICAL (they don’t need to be queried first!)
- Oracle scans segments marked as INMEMORY PRIORITY NONE (the default) only after they’re accessed the second time – and they may get overridden by higher priorities
- Oracle analyzes the table and stores the results in compression$
- Based on that analysis, Oracle may decide to load one or the other column only into IMC, or the entire table, depending on available space, and depending on the INMEMORY clause used
- It’s the W00? processes using some magic to do this analysis and read the segment into IMC.
- This analysis is also likely to be triggered again, whenever space management of the IMC triggers again, but I haven’t investigated that yet.

Categories: DBA Blogs

Solid Conference San Francisco 2014: Complete Video Compilation

Surachart Opun - Tue, 2014-07-29 08:17
Solid Conference focused on the intersection of software and hardware. It's great community with Software and Hardware. Audiences will be able to learn new idea to combine software and hardware. It gathered idea from engineers, researchers, roboticists, artists, founders of startups, and innovators.
Oreilly launched HD videos (Solid Conference San Francisco 2014: Complete Video Compilation Experience the revolution at the intersection of hardware and software—and imagine the future) for this conference. Video files might huge for download. It will spend much time. Please Use some download manager programs for help.
After watched, I excited to learn some things new with it (Run times: 36 hours 8 minutes): machines, devices, components and etc.

Written By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs

Create Windows Service for Oracle RAC

Pythian Group - Tue, 2014-07-29 08:08

It’s my first time on RAC system for Windows and I’m happy to learn something new to share.

I created a new service for database (restoredb) only to find out the ORACLE_HOME for the service is “c:\\oracle\\product\\10.2.0\\asm_1″

Any ideas as to what was wrong?

C:\\dba_pythian>set oracle_home=C:\\oracle\\product\\10.2.0\\db_1

C:\\dba_pythian>echo %ORACLE_HOME%
C:\\oracle\\product\\10.2.0\\db_1

C:\\dba_pythian>oradim -NEW -SID restoredb -STARTMODE manual
Instance created.

C:\\dba_pythian>env
 1 STOPPED agent11g1Agent                                    c:\\oracle\\app\\11.1.0\\agent11g
 2 STOPPED agent11g1AgentSNMPPeerEncapsulator                c:\\oracle\\app\\11.1.0\\agent11g\\bin\\encsvc.exe
 3 STOPPED agent11g1AgentSNMPPeerMasterAgent                 c:\\oracle\\app\\11.1.0\\agent11g\\bin\\agntsvc.exe
 4 RUNNING +ASM1                                             c:\\oracle\\product\\10.2.0\\asm_1
 5 RUNNING ClusterVolumeService                              C:\\oracle\\product\\10.2.0\\crs
 6 RUNNING CRS                                               C:\\oracle\\product\\10.2.0\\crs
 7 RUNNING CSS                                               C:\\oracle\\product\\10.2.0\\crs
 8 RUNNING EVM                                               C:\\oracle\\product\\10.2.0\\crs
 9 STOPPED JobSchedulerDWH1                                  c:\\oracle\\product\\10.2.0\\db_1
10 STOPPED JobSchedulerRMP1                                  c:\\oracle\\product\\10.2.0\\db_1
11 RUNNING OraASM10g_home1TNSListenerLISTENER_PRD-DB-10G-01  C:\\oracle\\product\\10.2.0\\asm_1
12 STOPPED OraDb10g_home1TNSListener                         c:\\oracle\\product\\10.2.0\\db_1
13 STOPPED ProcessManager                                    "C:\\oracle\\product\\10.2.0\\crs"
14 RUNNING DWH1                                              c:\\oracle\\product\\10.2.0\\db_1
15 RUNNING RMP1                                              c:\\oracle\\product\\10.2.0\\db_1
16 RUNNING agent12c1Agent                                    C:\\agent12c\\core\\12.1.0.4.0
17 RUNNING restoredb                                         c:\\oracle\\product\\10.2.0\\asm_1
18 STOPPED JobSchedulerrestoredb                             c:\\oracle\\product\\10.2.0\\asm_1

Check the PATH variable to find HOME for ASM is listed before DB.

C:\\dba_pythian>set
Path=C:\\oracle\\product\\10.2.0\\asm_1\\bin;C:\\oracle\\product\\10.2.0\\db_1\\bin;C:\\WINDOWS\\system32;C:\\WINDOWS

Create database service specifying the fullpath to oradim from the DB HOME

C:\\dba_pythian>oradim -DELETE -SID restoredb
Instance deleted.

C:\dba_pythian>env
 1 STOPPED agent11g1Agent                                    c:\\oracle\\app\11.1.0\\agent11g
 2 STOPPED agent11g1AgentSNMPPeerEncapsulator                c:\\oracle\\app\11.1.0\\agent11g\\bin\\encsvc.exe
 3 STOPPED agent11g1AgentSNMPPeerMasterAgent                 c:\\oracle\\app\11.1.0\\agent11g\\bin\\agntsvc.exe
 4 RUNNING +ASM1                                             c:\\oracle\\product\\10.2.0\\asm_1
 5 RUNNING ClusterVolumeService                              C:\\oracle\\product\\10.2.0\\crs
 6 RUNNING CRS                                               C:\\oracle\\product\\10.2.0\\crs
 7 RUNNING CSS                                               C:\\oracle\\product\\10.2.0\\crs
 8 RUNNING EVM                                               C:\\oracle\\product\\10.2.0\\crs
 9 STOPPED JobSchedulerDWH1                                  c:\\oracle\\product\\10.2.0\\db_1
10 STOPPED JobSchedulerRMP1                                  c:\\oracle\\product\\10.2.0\\db_1
11 RUNNING OraASM10g_home1TNSListenerLISTENER_PRD-DB-10G-01  C:\\oracle\\product\\10.2.0\\asm_1
12 STOPPED OraDb10g_home1TNSListener                         c:\\oracle\\product\\10.2.0\\db_1
13 STOPPED ProcessManager                                    "C:\\oracle\\product\\10.2.0\\crs"
14 RUNNING DWH1                                              c:\\oracle\\product\\10.2.0\\db_1
15 RUNNING RMP1                                              c:\\oracle\\product\\10.2.0\\db_1
16 RUNNING agent12c1Agent                                    C:\\agent12c\\core\\12.1.0.4.0

C:\\dba_pythian>dir C:\\oracle\\product\\10.2.0\\db_1\\BIN\\orad*
 Volume in drive C has no label.
 Volume Serial Number is D4FE-B3A8

 Directory of C:\\oracle\\product\\10.2.0\\db_1\\BIN

07/08/2010  10:01 AM           121,344 oradbcfg10.dll
07/20/2010  05:20 PM             5,120 oradim.exe
07/20/2010  05:20 PM             3,072 oradmop10.dll
               3 File(s)        129,536 bytes
               0 Dir(s)  41,849,450,496 bytes free

C:\\dba_pythian>C:\\oracle\\product\\10.2.0\\db_1\\BIN\\oradim.exe -NEW -SID restoredb -STARTMODE manual
Instance created.

C:\\dba_pythian>env
 1 STOPPED agent11g1Agent                                    c:\\oracle\\app\\11.1.0\\agent11g
 2 STOPPED agent11g1AgentSNMPPeerEncapsulator                c:\\oracle\\app\\11.1.0\\agent11g\\bin\\encsvc.exe
 3 STOPPED agent11g1AgentSNMPPeerMasterAgent                 c:\\oracle\\app\\11.1.0\\agent11g\\bin\\agntsvc.exe
 4 RUNNING +ASM1                                             c:\\oracle\\product\\10.2.0\\asm_1
 5 RUNNING ClusterVolumeService                              C:\\oracle\\product\\10.2.0\\crs
 6 RUNNING CRS                                               C:\\oracle\\product\\10.2.0\\crs
 7 RUNNING CSS                                               C:\\oracle\\product\\10.2.0\\crs
 8 RUNNING EVM                                               C:\\oracle\\product\\10.2.0\\crs
 9 STOPPED JobSchedulerDWH1                                  c:\\oracle\\product\\10.2.0\\db_1
10 STOPPED JobSchedulerRMP1                                  c:\\oracle\\product\\10.2.0\\db_1
11 RUNNING OraASM10g_home1TNSListenerLISTENER_PRD-DB-10G-01  C:\\oracle\\product\\10.2.0\\asm_1
12 STOPPED OraDb10g_home1TNSListener                         c:\\oracle\\product\\10.2.0\\db_1
13 STOPPED ProcessManager                                    "C:\\oracle\\product\\10.2.0\\crs"
14 RUNNING DWH1                                              c:\\oracle\\product\\10.2.0\\db_1
15 RUNNING RMP1                                              c:\\oracle\\product\\10.2.0\\db_1
16 RUNNING agent12c1Agent                                    C:\\agent12c\\core\\12.1.0.4.0
17 RUNNING restoredb                                         c:\\oracle\\product\\10.2.0\\db_1
18 STOPPED JobSchedulerrestoredb                             c:\\oracle\\product\\10.2.0\\db_1

C:\\dba_pythian>
Categories: DBA Blogs

How SQL Server Browser Service Works

Pythian Group - Tue, 2014-07-29 08:07

Some of you may wonder the role SQL browser service plays in the SQL Server instance. In this blog post, I’ll provide an overview of the how SQL Server browser plays crucial role in connectivity and understand the internals of it by capturing the network monitor output during the connectivity with different scenario.

Here is an executive summary of the connectivity flow:   ExecutiveWorkflow

 

Here is another diagram to explain the SQL Server connectivity status for Named & Default instance under various scenarios:

 DefaultvsNamed

Network Monitor output for connectivity to Named instance when SQL Browser is running:

In the diagram below, we can see that an UDP request over 1434 was sent from a local machine (client) to SQL Server machine (server) and response came from server 1434 port over UDP to client port with list of instances and the port in which it is listening:

image003

 

Network Monitor output for connectivity to Named instance when SQL Browser is stopped/disabled:

 We can see that client sends 5 requests which ended up with no response from UDP 1434 of server. so connectivity will never be established to the named instance.

 image004

 

Network Monitor output for connectivity to Named instance with port number specified in connection string & SQL Browser is stopped/disabled:

 There is no call made to the server’s 1434 port over UDP instead connection is directly made to the TCP port specified in the connection string.

image005  Network Monitor output for connectivity to Default instance when SQL Browser running:

 We can see that no calls were made to server’s 1434 port over UDP in which SQL Server browser is listening.

 image006

 

Network Monitor output for connectivity to Default instance which is configured to listen on different port other than default 1433 when SQL Browser running:

 We can see that connectivity failed after multiple attempts because client assumes that default instance of SQL Server always listens on TCP port 1433.

You can refer the blog below to see some workarounds to handle this situation here:

http://blogs.msdn.com/b/dataaccesstechnologies/archive/2010/03/03/running-sql-server-default-instance-on-a-non-default-or-non-standard-tcp-port-tips-for-making-application-connectivity-work.aspx

image007 References:

SQL Server Browser Service - http://msdn.microsoft.com/en-us/library/ms181087.aspx

Ports used by SQL Server and Browser Service - http://msdn.microsoft.com/en-us/library/ms175483.aspx

SQL Server Resolution Protocol Specification - http://msdn.microsoft.com/en-us/library/cc219703(v=prot.10).aspx

Thanks for reading!

 

Categories: DBA Blogs

Query with new plan

Bobby Durrett's DBA Blog - Mon, 2014-07-28 18:24

I came up with a simple query that shows a running SQL executing a different plan than what it had in the past.  Here is the query:

-- show currently executing sqls that have history
-- but who have never run with the current plan
-- joins v$session to v$sql to get plan_hash_value of 
-- executing sql.
-- queries dba_hist_sqlstat for previously used values 
-- of plan_hash_value.
-- only reports queries that have an older plan that is 
-- different from the new one.

select
vs.sid,
vs.sql_id,
vs.last_call_et,
sq.plan_hash_value
from
v$session vs,
v$sql sq
where
vs.sql_id=sq.sql_id and
vs.SQL_CHILD_NUMBER=sq.child_number and
sq.plan_hash_value not in 
(select ss.plan_hash_value
from dba_hist_sqlstat ss
where 
ss.sql_id=sq.sql_id) and 
0 < 
(select count(ss.plan_hash_value)
from dba_hist_sqlstat ss
where 
ss.sql_id=sq.sql_id);

Example output:

       SID SQL_ID        LAST_CALL_ET PLAN_HASH_VALUE
---------- ------------- ------------ ---------------
       229 cq8bhsxbbf9k7            0      3467505462

This was a test query.  I ran it a bunch of times with an index and then dropped the index after creating an AWR snapshot.  The query executed with a different plan when I ran it without the index.  The same type of plan change could happen in production if an index were accidentally dropped.

I’m hoping to use this query to show production queries that have run in the past but whose current plan differs from any that they have used before.  Of course, a new plan doesn’t necessarily mean you have a problem but it might be helpful to recognize those plans that are new and that differ from the plans used in the past.

– Bobby

 

Categories: DBA Blogs

TechTalk v5.0 – The Age of Big Data with Alex Morrise

Pythian Group - Mon, 2014-07-28 14:06

Who: Hosted by Blackbird, with a speaking session by Alex Morrise, Chief Data Scientist at Pythian.

What: TechTalk presentation, beer, wine, snacks and Q&A

Where: Blackbird HQ – 712 Tehama Street (corner of 8th and Tehama) San Francisco, CA

When: Thursday July 31, 2014 from 6:00-8:00 PM

How: RSVP here!

TechTalk v5.0 welcomes to the stage, Alex Morrise, Chief Data Scientist at Pythian. Alex previously worked with Idle Games, Quid, and most recently Beats Music where he led the development of an adaptive, contextual music recommendation server.  Alex earned a PhD in Theoretical Physics from UC Santa Cruz.

This edition of TechTalk will be based on how the age of big data allows statistical inference on an unprecedented scale. Inference is the process of extracting knowledge from data, many times uncovering latent variables unifying seemingly diverse pieces of information. As data grows in complexity and dimension, visualization becomes increasingly difficult. How do we represent complex data to discover implicit and explicit relationships? We discuss how to Visualize Inference in some interesting data sets that uncover topics as diverse as the growth of technology, social gaming, and music.

You won’t want to miss this event, so be sure to RSVP.

 

Categories: DBA Blogs

Unexpected Shutdown Caused by ASR

Pythian Group - Mon, 2014-07-28 13:45

In past few days I had two incidents and an outage, for just a few minutes. However, outage in a production environment is related to cost relatively and strictly. The server that had outage was because of failing over and then failing back about 4 to 5 times in 15 minutes. I was holding pager, and was then involved in investigating root cause for this fail-over and failed-back. Looking at the events in SQL Server error logs did not give me any clue towards what was happening, or why so I looked at the Windows Event View’s System log. I thought, “Maybe I have something there!”

There were two events that came to my attention:

Event Type:        Error

Event Source:    EventLog

Event Category:                None

Event ID:              6008

Date:                     7/24/2014

Time:                     1:14:12 AM

User:                     N/A

Computer:          SRV1

Description:

The previous system shutdown at 1:00:31 AM on 7/24/2014 was unexpected.

 

Event Type:        Information

Event Source:    Server Agents

Event Category:                Events

Event ID:              1090

Date:                     7/24/2014

Time:                     1:15:16 AM

User:                     N/A

Computer:          SRV1

Description:

System Information Agent: Health: The server is operational again.  The server has previously been shutdown by the Automatic Server Recovery (ASR) feature and has just become operational again.

 

 

The errors are closely related to the feature called Automatic Server Recovery (ASR) which is mainly configured with the server, and comes with the hardware. In our case, HP Blade, ProLiant server. There has been some resources/threads already discussed around similar topic. Most of the hardware vendor has somewhat similar software with similar functionality made available for servers.

In my case, my understanding was that maybe firmware are out of date and requiring updating, or the servers are aged. Further, I have sent my findings to customer with an incident report.  In a couple of hours, I had a reply and the feedback I received was just what I was expecting, the hardware was aged.  This may be the case with you when you see a message in event viewer which reads like “System Information Agent: Health: The server is operational again.  The server has previously been shutdown by the Automatic Server Recovery (ASR) feature and has just become operational again.”  Go check with your system administrator. The root cause of this unexepcted shutdown may not be related or caused by the SQL Server, rather, the system itself.  Please keep in mind that this could be one of the reasons, and certainly not the only.

References:

Automatic System Recovery

 

Categories: DBA Blogs

Four Options For Oracle DBA Tuning Training

Four Options For Oracle DBA Tuning Training
Oracle DBAs are constantly solving problems... mysteries. That requires a constant knowledge increase. I received more personal emails from my Oracle DBA Training Options Are Changing posting than ever before. Many of these were from frustrated, angry, and "stuck" DBAs. But in some way, almost all asked the question, "What should I do?"

In response to the "What should I do?" question, I came up with four types of Oracle DBA performance tuning training that are available today. Here they are:

Instructor Led Training (ILT) 
Instructor Led Training (ILT) is the best because you have a personal connection with the teacher. I can't speak for other companies, but I strive to connect with every student and every student knows they can personally email or call me...even years after the training. In fact, I practically beg them to do what we do in class on their production systems and send me the results so I can continue helping them. To me being a great teacher is more than being a great communicator. It's about connection. ILT makes connecting with students easy.

Content Aggregators
Content Aggregators are the folks who pull together free content from various sources, organize and display it. Oh yeah... and they profit from it. Sometimes the content value is high, sometimes not. I tend to think of content aggregators like patent trolls, yet many times they can be a great resource. The problem is you're not dealing with the creator of the content. However, the creator of the content actually knows the subject matter. You can somtimes contact them...as I encourage my students and readers to do.

Content Creators
Content Creators are the folks who create content based on their experiences. We receive that content through their blogs, videos, conference presentations and sometimes through their training. I am a content creator but with an original, almost child-like curiosity, performance research twist. Content creators rarely directly profit from their posted content, but somehow try to transform it into a revenue stream. I can personally attest, it can be a risky financial strategy...but it's personally very rewarding. Since I love do research, it's easy and enjoyable to post my findings so others may benefit.

Online Training (OLT)
Online Training (OLT) is something I have put off for years. The online Oracle training I have seen is mostly complete and total crap. The content is usually technically low and mechanical. The production quality is something a six year old can do on their PC. The teaching quality is ridiculous and the experience puts you to sleep. I do not ever want to be associated with that kind of crowd.

I was determined to do something different. It had to be the highest quality. I have invested thousands of dollars in time, labor, and equipment to make online video training
Craig teaching in an OraPub Online Institute Seminarwork. Based on the encouraging feedback I receive it's working!

This totally caught me by surprise. I have discovered that I can do things through special effects and a highly organized delivery that is impossible to do in a classroom. (Just watch my seminar introductions on YouTube and you'll quickly see what I mean.) This makes the content rich and highly compressed. One hour of OraPub Online Institute training is easily equivalent to two to four hours of classroom training. Easily. I have also strive to keep the price super low, the production at a professional level and ensure the video can be streamed anywhere in the world and on any device. Online training is an option, but you have to search for it.

Summary
So there you have it. Because of economics and the devaluation of DBAs as human beings coupled with new technologies, the Oracle DBA still has at least four main sources of training and knowledge expansion. Don't give up learning!

Some of you reading may be surprised that I'm writing about this topic because it will hurt my traditional instructor led training (public or on-site) classes. I don't think so. If people can attend my classes in person, they will. Otherwise, I hope they will register for an OraPub Online Institute seminar. Or, at least subscribe to my blog (see upper left of page).

All the best in your quest to do great work,

Craig.
You can watch seminar introductions for free on YouTube!
If you enjoy my blog, subscribing will ensure you get a short-concise email about a new posting. Look for the form on this page.

P.S. If you want me to respond to a comment or you have a question, please feel free to email me directly at craig@orapub .com.

Categories: DBA Blogs

Logging for Slackers

Pythian Group - Mon, 2014-07-28 07:41

When I’m not working on Big Data infrastructure for clients, I develop a few internal web applications and side projects. It’s very satisfying to write a Django app in an afternoon and throw it on Heroku, but there comes a time when people actually start to use it. They find bugs, they complain about downtime, and suddenly your little side project needs some logging and monitoring infrastructure. To be clear, the right way to do this would be to subscribe to a SaaS logging platform, or to create some solution with ElasticSearch and Kibana, or just use Splunk. Today I was feeling lazy, and I wondered if there wasn’t an easier way.

Enter Slack

Slack is a chat platform my team already uses to communicate – we have channels for different purposes, and people subscribe to keep up to date about Data Science, our internal Hadoop cluster, or a bunch of other topics. I already get notifications on my desktop and my phone, and the history of messages is visible and searchable for everyone in a channel. This sounds like the ideal lazy log repository.

Slack offers a rich REST API where you can search, work with files, and communicate in channels. They also offer an awesome (for the lazy) Incoming WebHooks feature – this allows you to POST a JSON message with a secret token, which is posted to a pre-configured channel as a user you can configure in the web UI. The hardest part of setting up a new WebHook was choosing which emoji would best represent application errors – I chose a very sad smiley face, but the devil is also available.

The Kludge

Django already offers the AdminEmailHandler, which emails log messages to the admins listed in your project. I could have created a mailing list, added it to the admins list, and let people subscribe. They could then create a filter in their email to label the log messages. That sounds like a lot of work, and there wouldn’t be a history of the messages except in individual recipients’ inboxes.

Instead, I whipped up this log handler for Django which will post the message (and a stack trace, if possible) to your Slack endpoint:

from logging import Handler
import requests, json, traceback
class SlackLogHandler(Handler):
   def __init__(self, logging_url="", stack_trace=False):
      Handler.__init__(self)
      self.logging_url = logging_url
      self.stack_trace = stack_trace
   def emit(self, record):
      message = '%s' % (record.getMessage())
      if self.stack_trace:
         if record.exc_info:
            message += '\n'.join(traceback.format_exception(*record.exc_info))
            requests.post(self.logging_url, data=json.dumps({"text":message} ))

There you go: install the requests library, generate an Incoming WebHook URL at api.slack.com, stick the SlackLogHandler in your Django logging configuration, and your errors will be logged to the Slack channel of your choice. Stack traces are optional – I’ve also been using this to post hourly reports of active users, etc. to the channel under a difference username.

For reference, here’s a log configuration for the Django settings.py. Now go write some code, you slacker.

LOGGING = {
    'version':1,
    'disable_existing_loggers':False,
    'handlers': {
        'console': {
            'level':'DEBUG',
            'class':'logging.StreamHandler',
        },
        'slack-error': {
            'level':'ERROR',
            'class':'SlackLogHandler.SlackLogHandler',
            'logging_url':'<SuperSecretWebHookURL>',
            'stack_trace':True
        }
    }
    'loggers': {
        'django': {
            'level': 'INFO',
            'handlers': ['console', 'slack-error']
        }
    }
}
Categories: DBA Blogs

GI Commands : 1 -- Monitoring Status of Resources

Hemant K Chitale - Sun, 2014-07-27 08:10
In 11gR2

Listing the Status of Resources

[root@node1 ~]# su - grid
-sh-3.2$ crsctl status resource -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE node1
ONLINE ONLINE node2
ora.DATA1.dg
ONLINE ONLINE node1
ONLINE ONLINE node2
ora.DATA2.dg
ONLINE ONLINE node1
ONLINE ONLINE node2
ora.FRA.dg
ONLINE ONLINE node1
ONLINE ONLINE node2
ora.LISTENER.lsnr
ONLINE ONLINE node1
ONLINE ONLINE node2
ora.asm
ONLINE ONLINE node1
ONLINE ONLINE node2
ora.gsd
OFFLINE OFFLINE node1
OFFLINE OFFLINE node2
ora.net1.network
ONLINE ONLINE node1
ONLINE ONLINE node2
ora.ons
ONLINE ONLINE node1
ONLINE ONLINE node2
ora.registry.acfs
ONLINE ONLINE node1
ONLINE ONLINE node2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE node2
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE node1
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE node1
ora.cvu
1 ONLINE ONLINE node1
ora.gns
1 ONLINE ONLINE node1
ora.gns.vip
1 ONLINE ONLINE node1
ora.node1.vip
1 ONLINE ONLINE node1
ora.node2.vip
1 ONLINE ONLINE node2
ora.oc4j
1 ONLINE ONLINE node1
ora.racdb.db
1 ONLINE ONLINE node1 Open
2 ONLINE ONLINE node2 Open
ora.racdb.new_svc.svc
1 ONLINE ONLINE node1
2 ONLINE ONLINE node2
ora.scan1.vip
1 ONLINE ONLINE node2
ora.scan2.vip
1 ONLINE ONLINE node1
ora.scan3.vip
1 ONLINE ONLINE node1
-sh-3.2$

So we see that :
a) The Cluster consists of two nodes node1 and node2
b) There are 4 ASM DiskGroups DATA, DATA1, DATA2 and FRA
c) GSD is offline as expected -- it is required only for 9i Databases
d) There is a database racdb and a service new_svc  (see my previous post)


Listing the status of SCAN Listeners

-sh-3.2$ id
uid=500(grid) gid=1001(oinstall) groups=1001(oinstall),1011(asmdba)
-sh-3.2$ srvctl status scan
SCAN VIP scan1 is enabled
SCAN VIP scan1 is running on node node2
SCAN VIP scan2 is enabled
SCAN VIP scan2 is running on node node1
SCAN VIP scan3 is enabled
SCAN VIP scan3 is running on node node1
-sh-3.2$ srvctl status scan_listener
SCAN Listener LISTENER_SCAN1 is enabled
SCAN listener LISTENER_SCAN1 is running on node node2
SCAN Listener LISTENER_SCAN2 is enabled
SCAN listener LISTENER_SCAN2 is running on node node1
SCAN Listener LISTENER_SCAN3 is enabled
SCAN listener LISTENER_SCAN3 is running on node node1
-sh-3.2$

So we see that
a) There are 3 SCAN Listeners
b) Since this is a 2-node cluster, 2 of the SCAN Listeners are on one node node1


Listing the status of the OCR

-sh-3.2$ id
uid=500(grid) gid=1001(oinstall) groups=1001(oinstall),1011(asmdba)
-sh-3.2$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3668
Available space (kbytes) : 258452
ID : 605940771
Device/File Name : +DATA
Device/File integrity check succeeded
Device/File Name : /fra/ocrfile
Device/File integrity check succeeded
Device/File Name : +FRA
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check bypassed due to non-privileged user

-sh-3.2$ su root
Password:
[root@node1 grid]# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3668
Available space (kbytes) : 258452
ID : 605940771
Device/File Name : +DATA
Device/File integrity check succeeded
Device/File Name : /fra/ocrfile
Device/File integrity check succeeded
Device/File Name : +FRA
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

[root@node1 grid]#

So we see that :
a) The OCR is in 3 locations +DATA, +FRA and NFS filesystem /fra/ocrfile
b) A Logical corruption check of the OCR can only be done by root, not by grid


Listing the status of the Vote Disk

-sh-3.2$ crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 0e94545ce44f4fb1bf6906dc6889aaff (/fra/votedisk.3) []
2. ONLINE 0d13305520f84f3fbf6c2008a6f79829 (/data1/votedisk.1) []
Located 2 voting disk(s).
-sh-3.2$

So we see that :
a) There are 2 votedisk copies  (yes, two -- not the recommended three !)
b) Both are on filesystem
How do I happen to have 2 votedisk copies ?  I actually had 3 but removed one. DON'T TRY THIS ON YOUR PRODUCTION CLUSTER.  I am adding the third one back now :
-sh-3.2$ crsctl add css votedisk /data2/votedisk.2
Now formatting voting disk: /data2/votedisk.2.
CRS-4603: Successful addition of voting disk /data2/votedisk.2.
-sh-3.2$ crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 0e94545ce44f4fb1bf6906dc6889aaff (/fra/votedisk.3) []
2. ONLINE 0d13305520f84f3fbf6c2008a6f79829 (/data1/votedisk.1) []
3. ONLINE 41c24037b51c4f97bf4cb7002649aee4 (/data2/votedisk.2) []
Located 3 voting disk(s).
-sh-3.2$

There, I am now back to 3 votedisk copies.
.
.
.

Categories: DBA Blogs

Log Buffer #381, A Carnival of the Vanities for DBAs

Pythian Group - Fri, 2014-07-25 08:22

Thy rhythm of blog posts regarding database technology has remained consistent throughout the week. Few of those posts have been plucked by this Log Buffer Edition for your pleasure.

Oracle:

Sayan has shared a Standalone sqlplus script for plans comparing.

Gartner Analysis: PeopleSoft Update Manager Delivers Significant Improvements to the Upgrade Tools and Processes.

Timely blackouts, of course, are essential to keeping the numbers up and (more importantly) preventing Target Down notifications from being sent out.

Are you experiencing analytics pain points?

Bug with xmltable, xmlnamespaces and xquery_string specified using bind variable.

SQL Server:

SQL Server 2012 introduced columnstore indexes, which can immensely improve the performance of OLAP queries.

Restoring the SQL Server Master Database Even Without a Backup .

There times when you need to write T-SQL code that creates specific T-SQL Code and executes it. When you do this you are creating dynamic T-SQL code.

A lot of numbers that we use everyday such as Bank Card numbers, Identification numbers, and ISBN codes, have check digits.

SQL-only ETL using a bulk insert into a temporary table (SQL Spackle).

MySQL:

How MariaDB makes Stored Procedures usable.

DBaaS, OpenStack and Trove 101: Introduction to the basics.

MySQL Fabric is a tool included on MySQL Utilities that helps you to manage your MySQL instances.

Showing all available MySQL data types when creating a new table with MySQL for Excel.

Why TokuDB hates Transparent HugePages.

Categories: DBA Blogs

Happy System Administrator Appreciation Day

Pythian Group - Fri, 2014-07-25 07:59

Today is our day. July 25, 2014 marks the 15th annual System Administrator Appreciation Day. On this day we pause and take a moment to forget the impossible tasks, nonexistent budgets, and often unrealistic timelines to say thank you to those people who keeps everything working — system administrators.

So much of what has become a part of everyday life, from doing our jobs, to playing games online, shopping, and connecting with friends and family around the world is only possible due in large part to the tireless efforts of the system administrators who are in the trenches every hour of every day of the year keeping the tubes clear and the packets flowing. The fact that technology has become so common place in our lives, and more often than not “just works” has afforded us the luxury of forgetting (or not evening knowing) the immense infrastructure complexity which the system administrator works with to deliver the services we have come to rely on.

SysAdmin Appreciation Day started 15 years ago thanks to Ted Kekatos. According to Wikipedia, “Kekatos was inspired to create the special day by a Hewlett-Packard magazine advertisement in which a system administrator is presented with flowers and fruit-baskets by grateful co-workers as thanks for installing new printers. Kekatos had just installed several of the same model printers at his workplace.” Ever since then, SysAdmin Appreciation Day has been celebrated on the last Friday in July.

At Pythian, I have the privilege of being part of the Enterprise Infrastructure Services group.  We are a SysAdmin dream team of the best of the best, from around the globe. Day in and day out, our team is responsible for countless servers, networks, and services that millions of people use every day.

To all my colleagues and to anyone who considers themselves a SysAdmin, regardless of which flavour – thank you, and know that you are truly doing work that matters.

Categories: DBA Blogs