Skip navigation.

Feed aggregator

TechTalk v5.0 – The Age of Big Data with Alex Morrise

Pythian Group - Mon, 2014-07-28 14:06

Who: Hosted by Blackbird, with a speaking session by Alex Morrise, Chief Data Scientist at Pythian.

What: TechTalk presentation, beer, wine, snacks and Q&A

Where: Blackbird HQ – 712 Tehama Street (corner of 8th and Tehama) San Francisco, CA

When: Thursday July 31, 2014 from 6:00-8:00 PM

How: RSVP here!

TechTalk v5.0 welcomes to the stage, Alex Morrise, Chief Data Scientist at Pythian. Alex previously worked with Idle Games, Quid, and most recently Beats Music where he led the development of an adaptive, contextual music recommendation server.  Alex earned a PhD in Theoretical Physics from UC Santa Cruz.

This edition of TechTalk will be based on how the age of big data allows statistical inference on an unprecedented scale. Inference is the process of extracting knowledge from data, many times uncovering latent variables unifying seemingly diverse pieces of information. As data grows in complexity and dimension, visualization becomes increasingly difficult. How do we represent complex data to discover implicit and explicit relationships? We discuss how to Visualize Inference in some interesting data sets that uncover topics as diverse as the growth of technology, social gaming, and music.

You won’t want to miss this event, so be sure to RSVP.


Categories: DBA Blogs

Unexpected Shutdown Caused by ASR

Pythian Group - Mon, 2014-07-28 13:45

In past few days I had two incidents and an outage, for just a few minutes. However, outage in a production environment is related to cost relatively and strictly. The server that had outage was because of failing over and then failing back about 4 to 5 times in 15 minutes. I was holding pager, and was then involved in investigating root cause for this fail-over and failed-back. Looking at the events in SQL Server error logs did not give me any clue towards what was happening, or why so I looked at the Windows Event View’s System log. I thought, “Maybe I have something there!”

There were two events that came to my attention:

Event Type:        Error

Event Source:    EventLog

Event Category:                None

Event ID:              6008

Date:                     7/24/2014

Time:                     1:14:12 AM

User:                     N/A

Computer:          SRV1


The previous system shutdown at 1:00:31 AM on 7/24/2014 was unexpected.


Event Type:        Information

Event Source:    Server Agents

Event Category:                Events

Event ID:              1090

Date:                     7/24/2014

Time:                     1:15:16 AM

User:                     N/A

Computer:          SRV1


System Information Agent: Health: The server is operational again.  The server has previously been shutdown by the Automatic Server Recovery (ASR) feature and has just become operational again.



The errors are closely related to the feature called Automatic Server Recovery (ASR) which is mainly configured with the server, and comes with the hardware. In our case, HP Blade, ProLiant server. There has been some resources/threads already discussed around similar topic. Most of the hardware vendor has somewhat similar software with similar functionality made available for servers.

In my case, my understanding was that maybe firmware are out of date and requiring updating, or the servers are aged. Further, I have sent my findings to customer with an incident report.  In a couple of hours, I had a reply and the feedback I received was just what I was expecting, the hardware was aged.  This may be the case with you when you see a message in event viewer which reads like “System Information Agent: Health: The server is operational again.  The server has previously been shutdown by the Automatic Server Recovery (ASR) feature and has just become operational again.”  Go check with your system administrator. The root cause of this unexepcted shutdown may not be related or caused by the SQL Server, rather, the system itself.  Please keep in mind that this could be one of the reasons, and certainly not the only.


Automatic System Recovery


Categories: DBA Blogs

UTL_FILE_DIR Security Weakness: Why and How To Use Oracle Directories

UTL_FILE_DIR is the database initialization parameter the Oracle Database uses to determine what operating system directories and files PL/SQL packages, functions, and procedures may read from or write to when using the standard UTL_FILE database package.  The directories specified in the UTL_FILE_DIR parameter may be accessed by any database user, which can be a security issue.  In Oracle 9iR2, Oracle released new functionality called “Directories” that provides a more secure and robust capability to access operating system directories and files.  The advantages of using Directories over UTL_FILE_DIR are –

  • Read and/or Write access to a Directory can be granted to individual database accounts or roles
  • A Directory can be added without having to bounce the database
  • Directory name is used in the UTL_FILE.FOPEN statement rather than the directory path, which allows changes to the directory path without modification to the PL/SQL source code
Securing UTL_FILE

The UTL_FILE database package is used to read from and write to operating system directories and files.  By default, PUBLIC is granted execute permission on UTL_FILE. Therefore, any database account may read from and write to files in the directories specified in the UTL_FILE_DIR database initialization parameter.

Oracle usually assumes that PUBLIC has execute permission on UTL_FILE, therefore, many Oracle product installations do not specifically grant execute permission on UTL_FILE to Oracle installed database accounts.  Consequently, revoking execute permission on UTL_FILE from PUBLIC will result in errors in a number of standard Oracle database utilities and The Oracle E-Business Suite.  Also, some Oracle products and third party products will grant execute on UTL_FILE to PUBLIC during the installation of the product.

We do not recommend revoking execute permission on UTL_FILE from PUBLIC in database instances running the Oracle E-Business Suite and other complex applications (i.e., SAP, Peoplesoft, Oracle Clinical, etc.) due to the possibility of encountering errors in the application and third party products.  Only revoke execute permission from PUBLIC in database instances where the application, third party products, and all database management tools can be thoroughly tested.  All Oracle delivered products must be tested since Oracle often assumes UTL_FILE is granted to PUBLIC and does not provide the necessary grants when any products are installed – this includes products like Enterprise Manager Grid Control and Apex.

Security considerations with UTL_FILE can be mitigated by removing all directories from UTL_FILE_DIR and using the Directory functionality instead.

Oracle E-Business Suite and UTL_FILE_DIR

The combination of UTL_FILE being granted to PUBLIC and UTL_FILE_DIR being publicly accessible creates a significant security issue for the Oracle E-Business Suite.  The Oracle E-Business Suite uses UTL_FILE_DIR to read and write concurrent manager request temporary files.  Also, UTL_FILE_DIR is extensively used by most organizations to access interface and conversion data files from PL/SQL interface programs.

In the Oracle E-Business Suite, UTL_FILE_DIR is usually set to include at least the directories specified in $APPLPTMP and $APPLTMP – in the default installation this will include at least “/usr/tmp”.  Frequently, additional custom directories will be included for custom interfaces and other custom programs.

By accessing the APPLSYSPUB database account, an attacker can easy read and write interface data files.  Depending on the exact configuration, implemented modules, and custom interfaces, this could allow access to sensitive information including social security numbers and credit card numbers.

Migrating From UTL_FILE_DIR

For Oracle E-Business Suite customers, migrating from UTL_FILE_DIR to Directories requires only minimal changes and may require no source code changes depending on the design of the interfaces and other custom programs. The steps are as follows -

  1. Identify where UTIL_FILE is used
  2. Create Directories
  3. Change FOPEN calls to use Directories
  4. Edit UTL_FILE_DIR to remove physical Directories

Step One – Identify Where UTIL_FILE Is Used

The most difficult issue is identifying the packages, functions, and procedures using the physical directories in UTL_FILE_DIR.  The UTL_FILE_DIR physical directories are only directly referenced by the UTL_FILE.FOPEN function.  The FOPEN specifies the operating system directory and file name to open.  All subsequent read, write, and close function calls use the file handle returned by FOPEN.

The following SQL may assist in identifying uses of UTL_FILE_DIR in FOPEN statements –

SELECT * FROM dba_source 

WHERE upper(text) like '%FOPEN%' 

AND name like '%<custom prefix>%' 

AND owner = 'APPS' 

 If the calls to FOPEN are not in a common function and are not accessed through some other indirection, it should be fairly straightforward to find and change the necessary FOPEN references in any custom PL/SQL packages, functions, and procedures.  If the physical directory paths are stored in a table or in a concurrent program definition, then no changes to the source code are required.

At this time, converting the standard the Oracle E-Business Suite directories ($APPLPTMP and $APPLTMP) is not recommended as this is not supported by Oracle.  Theoretically it should work without any issues, however, the Oracle E-Business Suite references the directories in multiple places including the $APPLPTMP and $APPLTMP environmental variables, system profile options (e.g., “ECX: XSLT File Path”), and potentially in some configuration files.

Step Two – Create Directories

The following general steps are required to change the references from UTL_FILE_DIR to Directories. Please note that the directory name MUST always be in uppercase in all UTL_FILE.FOPEN statements, otherwise errors may be encountered

For each custom directory in UTL_FILE_DIR, execute the following SQL statements in each development, test, and production database instance –

CREATE OR REPLACE DIRECTORY <name> AS '<physical directory path>'; 

as an example –



The directories “/usr/tmp” and “../comn/temp” and any other directories specified in $APPLPTMP and $APPLTMP should remain in UTL_FILE_DIR, since these directories are required by Oracle.

Step Three – Change FOPEN Calls to Use Directories

Once directories have been created the next step is to edit your code to use them. The process is straightforward. If a physical directory is specified in the UTL_FILE.FOPEN statement, change the hard-coded path to the Directory name. 

As an example –

FILE_ID := UTL_FILE.FOPEN('/usr/tmp', 'dummy.txt', 'W'); 

 change to –

FILE_ID := UTL_FILE.FOPEN('TMP_DIR', 'dummy.txt', 'W'); 

 Two pointers to keep in mind:

  1. Always be sure to use the directory name in uppercase and it must be enclosed in single quotes  
  2. If the physical directory is specified in a table or as a parameter in a Concurrent Program definition, then just specify the Directory name rather than a physical path – /usr/tmp becomes TMP_DIR

Step Four – Edit UTL_FILE_DIR to Remove Physical Directories

Remove all the custom physical directories from UTL_FILE_DIR.  The standard Oracle directories of ‘/usr/tmp’, ‘../comn/temp’, etc. should not be removed.  The database must be bounced for the change to take effect.


If you have questions, please contact us at

References Tags: Information DisclosureOracle DatabaseOracle E-Business Suite
Categories: APPS Blogs, Security Blogs

<b>Contributions by Angela Golla,

Oracle Infogram - Mon, 2014-07-28 10:58
Contributions by Angela Golla, Infogram Deputy Editor

Source and Hire the Best Talent
Oracle Taleo Recruiting Cloud Service is the market leading solution that helps enterprises source, assess, and hire the best talent. In addition to automating the process for even the most complex global organization, Oracle Taleo Recruiting Cloud Service delivers insights to continuously improve talent acquisition efficiency and effectiveness.  Learn more at the Oracle Taleo Recruiting homepage. 

Plan Instability

Dominic Brooks - Mon, 2014-07-28 09:04

There seems to me to be a relatively simple choice.

Either you except that the Oracle Optimizer has a wealth of complicated strategies and, in this complex effort to get the best executions it can, will inevitably either get it wrong sometimes (or take some extra executions to realise it’s wrong).

Or you stick your head in the sand and raise a bug for every unexpected poor execution or plan flip.

But let’s say that above else you wanted Plan Stability.

This is an widespread desire.

What would be the best strategy?
And to what lengths would you take it?

SQL Plan Management features are designed to give you some degree of stability.

You could baseline statements which have given you a problem in the past.

What if that is not deemed adequate?

So, on the one hand, you could baseline everything you could find.

Taking a long history of AWR and taking regularly snaps of V$SQL, you could put all in a SQL Plan Baseline.

But, depending on your application, you might regularly get new SQL appear unprotected by a baseline.
In one Hibernate application I know, a change in the application domain model often results in the table aliases changing, meaning that there is lots of brand new, but very familiar, SQL.
So, you then become dependant on having a reliable testing infrastructure which will generate nearly all the SQL you’re likely to get in advance of production.

In addition, you might want multiple plans in your baseline – for ACS for example – and then, once that extra bindset awareness is flushed out of memory, you then need a couple of executions to rediscover it (unless you use a SQL Patch to inject BIND_AWARE into the specific sql statements).

It’s certainly no guarantee of stability.

What is the alternative?

I like to argue that most problematic executions stem from two sources:

  1. Badly written SQL
  2. Using poor combinations of features – e.g. bind variable peeking + histograms

The other day I made a comment to someone that “if you’re heavily reliant on ACS, perhaps you should rethink whether you should be using literals rather than bind variables”.

So, you might then take the position that plan instability stems from increasing complexity in the optimizer.

In which case, maybe a viable strategy might be to turn off much of the complexity:

  • Bind Variable Peeking
  • Cardinality feedback
  • Adaptive Cursor Sharing
  • Dynamic Sampling
  • Adaptive direct path reads
  • 12c adaptive execution plans
  • Anything with the word “adaptive” in it?
  • Default Stats job and/or default method_opt histogram behaviour

This seems quite radical to many. Perhaps justifiably.

Perhaps it would be nice if there were new optimizer strategies – OPTIMIZER_MODE? – perhaps “AGGRESSIVE_PERFORMANCE” (maybe “ADAPTIVE_PERFORMANCE_WITH_THE_ODD_HICCUP” ?) vs “PERFORMANCE_STABILITY” which were an umbrella of many of the features above.


To what extent are you concerned with performance stability, above all else?

How many of the optimizer features in each release are actually aligned with that widespread concern?

Four Options For Oracle DBA Tuning Training

Four Options For Oracle DBA Tuning Training
Oracle DBAs are constantly solving problems... mysteries. That requires a constant knowledge increase. I received more personal emails from my Oracle DBA Training Options Are Changing posting than ever before. Many of these were from frustrated, angry, and "stuck" DBAs. But in some way, almost all asked the question, "What should I do?"

In response to the "What should I do?" question, I came up with four types of Oracle DBA performance tuning training that are available today. Here they are:

Instructor Led Training (ILT) 
Instructor Led Training (ILT) is the best because you have a personal connection with the teacher. I can't speak for other companies, but I strive to connect with every student and every student knows they can personally email or call me...even years after the training. In fact, I practically beg them to do what we do in class on their production systems and send me the results so I can continue helping them. To me being a great teacher is more than being a great communicator. It's about connection. ILT makes connecting with students easy.

Content Aggregators
Content Aggregators are the folks who pull together free content from various sources, organize and display it. Oh yeah... and they profit from it. Sometimes the content value is high, sometimes not. I tend to think of content aggregators like patent trolls, yet many times they can be a great resource. The problem is you're not dealing with the creator of the content. However, the creator of the content actually knows the subject matter. You can somtimes contact I encourage my students and readers to do.

Content Creators
Content Creators are the folks who create content based on their experiences. We receive that content through their blogs, videos, conference presentations and sometimes through their training. I am a content creator but with an original, almost child-like curiosity, performance research twist. Content creators rarely directly profit from their posted content, but somehow try to transform it into a revenue stream. I can personally attest, it can be a risky financial strategy...but it's personally very rewarding. Since I love do research, it's easy and enjoyable to post my findings so others may benefit.

Online Training (OLT)
Online Training (OLT) is something I have put off for years. The online Oracle training I have seen is mostly complete and total crap. The content is usually technically low and mechanical. The production quality is something a six year old can do on their PC. The teaching quality is ridiculous and the experience puts you to sleep. I do not ever want to be associated with that kind of crowd.

I was determined to do something different. It had to be the highest quality. I have invested thousands of dollars in time, labor, and equipment to make online video training
Craig teaching in an OraPub Online Institute Seminarwork. Based on the encouraging feedback I receive it's working!

This totally caught me by surprise. I have discovered that I can do things through special effects and a highly organized delivery that is impossible to do in a classroom. (Just watch my seminar introductions on YouTube and you'll quickly see what I mean.) This makes the content rich and highly compressed. One hour of OraPub Online Institute training is easily equivalent to two to four hours of classroom training. Easily. I have also strive to keep the price super low, the production at a professional level and ensure the video can be streamed anywhere in the world and on any device. Online training is an option, but you have to search for it.

So there you have it. Because of economics and the devaluation of DBAs as human beings coupled with new technologies, the Oracle DBA still has at least four main sources of training and knowledge expansion. Don't give up learning!

Some of you reading may be surprised that I'm writing about this topic because it will hurt my traditional instructor led training (public or on-site) classes. I don't think so. If people can attend my classes in person, they will. Otherwise, I hope they will register for an OraPub Online Institute seminar. Or, at least subscribe to my blog (see upper left of page).

All the best in your quest to do great work,

You can watch seminar introductions for free on YouTube!
If you enjoy my blog, subscribing will ensure you get a short-concise email about a new posting. Look for the form on this page.

P.S. If you want me to respond to a comment or you have a question, please feel free to email me directly at craig@orapub .com.

Categories: DBA Blogs

PDB media failure may cause the whole CDB to crash

Yann Neuhaus - Mon, 2014-07-28 07:44

Do you remember last year, when 12c arrived with multitenant, David Hueber warned us about the fact that a single PDB can, under certain conditions, generate a complete system downtime? We are beta testers and opened a SR for that. Now one year later the first patchset is out and obviously I checked if the issue was fixed. It's a patchset afterall, which is expected to fix issues before than bringing new features.

So the issue was that when the SYSTEM tablespace is lost in a PDB, then we cannot restore it without shutting down the whole CDB. This is because we cannot take the SYSTEM tablespace offline, and we cannot close the PDB as a checkpoint cannot be done. There is no SHUTDOWN ABORT for a PDB that can force to it. Conclusion: if you loose one SYSTEM tablespace, either you accept to wait for a maintenance window before bring it back online, or you have to stop the whole CDB with a shutdown abort.

When I receive a new release, I like to check new parameters, even the undocumented ones. And in there is a new underscore parameter _enable_pdb_close_abort which has the description 'Enable PDB shutdown abort (close abort)'. Great. It has a default value of false but maybe this is how the bug has been addressed.

Before trying that parameter, let's reproduce the case:

Here are my datafiles:

RMAN> report schema;

using target database control file instead of recovery catalog
Report of database schema for database with db_unique_name CDB_SITE1

List of Permanent Datafiles
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    790      SYSTEM               YES     /u01/app/oracle/oradata/CDB/system01.dbf
3    680      SYSAUX               NO      /u01/app/oracle/oradata/CDB/sysaux01.dbf
4    215      UNDOTBS1             YES     /u01/app/oracle/oradata/CDB/undotbs01.dbf
5    250      PDB$SEED:SYSTEM      NO      /u01/app/oracle/oradata/CDB/pdbseed/system01.dbf
6    5        USERS                NO      /u01/app/oracle/oradata/CDB/users01.dbf
7    540      PDB$SEED:SYSAUX      NO      /u01/app/oracle/oradata/CDB/pdbseed/sysaux01.dbf
8    250      PDB1:SYSTEM          NO      /u01/app/oracle/oradata/CDB/PDB1/system01.dbf
9    570      PDB1:SYSAUX          NO      /u01/app/oracle/oradata/CDB/PDB1/sysaux01.dbf
10   5        PDB1:USERS           NO      /u01/app/oracle/oradata/CDB/PDB1/PDB1_users01.dbf
11   250      PDB2:SYSTEM          NO      /u01/app/oracle/oradata/CDB/PDB2/system01.dbf
12   570      PDB2:SYSAUX          NO      /u01/app/oracle/oradata/CDB/PDB2/sysaux01.dbf
13   5        PDB2:USERS           NO      /u01/app/oracle/oradata/CDB/PDB2/PDB2_users01.dbf

List of Temporary Files
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1    197      TEMP                 32767       /u01/app/oracle/oradata/CDB/temp01.dbf
2    100      PDB$SEED:TEMP        32767       /u01/app/oracle/oradata/CDB/pdbseed/pdbseed_temp012014-06-15_09-46-11-PM.dbf
3    20       PDB1:TEMP            32767       /u01/app/oracle/oradata/CDB/PDB1/temp012014-06-15_09-46-11-PM.dbf
4    20       PDB2:TEMP            32767       /u01/app/oracle/oradata/CDB/PDB2/temp012014-06-15_09-46-11-PM.dbf

then I just remove the PDB2 SYSTEM datafile:

rm /u01/app/oracle/oradata/CDB/PDB2/system01.dbf 

And I go to sqlplus in order to check the state of my pdb. Remeber, I want to see if I can restore the datafile without doing a shutdown abort on my CDB instance.

SQL*Plus: Release Production on Mon Jul 27 20:31:45 2014

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> select name,open_mode from v$pdbs;
select name,open_mode from v$pdbs
ERROR at line 1:
ORA-01034: ORACLE not available
Process ID: 0
Session ID: 0 Serial number: 0

Oh... that's bad... Let's look at the alert.log:

Errors in file /u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_ckpt_21620.trc:
ORA-01243: system tablespace file suffered media failure
ORA-01116: error in opening database file 11
ORA-01110: data file 11: '/u01/app/oracle/oradata/CDB/PDB2/system01.dbf'
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
USER (ospid: 21620): terminating the instance due to error 1243
System state dump requested by (instance=1, osid=21620 (CKPT)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_diag_21608_20140727202844.trc
2014-07-27 20:28:49.596000 +02:00
Instance terminated by USER, pid = 21620

The CKPT process has terminated the instance. The whole CDB is down.

That's worse. In we had to bring down the instance, but at least we were able to choose the time and warn the users. Not here. In it crashes immediately when a checkpoint occurs.

I've opened a bug for that (Bug 19001390 - PDB SYSTEM TABLESPACE MEDIA FAILURE CAUSES THE WHOLE CDB TO CRASH) which is expected to be fixed for the next release (12.2).

Ok the good news is that once the CDB is down, recovery is straightforward:

Recovery Manager: Release - Production on Mon Jul 27 21:36:22 2014

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

connected to target database (not started)

RMAN> startup

Oracle instance started
database mounted
database opened

Total System Global Area     838860800 bytes

Fixed Size                     2929936 bytes
Variable Size                616565488 bytes
Database Buffers             213909504 bytes
Redo Buffers                   5455872 bytes

RMAN> list failure;

using target database control file instead of recovery catalog
Database Role: PRIMARY

List of Database Failures

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
3353       CRITICAL OPEN      27-JUL-14     System datafile 11: '/u01/app/oracle/oradata/CDB/PDB2/system01.dbf' is missing
245        HIGH     OPEN      27-JUL-14     One or more non-system datafiles need media recovery

RMAN> advise failure;

Database Role: PRIMARY

List of Database Failures

Failure ID Priority Status    Time Detected Summary
---------- -------- --------- ------------- -------
3353       CRITICAL OPEN      27-JUL-14     System datafile 11: '/u01/app/oracle/oradata/CDB/PDB2/system01.dbf' is missing
245        HIGH     OPEN      27-JUL-14     One or more non-system datafiles need media recovery

analyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=132 device type=DISK
analyzing automatic repair options complete

Mandatory Manual Actions
no manual actions available

Optional Manual Actions
1. If file /u01/app/oracle/oradata/CDB/PDB2/system01.dbf was unintentionally renamed or moved, restore it
2. Automatic repairs may be available if you shutdown the database and restart it in mount mode
3. If you restored the wrong version of data file /u01/app/oracle/oradata/CDB/PDB2/sysaux01.dbf, then replace it with the correct one
4. If you restored the wrong version of data file /u01/app/oracle/oradata/CDB/PDB2/PDB2_users01.dbf, then replace it with the correct one

Automated Repair Options
Option Repair Description
------ ------------------
1      Restore and recover datafile 11; Recover datafile 12; Recover datafile 13
  Strategy: The repair includes complete media recovery with no data loss
  Repair script: /u01/app/oracle/diag/rdbms/cdb/CDB/hm/

RMAN> repair failure;

Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/oracle/diag/rdbms/cdb/CDB/hm/

contents of repair script:
   # restore and recover datafile
   restore ( datafile 11 );
   recover datafile 11;
   # recover datafile
   recover datafile 12, 13;

Do you really want to execute the above repair (enter YES or NO)? YES
executing repair script

Starting restore at 27-JUL-14
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00011 to /u01/app/oracle/oradata/CDB/PDB2/system01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/CDB/FECFFDC5F6D31F5FE043D74EA8C0715F/backupset/2014_07_28/o1_mf_nnndf_TAG20140728T150921_9xdlw21n_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/CDB/FECFFDC5F6D31F5FE043D74EA8C0715F/backupset/2014_07_28/o1_mf_nnndf_TAG20140728T150921_9xdlw21n_.bkp tag=TAG20140728T150921
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
Finished restore at 27-JUL-14

Starting recover at 27-JUL-14
using channel ORA_DISK_1

starting media recovery
media recovery complete, elapsed time: 00:00:01

Finished recover at 27-JUL-14

Starting recover at 27-JUL-14
using channel ORA_DISK_1

starting media recovery
media recovery complete, elapsed time: 00:00:00

Finished recover at 27-JUL-14
repair failure complete

RMAN> alter pluggable database PDB2 open;

Statement processed

I whish that one day the PDB will have true isolation so that I can give DBA rights to the application owner on his PDB. But that means that, at least:

  • A PDB failure cannot crash the CDB instance.
  • A PDB admin cannot create datafiles anywhere on my server.
  • A PDB admin cannot run anything as the instance owner user (usually oracle)

Logging for Slackers

Pythian Group - Mon, 2014-07-28 07:41

When I’m not working on Big Data infrastructure for clients, I develop a few internal web applications and side projects. It’s very satisfying to write a Django app in an afternoon and throw it on Heroku, but there comes a time when people actually start to use it. They find bugs, they complain about downtime, and suddenly your little side project needs some logging and monitoring infrastructure. To be clear, the right way to do this would be to subscribe to a SaaS logging platform, or to create some solution with ElasticSearch and Kibana, or just use Splunk. Today I was feeling lazy, and I wondered if there wasn’t an easier way.

Enter Slack

Slack is a chat platform my team already uses to communicate – we have channels for different purposes, and people subscribe to keep up to date about Data Science, our internal Hadoop cluster, or a bunch of other topics. I already get notifications on my desktop and my phone, and the history of messages is visible and searchable for everyone in a channel. This sounds like the ideal lazy log repository.

Slack offers a rich REST API where you can search, work with files, and communicate in channels. They also offer an awesome (for the lazy) Incoming WebHooks feature – this allows you to POST a JSON message with a secret token, which is posted to a pre-configured channel as a user you can configure in the web UI. The hardest part of setting up a new WebHook was choosing which emoji would best represent application errors – I chose a very sad smiley face, but the devil is also available.

The Kludge

Django already offers the AdminEmailHandler, which emails log messages to the admins listed in your project. I could have created a mailing list, added it to the admins list, and let people subscribe. They could then create a filter in their email to label the log messages. That sounds like a lot of work, and there wouldn’t be a history of the messages except in individual recipients’ inboxes.

Instead, I whipped up this log handler for Django which will post the message (and a stack trace, if possible) to your Slack endpoint:

from logging import Handler
import requests, json, traceback
class SlackLogHandler(Handler):
   def __init__(self, logging_url="", stack_trace=False):
      self.logging_url = logging_url
      self.stack_trace = stack_trace
   def emit(self, record):
      message = '%s' % (record.getMessage())
      if self.stack_trace:
         if record.exc_info:
            message += '\n'.join(traceback.format_exception(*record.exc_info))
  , data=json.dumps({"text":message} ))

There you go: install the requests library, generate an Incoming WebHook URL at, stick the SlackLogHandler in your Django logging configuration, and your errors will be logged to the Slack channel of your choice. Stack traces are optional – I’ve also been using this to post hourly reports of active users, etc. to the channel under a difference username.

For reference, here’s a log configuration for the Django Now go write some code, you slacker.

    'handlers': {
        'console': {
        'slack-error': {
    'loggers': {
        'django': {
            'level': 'INFO',
            'handlers': ['console', 'slack-error']
Categories: DBA Blogs

The Importance of Documentation

Chris Foot - Mon, 2014-07-28 07:05

As a remote data infrastructure services provider, documenting the activities we perform, as well as our customers’ environments, is critical to our success. RDX currently supports thousands (and thousands) of database ecosystems ranging in complexity from simple to “making your hair stand on end.”

My customers hold my organization to extremely high standards. Rightfully so, they have turned over the keys to their most sensitive and mission-critical data stores to RDX. At the end of every email blast I send to our customer base, I end it with, “I personally appreciate you placing your trust in us to administer and safeguard your most valuable data assets. We take that responsibility very seriously here at RDX.” Stating that we take that responsibility seriously is kind of like saying the Titanic sprung a small leak.

Although the importance of a well thought out and detailed documentation library is blatantly obvious, creating documentation is the task most often postponed by an overworked DBA unit.

Documenting processes, procedures and best practices is a task that is often considered to be boring and mundane. Most DBAs would rather perform virtually any other activity than sit in front of a screen using a word processor. As a result, creating documentation is often postponed until the DBA has a little free time to kill. Today’s database administration units are operating with smaller staffs, tighter budgets and ever-increasing workloads. The end result is that the documentation is either never created or created and not kept current.

However, a robust detailed documentation library creates an environment that is less complex, less error-prone, reduces the amount of time DBAs spend learning new database environments and reduces the overall time spent on day-to-day support activities. DBAs are able to spend more time administering the environment rather than finding the objects they are trying to support and the processes and programs used to administer them.

The nature of my business as a remote services provider demands excellent documentation. The majority of environments we administer weren’t designed by my organization. The only way that we can ensure high quality and high-speed administration of these environments is to document them thoroughly. We document everything from initial connectivity and customer contact sheets to detailed information on database and server information, batch job streams and individual program execution (what it does, run-time windows). If we need to be aware of it, we have it documented.

Documentation is also the foundation of many of the other disciplines I will be discussing in future blogs. Let’s continue our discussion with a few helpful hints to get you started.

Understanding the Important Role Good Documentation Plays

We all generally understand the benefits that documentation provides. I think that all readers will see the importance I personally place on documentation in upcoming blogs.

Let me reaffirm my opinion in this one sentence: Good documentation is the foundation that high-quality data infrastructure services are built upon.

Creating an Organizational Environment That Fosters Good Documentation

I’ve been the Vice President of Service Delivery at RDX for 6 years now. It is my responsibility as manager to create an environment that fosters the production of robust and high-quality documentation. Let me describe some of the challenges that I have faced in the past at other organizations and how I have overcome them.

Since I view high quality documentation to be my responsibility as a manager, I ensure that it becomes part of every DBA’s performance appraisal criteria, including my own. If it isn’t on my, and my unit’s, performance appraisal forms, I will ask to have it added or make my own personal addendum and notify both the DBA team and management that I have done so.

I will add time for documentation when I estimate the amount of time it will take me to perform an administrative task during project planning meetings. I don’t settle for “we can do that after the project is complete” as an answer.

If you continuously sell the importance of documentation, sooner or later, you will begin to wear your opponents down. Although I prefer to call it “being relentless,” I’m sure that many of the application development managers (and my own managers) viewed it as “being a ….” (insert your favorite description here).

Every document I have created that provides a list of activities I , or my unit, need to perform during a project has documentation included. It helps to integrate it into the fabric and culture of my organization’s environment.

Making Documentation Second Nature

You also need to ensure that generating documentation becomes a natural part of your daily activities. You must continuously remind yourself that documentation is a primary and integral part of providing high-quality support services to your customers.

You must also remind yourself that it makes your job easier and benefits your fellow DBAs. It is a recipe for disaster when a fellow DBA needs to be out of the office for a time and asks another DBA to “help them out” by performing a complex, application-specific administrative activity and then tries to verbally tell them how to perform the 326 steps it takes to execute it.

Did you ever try to refresh an ERP application test environment from production when that test environment doesn’t have enough space to hold all of production’s data? 4,000 steps later, you begin to second-guess your choice of professions. That was the exact request from one of my fellow DBAs when I first started in this profession, and it quickly taught me the importance of good documentation. Not only did he get me to do the refresh, but I also had to document the process for him along the way. Some call that being a good coworker; I would view that as having a big sucker taped to my forehead.

The moral of this story is this: If you don’t want to be the only one that can perform that 900 step ERP application production to test refresh, document it! If you don’t want to be called by the on-call DBA because he doesn’t know exactly where to add a file in an emergency situation (like someone forgetting to tell you that they were loading 10 million additional rows into that 100 row table), document it! The more you document, the easier your life as a DBA becomes.

I’ve never had a photographic memory. It makes generating documentation easy for me. I also like to write, and that helps, but I will admit that there are times that I would rather perform virtually any other activity than document.

However, it has become easier because I continuously reaffirm to myself the importance of documentation. The more you reinforce that to yourself, the more second nature (and easier) it becomes.

Making Documentation Easy

I’m a huge fan of documentation templates. Here at RDX, we have templates and Standard Operating Procedures for everything we document. If it is repeatable or a complex process, we have an SOP for it. We have templates for documenting connections to our customers’ environments, their backup and recovery environments and their application specific processes, to name a few. If it needs to be documented on a regular basis, we have a template for it. We also have generic templates for documenting environments and activities that don’t fit into other templates.

Word Documents and Templates

Word document templates provide many features that streamline the documentation process and help to improve the quality of the content they store. I try to take advantage of as many features as I can. I use drop-down selection menus, check boxes and radio push buttons to improve the speed and quality of the documentation process. I also take advantage of the help pop-up feature that Microsoft Word provides to create a detailed description of what information is to be entered into that field, check box or radio button.


We heavily utilize Wikis to intelligently and securely display information about the environments we are tasked with supporting. A common, menu-driven interface has been designed, tuned and tweaked over our 20 year history. The Wiki’s contents include customer contact and escalation information, detailed database/server information, customer change management procedures, RDX DBAs assigned to the account, on-call team assigned, non-sensitive connection information (VPN type, VPn vendor, etc) and job information. The Wiki page also links to current tickets, current time cards and a history of past problems contained in our problem resolution library.

The Wiki content is controlled by a well-defined change management procedure and relies upon Wiki templates to ensure information is stored and displayed in a common format that adheres to RDX specifications. Once again, templates help improve the quality of content, speed data entry and ensure a uniformity of display pages and menus. We constantly review the Wiki for content and usability as well as leverage new Wiki features as they are released.

Database-Driven Content Managers

There are dozens of software companies that offer content management solutions. Database vendors have also recognized this as a lucrative market. All of the major database vendors now offer advanced content management software, each one trying to outdo the other in the number of bells and whistles that their products offer. Do a quick search on Google for documentation content management software, and you will find out just how many competing products there are.

Content management products offer check-in/check-out features, document versioning, web portal access and advanced workflow capabilities to name just a few of the features designed to improve content management. The competition in the content management market space is fierce to say the least. Content management vendors know that continuously adding new bells and whistles to their products is not just important for increasing market share, but it also is critical for their survival. Product costs can range from thousands to tens of thousands of dollars (or more).

If you have the funds and your management understands the benefits that a full-blown content management package provides, by all means begin a content management product analysis. But if you don’t have the funds, create a shared drive on your network and declare it to be the “DBA Documentation Portal.”

What to Document

By all means, this is not an all-inclusive list of what can be documented. Consider it as a starter kit to help you begin your quest for “documentis nirvanas.” Is some of this overkill for your particular environment? Maybe, but just consider this a general, high-level list. Since most readers will work for a single organization, I’m focusing my recommendations on DBA units that support one corporate environment.

Database Environment Documentation

  • Naming conventions
  • Servers (server names, operating system release, hardware vendor)
  • Databases (vendor, database version, features enabled)

Application-Specific Documentation

  • Application type (i.e. data warehouse, online transaction processing, decision support, third-party application name and functionality it provides).
  • Business unit requirements and related information for supported databases
  • Uptime requirements (i.e. 24 X 7, 8 X 5)
  • Database downtime windows
  • Critical job processes
  • Business unit and application developer contact lists
  • Turnover windows for database changes
  • Problem notification and escalation procedures
  • Security sensitivity- How sensitive is the data?

Process Documentation

  • Repeatable administrative processes (covered in an upcoming blog)
  • Backups – Probably the most critical set of documentation you will ever create- Document how it is backed up, what scripts back it up, where the backup is going to, retention periods and backup message directories. If it is involved with a backup, DOCUMENT IT. Review the document with other units that are involved in the backup and recovery process. It is your responsibility to ensure that you don’t hear an operator say, “What retention period? Nobody told me we were to have a retention on these files” when you are in a recovery situation. Remember that Oracle states that human error, including miscommunications, is responsible for over 90% of failed recoveries. If you want to reduce recovery failures, DOCUMENT THE PROCESS AND REVIEW IT.
  • Anything else you run on a regular basis to support a specific application
  • Change management- I’ll be spending an entire blog, or two, on this
  • A daily monitoring activity checklist to ensure that no activity is missed- We have daily, weekly and monthly activities that are to be performed for each of our customers
  • Complex administrative activities performed regularly
  • Test and reporting database refreshes
  • Data reorganizations
  • Disaster recovery tests- The processes required to perform the recovery AND the criteria that will be used to evaluate whether it was successful or not

Object Documentation

  • DBA-specific stored PL/SQL and TSQL programs
  • Day-to-day support scripts (where they are and what they do)
  • Monitoring scripts (where they are and what they do)
  • Scripts used to perform database administrative changes- I personally utilized specific directories that provide output from critical database changes that I have performed and other directories containing the SQL used to make that change
  • Operating system scripts- Document what the script does in the beginning of each of your scripts. Did you ever try to determine what a 400 line script does that was created by someone who knows much more about UNIX scripting than you do? We have all been in that position at one time or another during our career. Make it easy on your coworkers to understand what the script does by putting comments at the top of the script as well as in the body. Also keep a running history of script changes, what they were and the time they were made

Database Administration Unit Organizational Documentation

  • Contact Information
  • DBA roles and responsibilities- Which applications, databases and tasks they are responsible for supporting
  • DBA unavailability- Allows application developers to plan for a DBA not being available

It is a good practice to distribute this information to all business units supported by the database administration unit.

I hope you enjoyed this blog on documentation and the important role it plays.

The post The Importance of Documentation appeared first on Remote DBA Experts.

Using HAWQ with PHD service in PCF 1.2

Pas Apicella - Mon, 2014-07-28 06:01
The following demo shows how to use the PCF 1.2 PHD service with HAWQ by loading data into the PCF PaaS platform.

1. First lets setup our ENV to use the correct version of HADOOP on our local laptop.

export HADOOP_INSTALL=/Users/papicella/vmware/software/hadoop/hadoop-2.0.5-alpha
export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home


export HADOOP_OPTS="$HADOOP_OPTS  -Djava.awt.headless=true"

export YARN_OPTS="$YARN_OPTS -Djava.awt.headless=true"

hadoop version

2. Set the HADOOP_USER_NAME to ensure you have write access to load a file.

export HADOOP_USER_NAME=ucc3a04008db2486

3. Create a file called person.txt with some pipe delimited data , example below.

[Mon Jul 28 21:47:37 papicella@:~/vmware/software/hadoop/cloud-foundry/pcf12/demo ] $ head person.txt

4. Load the file into the PHD instance running in PCF 1.2. You will need to use the name node / path which is correct for your PHD instance.

[Mon Jul 28 21:51:43 papicella@:~/vmware/software/hadoop/cloud-foundry/pcf12/demo ] $ hadoop fs -put person.txt hdfs://x.x.x.x:8020/user/ucc3a04008db2486/

5. Create a HAWQ table to the file person.txt using PXF as shown below.

CREATE EXTERNAL TABLE person (id int, name text)
LOCATION ('pxf://x.x.x.x:50070/user/ucc3a04008db2486/person.txt?Fragmenter=HdfsDataFragmenter&Accessor=TextFileAccessor&Resolver=TextResolver')

6. Query the table as shown below.

For more information on the PHD service see the link below.
Categories: Fusion Middleware

How DBA services can help you manage big data

Chris Foot - Mon, 2014-07-28 02:11

Effectively being able to store and manage big data is more than simply having a lot of hard disk space. 

The variety and complexity of the information produced by Internet-connected assets has forced database administration services to adapt to new processes and environments. Their focus on accessibility and security hasn't wavered, but the manner in which they approach these priorities has transformed.

Solving the puzzle: structured and unstructured data 
BrightPlanet, a company that specializes in harvesting data from the Internet, outlined the differences between unstructured and structured data. While volume has always challenged databases to hold massive troves of organized intelligence, one of the chief difficulties resides in the act of actually arranging it. 

  • Structured: Pertains to data that is highly constructed and easy to query and is typically held in relational database systems. A spreadsheet is an example of structured information.
  • Unstructured: Applicable to data that doesn't subscribe to a particular architecture and is usually stored in NoSQL databases, which run complex algorithms to create environments capable of managing it. Social media posts are examples of unstructured information. 

What does this mean for remote database services?
According to MongoDB, conventional DBA practices needed to become more agile in order to be able to query large collections of unstructured data, giving birth to NoSQL databases. This access language sanctioned the development of "document" storage, which has spawned the following benefits:

  • Documents are regarded as independent entities, which makes it simpler to transport data across multiple virtual locations.
  • SQL queries don't need to be translated from object to application. 
  • Because a document contains whatever values the software language requires, unstructured data is easy to store. 

In response to this development, DBAs learned the administrative languages and tools needed to launch and manage document-based data environments. 

Different program, same responsibilities 
As one can imagine, DBAs are still expected to perform the same database active monitoring tasks they have been around since the inception of digital information storage. There are also a number of additional responsibilities these professionals are undertaking:

  • Understanding how clients plan on using the data. Are they simply looking to scrutinize it or allow applications to make intelligent decisions with it?
  • Securing firewall access. What tactics are cybercriminals employing in an attempt to penetrate these environments?
  • Managing and monitoring performance. How well are software deployments adapting to unstructured data? 

Outsourcing to DBAs knowledgeable in contemporary enterprise needs and NoSQL databases may be a good tactic for organizations to use. 

The post How DBA services can help you manage big data appeared first on Remote DBA Experts.

Parallel Plans

Jonathan Lewis - Sun, 2014-07-27 13:39

I’ve popped this note to the top of the stack because I’ve added an index to Randolf Geist’s series on parallel execution skew, and included a reference his recent update to the XPLAN_ASH utility.

This is the directory for a short series I wrote discussing how to interpret parallel execution plans in newer versions of Oracle.

For other aspects of parallel execution, here are links to several articles by Randolf Geist, published on his own blog or on Oracle’s Technet:

One of the awkward problems you can encounter with parallel execution is data skew – which has the potential to make just one slave in a set do (almost) all the work hence reducing the performance to something close to serial execution times.  has written a series of articles on Parallel Skew that has been published by AllthingsOracle over the last few months.

And a summary posting from Randolf listing the 5 articles above, but also including a set of short videos on the topic.


GI Commands : 1 -- Monitoring Status of Resources

Hemant K Chitale - Sun, 2014-07-27 08:10
In 11gR2

Listing the Status of Resources

[root@node1 ~]# su - grid
-sh-3.2$ crsctl status resource -t
Local Resources
Cluster Resources
1 ONLINE ONLINE node1 Open
2 ONLINE ONLINE node2 Open

So we see that :
a) The Cluster consists of two nodes node1 and node2
b) There are 4 ASM DiskGroups DATA, DATA1, DATA2 and FRA
c) GSD is offline as expected -- it is required only for 9i Databases
d) There is a database racdb and a service new_svc  (see my previous post)

Listing the status of SCAN Listeners

-sh-3.2$ id
uid=500(grid) gid=1001(oinstall) groups=1001(oinstall),1011(asmdba)
-sh-3.2$ srvctl status scan
SCAN VIP scan1 is enabled
SCAN VIP scan1 is running on node node2
SCAN VIP scan2 is enabled
SCAN VIP scan2 is running on node node1
SCAN VIP scan3 is enabled
SCAN VIP scan3 is running on node node1
-sh-3.2$ srvctl status scan_listener
SCAN Listener LISTENER_SCAN1 is enabled
SCAN listener LISTENER_SCAN1 is running on node node2
SCAN Listener LISTENER_SCAN2 is enabled
SCAN listener LISTENER_SCAN2 is running on node node1
SCAN Listener LISTENER_SCAN3 is enabled
SCAN listener LISTENER_SCAN3 is running on node node1

So we see that
a) There are 3 SCAN Listeners
b) Since this is a 2-node cluster, 2 of the SCAN Listeners are on one node node1

Listing the status of the OCR

-sh-3.2$ id
uid=500(grid) gid=1001(oinstall) groups=1001(oinstall),1011(asmdba)
-sh-3.2$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3668
Available space (kbytes) : 258452
ID : 605940771
Device/File Name : +DATA
Device/File integrity check succeeded
Device/File Name : /fra/ocrfile
Device/File integrity check succeeded
Device/File Name : +FRA
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check bypassed due to non-privileged user

-sh-3.2$ su root
[root@node1 grid]# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3668
Available space (kbytes) : 258452
ID : 605940771
Device/File Name : +DATA
Device/File integrity check succeeded
Device/File Name : /fra/ocrfile
Device/File integrity check succeeded
Device/File Name : +FRA
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

[root@node1 grid]#

So we see that :
a) The OCR is in 3 locations +DATA, +FRA and NFS filesystem /fra/ocrfile
b) A Logical corruption check of the OCR can only be done by root, not by grid

Listing the status of the Vote Disk

-sh-3.2$ crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 0e94545ce44f4fb1bf6906dc6889aaff (/fra/votedisk.3) []
2. ONLINE 0d13305520f84f3fbf6c2008a6f79829 (/data1/votedisk.1) []
Located 2 voting disk(s).

So we see that :
a) There are 2 votedisk copies  (yes, two -- not the recommended three !)
b) Both are on filesystem
How do I happen to have 2 votedisk copies ?  I actually had 3 but removed one. DON'T TRY THIS ON YOUR PRODUCTION CLUSTER.  I am adding the third one back now :
-sh-3.2$ crsctl add css votedisk /data2/votedisk.2
Now formatting voting disk: /data2/votedisk.2.
CRS-4603: Successful addition of voting disk /data2/votedisk.2.
-sh-3.2$ crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 0e94545ce44f4fb1bf6906dc6889aaff (/fra/votedisk.3) []
2. ONLINE 0d13305520f84f3fbf6c2008a6f79829 (/data1/votedisk.1) []
3. ONLINE 41c24037b51c4f97bf4cb7002649aee4 (/data2/votedisk.2) []
Located 3 voting disk(s).

There, I am now back to 3 votedisk copies.

Categories: DBA Blogs


Jonathan Lewis - Sun, 2014-07-27 01:02

So is out with a number of interesting new features, of which the most noisily touted is the “in-memory columnar storage” feature. As ever the key to making best use of a feature is to have an intuitive grasp of what it gives you, and it’s often the case that a good analogy helps you reach that level of understanding; so here’s the first thought I had about the feature during one of the briefing days run by Maria Colgan.

“In-memory columnar storage gives you bitmap indexes on OLTP systems without the usual disastrous locking side effects.”

Obviously the analogy isn’t perfect … but I think it’s very close:  for each column stored you use a compression technique to pack the values for a large number of rows into a very small space, and for each stored row you can derive the rowid by arithmetic.  In highly concurrent OLTP systems there’s still room for some contention as the session journals are applied to the globally stored compressed columns (but then, private redo introduces some effects of that sort anyway); and the “indexes” have to be created dynamically as tables are initially accessed (but that’s a startup cost, it’s timing can be controlled, and it’s basically limited to a tablescan).

Whatever the technical hand-waving it introduces – thinking of the in-memory thing as enabling real-time bitmaps ought to help you think of ways to make good use of the feature.




Greg Pavlik - Sat, 2014-07-26 10:26
Silence. Sometimes sought after, but in reality almost certainly feared - the absence of not just sound but voice. Silence is often associated with divine encounter - the neptic tradition of the Philokalia comes to mind - but also and perhaps more accurately with abandonment, divine or otherwise. I recently read Shusaku Endo's Silence, a remarkable work, dwelling on the theme of abandonment in the context of the extirpation of Kakure Kirishitan communities in Tokagawa Japan. Many resilient families survived and eventually came out of hiding in the liberalization in the mid-19th century, but the persecutions were terrible. Their story is deeply moving (sufficiently so that over time I find myself drawn to devotion to the image of Maria-Kannon). Endo's novel was not without controversy but remains one of the great literary accomplishments of the 20th century.

In fact, the reason for this post is a kind of double entendre on silence: the relative silence in literate western circles with respect to Japanese literature of the past century. Over the last month, I realized that virtually no one I had spoken with had read a single Japanese novel. Yet, like Russia of the 19th century, Japan produced a concentration of great writers and great novelists in the last 20th century that is set apart: the forces of of profound national changes (and defeat) created the crucible of great art. That art carries the distinctive aesthetic sense of Japan - a kind of openness of form, but is necessarily the carrier of universal, humanistic themes.

Endo is a writer in the post war period - the so-called third generation, and in my view the last of the wave of great Japanese literature. Read him. But don't stop - perhaps don't start - there. The early 20th century work of Natsume Soseki are a product of the Meiji period. In my view, Soseki is not only a father of Japenese literature but one of the greatest figures of world literature taken as a whole - I am a Cat remains one of my very favorite novels. Two troubling post-war novels by Yukio Mishima merit attention - Confessions of a Mask and the Sailor Who Fell From Grace with the Sea, both I would characterize broadly as existential masterpieces. The topic of identity in the face of westernization is also a moving theme in Osamu Dazai's No Longer Human. I hardly mean this as a complete survey - something in any case I am not qualified to provide -just a pointer toward something broader and important.

My encounter with contemporary Japanese literature - albeit limited - has been less impactful (I want to like Haruki Murakami in the same way I want to like Victor Pelevin, but both make me think of the distorted echo of something far better). And again like Russia, it is difficult to know what to make of Japan today - where its future will lead, whether it will see a cultural resurgence or decline. It is certain that its roots are deep and I hope she finds a way to draw on them and to flourish.

Oracle BPM 12c Installation Improvements and Issues with Mac

Andrejus Baranovski - Sat, 2014-07-26 10:13
Oracle BPM 12c is improved with a great installation process - simplified and stable. You only need to download single installation package, run a wizard and after several clicks it gets installed. All required products are installed for you out of the box, no extra configuration is needed - JDeveloper, BPM/SOA extensions, BPM/SOA schemas in Java DB. Developer can focus on his development tasks, no need to spend days to install and configure development environment. You should read official guide - Quick Start for Oracle Business Process Management Suite.

Installer works on Mac OS platform, you can install entire Oracle BPM 12c infrastructure and run domain. However, it fails to open Human Task wizard, while running JDEV on Mac OS. I have developed extremely basic Oracle BPM 12c sample application, to test Human Task wizard with JDeveloper installed on Oracle Enterprise Linux and on Mac OS -

It works fine to load Human Task wizard with JDeveloper 12c installed on Linux:

The same Human Task fails to open with JDeveloper 12c installed on Mac OS. There is Null Pointer exception generated, thrown from JDeveloper IDE class:

Messages in JDeveloper log for this exception:

Probably you should not use Oracle BPM 12c for development on Mac OS, rather use Linux or Windows platforms. This is a bit pity, but on other hand JDeveloper extensions (except ADF Mobile) always had issues running on Mac OS. It should be no difference, but it seems like development on Linux or Windows with JDeveloper is more stable.

Why Oracle Big Data SQL Potentially Solves a Big Issue with Hadoop Security

Rittman Mead Consulting - Sat, 2014-07-26 07:21

Oracle announced their Big Data SQL product a couple of weeks ago, which effectively extends Exadata’s query-offloading to Hadoop data sources. I covered the launch a few days afterwards, focusing on how it implements Exadata’s SmartScan on Hive and NoSQL data sources and provides a single metadata catalog over both relational, and Hadoop, data sources. In a Twitter conversation later in the day though, I made the comment that in my opinion, the biggest benefit of Big Data SQL will be in its ability to extend Oracle’s security model to Hadoop data sources, because Hadoop security Hadoop security is still a bit of a mess:

lang=”en”>To me the greatest benefit of Big Data SQL is the single security model; even with Sentry, Hadoop security is fragmented and a mess (IMO)

— Mark Rittman (@markrittman) July 17, 2014

I’ve been working on an Oracle Big Data Appliance project over the past few weeks, as the technical architect and initial sysadmin for the cluster, and it’s given me a first-hand experience of what security’s like on a Hadoop cluster. Over the past few weeks I’ve had to come up with a security policy covering HDFS, Hive and the Cloudera management tools (Cloudera Manager, Hue etc), and try and implement an access and authorisation approach that ensures only designated people can log in, and when they’re in, they can only see the data they’re supposed to see. Hadoop at this point, to my mind, suffers from a couple of major issues when it comes to security:

  • It’s fragmented, in that each tool or Hadoop product tends to have its own security setup, and the documentation is all split up, rapidly goes out of date, and is more of a reference than a tutorial (Cloudera’s Security documentation is one of the better examples, but it still splits the key information you need over several sections and several other docs)
  • It’s full of holes, such that the default security setup is considered insecure in terms of users being able to spoof their details, and making it more secure is again an exercise in hunting through docs, with some pretty complex configuration steps you need to perform (for example, configuring Kerebos authentication, a prerequisite for things like Apache Sentry)

If we take a typical security policy that a large enterprise customer’s going to want to put in place, it’ll look something like this:

  • Users should only be able to log in via their corporate LDAP account, and we’ll want that login process to be secure so it can’t easily be bypassed
  • We want to be able to secure our datasets, so that only authorised users can view particular datasets, and there’s likely to be some groups we grant read-only access to, and others we grant read-write
  • The data loading processes for the Hadoop cluster need to be locked-down so they can’t overwrite the datasets of other applications
  • Our security policy ideally needs to sync-up, or be an extension of, our existing enterprise security policy, not something we maintain separately
  • We need to be able to audit and review who’s actually accessing what dataset, to ensure that these policies are being followed and enforced
  • We also need the ability to obfuscate or depersonalise data before it gets into the cluster, and also have the option of encrypting the data at-rest as well as on-the-wire

Back in the early days of Hadoop these types of security policy weren’t often needed, as the users of the Hadoop cluster were typically a small set of data scientists or analysts who’d been cleared already to view and work with the data in the cluster (or more likely, they did it and just didn’t tell anyone). But as we move to enterprise information management architectures such as the one outlined in my two-part blog post series a few weeks ago (pt.1, pt.2), the users of Hadoop and other “data reservoir” data sources are likely to increase significantly in number as data from these systems becomes just another part of the general enterprise data set.


But in practice, this is hard to do. Let’s start with HDFS first, the Hadoop Distributed File System on which most Hadoop data is stored. HDFS aims to look as similar to a Linux or Unix-type filesystem as possible, with similar commands (mkdir, ls, chmod etc) and the same POSIX permissions model, where files and directories are associated with an owner and a group and where permissions are set for that owner, the group and all others. For example, in the HDFS file listing below, the “/user/cust_segment_analysis” directory is owned by the user “mrittman” and the group “marketing”, with the directory owner having full read, write and subdirectory traversal access to the directory, the group having read-only and subdirectory traversal access, and all others having no access at all.

[root@bdanode1 ~]# hadoop fs -ls /user
Found 13 items
drwxrwxrwx   - admin    admin               0 2014-06-02 16:06 /user/admin
drwxr-x---   - mrittman marketing           0 2014-07-26 21:31 /user/cust_segment_analysis
drwxr-xr-x   - hdfs     supergroup          0 2014-05-27 13:19 /user/hdfs
drwxrwxrwx   - mapred   hadoop              0 2014-05-25 20:47 /user/history
drwxrwxr-t   - hive     hive                0 2014-06-04 16:31 /user/hive
drwxr-xr-x   - hue      hue                 0 2014-05-31 18:51 /user/hue
drwxrwxr-x   - impala   impala              0 2014-05-25 20:54 /user/impala
drwxrwxr-x   - oozie    oozie               0 2014-05-25 20:52 /user/oozie
drwxrwxrwx   - oracle   oracle              0 2014-06-09 21:38 /user/oracle
drwxr-xr-x   - root     root                0 2014-06-06 16:25 /user/root
drwxr-xr-x   - sample   sample              0 2014-05-31 18:51 /user/sample
drwxr-x--x   - spark    spark               0 2014-05-25 20:45 /user/spark
drwxrwxr-x   - sqoop2   sqoop               0 2014-05-25 20:53 /user/sqoop2

Which all sounds great until you then have another group that needs read-write access to the directory, but you’re limited to just one group permissions setting for the directory which you’ve already used to set up read-only access for that particular group. If you therefore need to set up different sets of security access for different groups, you typically then end-up creating multiple HDFS directories and multiple copies of the dataset in question, assigning each copy to a different group, which isn’t all that convenient and gives you other problems in terms of maintenance and keeping it all in-sync.

What you of course need is something like the “access control lists” (ACLs) you get with operating systems like Windows NT and MacOS, where you can define an arbitrary number of user groups and then assign each of them their own permission set on the directory and the files it contains. The most recent versions of Hadoop actually implement a form of ACL for HDFS, with this feature making its way into the recently-released Cloudera CDH5.1, but these ACLs are an addition to the standard POSIX user, group, others model and aren’t recommended for all files in your HDFS filesystem as according to the Hadoop docs “Best practice is to rely on traditional permission bits to implement most permission requirements, and define a smaller number of ACLs to augment the permission bits with a few exceptional rules. A file with an ACL incurs an additional cost in memory in the NameNode compared to a file that has only permission bits.” Still, it’s better than not having them at all, and I’d imagine using this feature for particular directories and sets of files that need more than one set of group permissions configured for them.

In most cases though, the way you’ll present data out to non-technical end-users and applications is through Hive and Impala tables, or through tools like Pig and Spark. Under the covers, these tools still use HDFS permissions to control access to the data within Hive and Impala tables, but again by default you’re limited to granting access to whole HDFS directories, or the files contained within those directories. Something that addresses this issue is a product called Apache Sentry, an open-source project within the Hadoop family that enables role-based access control for Hive and Impala tables. Oracle are one of the co-founders of the Sentry project and include it in the base software on the Big Data Appliance, and using Sentry you can grant SELECT, INSERT or ALL privileges to a group on a particular Hive or Impala table, rather than on the underlying HDFS directories and files. A form of fine-grained access control can be set up using Sentry by creating views with particular row-level security settings, giving you the basics of a database-like security policy that you can apply over the main way that users access data in the cluster.

But Sentry itself has a few significant prerequisites – you have to enable Kerebos authentication on your cluster, which you should do anyway because of the risk of account spoofing, but is still a significant thing to set up – and of course you need to link Hive and Impala to your corporate LDAP server and configure them to work in the way that Sentry requires. Most importantly though, you’re still left with the situation where you’ve got two separate security setups – the one for your corporate data warehouse and relational data sources, and another for data accessed on Hadoop, and it’s still hard to be sure, what with all the disparate products and partially-complete open-source products, whether data in your Hadoop cluster is still really secure (though products like Cloudera Navigator aim to provide some form of data governance and auditing over these datasets); and, there’s still no straightforward way to remove individual customers’ data out of the Hadoop dataset (“data redaction”), no easy way to obfuscate or mask data, and no easy way (apart from the Hive views mentioned before) to restrict users to accessing only certain columns in a Hive or Impala table.

And so this is where Oracle’s Big Data SQL product could be very interesting. Big Data SQL takes the Exadata model of moving as much filtering and column-projection as it can to the storage server, adding Oracle SmartScan functionality to the Hadoop node and allowing it to understand the full Oracle SQL dialect (and PL/SQL security functions), rather than just the subset of SQL provided by HiveQL and Impala SQL.


More importantly, it’ll enable a single unified data dictionary over both Oracle and Hadoop data sources, presenting Hive tables and NoSQL data as regular Oracle tables and allowing the DBA to create data security, redaction and row-level filtering policies over both relational and Hadoop data – giving you potentially the ability to define a single security policy across all data in your overall information management architecture.


So I think this is actually a “big deal”, and potentially even more game-changing that the SmartScan functionality that got most of the attention with the Big Data SQL product launch. How well it’ll work in-practice, and how much will be enabled on day one it’s hard to say, but this feature meets a real need that our customers are finding now, so I’ll be very interested to try it out when the product becomes available (presumably) later in the year.

Categories: BI & Warehousing

Oracle Database 12c Release – My First Observations. Licensed Features Usage Concerns – Part II.

Kevin Closson - Fri, 2014-07-25 15:02

In this post you’ll see that I provide an scenario of accidental paid-feature “use.”  The key elements of the scenario are: 1) I enabled the feature (by “accident”) but 2) I didn’t actually use the feature because I neither created nor altered any tables.

In Part I of this series I aimed to bring to people’s attention what I see as a significant variation from the norm when it comes to Oracle licensed-option usage triggers and how to prevent them from being triggered. Oracle Database Enterprise Edition supports several separately licensed options such as Real Application Clusters, Partitioning, and so on.  A feature like Real Application Clusters is very expensive but if  “accidental usage” of this feature is a worry on an administrator’s mind there is a simple remedy: unlink it. If the bits aren’t in the executable you’re safe. Is that a convoluted procedure? No. An administrator simply executes make -d rac_off and then relinks the Oracle executable. Done.

What about other separately licensed options like Partitioning?  As I learned from Paul Bullen, once can use the Oracle-supplied chopt command to remove any chance of using Partitioning if, in fact, one does not want to use Partitioning. I thought chopt might be the solution to the issue of possible, accidental usage of the In-Memory Column Store feature/option. However, I found that chopt, as of this point, does not offer the ability to neutralize the feature as per the following screenshot.


Trivial Pursuit of the Ignoramus or Mountainous Mole Hill?

There is yet no way I know of to prevent accidental use of the In-Memory Column Store feature/option. Am I just making a mountain out of a mole hill? I’ll let you be the judge. And if you side with folks that do feel this is a mountainous-mole hill you’d be in really good company.

Lest folks think that we Oaktable Network Members are a blind, mutual admiration society, allow me to share the rather sizzling feedback I got for raising awareness to this aspect of Oracle Database 12c:



No, I didn’t just want to dismiss this feedback. Instead  I pushed the belt-sander off of my face and read the words a couple of times. The author of this email asserted I’m conveying misinformation ( aka “BS”) and to fortify that position it was pointed out that one must:

  1. Set a database (instance initialization) parameter.
  2. Bounce the instance.
  3. Alter any object to use the feature. I’ll interpret that as a DDL action (e.g., ALTER TABLE, CREATE TABLE).

Even before I read this email I knew these assertions were false. We all make mistakes–this I know!  I should point out that unlike every release of Oracle from 5.1.17 to 11gR2 I was not invited to participate in the Beta for this feature. I think a lot of Oaktable Network members were in the program–perhaps even the author of the above email snippet–but I don’t know that for certain. Had I encountered this during a Beta test I would have raised it to the Beta manager as an issue and maybe, just maybe, the feature behavior might have changed before first customer ship. Why am I blabbering on about the Beta program? Well, given the fact that even Oaktable Network members with pre-release experience with this feature evidently do not know what I’m about to show in the remainder of this post.

What Is An Accident?

Better yet, what is an accident and how full of “BS” must one be to fall prey? Maybe the remainder of the post will answer that rhetorical question. Whether or not  it does, in fact, answer the question I’ll be done with this blog series and move on to the exciting work of performance characterization of this new, incredibly important feature.

Anatomy of a “Stupid Accident.”

Consider a scenario. Let’s say a DBA likes to use the CREATE DATABASE statement to create a database. Imagine that!  Let’s pretend for a moment that DBAs can be very busy and operate in chaotic conditions. In the fog of this chaos, a DBA could, conceivably, pull the wrong database instance initialization file (e.g., init.ora or SPFILE) and use it when creating a database. Let’s pretend for a moment I was that busy, overworked DBA and I’ll show you in the following happeneds:

  1. I executed sqlplus from the bash command prompt.
  2. I directed sqlplus to execute a SQL script called cr_db.sql. Many will recognize this as the simple little create script I supply with SLOB.
  3. The cr_db.sql script uses a local initialization parameter file called create.ora
  4. sqlplus finished creating the database. NOTE: this procedure does not create a single user table.
  5. After the database was created I connected to the instance and forced the feature usage tracking views to be updated (thanks to Morgan’s Library for that know-how as well…remember, I’m a database platform engineer not a DBA so I learn all the time in that space).
  6. I executed a SQL script to report feature usage of only those features that match a predicate such as “In-%’



This screen shot shows that the list of three asserted must-happen steps (provided me by a fellow Oaktable Network member) were not, in fact, the required recipe of doom.  The output of the features.sql script proves that I didn’t even need to create a user table to trigger the feature.

The following screen shot shows what the cr_db.sql script does:


The following screenshot shows the scripts I used to update the feature usage tracking views and to report against same:


The “Solution” To The “Puzzle”

Stepping on a landmine doesn’t just happen. You have to sort of be on your feet and walking around for that to happen. In the same vein, triggering usage of the separately licensed Oracle Database 12c Release In-Memory Column Store feature/option required me to be “on my feet and walking around” the landmine–as it were. Did I have to jump through hoops and be a raging, bumbling idiot to accidentally trigger usage of this feature? No. Or, indeed, did I issue a single CREATE TABLE or ALTER TABLE DDL statement? No. What was my transgression? I simply grabbed the wrong database initialization parameter file from my repository–in the age old I’m-only-human sort of way these things often happen.

To err to such a degree would certainly not be human, would it?

The following screenshot shows the parameter file I used to prove:

  1. You do not need to alter parameters and bounce an instance to trigger this feature usage in spite of BS-asserting feedback from experts.
  2. You don’t even have to create a single application table to trigger this feature usage.



This blog thread has made me a feel a little like David Litchfield must have surely felt for challenging the Oracle9i-era claims of how Oracle Database was impenetrable by database security breaches. We all know how erroneous those claims where. Unbreakable, can’t break it, can’t break in?

Folks, I know we all have our different reasons to be fans of Oracle technology–and, indeed, I am a fan. However, I’m not convinced that unconditional love of a supposed omnipotent and omniscient god-like idol are all that healthy for the IT ecosystem. So, for that reason alone I have presented these findings. I hope it makes at least a couple of DBAs aware of how this licensed feature differs from other high-dollar features like Real Application Clusters in exactly what it takes to “use” the feature–and, moreover, how to prevent stepping on a landmine as it were.


…and now, I’m done with this series.



Filed under: oracle

SQL monitoring shows adaptive plans

Yann Neuhaus - Fri, 2014-07-25 14:23

In a previous post, I have described Adaptive Plans. Even if I prefer to show plans with the SQL Monitor active html format, I had to stick with the dbms_xplan for that because SQL Monitoring did not show all information about adaptive plans.

This has been fixed in the Patchset 1 and I have run the same query to show the new feature.

First, an adaptive plan can be in two states: 'resolving' where all alternatives are possible and 'resolved' then the final plan has been choosen. It is resolved once the first execution statistics collector has made the decision about the inflection point. We can see the state in the SQL Monitor header:




Here my plan is resolved because the first execution is finished.

The plan with rowsource statistics show only the current plan, but the 'Plan Note' shows that it is an adaptive plan:




Now we have to go to the 'Plan' tab which show the equivalent of dbms_xplan.display_cursor:




Here the format is equivalent to format=>'adaptive'. It's the 'Full' plan where all branches are shown but inactive part is grayed. We have here the Statistics Collector after reading DEPARTMENTS, and we have the inactive full table scan hash join of EMPLOYEES.

Just choose the 'Final' Plan (or 'Current' if it is not yet resolved) to get only the active part:




I often prefer the tabular format to the graphical one:




We have all information: the 7 rows from DEPARTMENTS have gone through STATISTICS COLLECTOR and NESTED LOOP with index access has been choosen. Note that it is different from the previous post where HASH JOIN with full table scan was choosen because the 7 rows were higher than the inflection point.

In my current example, because I have system statistics that costs full table scan higher:


DP: Found point of inflection for NLJ vs. HJ: card = 8.35


This is higher than ny 7 rows from DEPARTMENTS.

Here is the whole sqlmon report: and how I got it:


alter session set current_schema=HR;
select /*+ monitor */ distinct DEPARTMENT_NAME from DEPARTMENTS
 where DEPARTMENT_NAME like '%ing' and SALARY>20000;

alter session set events='emx_control compress_xml=none';set pagesize 0 linesize 10000 trimspool on serveroutput off long 100000000 longc 100000000 echo off feedback off
spool sqlmon.htm
select dbms_sqltune.report_sql_monitor(report_level=>'all',type=>'html') from dual;
spool off

Note that I used the script exposed here and I used the emx_event to get the uncompressed xml, which I got from Tyler Muth:

@FranckPachot well, I asked @DBAKevlar, she asked the developer that owns it ;)

— tmuth (@tmuth) July 25, 2014