Skip navigation.

Feed aggregator

Oracle Makes 155 fixes for Patch Tuesday [VIDEO]

Chris Foot - Tue, 2014-10-21 12:33


Welcome to RDX! For those using Oracle Products, Oracle’s October critical patch update contains an unusually high number of security bug fixes.

ZDNet contributor Liam Tun noted that Oracle released patches for 155 security flaws for 44 of its products October 14th. Fixes include 25 security fixes for Java SE. The components affected include Java SE, Java SE embedded, JavaFX and JRockit. The highest Common Vulnerability Scoring System (CVSS) rating among the Java fixes was a 10, the highest rating available.

Also included are 32 fixes to Oracle Database Server products, with at least one receiving a CVSS rating of 9, 17 fixes for Oracle Fusion Middleware, 4 fixes for Oracle Retail Applications, 15 fixes for Oracle Sun Systems Product Suite and 24 for Oracle MySQL.

Many of these vulnerabilities may be remotely exploitable without authentication.

Thanks for watching!

The post Oracle Makes 155 fixes for Patch Tuesday [VIDEO] appeared first on Remote DBA Experts.

Pythian at PASS 2014

Pythian Group - Tue, 2014-10-21 11:40

Join us in Seattle for SQL PASS Summit 2014—planned by and for the Microsoft SQL Server community—where some of our very own will be presenting. You’ll notice that PASS has a funny way of scheduling speaking sessions, so both of our experts are presenting at the same time.

Also be sure to visit us at booth #226 for a chance to win a Sonos Play: 1. The Exhibit Hall will be open all day Wednesday, Thursday, and Friday morning.


Edwin Sarmiento Configuring SharePoint 2013 as a Business Intelligence Platform by Edwin Sarmiento
Wednesday November 5 — 1:30-2:45 PM
Room 608

Edwin Sarmiento, a Microsoft MVP, Certified Master, and Principal Consultant in Pythian’s Advanced Technology Group, will be presenting a session called Configuring SharePoint 2013 as a Business Intelligence Platform.

With SharePoint as the business collaboration platform of choice for the enterprise, it has become an important piece in the overall Microsoft Business Intelligence (BI) stack. However, deploying and configuring it to support Microsoft BI solutions has become a cumbersome task for IT experts.

In this session, Edwin will demonstrate what it takes to successfully architect and design SharePoint 2013 as a BI platform. He will cover multiple configuration scenarios and the different technologies necessary to build the infrastructure. Attendees will walk away with the confidence to run Power Pivot, Power View, and Reporting Services in their SharePoint 2013 farms.


Warner ChavesThe Use Cases for In-Memory OLTP by Warner Chaves
Wednesday November 5 — 1:30-2:45 PM
Room 3AB

Warner Chaves, a Microsoft Certified Master and Principal Consultant in Pythian’s SQL Server practice, will be presenting a session called The Use Cases for In-Memory OLTP.

In this session, he will do a short introduction to the In-Memory OLTP feature before diving straight into the use cases where the new lockless/latchless concurrency control and native compilation really shine.

Demos will cover PAGELATCH contention (or lack thereof), use of non-persistent in-memory tables for ETL, and in-memory tables as “shock absorbers” for high throughput environments.

For each case, Warner will do a comparison of “classic” versus in-memory, what gains the audience can expect, and what patterns will yield the biggest benefits.


BONUS MATERIAL! Since its release back in April, SQL Server experts across the globe are becoming familiar with the top features in Microsoft SQL Server 2014—the In-Memory OLTP engine, the AlwaysOn enhancements, and more. Warner couldn’t help but notice that there were a few features that weren’t getting the same attention and filmed a video series sharing the most underrated features of SQL Server 2014. We’ll be publishing that series during PASS, so follow @Pythian on Twitter to receive our updates.


Pythian is a global leader in data consulting and managed services. We specialize in optimizing and managing mission-critical data systems, combining the world’s leading data experts with advanced, secure service delivery. Learn more about Pythian’s Microsoft SQL Server expertise.

Categories: DBA Blogs

Oracle 12c Multitenant – Inmemory admin basics

Dominic Brooks - Tue, 2014-10-21 11:15

A couple of very, very basic observations on getting going with 12c Inmemory in a multitenant database.

1. When trying to set inmemory_size within a PDB when inmemory_size is 0 in the CDB
ORA-02096: specified initialization parameter is not modifiable with this option

SQL> alter session set container = cdb$root;

Session altered.

SQL> select value from v$parameter where name = 'inmemory_size';


SQL> alter session set container = orcl;

Session altered.

SQL> alter session set inmemory_size=100M;

alter session set inmemory_size=100M
ERROR at line 1:
ORA-02096: specified initialization parameter is not modifiable with this option

2. You have to use scope=spfile when setting inmemory on CDB and it requires restart to take effect

SQL> alter session set container = cdb$root;

Session altered.

SQL> alter system set inmemory_size = 500M;

alter system set inmemory_size = 500M
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-02095: specified initialization parameter cannot be modified

SQL> alter system set inmemory_size = 500M scope=spfile;

System altered.

SQL> select * from v$sga;

NAME                 VALUE      CON_ID
-------------------- ---------- ----------
Fixed Size           2926472    0
Variable Size        1224738936 0
Database Buffers     905969664  0
Redo Buffers         13848576   0

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL> startup
ORACLE instance started.

Total System Global Area 2147483648 bytes
Fixed Size               2926472 bytes
Variable Size            1275070584 bytes
Database Buffers         318767104 bytes
Redo Buffers             13848576 bytes
In-Memory Area           536870912 bytes
Database mounted.
Database opened.

3. You don’t use scope when setting in a PDB otherwise
ORA-02096: specified initialization parameter is not modifiable with this option

SQL> alter session set container = orcl;

Session altered.

SQL> alter system set inmemory_size=400M scope=spfile;

alter system set inmemory_size=400M scope=spfile
ERROR at line 1:
ORA-02096: specified initialization parameter is not modifiable with this option

SQL> alter system set inmemory_size=400M;

System altered.

4. If you try to set the PDB inmemory_size larger than the CDB then you get
ORA-02097: parameter cannot be modified because specified value is invalid

SQL> alter session set container = orcl;

Session altered.

SQL> alter system set inmemory_size=600M;

alter system set inmemory_size=600M
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-02095: specified initialization parameter cannot be modified


5. Similar to point (2), if we change the inmemory size of the CDB, we have to restart for it to take effect

SQL> select sys_context('USERENV','CON_NAME') from dual;


SQL> alter system set inmemory_size=500M scope=spfile;

System altered.

SQL> select value from v$parameter where name = 'inmemory_size';


SQL> shutdown immediate  
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL> startup
ORACLE instance started.

Total System Global Area 2147483648 bytes
Fixed Size               2926472 bytes
Variable Size            1157630072 bytes
Database Buffers         436207616 bytes
Redo Buffers             13848576 bytes
In-Memory Area           536870912 bytes
Database mounted.
Database opened.

6. If we modify inmemory_size for PDB, it takes effect immediately (presumably because all it really is a maximum quota on the actual memory structure established in the CDB?):

SQL> alter session set container = orcl;

Session altered.

SQL> alter system set inmemory_size=200M;

System altered.

SQL> select value from v$parameter where name = 'inmemory_size';


SQL> alter system set inmemory_size=400M;

System altered.

SQL> select value from v$parameter where name = 'inmemory_size';


7. While we showed in point (4) that a single PDB inmemory_size cannot exceed the CDB size, we can overallocate such “maximum quotas” across all PDBs – i.e. the sum of inmemory_size for all PDBs can exceed CDB size (prompts future investigations):

SQL> alter session set container=cdb$root;

Session altered.

SQL> select value from v$parameter where name = 'inmemory_size';


SQL> alter session set container=orcl;

Session altered.

SQL> select value from v$parameter where name = 'inmemory_size';


SQL> alter session set container=cdb$root;

Session altered.

SQL> create pluggable database dom from orcl file_name_convert=
 2  ('/home/oracle/app/oracle/oradata/cdb1/orcl',
 3  '/home/oracle/app/oracle/oradata/cdb1/dom');

Pluggable database created.

SQL> alter pluggable database dom open;

Pluggable database altered.

SQL> alter session set container = dom;

Session altered.

SQL> alter system set inmemory_size = 400M;

System altered.

SQL> select value from v$parameter where name = 'inmemory_size';


Monitoring OBIEE with the ELK stack

Rittman Mead Consulting - Tue, 2014-10-21 09:37

Monitoring the health of an OBIEE system and diagnosing problems that may occur is a vital task for the system’s administrator and support staff. It’s one that at Rittman Mead we help customers with implementing themselves, and also provide as a managed service. In this article I am going to discuss the ELK stack, which fills a specific gap between the high-level monitoring and configuration functionality of Enterprise Manager 11g Fusion Middleware Control, and the Enterprise-grade monitoring, alerting and configuration management of Enterprise Manager 12c Cloud Control.

The ELK stack enables you to rapidly access both summary and detail information across the stack, supporting swift identification and diagnosis of any issues that may occur. The responsive interface lets you to drill into time periods or any ad-hoc field or filter as you wish, to analyse and diagnose problems. Data can be summarised and grouped arbitrarily, displaying relative error rates ensuring that genuine problems are not lost in the ‘noise’ of usual operation.

Out of the box, OBIEE ships with Enterprise Manager 11g Fusion Middleware Control (FMC), which as the name says is part of the Enterprise Manager line of tools from Oracle for managing systems. It is more of a configuration and deployment tool than it is really a monitoring and diagnostics one. The next step up is FMC’s (very) big brother, Enterprise Manager 12c Cloud Control (EM12c). This is very much its own product, requiring its own infrastructure and geared up to monitoring an organisation’s entire fleet of [Oracle] hardware and software. With this greatly enhanced functionally with EM12c also comes a license cost. The ELK stack conceptually fits perfectly alongside your existing EM FMC, providing a most excellent OBIEE monitoring dashboard and analysis tool, and allowing you to explore the kind of diagnostics and historical data that you could have access to in EM 12c.

In ELK we can see at a glance what kind of relative activity there has been on the system over the past few days:

There have been some errors, and the top three nQS and ORA error codes and messages are shown. This is an important differentiator to EM where you can search for errors, but cannot see straightaway if it is a one-off or multiple occurence. By grouping by error message it’s possible to quickly see what the biggest problem on a system may currently be:

At this point we might want to drill down into what was being run when the errors were being thrown. For example, from the error summary alone we can see the biggest problem was a locked database account – but which database was being queried? Lower down the dashboard page is a list of log details, and by clicking on the search icon against an error message we can filter the results shown:

We can use the search icon again to restrict results by ECID

And from there see all the related log entries, including which connection pool the request was against (and thus which database account is locked)

Another way of diagnosing a sudden rash of errors would be to instead drilldown on time alone to take a more holistic view at the logs (useful also given that ECIDs don’t always give the full picture). Using the system activity timeline along with the events log view it is a piece of cake to do this – simply click and drag a time window on the chart to instantly zoom into it.

Taking a step back up, we can see at a glance which areas of the OBIEE metadata model (RPD) are being used, as well as where we are pulling logs from – and all of these are clickable in order to filter the results further. So it’s easy to see, for a given subject area, what’s the current error rate? Or to quickly access all the log files for a specific set of components alone (for example, BI Server and OPMN). Any field that is displayed, whether in a chart or a detailed log view, can be clicked and used as the input for an ad-hoc filter.

It’s not just errors and logs we can monitor – the current and trending performance of the system (or a part of it; note the filtering by subject area and database described above) can be observed and of course, drilled into:

The ELK stack

The ELK stack is a suite of free software made up of three tools, the first letter of each giving it its name:

  • ElasticSearch
  • Logstash
  • Kibana

At a very high level, we collect and enrich diagnostic data from log files using logstash, store it in ElasticSearch, and present and analyse it through Kibana.

  • ElasticSearch is a document store, in which data with no predefined structure can be stored. Its origins and core strength are in full text search of any of the data held within it, and it is this that differentiates it from pure document stores such as MongoDB that Mark Rittman wrote about recently. Data is loaded and retrieved from ElasticSearch through messages sent over the HTTP protocol, and one of the applications that can send data this way and works extremely well is Logstash.
  • Logstash is an innocuous looking tool that at first glance one could mistakenly write off as “just” a log parser. It does a lot more than that and a healthy ecosystem of input, filter, codec and output plugins means that it can interface between a great variety of applications, shifting data from one to another and optionally processing and enriching it along the way.
  • The final piece of the stack is Kibana, a web application that enables one to build very flexible and interactive time-based dashboards, sourcing data from ElasticSearch. Interestingly, another of my favourite tools that I have written about before – and will write about again in this article – is Grafana which is forked from Kibana (and modified to source its data from time-series databases like graphite/carbon/whisper or InfluxDB) – thus if you’re at home with one you will be the other.

In this article I’m going to show how to set up your own ELK stack to monitor OBIEE, based on SampleApp v406.

Who is this for?

As you will see below, setting up and configuring the ELK stack does involve rolling up ones sleeves and diving right in. If you’re looking for an off-the-shelf monitoring solution then you should look elsewhere (such as EM12c). But if you want to have a crack at it I think you’ll be pleasantly surprised at what is possible once you get past the initially (bumpy) learning curve. The capabilities are great, and there’s an active support community as is the case with lots of open-source tools. With a bit of work it is possible to create a monitoring environment tailored pretty much entirely to your design.

Installing the stack

ELK runs on all common linux distributions (including Oracle Linux), as well as Mac OS. The only prerequisite is a JDK for ElasticSearch and Logstash, and web server for Kibana; here I am using Apache.

First up, let’s install JDK 1.7 (SampleApp has 1.6, which isn’t enough):

sudo yum install -y java-1.7.0-openjdk.x86_64

Apache is already installed on SampleApp, which we can verify thus:

[oracle@demo ~]$ sudo yum install -y httpd
Loaded plugins: refresh-packagekit
Setting up Install Process
Package httpd-2.2.15-29.0.1.el6_4.x86_64 already installed and latest version
Nothing to do
[oracle@demo ~]$ sudo service httpd status
httpd is stopped

It’s shutdown by default and that’s fine because we need to update the configuration on it anyway.


The easiest way to install ElasticSearch is using the yum repository:

sudo rpm --import

cat > /tmp/elasticsearch.repo<<EOF
name=Elasticsearch repository for 1.3.x packages

sudo mv /tmp/elasticsearch.repo /etc/yum.repos.d/
sudo yum install -y elasticsearch

I’d then set it to start at boot automagically:

sudo chkconfig elasticsearch on

and then start it up:

sudo service elasticsearch start

One final, optional, step in the installation is a plugin called kopf which gives a nice web dashboard for looking at the status of ElasticSearch:

export JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64/
cd /usr/share/elasticsearch/bin
sudo ./plugin -install lmenezes/elasticsearch-kopf


There’s no repository for logstash, but it’s no biggie because there’s no install as such, just a download and unpack. Grab the download archive for logstash from the ELK download page, and then unpack it:

cd ~/Downloads
# Need to use sudo because /opt is owned by root
sudo tar -xf logstash-1.4.2.tar.gz --directory /opt/
sudo mv /opt/logstash-1.4.2/ /opt/logstash/
sudo chown -R oracle. /opt/logstash/


As with logstash, Kibana just needs downloading and unpacking. There’s also a wee bit of configuration to do, so that the web server (Apache, in our case) knows to talk to it, and so that Kibana knows how to find ElasticSearch.

cd ~/Downloads/
sudo tar -xf kibana-3.1.0.tar.gz --directory /opt
sudo mv /opt/kibana-3.1.0/ /opt/kibana/
sudo chown -R oracle. /opt/kibana/

Now to configure Apache, telling it where to find Kibana. If you have existing sites configured, you’ll need to sort this bit out yourself, but on a vanilla SampleApp v406 you can use the following sed command to set up the needful:

sudo sed -i'.bak' -e 's/DocumentRoot.*$/DocumentRoot "\/opt\/kibana\/"/g' /etc/httpd/conf/httpd.conf

Lastly, Kibana needs to know where to find ElasticSearch, which is where it is going to pull its data from. An important point here is that the URL of ElasticSearch must be resolvable and accessing from the web browser you run Kibana on, so if you are using a DNS name it must resolve etc. You can update the configuratinon file config.js by hand (it’s the elasticsearch: definition that needs updating), or use this sed command:

sed -i'.bak' -e 's/^\s*elasticsearch:.*$/elasticsearch: "http:\/\/",/g' /opt/kibana/config.js

Finally, [re]start Apache so that it uses the new configuration:

sudo service httpd restart

You should be able to now point your web browser at the server and see the default Kibana dashboard. So for sampleapp, if you’re running Firefox locally on it, the URL would simply be http://localhost/ (port 80, so no need to specify it in the URL). Note that if you’re doing anything funky with network, your local web browser needs to be able to hit both Apache (port 80 by default), and ElasticSearch (port 9200 by default).

Configuring ELK end-to-end

Now that we’ve got the software installed, let’s see how it hangs together and create our first end-to-end example. There’s a good logstash tutorial here that covers a lot of the functionality. Here, I’ll just look at some of the very basics, creating a very simple logstash configuration which will prompt for input (i.e. stdin) and send it straight to ElasticSearch. The kopf plugin that we installed above can show that the data made it to ElasticSeach, and finally we will create a very simple Kibana dashboard to demonstrate its use.

Logstash works by reading a configuration file and then running continually waiting for the configured input. As well as the input we configure an output, and optionally in between we can have a set of filters. For now we will keep it simple with just an input and output. Create the following file in /opt/logstash and call it logstash-basic.conf:

input {
        stdin {}
output {
        elasticsearch {}

It’s pretty obvious what it’s saying – for the input, use stdin, and send it as output to elasticsearch (which will default to the localhost).
Run this with logstash:

export JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64/
cd /opt/logstash
bin/logstash -f logstash-basic.conf

After a few moments you should get a prompt. Enter some text, and nothing happens… apparently. What’s actually happened is that the text, plus some information such as the current timestamp, has been sent to ElasticSearch.

Let’s see where it went. In a web browser, got to http://localhost:9200/_plugin/kopf/. 9200 is the port on which ElasticSearch listens by default, and kopf is a plugin we can use to inspect ElasticSearch’s state and data. ElasticSeach has a concept of an index, in which documents, maybe of the same repeating structure but not necessarily, can be stored. Crudely put, this can be seen as roughly analogous to tables and rows of data respectively. When logstash sends data to ElasticSearch it creates one index per day, and in kopf now you should see an index with the current date, with a “document” for each line you entered after running logstash:

ElasticSearch can be queried using HTTP requests, and kopf gives a nice way to construct these and see the results which are in JSON format. Click on the rest (as in, REST API) menu option, leave the request as default http://localhost:9200/_search, and click send. You’ll see in the response pane a chunk of JSON in amongst which are the strings that you’ve entered to logstash:

Enter a few more lines into the logstash prompt, and then head over to http://localhost/ where you should find the default dashboard, and click on the Logstash Dashboard option:

It’s fairly bare, because there’s very little data. Notice how you have a histogram of event rates over the past day at the top, and then details of each event at the bottom. There are two things to explore here. First up, go and enter a bit more data into logstash, so that the create events have been spread out over time. Click the refresh icon on the Kibana dashboard, and then click-drag to select just the period on the chart that has data. This will zoom in on it and you’ll see in greater definition when the events were created. Go and click on one of the event messages in the lower pane and see how it expands, showing the value of each field – including message which is what logstash sent through from its input to output.

Now let’s get some proper data in, by pointing logstash at the BI Server log (nqsserver.log). Create a new configuration file, logstash-obi.conf, and build it up as follows. First we’ll use the file input to get data from …wait for it….a file! The syntax is fairly obvious:

input {
        file {
                path => "/app/oracle/biee/instances/instance1/diagnostics/logs/OracleBIServerComponent/coreapplication_obis1/nqserver.log"

Now we need to tell Logstash how to interpret the file. By default it’ll chuck every line of the log to ElasticSearch, with the current timestamp – rather than the timestamp of the actual event.

Now is time to introduce the wonderful world of the grok. A grok is one of the most important of the numerous filter plugins that are available in logstash. It defines expected patterns of content in the input, and maps it to fields in the output. So everything in a log message, such as the timestamp, user, ecid, and so on – all can be extracted from the input and stored as distinct items. They can also be used for further processing – such as amending the timestamp output from the logstash event to that of the log file line, rather than the system time at which it was processed.

So, let us see how to extract the timestamp from the log line. An important part of grok’ing is patterns. Grok statement are written as Regular expressions, or regex (obXKCD), so to avoid continual wheel-reinventing of regex statements for common objects (time, ip addresses, etc) logstash ships with a bunch of these predefined, and you can build your own too. Taking a line from nqsserver.log we can see the timestamp matches the ISO 8601 standard:

So our grok will use the pre-defined pattern TIMESTAMP_ISO8601, and then everything else (“GREEDYDATA”) after the timestamp, map to the log message field. The timestamp is in square brackets, which I’ve escaped with the backslash character. To indicate that it’s a grok pattern we want to match, it’s enclosed in %{ } markers.

\[%{TIMESTAMP_ISO8601:timestamp}\] %{GREEDYDATA:log_message}

This can be broken down as follows:

\[                                  The opening square bracket, escaped by \
%{TIMESTAMP_ISO8601:timestamp}      Capture an ISO 8601 timestamp, store it in a field called 'timestamp'
\]                                  The closing square bracket, escaped by \
%{GREEDYDATA:log_message}           Capture everything else ('GREEDYDATA' is also a grok pattern) and store it in the 'log_message' field

A grok operator in logstash is part of the filter processing, so we need a new stanza in the configuration file, after input and before output. Note that the grok operator is matching our pattern we built above against the message field, which is pre-populated by default by the input stream. You can grok against any field though.

input {
        file {
                path => "/app/oracle/biee/instances/instance1/diagnostics/logs/OracleBIServerComponent/coreapplication_obis1/nqserver.log"
filter {
        grok {
                match => ["message","\[%{TIMESTAMP_ISO8601:timestamp}\] %{GREEDYDATA:log_message}"]
output {
        elasticsearch {}

Now we can see in the resulting capture we’ve extracted the timestamp to a field called “timestamp”, with the remainder of the field in “log_message”

But – the actual timestamp of the log entry that we have attached to the event stored in ElasticSearch, a special field called @timestamp is still reflecting the timestamp at which logstash read the logfile entry (30th September), rather than when the logfile entry was created (11th June). To fix this, we use a new filter option (grok being the first), the date filter:

date {
        match => ["timestamp", "yyyy-MM-dd'T'HH:mm:ss.SSSZZ"]

This matches the timestamp field that we captured with the grok, and converts it into the special @timestamp field of the event, using the mask specified. Now if we go back to kopf, the ElasticSearch admin tool, we can see that a new index has been created, with the date of log entry that we just parsed and correctly extracted the actual date from:

And over in Kibana if you set the timeframe long enough in the filter at the top you’ll be able to find the log entries showing up, now with the correct timestamp. Note that ElasticSearch accounts for the timezone of the message, storing it in UTC:

A large part of setting up your own ELK stack is this configuration of the filters in order to imbue and extract as much information from the log as you want. Grok filters are just the beginning – you can use conditional paragraphs to grok certain fields or logs for certain strings (think, error messages and ORA codes), you can mutate output to add in your own fields and tags based on what has been read. The point of doing this is that you enrich what is in the logs by giving the different pieces of data meaning that you can then drive the Kibana dashboard and filtering through.

Tips for using logstash

To think of the logstash .conf file as merely “configuration” in the same way a simple .ini is would be to underestimate it. In effect you are writing a bit of data transformation code, so brace yourself – but the silver lining is that whatever you want to do, you probably can. Logstash is a most flexible and powerful piece of software, and one in which the model of input, filter, codec and output work very well.

In the spirit of code development, here are some tips you may find useful:

  • Grok patterns can inherit. Study the provided patterns (in the logstash patterns folder), and build your own. Look for commonality across files and look to reuse as much as possible. If you have multiple unrelated patterns for every type of log file in FMW then you’ve done it wrong.
  • When you’re building grok statements and patterns, use the superb The advantage of this over a standard regex debugger is that it supports the grok syntax of capture groups and even custom patterns.
  • For building up and testing regex from scratch, there are some useful sites:

    On the Mac there is a useful tool called Oyster which does the same as the above sites but doesn’t require an Internet connection.

  • By default a failed grok will add the tag _grokparsefailure to the event. If you have multiple grok clauses, give each its own unique tag_on_failure value so that you can see from the output message which grok statement failed.

    grok {
        match => [  "message", "%{WLSLOG}"
        tag_on_failure => ["_grokparsefailure","grok03"]

  • You can use # as a comment character, so comment your code liberally, particularly whilst you’re finding your way with the configuration language so you can find your way back from the gingerbread house.
  • grok’ing takes resources, so target your grok statements by using conditionals. For example, to only parse nqquery logs for an expected string, you could write :

    if [path] =~ /nqquery/ {
        # Do your grok here

    NB the =~ denotes a regex match, and the / delineate the regex string.

  • Use a conditional to split the output, writing and grok failures to stdout and/or a file, so that it is easier to see what’s failing and when. Note the use of the rubydebug codec here to prettify output sent to stdout.

    output {
        if "_grokparsefailure" not in [tags] {
            elasticsearch { }
        } else {
            file { path => "unprocessed.out" }
            stdout { codec => rubydebug }

  • Use the multiline codec to work with log messages that span multiple lines, and watch out for newline characters – grok does not like them so use a mutate gsub to fix them out.
Building a Kibana dashboard

Now that we’ve got the data streaming through nicely from logstash into ElasticSearch, let us take a look at building dashboards against it in Kibana. Kibana is a web application that runs within an existing web server such as Apache, and it builds dashboards from data stored in ElasticSearch.

The building blocks of a Kibana dashboard are rows, which contain panels of a given pane width, up to twelve per row. On a panel goes one of the types of object, such as a graph. We’ll see now how to build up a dashboard and the interactions that we can use for displaying and analysing data.

When you first load Kibana you get a default dashboard that links to a few other start dashboards. Here we’re going to properly start from scratch so as to build up a picture of how a dashboard is created. Click on Blank Dashboard, and then in the top-right corner click on Configure Dashboard. Click on Index and from the Timestamping option select day. What this does is tell Kibana which ElasticSearch indices it is to pull data from, in this case using the standard Logstash index naming pattern – which we observed in kopf earlier – logstash-YYYY.MM.DD. Click Save, and then select Add a row. Give the row a title such as System Activity, click on Create Row and then Save. A new row appears on the dashboard.

Now we’ll add a graph to the row. Click on Add panel to empty row, and select Histogram as the Panel Type. Note that by default the width is 4 – change this to 12. You’ll note that there are plenty of options to explore, but to start with we’ll just keep it simple, so go ahead and click Save. By default the chart will show all data, so use the Time filter dropdown option at the top of the screen to select a recent time period. Assuming your data has loaded from logstash into ElasticSearch you should see a graph similar to this:

This is a graph of the number of events (log file entries) per time period. The graph will amend the resolution according to the zoom so that a reasonable resolution of data is shown, or you can force it through the Resolution option in the graph properties. In the legend of the chart you can see the resolution currently used.

Assuming you’ve selecting a broad time interval, such as the last week, you’ll presumably want to drill into the data shown. This is very intuitive in Kibana – simply click and drag horizontally over the time period you want to examine.

There are two important concepts for selecting and grouping data in Kibana, called filters and queries. A query groups data based on conditions, and we’ll explore those later. Filtering is a predicate applied to all data returned. Think of it just like a WHERE clause on a SQL query. You can see the current filter(s) applied at the top of the dashboard.

You may well want to hide these, and they can be collapsed – as can the query row and all of the dashboard rows – by clicking on the little triangle

From the Filter area you can also add, amend, disable and remove filters.

So far all we’ve got is a graph showing system activity over time, based on events recorded in a log file. But, we’ve no way of seeing what those logs are, and this is where the Table panel comes in. Add a new row, give it a title of Log messages and add the Table panel to it specifying the span as 12. You should now see a list of messages with timestamps corresponding to the time period shown in the graph. You can customise the Table panel, for example specifying which fields to show; by default it shows _source which is the raw row returned by ElasticSearch. More useful to us is the log_message field that we parsed out using the grok in logstash earlier. You can do this by selecting the relevant field from the Fields list on the left (which can be collapsed for convenience), or editing the Table panel and specifying it in the Columns area.

From the Table panel it is possible to select the data shown even more precisely by adding additional filters based on data in the table. Clicking on a particular row will expand it and show all of the associated fields, and each field has a set of Action options. You can filter only for that value, or specifically excluding it, and you can also add each field into the table shown (just like we did above for logmessage). So here I can opt to only display messages that I’ve tagged in logstash as coming from the BI Server component itself:

You’ll note in the second screenshow, once the filter has been applied, that the graph has changed and is showing less data. That is because a filter is global to a dashboard. But what if we want to show on the graph counts for all logs, but in the data table just those for BI Server? Here is where queries come into play. A query also looks like a predicate, but rather than restricting the data returned it just identifies a set of data within what is returned. To illustrate this I’m first going to remove all existing filters except the time period one:

And now in the Query area click on the + (to the right of the line). Now there are two queries, both with a wildcard as their value meaning they’ll each match everything. In the second query box I add the query Component: OracleBIServerComponent – note for this to work your logstash must be sending messages to ElasticSearch with the necessary Component field. Once updated, the second query’s impact can be seen in the graph, which is showing both the “all” query and the BI Server component tows. Use the View > option in the top left of the graph as a quick way of getting to the graph settings, including disabling cumulative/stack view:

Each panel in Kibana can be configured to show all or some of the query groups that have been defined. This is most useful for creating breakdowns of data, including those that are splitting it in different ways and you wouldn’t want all of the options displayed in entirety on all panels. You might want to group out the components, and the types of error, and then show a break down of system activity by one or the other – but not necessarily both. To configure which query a panel is to show use the Configure option in the top-right of a panel and go to the Queries tab. If it’s set to all then each and every query set will be shown individually on the panel.

If it’s selected then you can select one or more of the defined query sets to display

There are about a dozen types of panel in Kibana, and I’m not going to cover them all here. The other ones particularly of interest for building this kind of OBIEE monitoring dashboard include:

  • Terms is basically a SELECT FIELD, COUNT(*) ... GROUP BY FIELD. It shows the top x number of terms for a given field, and how frequently they occurred. Results can be as a pie or bar chart, or just a table:From a Terms panel you can add filters by clicking on a term. In the example above, clicking on the pie segment, or table row icon, for ERROR would add a filter to show just ERROR log entries
  • Trends shows the trend of event occurrences in a given time frame. Combined with an appropriate query you can show things like error rates
  • Stats shows a set of statistics, so you can identify mean response times, maximum users logged on, and so on – assuming you have this data coming through from the logstash parsing.

Once you’ve built your dashboard save it (Kibana stores it in ElasticSearch). The dashboard is defined in json and you can opt to download this too.

The complete OBIEE monitoring view

Parsing logs is a great way to get out valuable information from the text stored and build visualisations and metrics on top of it. However, for pure metrics alone (such as machine CPU, OBIEE DMS metrics, and so on) a close-relation to Kibana, Grafana, is better suited to the task. Thus we have the text-based data going into ElasticSearch and reported through Kibana, and the pure metrics into a time-based store such as whisper (graphite’s database) and reported through Grafana. Because Grafana is a fork of Kibana, the look and feel is very similar.

Using obi-metrics-agent the DMS metrics from OBIEE can be collected and stored in whisper, and so also graphed out in Grafana alongside the system metrics. This gives us an overall architecture like this:

Obviously, it would be nice if we could integrate the fundamental time-based nature of both Kibana and Grafana together, so that drilling into a particular time period of interest maybe from an error rate point of view in the logs would also show the system and DMS metrics for the same period. There has been discussion about this (1, 2, 3) but I don’t get the impression that it will happen soon, if ever. One other item of interest here is Marvel, which is a commercial offering for monitoring ElasticSearch – through Kibana. It makes use of stock Kibana panel types, along with some new ones including the Nodes panel type, which suits the requirements we have of monitoring OBIEE/system metrics within a Kibana view, but unfortunately it looks like currently it is going to remain within Marvel only.

One other path to consider is trying to get the metrics currently sent to graphite/whisper instead into ElasticSearch so that Kibana can then report on them. The problem with this is twofold. Firstly, ElasticSearch is fundamentally a text-based store, whereas whisper fits much better for time/metric data (as would another DB such as influxDB). So trying to crow-bar the two together may not be the best solution, and instead better for it to be resolved at front end as discussed above. Secondly, Kibana’s graphing capabilities do not conceptually extend to multiple metrics in the same graph – only multiple queries – which means that graphing something that would be simple in Grafana (such as CPU wait/user/sys) would be overly complex in Kibana.

Architecting an ELK deployment

So far I’ve shown how to configure ELK on a single server, reading logs from that same server. But there are two extra things we should consider. First, Logstash in particular can be quite a ‘heavy’ process depending on how much work you’re doing with it. If you are processing all the logs that FMW writes, and have lots of grok filters (which isn’t a bad thing; it means you’re extracting lots of good information), then you will see logstash using a lot of CPU, lots IO, possibly to the detriment of other processes on the system – a tad ironic if the purpose of using logstash is to monitor for any system problems that occur. Secondly, ELK works very well with mutiple servers. You might have a scaled out OBIEE stack, or want to monitor multiple environments. Rather than replicating the ELK stack on each server instead it’s better for each server to push its log messages to a central ELK server for processing. And since the processing takes place on the ELK server and not the server being monitored we reduce the local resource footprint too.

To implement this kind of deployment, you need something like logstash-forwarder on the OBI server which is a light-touch program, sending the messages to logstash itself on the ELK server, over a custom protocol called lumberjack. Logstash then processes the messages as before, except it is reading the input from the logstash-forwarder rather than from file. An alternative approach to this is using redis as a message broker, with logstash running on both the source (sending output to redis) and ELK server (using redis as the input). This approach is documented very well here / here, and the former of using logstash-forwarder here. Logstash-forwarder worked very well for me in my tests, and seems to fit the purpose nicely.


Responsive monitoring tools are crucial for successful and timely support of an OBIEE system, and the ELK stack provides an excellent basis on which to build beyond the capabilities of Enterprise Manager Fusion Middleware Control. The learning curve is a bit steep at first, and you have to be comfortable with installing unpackaged tools, but the payoff makes it worth it! If you are interested in finding out about how Rittman Mead can help with your OBIEE implementation or other areas, please contact us.

Categories: BI & Warehousing

OCP 12C – Index and Table Enhancements

DBA Scripts and Articles - Tue, 2014-10-21 07:16

Table Enhancements Oracle 12c offers you to create invisible columns, these columns are not visible until you explicitly mention their names in the SQL statement. This functionnality allows developpers to make change to the database without conflicting with the existing application. To create an invisible column: [crayon-544a9f885cf92426881455/]  You can’t create invisible columns on : External [...]

The post OCP 12C – Index and Table Enhancements appeared first on Oracle DBA Scripts and Articles (Montreal).

Categories: DBA Blogs

APEX Connect 8th and 9th of June 2015 in Duesseldorf Germany

Denes Kubicek - Tue, 2014-10-21 03:34
APEX Connect is the first big APEX event in Germany and will be taking place in Duesseldorf on 8th and 9th of June 2015. You can still apply for a presentation using this link. I will be there with a presentation on APEX Plugins. As far as I know a couple of international guests are expected to be there as well. This event will for sure be interesting and informative so you should think about participating. Düsseldorf is close to Belgium and Holland so I think that we will have quite a lot of visitors from those counties as well. It just comes to my mind that the presentation should be in English. So, you should make sure to book your ticket, hotel and travel in time.

Categories: Development

Documentum upgrade project: from D2-Config 3.1 to D2-Config 4.1

Yann Neuhaus - Mon, 2014-10-20 22:20

During a migration from D2-Config 3.0 to D2-Config 4.1, we have encountered 2 issues. In order to have a clean installation and the insurance of being able to backup the configuration, we have decided to perform a full config export on D2-Config 3.0 and a full import with a configuration reset to D2-Config 4.1.


1. Configuration export

The first issue we experienced was regarding docapps existing in the repository. As you can see in following screenshot, D2-Config crashes very quickly.




But we were quickly able to find a solution. We just had to unselect the docapps check boxes and wait a little bit.




2. Configuration update

After the first import we discovered that some conditions in the menu management were not manageable anymore. But they still applied to D2-Client.

Actually, a class name is seen - i. e. “eu.c6.web.conditions.permits.HasPermit” - instead of the real condition: “User has minimum “write” permit on the selection”.




We have this result because no migration path was anticipated between the 2 releases of D2-Config and some configuration items kept its old class names...

Class names now have to start with “com.emc” instead of “eu.c6”


In fact (and this was confirmed by C), we need to change the menu, toolbar, and tree configuration items manually.

Manually here mean to export the full configuration and perform the changes directly inside the zip-archived XML files of the related configuration items. After a quick assessment, we counted around 100 occurrences to be changed in our case. That was too much to be done during an upgrade. So we decided to automatize the process.

We used a tailor-made Linux script to do that:



$> ./ <config archive name>



After execution, the result was a new D2-Config archive, ready to be deployed, with “Update_” as a prefix for the archive name given in parameter.

You can see the content of the script, executed and tested under RHEL 5.10:





 # Script to update D2-config from 3.0 objects level to 3.1/4.1


 if [[ $# -ne 1 ]] ; then

 echo "ERROR usage : \"$0 \""

 exit 1




 mkdir -p ${root_dir}

 unzip -d ${root_dir} "$1"


 for file in `grep -RH -i eu.c6 ${root_dir}/* | cut -d ":" -f 1`


 echo $file


 mv $file ${file}_old

 sed -e s/eu.c6/com.emc/g ${file}_old > $file

 rm ${file}_old



 cd $root_dir

 zip -r "../UPDATED_$1" *

 cd ..

 chmod 777 "UPDATED_$1"


 rm -rf $root_dir



You can now try to import the updated configuration archive and see the difference.

Be careful, a full config import with configuration reset implies the deletion and recreation of some objects like plublic searches (d2c_query with new r_object_id).

Thanks for reading!

Using ILM from Oracle Enterprise Manager

Anthony Shorten - Mon, 2014-10-20 18:18

If you are upgrading to Oracle Database 12c for an Oracle Utilities Application Framework based product then you can use Oracle Enterprise Manager to manage your Information Lifecycle Management based data management solution.

To use it you simply install Oracle Enterprise Manager and then register the 12c database instance (PDB and non-PDB are supported) and then when selecting the database target use the Information Lifecycle Management menu option from the Administration --> Storage menus. For example:

ILM Menu option from OEM

This means that you can use Oracle Enterprise Manager as a viable alternative to ILM Assistant for Oracle Database 12c targets.

RDX Services: Full DBA Support [VIDEO]

Chris Foot - Mon, 2014-10-20 13:23


Hi, and welcome to RDX. In this portion of our "services" series, we'll discuss how we provide companies with all of their database administration needs.

With RDX's full DBA support services, we become your DBA team and assume complete responsibility for the functionality, security, availability and performance of your database environments. We know that each company has unique goals and demands, which is why we also implement guidelines and protocols based on your organization's specific requirements.

In addition, we're willing to fill in any DBA role from our offerings that your company may need. You get the expertise and best practices of over 100 DBA experts for less than the cost of a single in-house resource.

Thanks for watching! Stay tuned for other ways to work with RDX soon.

The post RDX Services: Full DBA Support [VIDEO] appeared first on Remote DBA Experts.

Deploying a Private Cloud at Home — Part 5

Pythian Group - Mon, 2014-10-20 13:05

Today’s blog post is part five of seven in a series dedicated to Deploying Private Cloud at Home, where I will be demonstrating how to configure Compute node and OpenStack services on the compute node. We have already installed the MySQL Python library on compute node in previous posts.

  1. Install OpenStack compute packages on the node
    yum install -y openstack-nova-compute openstack-utils
  2. Configure Nova compute service
    openstack-config --set /etc/nova/nova.conf database connection mysql://nova:Youre_Password@controller/nova
    openstack-config --set /etc/nova/nova.conf DEFAULT auth_strategy keystone
    openstack-config --set /etc/nova/nova.conf keystone_authtoken auth_uri http://controller:5000
    openstack-config --set /etc/nova/nova.conf keystone_authtoken auth_host controller
    openstack-config --set /etc/nova/nova.conf keystone_authtoken auth_protocol http
    openstack-config --set /etc/nova/nova.conf keystone_authtoken auth_port 35357
    openstack-config --set /etc/nova/nova.conf keystone_authtoken admin_user nova
    openstack-config --set /etc/nova/nova.conf keystone_authtoken admin_tenant_name service
    openstack-config --set /etc/nova/nova.conf keystone_authtoken admin_password Your_Password
    openstack-config --set /etc/nova/nova.conf DEFAULT rpc_backend qpid
    openstack-config --set /etc/nova/nova.conf DEFAULT qpid_hostname controller
    openstack-config --set /etc/nova/nova.conf DEFAULT my_ip Your_Compute_node_IP
    openstack-config --set /etc/nova/nova.conf DEFAULT vnc_enabled True
    openstack-config --set /etc/nova/nova.conf DEFAULT vncserver_listen
    openstack-config --set /etc/nova/nova.conf DEFAULT vncserver_proxyclient_address Your_Compute_node_IP
    openstack-config --set /etc/nova/nova.conf DEFAULT novncproxy_base_url http://controller:6080/vnc_auto.html
    openstack-config --set /etc/nova/nova.conf DEFAULT glance_host controller
  3. Start the Compute service and its dependencies. Configure them to start automatically when the system boots
    service libvirtd start
    service messagebus start
    service openstack-nova-compute start
    chkconfig libvirtd on
    chkconfig messagebus on
    chkconfig openstack-nova-compute on
  4. Enable IP forwarding
    perl -pi -e 's,net.ipv4.ip_forward = 0,net.ipv4.ip_forward = 1,' /etc/sysctl.conf
    perl -pi -e 's,net.ipv4.conf.default.rp_filter = 1,net.ipv4.conf.default.rp_filter = 0,' /etc/sysctl.conf
    echo "net.ipv4.conf.all.rp_filter=0" >> /etc/sysctl.conf
    sysctl -p
  5. Install legacy networking components and Flat DHCP
    yum install -y openstack-nova-network openstack-nova-api

    We are using legacy networking and single NIC on both controller and compute nodes. Flat and public interfaces will be the same on below configuration. In this case, it is etho replace with the one you have on your system.

    openstack-config --set /etc/nova/nova.conf DEFAULT network_api_class
    openstack-config --set /etc/nova/nova.conf DEFAULT security_group_api nova
    openstack-config --set /etc/nova/nova.conf DEFAULT network_manager
    openstack-config --set /etc/nova/nova.conf DEFAULT firewall_driver nova.virt.libvirt.firewall.IptablesFirewallDriver
    openstack-config --set /etc/nova/nova.conf DEFAULT network_size 254
    openstack-config --set /etc/nova/nova.conf DEFAULT allow_same_net_traffic False
    openstack-config --set /etc/nova/nova.conf DEFAULT multi_host True
    openstack-config --set /etc/nova/nova.conf DEFAULT send_arp_for_ha True
    openstack-config --set /etc/nova/nova.conf DEFAULT share_dhcp_address True
    openstack-config --set /etc/nova/nova.conf DEFAULT force_dhcp_release True
    openstack-config --set /etc/nova/nova.conf DEFAULT flat_network_bridge br0
    openstack-config --set /etc/nova/nova.conf DEFAULT flat_interface eth0
    openstack-config --set /etc/nova/nova.conf DEFAULT public_interface eth0
  6. Start the services and configure them to start when the system bootsservice openstack-nova-network start
    service openstack-nova-metadata-api start
    chkconfig openstack-nova-network on
    chkconfig openstack-nova-metadata-api on
  7. Restart networking
    service network restart


This completes the configuration of compute node. Stay tuned for part six where we will configure network services on controller node.

Categories: DBA Blogs

Evolution of WCM – Reduce IT Burden with WCM Workflows

WebCenter Team - Mon, 2014-10-20 13:01

By Mitchell Palski, Oracle WebCenter Sales Consultant

Identifying the Problem
Public Sector organizations often times face painful challenges managing their enterprise content. Too many organizations leverage multiple disparate content management systems that present difficulties in standardizing where and how content is processed and stored. The results are manual business processes that are not only tedious and inefficient, but also lack the functionality to provide adequate content quality-assurance, approval automation, and process traceability. To compound upon those issues, these processes also frequently depend on the involvement of development staff to drive and manage their completion. Content management tasks that are business-driven rely on IT resources, limiting the productivity of the organization to deliver new innovative IT services.

Oracle WebCenter - Complete Enterprise Content Management
Oracle WebCenter is the center of engagement for business. It helps people work together more efficiently through contextual collaboration tools that optimize connections between people, information, and applications and ensures users have access to the right information in the context of the business process in which they are engaged. One of the Oracle WebCenter products is Oracle WebCenter Content – an enterprise-class content management solution that empowers employees and content-enables business processes throughout your organization. Oracle WebCenter Content helps organizations lower costs and reduce risks while improving productivity and agility.

Oracle WebCenter includes best of breed web content management (WCM) capabilities that segregates its user interface into distinct roles that provides a productive and focused user experience. These WCM capabilities provide revolutionary web content management tools that allow business users to access and update web content from their browsers, their desktops, and their business applications. Some of Oracle WebCenter’s content management features include:

  • User-focused, specialized interfaces
  • Desktop integration with Microsoft Office products and Windows Explorer
  • Mobile apps for smart phones and tablets
  • Embedded BPM Suite provides one platform for workflows and approvals 

 Some of Oracle WebCenter’s WCM-specific features include:

  • Services-based architecture enables web content management services to be consumed in web, portal and development environments 
  • Automatic conversion of +500 document and image formats 
  • Integrated with digital asset management capabilities 
  • In-context updates of content 
  • Support for multi-lingual sites

 As an integrated component of your enterprise solution, Oracle WebCenter Content: 

  •  Easily deploys and integrates for an immediate ROI
  • Ensures that content meets organizational compliance standards
  • Drives increased productivity through business user engagement

Workflow and Oracle WebCenter Content
Before designing a workflow, organizations must evaluate how their current processes operate. Process flow diagrams are a great way to create visual representations of your existing processes. It’s important to review, analyze, and improve your existing process models before implementing them. Oracle WebCenter has tight out-of-the-box integration with Oracle Business Process Management (BPM) Suite which includes Oracle Process Composer – a web-based tool that enables business users to design and test business processes in a visual representation. 

Oracle BPM is the enterprise class process management tool of choice when delivering complex service-based workflows to your organization. No matter which tool(s) your organization uses, it is always important to:

  1. Understand your process flow
  2. Verify the metadata needed to complete the workflow
  3. Understand what users and/or roles will be responsible for each step

If your workflows are relatively simple, your business analysts can model them using tools like Microsoft Visio. Your organization can then leverage the native workflow engine within Oracle WebCenter Content to deliver those content-based workflows.

The Oracle WebCenter Content native workflow engine provides three types of workflows:

  • A basic workflow defines the review process for specific content items, and must be initiated manually
  • A criteria workflow is used for content that enters a workflow automatically based on metadata that matches predefined criteria
  • A sub-workflow is initiated from a step in another workflow and is created in the same manner as criteria workflows. Sub-workflows are useful for splitting large, complex workflows into manageable pieces

Email is sent to the participating contributors and reviewers involved in workflow steps. Reviewers can review content, reject or approve content, and view information about the content and the workflow. If the content is rejected, reviewer can enter a message to explain the reason for rejection. The message is sent to the reviewers assigned to the last step allowing a contribution. Those reviewers can then check out the content, edit it and check the content back in.

Benefits of Workflow to WCM
Designing a workflow requires you to examine and understand your business processes, helping you find areas for improvement. Setting up workflows for a business process can provide several advantages:

  • Workflows provide good reporting metrics. They can produce an audit trail of who signed off on content at various points of the life cycle of the content
  • Workflows help get the right information to the right person at the right times

 Automated workflows remove human dependencies and reduce risk for your organization.

If your current processes rely heavily on manual human intervention, here are some of the long term and short term risks that your organization faces:

Short Term  Long Term  “Fat fingering” errors  Lack of institutional knowledge for workflows  Inconsistent publishing and branding  Decreased IT productivity  Lost or forgotten content updates  Inability to analyze and enhance existing processes  Frustrated employees and end users  Loss of user-loyalty due to outdated web content

Implementing workflows to manage the web content of your enterprise is a safe, low-risk investment with a tremendous upside. Aside from the ROI your organization will realize from content-publishing efficiencies, workflow-enabled web content also relieves the burden of making administrative updates from your IT staff. The result of a workflow-driven web content management system is an improved web experience for end users and a more productive, happier workplace for your employees. Consider implementing Oracle WebCenter Content and it’s WCM capabilities to empower your business users and deliver a sustainable solution for your enterprise.


Yann Neuhaus - Mon, 2014-10-20 12:37

I have recently published a script to check index fragmentation. But then, do you COALESCE or REBUILD? Well, there also is another option: ALTER INDEX SHRINK SPACE. Let's compare all those index defragmentation operations.

Fragmented index

I have an index created when the table had 1 million rows. Then, I deleted 90% of the rows. Here is the index state from:

  • dbms_space.space_usage
  • index_stats after an analyze index validate structure
  • my index fragmentation checking script with 4 buckets

and here are the results:


      :UNF       :FS1       :FS2       :FS3       :FS4      :FULL  BLOCKS
---------- ---------- ---------- ---------- ---------- ---------- -------
         0          0          1          0          0       2230    2304

------- ------- ---------- -------- -------- -------- ----------- ---------- --------
      3    2304     100404     2226     2225        5         404    1622013       10

         N to          N rows/block bytes/block %free space  blocks free
---------- -- ---------- ---------- ----------- ----------- ------- -----
        10 ->     250280         45         714          91     557 oooo
    250730 ->     500370         45         714          91     557 oooo
    500820 ->     750010         45         714          91     556 oooo
    750460 ->     999660         45         714          91     556 oooo


I have 2226 leaf blocks, the index height is 3 with 5 branch blocks. The leaves are only 91.
However dbms_space shows only full blocks: the deleted entries are still there, just marked as deleted, and the blocks are still seen as full.



Now let's COALESCE:

SQL> alter index DEMO_N coalesce;

And before checking the same values about the index space I measure the amount of work that has been done by the operation (from v$mystat):


NAME                                                                  VALUE
---------------------------------------------------------------- ----------
redo entries                                                          22067
db block gets                                                         32818
session logical reads                                                 32886
db block changes                                                      40601
undo change vector size                                            35199264
redo size                                                          47878800

      :UNF       :FS1       :FS2       :FS3       :FS4      :FULL  BLOCKS
---------- ---------- ---------- ---------- ---------- ---------- -------
         0          0       2004          0          0        227    2304

------- ------- ---------- -------- -------- -------- ----------- ---------- --------
      3    2304     100000      223      222        5           0    1591530       88

         N to          N rows/block bytes/block %free space  blocks free
---------- -- ---------- ---------- ----------- ----------- ------- -----
        10 ->     248690        452        7170          11      56
    253200 ->     500800        450        7158          11      56
    505310 ->     752020        449        7132          11      56
    756530 ->     998730        443        7038          12      55


COALESCE is an online operation that defragments the leaf blocks. We have now only 223 leaf blocks that are 90% full (because my pctfree is the default 10%). But the index stil has the same height and still has 2300 blocks. Where are the reclaimed 2000 blocks? They are available if the index need a new block (for a block split). They are seen as FS2 (at least 25 to 50% free space) by dbms_space because they still contain the deleted rows, but they are fully reclaimable anyway.



Back with the same table, and doing a SHRINK SPACE COMPACT instead of COALESCE:


NAME                                                                  VALUE
---------------------------------------------------------------- ----------
redo entries                                                          28794
db block gets                                                         40940
session logical reads                                                 41527
db block changes                                                      49387
undo change vector size                                            36990460
redo size                                                          51848880

      :UNF       :FS1       :FS2       :FS3       :FS4      :FULL  BLOCKS
---------- ---------- ---------- ---------- ---------- ---------- -------
         0          0          1          0       2003        227    2304

------- ------- ---------- -------- -------- -------- ----------- ---------- --------
      3    2304     100000      223      222        5           0    1591530       88

         N to          N rows/block bytes/block %free space  blocks free
---------- -- ---------- ---------- ----------- ----------- ------- -----
        10 ->     248690        452        7170          11      56
    253200 ->     500800        450        7158          11      56
    505310 ->     752020        449        7132          11      56
    756530 ->     998730        443        7038          12      55


So what is the difference? Slightly more work (about 20% more logical reads and block changes) for the same result. Except that now the reclaimed blocks are in FS4 (75 to 100% free space).



What if we use SHRINK SPACE instead of SHRINK SPACE COMPACT?


NAME                                                                  VALUE
---------------------------------------------------------------- ----------
redo entries                                                          29352
db block gets                                                         45496
session logical reads                                                 46190
db block changes                                                      50032
undo change vector size                                            36981524
redo size                                                          51901500

      :UNF       :FS1       :FS2       :FS3       :FS4      :FULL  BLOCKS
---------- ---------- ---------- ---------- ---------- ---------- -------
         0          0          1          0          0        227     240

------- ------- ---------- -------- -------- -------- ----------- ---------- --------
      3     240     100000      223      222        5           0    1591530       88

         N to          N rows/block bytes/block %free space  blocks free
---------- -- ---------- ---------- ----------- ----------- ------- -----
        10 ->     248690        452        7170          11      56
    253200 ->     500800        450        7158          11      56
    505310 ->     752020        449        7132          11      56
    756530 ->     998730        443        7038          12      55


With tables, the shrink space lowers the high water mark. Here it is the same idea: in addition to the shrink space the reclaimed blocks are no more allocated to the index, so it can be used for other segments in the tablespace. We see that from dbms_space: the index is now 240 blocks only.



The previous requires a lock only for a short duration (according that we did the shrink space compact before). The rebuild needs a Share lock on the table during the whole operation, blocking concurrent DML.


NAME                                                                  VALUE
---------------------------------------------------------------- ----------
db block gets                                                           953
redo entries                                                           1832
db block changes                                                       1906
session logical reads                                                  4019
undo change vector size                                                9152
redo size                                                            173732

      :UNF       :FS1       :FS2       :FS3       :FS4      :FULL  BLOCKS
---------- ---------- ---------- ---------- ---------- ---------- -------
         0          0          1          0          0        222     256

------- ------- ---------- -------- -------- -------- ----------- ---------- --------
      2     256     100000      222      221        1           0    1591520       90

         N to          N rows/block bytes/block %free space  blocks free
---------- -- ---------- ---------- ----------- ----------- ------- -----
        10 ->     248690        452        7170          11      56
    253200 ->     501250        451        7170          11      56
    505760 ->     749300        451        7170          11      55
    753810 ->     997350        448        7117          11      55


The result is the same as the previous operation (SHRINK SPACE) except that the index height has decreased. A rebuild is the right operation if the index blevel has become too high. And we did that offline but with much less work. Minimal undo and redo. And small blocks to read (when the index is still usable the rebuild can use the current index to rebuild the new segment).



Last operation, possible only in Enterprise Edition, is the rebuild online which doesn't need to lock the table.


NAME                                                                  VALUE
---------------------------------------------------------------- ----------
redo entries                                                            660
db block changes                                                        876
db block gets                                                          1419
session logical reads                                                  4989
undo change vector size                                               24932
redo size                                                            114924

      :UNF       :FS1       :FS2       :FS3       :FS4      :FULL  BLOCKS
---------- ---------- ---------- ---------- ---------- ---------- -------
         0          0          1          0          0        222     256

------- ------- ---------- -------- -------- -------- ----------- ---------- --------
      2     256     100000      222      221        1           0    1591520       90

         N to          N rows/block bytes/block %free space  blocks free
---------- -- ---------- ---------- ----------- ----------- ------- -----
        10 ->     248690        452        7170          11      56
    253200 ->     501250        451        7170          11      56
    505760 ->     749300        451        7170          11      55
    753810 ->     997350        448        7117          11      55


Here we don't see an overhead to do it online. This is because my table is small (my testcase has only one column which is the indexed one). On a real table you will probably see that the online rebuild takes longer than the offline one, because it cannot use the current index. But anyway, the fact that it is online means that the duration is not a big issue.



This is an example on a table that had a massive purge: all blocks were touched. In that case the REBUILD is the right solution. However if you are in Standard Edition and cannot do it online, then you will probably do a SHRINK SPACE COMPACT because it lets you do a SHRINK SPACE (need a quick locks but for a very short duration - to do on a period of low activity).
COALESCE will make sense here only if you know you will insert back a lot of rows, so that you don't need to deallocate the blocks from the index.

Now, what to do if the free space to reclaim is only on small part of the index blocks? As in the following case:


         N to          N rows/block bytes/block %free space  blocks free
---------- -- ---------- ---------- ----------- ----------- ------- -----
        10 ->      50468        374        5974          26     112 o
     50917 ->     100756        449        7179          11     112
    101205 ->     151044        449        7179          11     112
    151493 ->     201332        449        7179          11     112
    201781 ->     251620        449        7179          11     112
    252069 ->     301908        449        7179          11     112
    302357 ->     351747        449        7179          11     111
    352196 ->     401586        449        7179          11     111
    402035 ->     451425        449        7179          11     111
    451874 ->     501264        449        7179          11     111
    501713 ->     551103        449        7179          11     111


In that case a COALESCE is probably the best because you keep the blocks allocated to the index for future growth. And the few blocks reclaimed will probably not change the index height anyway. However, if you did a small purge that concern only a small part of the index and want to reclaim the space for other segments, then the SHRINK SPACE is the right solution. But do you really need to reclaim space in that case?

Now you see the usage for my index fragmentation script: I don't need only the average free space. I need to have a clear picture of the fragmentation in order to decide what to do and how.

OCP 12C – ADR and Network Enhancements

DBA Scripts and Articles - Mon, 2014-10-20 11:59

ADR enhancements In oracle 12c the Automatic Diagnostic Repository contains a new log directory with 2 subdirectories : DDL Debug The DDL log When you active the DDL logging in Oracle 12c using enable_ddl_logging=true, Oracle writes all DDL operations into the specific DDL log instead of writting it to the alert log. The DDL logging feature [...]

The post OCP 12C – ADR and Network Enhancements appeared first on Oracle DBA Scripts and Articles (Montreal).

Categories: DBA Blogs

OCP 12C – Emergency Monitoring, Real-Time ADDM

DBA Scripts and Articles - Mon, 2014-10-20 11:48

Emergency Monitoring Emergency monitoring is meant for extreme circumstances where it’s impossible for you to connect to the database because the database is hung. Emergency monitoring allows you to connect to the database in diagnostic mode and run a lightweight analysis to see what’s happening. You can access real-time performance data from ASH and access [...]

The post OCP 12C – Emergency Monitoring, Real-Time ADDM appeared first on Oracle DBA Scripts and Articles (Montreal).

Categories: DBA Blogs

Avro MapReduce Jobs in Oozie

Pythian Group - Mon, 2014-10-20 07:56

Normally when using Avro files as input or output to a MapReduce job, you write a Java main[] method to set up the Job using AvroJob. That documentation page does a good job of explaining where to use AvroMappers, AvroReducers, and the AvroKey and AvroValue (N.B. if you want a file full of a particular Avro object, not key-value pair of two Avro types, use AvroKeyOutputWriter as the OutputFormat, AvroKey as the key and NullWritable as the value).

Sometimes (like if you’re using Oozie), you need to set everything up without using AvroJob as a helper. The documentation is less clear here, so here’s a list of Hadoop keys and the appropriate values (for MRv2):

  • avro.schema.output.key - The JSON representation of the output key’s Avro schema. For large objects you may run afoul of Oozie’s 100,000 character workflow limit, in which case you can isolate your Avro job in a subflow
  • avro.schema.output.value – Likewise, if you’re emitting key-value pairs instead of using AvroKeyOutputWriter, put your value’s JSON schema here
  • avro.mapper - your mapper class that extends AvroMapper. You can also use a normal Mapper (with the normal Mapper configuration option), but you’ll have to handle coverting the AvroKey/AvroValue yourself
  • avro.reducer - likewise, a class that extends AvroReducer
  • mapreduce.job.output.key.class - always AvroKey
  • mapreduce.job.output.value.class – AvroValue or NullWritable, as above
  • mapreduce.input.format.class  - if you’re reading Avro files as Input, you’ll need to set this to
  • - AvroKey, if you’re using a subclass of AvroMapper. If you write your own Mapper, you can pick
  • - AvroKey or NullWritable, unless you write a Mapper without subclassing AvroMapper
  • io.serializations  – AvroJob set this value to the following:,,,

With these configuration options you should be able to set up an Avro job in Oozie, or any other place where you have to set up your MapReduce job manually.

Categories: DBA Blogs

An Introduction to Extended Data Types in Oracle 12c

Pythian Group - Mon, 2014-10-20 07:55

One of the lesser known new features that comes as a boon to many developers and DBAs is the provision of implicit handling of large data strings using scalar data types like VARCHAR2 and RAW.

When creating tables, each column must be assigned a data type, which determines the nature of the values that can be inserted into the column. Common data types include number, date, and varchar2. These data types are also used to specify the nature of arguments for PL/SQL programs like functions and procedures.

When choosing a data type, you must carefully consider the data you plan to store and the operations you may want to perform upon it. Making good decisions at the table design stage reduces the potential negative downstream impact on space utilization and performance. Space is a consideration since some data types occupy a fixed length, consuming the same number of bytes, no matter what data is actually stored in it.

In pre-12c databases, long characters strings of more than 4000 bytes had to be handled using creative solutions including: CLOB or LONG data types and multiple columns or variables. These approaches led to inefficient unnecessarily complex designs and added processing overheads.

12c introduced the MAX_STRING_SIZE system parameter that allows string data types to be much larger when the parameter is changed from its default value of STANDARD to EXTENDED. The VARCHAR2 data type, stores variable length character data from 1 to 4000 bytes if MAX_STRING_SIZE=STANDARD or up to 32767 bytes if MAX_STRING_SIZE=EXTENDED.

RAW and NVARCHAR2 data types are affected in similar ways.


Potential issues to consider:

  • Internally, extended data types are stored out-of-line using LOBs, but these cannot be manipulated using the DBMS_LOB interface.
  • When changing the MAX_STRING_SIZE parameter, objects may be updated invalidating dependent objects, so ideally, change this parameter during a maintenance window in your important databases.
  • List partitioning on EDT columns may potentially exceed the 4096 byte limit for the partition bounds. The DEFAULT partition may be used for data values that exceed the 4096 byte limit or a hash function may be used on the data to create unique identifiers smaller than 4096 bytes.
  • Indexing EDT columns may fail with “maximum key length exceeded” errors. For example, databases with an 8k default block size support a maximum key length of approximately 6400 bytes. A suggested work-around is to use a virtual column or function-based index to effectively shorten the index key length.



This feature will no doubt be improved and the shortcomings will be dealt with in future releases—but for now, it offers a clean and elegant mechanism for handling large character data within existing applications requiring minimal code changes.



Categories: DBA Blogs

Old Castles

Pete Scott - Mon, 2014-10-20 06:12
Living here on the Kent Coast we are quite blessed with the number of castles within half and hour’s drive of our cottage. English Heritage manages several nearby castles or forts. The nearest, Richborough, is out and out Roman. We had a lot of Romans roaming around here, they even strolled past my cottage along […]

Connecting to Pivotal Cloud Foundry Ops Metrics using Java VisualVM

Pas Apicella - Sun, 2014-10-19 21:34
The Pivotal Ops Metrics tool is a JMX extension for Elastic Runtime. Pivotal Ops Metrics collects and exposes system data from Cloud Foundry components via a JMX endpoint. Use this system data to monitor your installation and assist in troubleshooting. Below is the tile once installed and available with Pivotal Cloud Foundry Ops Manager

Once installed and configured, metrics for Cloud Foundry components automatically report to the JMX endpoint. Your JMX client uses the credentials supplied to connect to the IP address of the Pivotal Ops Metrics JMX Provider at port 44444

1. Start jvisualvm

2. Under plugin ensure you have the VisualVm-Mbeans plugin installed as shown below, or install it to be able to view the MBeans.

3. Create a JMX connection as shown below

4. Finally the CF MBeans can be viewed as shown below.

More Information

Deploying Pivotal Ops Metrics
Categories: Fusion Middleware