Skip navigation.

Feed aggregator

Loose Coupling and Discovery of Services With Consul — Part 1

Pythian Group - Wed, 2014-09-03 08:16

Today I would like to demonstrate the use of Consul to help achieve loose coupling and discovery of services, two important principles of service-oriented architecture (SOA) present in modern, elastic infrastructures.

If you have ever designed an architecture to be deployed in a cloud computing environment, then you are probably already familiar with the idea that you should design for failure. Having the expectation that one or more components of your infrastructure may fail at any given time forces you to design your services in such a way that they are loosely coupled from one another. This usually means running multiple instances of web, application, and database servers, and making use of load balancers, message queues and / or APIs for communication between them.

The environment is scaled horizontally, by adding more instances to the pool. Because of this, instances need to be able to advertise the service they provide, and discover providers of other services.

Where Does Consul Fit In?

Consul’s introduction page does an excellent job of describing the features it provides and what its basic architecture looks like, but I’ll provide a brief summary here.

Consul is a tool for discovering and configuring services in your infrastructure. At its core, it provides service discovery, health checking, and a key/value store. It supports multiple data centers without having to add a layer of abstraction. Nodes that provide services run a Consul agent which talks to one or more Consul servers. Services or nodes can be discovered by querying any of the Consul servers or agents. Data is exposed through DNS and / or HTTP interfaces.

Its capabilities overlap with a number of different types of software including: service discovery systems such as ZooKeeper and etcd; configuration management systems such as Puppet or Chef; and monitoring systems such as Nagios or Sensu. Again, there is an excellent page on the Consul website that describes how it compares with these tools.

The Demo Environment

For my Consul demonstration, I elected to use a combination of Docker and Vagrant. Docker because it makes it easy for me to run multiple lightweight containers on the same machine, and Vagrant because it gives me a great deal of flexibility in building and controlling the containers I will be creating. If you are unfamiliar with either of these tools, allow me to provide a brief overview.


Per Wikipedia, Docker “.. is an open-source project that automates the deployment of applications inside software containers”. It consists of: Docker Engine, a container that runs on the host operating system; and Docker Hub, a cloud service similar to GitHub which allows users to share containers and automate workflows.

Docker makes use of the following features of the Linux kernel: cgroups (or control groups), which make it possible to limit and isolate resource usage (CPU, memory, etc) of process groups; and namespace isolation, where process groups’ view of operating environment resources such as process trees, network, and file systems are isolated from one another.


Vagrant is a tool for building complete, reproducible development environments, with a focus on automation. It started out as essentially a ‘wrapper’ for VirtualBox, but now supports VMware, Docker, kvm/libvirt and Amazon EC2 as providers, as well as a number of provisioners including Puppet, Chef, and Docker.

Vagrant was created by Mitchell Hashimoto, who initially maintained it in his free time. After it went on to become wildly successful, he formed HashiCorp so he could work on the product full time. HashiCorp has since released other products including Packer, Serf, Consul, and most recently Terraform.

Installing The Tools Docker

Ubuntu 14.04 LTS includes Docker in its universe repository under the name and can therefore be installed this way:

$ sudo apt-get update
$ sudo apt-get install

Keep in mind that this may not be the latest version of Docker. If you would like to try the latest version, you will need to add the Docker repository key and add the repository to apt’s sources list. A script has been made available to automate this process for you, so all that is necessary is to run the following:

$ curl -sSL | sudo sh

On Fedora, Docker is provided by the package named docker-io. If you are running Fedora 19 and have the (unrelated) docker package installed, you will need to remove it before continuing.

$ sudo yum -y remove docker

With Fedora 21 and later, the docker package’s functionality is provided by another package named wmdocker, and it does not conflict with docker-io. Use the following command to install Docker.

$ sudo yum -y install docker-io

Whichever operating system you are running Docker on, you will likely want to be able to use the commands as your regular, non-privileged user, without having to elevate privileges to root. Therefore, you will probably want to make yourself a member of the docker group so you have access to the socket file used by the various Docker commands.

$ sudo usermod -a -G docker your_name

For Vagrant, you will need at least version 1.6.0 as that is when the Docker provider was introduced. This demonstration was tested with version 1.6.3. To install Vagrant, visit its download page and obtain the appropriate package for your operating system. You can install the package on Ubuntu using the following command:

$ sudo dpkg -i vagrant_1.6.3_x86_64.deb

That’s it. In the next section, we will install Consul and continue with setting up our cluster.

Setting Up The Consul Cluster

Let’s begin by establishing the Consul cluster, which will be used by nodes to register themselves as providers of services, and can be queried to discover which services are being provided.

The first step is to download and install Consul on the host which will be running the Docker containers.

$ wget
$ unzip -d /usr/local/bin/

Now we can start a Consul server that will bind to the IP address of the docker0 interface on your host.

$ consul agent -server -bootstrap -bind= -client= -data-dir /tmp/consul

This server will elect itself leader of the cluster (since it is currently the only member), and is what the rest of the Consul servers will connect to when joining. Ultimately, the architecture will look like this:

Consul Demo Architecture

Before we can bring up the rest of the Consul servers, however, we’ll need to do some prep work. As mentioned above, the intent is to use Vagrant with the Docker provider to create this environment. Therefore, let’s begin by creating a working directory to contain our Vagrantfile.

$ mkdir -p vagrant/consul && cd vagrant/consul
$ vagrant init

The contents of Vagrantfile should look like this:


Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  # A hash of containers to define.
  # These will be the Consul cluster members.
  consul_members = [ "consul1", "consul2", "consul3" ]
  consul_members.each do |member|
  	config.vm.define member do |consul_config|

  	  # Use Docker provider
  	  consul_config.vm.provider "docker" do |docker| = member
  	  	docker.image = 'progrium/consul'
  	  	docker.cmd = [ "-server", "-node=#{member}", "-join=#{JOIN_IP}" ]

What should be self evident is that Vagrant is assigning the value of an environment variable called JOIN_IP to a local variable with the same name, and then enumerating through an array of three Consul members which it will create using the Docker provider.

What might not be so obvious, however, are the docker.image and docker.cmd lines, so I will explain them in greater detail.

docker.image = 'progrium/consul'

This line tells Docker to launch (or ‘run’) an instance of the progrium/consul image as found on the Docker Hub Registry. It is a small container based on BusyBox used to run Consul. The project’s source page can be found on GitHub if you’re interested in learning more. The next line:

docker.cmd = [ "-server", "-node=#{member}", "-join=#{JOIN_IP}" ]

is an array of strings which is used to build a custom command to run on the container. Vagrant will perform string interpolation on the member and JOIN_IP variables, replacing them with the current member’s name and the IP address that was provided via the JOIN_IP environment variable. The end result is that the container runs a command such as this:

/bin/consul agent -config-dir=/config -server -node=consul1 -join=

Let’s see it in action by telling Vagrant to create our containers.

$ JOIN_IP= vagrant up --provider=docker
Bringing machine 'consul1' up with 'docker' provider...
Bringing machine 'consul2' up with 'docker' provider...
Bringing machine 'consul3' up with 'docker' provider...
==> consul2: Creating the container...
    consul2:   Name: consul2
    consul2:  Image: progrium/consul
    consul2:    Cmd: -server -node=consul2 -join=
    consul2: Volume: /home/bfraser/vagrant/consul:/vagrant
    consul2: Container created: d85fbfacdb45cabc
==> consul2: Starting container...
==> consul2: Provisioners will not be run since container doesn't support SSH.
==> consul1: Creating the container...
    consul1:   Name: consul1
    consul1:  Image: progrium/consul
    consul1:    Cmd: -server -node=consul1 -join=
    consul1: Volume: /home/bfraser/vagrant/consul:/vagrant
==> consul3: Fixed port collision for 22 => 2222. Now on port 2200.
==> consul3: Creating the container...
    consul3:   Name: consul3
    consul3:  Image: progrium/consul
    consul3:    Cmd: -server -node=consul3 -join=
    consul3: Volume: /home/bfraser/vagrant/consul:/vagrant
    consul1: Container created: 413dfa1a63c94bcc
==> consul1: Starting container...
==> consul1: Provisioners will not be run since container doesn't support SSH.
    consul3: Container creaited: fb54d80e8ce58a46
==> consul3: Starting container...
==> consul3: Provisioners will not be run since container doesn't support SSH.

This created three containers and substituted the node name and join address as expected. Now let’s see what Docker reports.

$ docker ps
CONTAINER ID        IMAGE                    COMMAND                CREATED              STATUS              PORTS                                                                          NAMES
fb54d80e8ce5        progrium/consul:latest   /bin/start -server -   About a minute ago   Up About a minute   53/udp, 8300/tcp, 8301/tcp, 8301/udp, 8302/tcp, 8302/udp, 8400/tcp, 8500/tcp   consul3             
413dfa1a63c9        progrium/consul:latest   /bin/start -server -   About a minute ago   Up About a minute   53/udp, 8300/tcp, 8301/tcp, 8301/udp, 8302/tcp, 8302/udp, 8400/tcp, 8500/tcp   consul1             
d85fbfacdb45        progrium/consul:latest   /bin/start -server -   About a minute ago   Up About a minute   53/udp, 8300/tcp, 8301/tcp, 8301/udp, 8302/tcp, 8302/udp, 8400/tcp, 8500/tcp   consul2             

There are in fact three containers running. Other details are provided but they aren’t relevant at this point, I just wanted to show you how to view the status of your newly created Docker containers.

Now if we check Consul, we should see each of the containers listed as members of the cluster (note: we can’t just run consul members as we need to tell it which IP address to use as the RPC address).

$ consul members -rpc-addr=
Node     Address           Status  Type    Build  Protocol
laptop  alive   server  0.3.0  2
consul2   alive   server  0.3.0  2
consul1   alive   server  0.3.0  2
consul3   alive   server  0.3.0  2

As you can see, we have successfully created a cluster of Consul servers, comprised of an initial server running on the host and three servers running as Docker containers created by Vagrant. In my next post, we will see how to add Consul clients providing services (‘providers’), register the services, and query available services from clients (‘consumers’).

Categories: DBA Blogs

Teach Your Children Well

Steve Karam - Wed, 2014-09-03 06:00

Today’s the first day of school in my city, and the plethora of “OMG 1st day of school!” posts of Facebook and my own kids heading back in to slog through another year of fundamentals got me thinking about education. My own kids (12/daughter and 10/son) came home and went swimming since it was about 104 degrees today…the hottest day of the summer, as it were.

minecraft-2__140227213724Anyways, after that activity my son decided to get down to business. And by business I mean: play Minecraft. Some of my friends and family have banned Minecraft during the school week or even as a whole, because of the incredibly addictive nature of the game. I elected instead to make my son play it on the computer instead of the Xbox or iDevice. See, on those systems it’s slightly more mindless (yet still very creative); you just pick the things you want to make and it makes them, and the game is what the game is because it’s on a console. On the computer, you have to memorize the ingredients and patterns necessary to make objects (from making andesite to baking cakes). You can mod the game with a variety of amazing mods out there in the community. You can play on servers that have goals ranging from “kill everyone you see” to “let’s build something incredible together.” I like that. It’s like legos, except with social implications, unlimited blocks of every kind, and electronics lessons all rolled into one.

What’s more, in the last few months my son has learned to install Java, use basic DOS, modify heap parameters, create a .BAT file, and many other cool things. Add that to his foray into the world of cryptocurrency, and he’s growing a set of very valuable skills. My daughter’s no slouch either, she spent a couple years on Ubuntu Linux and actually came to like it quite a bit!

Okay, so enough bragging on my kids. They’re goofballs anyways.

Teach Kids Cool Things

A while back I posted about how I became a DBA while in High School. When I was 18, I offered to help someone at a community college with a presentation to teach college students about programming. I remember them telling me that it was a ridiculous concept, because kids can’t learn those kinds of things anyways. It struck a nerve as you might imagine. Kids can learn highly technical skills if someone is willing to teach them. And with the proliferation of technical gadgets and abilities in the world, they want to learn. You can even start them young…REALLY young.stillashark

There are some great resources out there beyond build-a-world-with-blocks games like Minecraft. There are simple learning sites like, highly interactive and informative classes like Code Academy, and even specially made storyboarding and design programs like Scratch by MIT.

For books with associated exercises, we have 3D Game Programming for Kids, Learn to Program with Minecraft Plugins (Java and Minecraft!), and even Python with Python for Kids: A Playful Introduction to Programming.

Babies don’t get a pass from learning! Code Babies has a set of excellent introductions to programming including Web Design, CSS, and HTML. A is NOT for “Aardvark,” it’s for “Anchor Tag”!

There are even online courses like Learn JAVA with Minecraft (my son is doing this one and enjoys it), and summer camps as well, to teach kids online and in person.

Even Oracle has a part to play in the education of youngsters with Oracle Academy. Oracle Academy has resources for all levels of learning, contests, and even a self-study programming environment and course called Greenfoot that teaches Java from zero experience with 2D gaming design. I’ve had the privilege of judging some of their contests in the past and was astounded by the skills these young people had in highly advanced technology.

This is so awesome. I really wish these sorts of things were around when I was a kid. If you have children and want them to learn how to tech, the resources are plentiful and amazing.

The post Teach Your Children Well appeared first on Oracle Alchemist.

Who is using this index?

David Kurtz - Wed, 2014-09-03 04:30
Or, to put it another way, I want to change or drop this index, who and what will I impact?
The Challenge The problem that I am going to outline is certainly not exclusive to PeopleSoft, but I am going to illustrate it with examples from PeopleSoft. I often find tables with far more indexes than are good for them.
  • The Application Designer tool makes it very easy for developers to add indexes to tables. Sometimes, too easy!
  • Sometimes, DBAs are too quick to unquestioningly follow the advice of the Oracle tuning advisor to add indexes.
Recently, I have been working on 3 different PeopleSoft Financials systems where I have found major tables with a host of indexes.

There are several concerns:
  • Indexes are maintained during data modification. The more indexes you have, the greater the overhead. 
  • The more indexes you have, particularly if they lead on the same columns, the more likely Oracle is to use the wrong one, resulting in poorer performance.
  • There is of course also a space overhead for each index, but this is often of less concern. 
If you can get rid of an index, Oracle doesn't store, maintain or use it. 
In some cases, I have wanted to remove unnecessary indexes, and in others to adjust indexes. However, this immediately raises the question of where are these indexes used, and who will be impacted by the change. Naturally, I turn to the Active Session History (ASH) to help me find the answers. Index Maintenance Overhead during DDL ASH reports the object number, file number, block number and (from 11g) row number within the block being accessed by physical file operations. However, the values reported in v$active_session_history (and later other views) are not reliable for other events because they are merely left over from the previous file event that reported them. So, we can profile the amount of time spent on physical I/O on different tables and indexes, but not for other forms of DB Time, such as CPU time, spent accessing the blocks in the buffer cache.
Let me take an extreme example from PeopleSoft Global Payroll. The table PS_GP_RSLT_ACUM is one of the principal result tables. It has only a single unique index (with the same name). The table is populated with the simplest of insert statements.
I can profile the ASH data for just this statement over the last week on a production system. Note that DBA_OBJECTS and DBA_DATA_FILES are outer joined to the ASH data and only matched for events like 'db file%'
SELECT o.object_type, o.object_name
, f.tablespace_name, NVL(h.event,'CPU+CPU Wait') event
, SUM(10) ash_Secs
FROM dba_hist_Active_sess_history h
LEFT OUTER JOIN dba_objects o
ON o.object_id = h.current_obj#
AND h.event like 'db file%'
LEFT OUTER JOIN dba_data_files f
ON f.file_id = h.current_file#
AND h.event like 'db file%'
WHERE h.sql_id = '4ru0618dswz3y'
AND h.sample_time >= sysdate-7
GROUP BY o.object_type, o.object_name, h.event, f.tablespace_name
ORDER BY ash_secs DESC
A full payroll calculation inserts over 3 million rows on this particular system. The calculation is run incrementally several times per week during which old rows are deleted and newly recalculated rows inserted.  Looking at just this insert statement:
  • 30% of the time is spent on CPU operations, we cannot profile that time further with ASH.
  • 38% of the time is spent reading from the table and index, yet this is a simple INSERT … VALUES statement.
------------------- ------------------ --------------- ------------------------ ----------
CPU+CPU Wait 1040
UNDOTBS1 db file sequential read 900
INDEX SUBPARTITION PS_GP_RSLT_ACUM GP201408IDX db file sequential read 750
TABLE SUBPARTITION PS_GP_RSLT_ACUM GP201408TAB db file sequential read 550
gc current grant 2-way 70
cursor: pin S wait on X 60
db file sequential read 10
buffer exterminate 10
row cache lock 10
More time is spent reading the index than the table.  That is not a surprise.  When you insert a row into a table, you also insert it into the index. Rows in index leaf blocks are ordered by the key columns, and the new entry must go into the right place, so you have to read down the index from the root block, through the branch blocks, to find the correct leaf block for the new entry.
[Digression: Counter-intuitively index compression can improve DML performance. It does for this index.  The overhead of the compression processing can be outweighed by the saving in physical I/O.  It depends.]
Profile Physical I/O by Object I can twist this query around and profile DB_TIME by object for 'db file%' events
SELECT o.object_type, o.object_name, sum(10) ash_secs
FROM dba_hist_active_sess_history h
, dba_objects o
WHERE o.object_id = h.current_obj#
AND h.event LIKE 'db file%'
AND h.sample_time > sysdate-7
GROUP BY o.object_type, o.object_name
Now I can see upon which objects the most time is spent on physical I/O.
---------- ------------------ ----------

sum 1382680
This is a worthwhile exercise, it shows the sources of physical I/O in an application.

However, if you want to find where an index is used, then this query will also identify SQL_IDs where the index is either used in the query or maintained by DML. If I am interested in looking for places where changing or deleting an index could have an impact then I am only interested in SQL query activity. ASH samples which relate to index maintenance are a false positive. Yet, I cannot simply eliminate ASH samples where the SQL_OPNAME is not SELECT because the index may be used in a query within the DML statement.

Another problem with this method is that it matches SQL to ASH by object ID. If someone has rebuilt an index, then its object number changes.

A different approach is required.
Index Use from SQL Plans Captured by AWR During an AWR snapshot the top-n SQL statements by each SQL criteria in the AWR report (Elapsed Time, CPU Time, Parse Calls, Shareable Memory, Version Count) , see dbms_workload_repository. The SQL plans are exposed by the view DBA_HIST_SQL_PLAN.

On PeopleSoft systems, I generally recommend decreasing the snapshot interval from the default of 60 minutes to 15. The main reason is that SQL gets aged out of the library cache very quickly in PeopleSoft systems. They generate lots of dynamic code, often with literal values rather than bind variables. Cursor sharing is not recommended for PeopleSoft, so different bind variables result in different SQL_IDs. The dynamic code also results in different SQL IDs even with cursor sharing. Therefore, increasing the snapshot frequency means that will capture more SQL statement. This will increase total volume of the AWR repository simply because there are more snapshots. However, the overall volume of ASH data captured does not change, it just gets copied to the repository earlier.

On DBA_HIST_SQL_PLAN the object ID, owner, type and name are recorded, so I can find the plans which referenced a particular object. I am going to carry on with the example from a PeopleSoft Financials system, and look at indexes on the PS_PROJ_RESOURCE table.

These are some of the indexes on PS_PROJ_RESOURCE. We have 4 indexes that all lead on PROCESS_INSTANCE. I suspect that not all are essential, but I need to work out what is using them, and which one I should retain.
------------------ ---------- -------------------- ----------------------------------





I find it easier to extract the ASH data to my own working storage table. For each index on PS_PROJ_RESOURCE, I am going to extract a distinct list of plan hash values. I will then extract all ASH data for those plans. Note, that I have not joined the SQL_ID on DBA_HIST_SQL_PLAN. That is because different SQL_IDs can produce the same execution plan. The plan is equally valid for all SQL_IDs that produce the plan, not just the one where the SQL_ID also matches.
DROP TABLE my_ash purge
SELECT DISTINCT p.plan_hash_value, p.object#, p.object_owner, p.object_type, p.object_name
FROM dba_hist_sql_plan p
WHERE p.object_name like 'PS_PROJ_RESOURCE'
AND p.object_type LIKE 'INDEX%'
AND p.object_owner = 'SYSADM'
SELECT p.object# object_id, p.object_owner, p.object_type, p.object_name
, h.*
FROM dba_hist_active_sess_history h
, p
WHERE h.sql_plan_hash_value = p.plan_hash_value
I am fortunate that PeopleSoft is a well instrumented application. Module and Action are set to fairly sensible values that will tell me whereabouts in the application the ASH sample relates. In the following query, I have omitted any ASH data generated by SQL*Plus, Toad, or SQL Developer, and also any generated by Oracle processes to prevent statistics collection jobs being included.
Set pages 999 lines 150 trimspool on
break on object_name skip 1
compute sum of ash_secs on object_name
column ash_secs heading 'ASH|Secs' format 9999999
column module format a20
column action format a32
column object_name format a18
column max_sample_time format a19 heading 'Last|Sample'
column sql_plans heading 'SQL|Plans' format 9999
column sql_execs heading 'SQL|Execs' format 99999
SELECT object_name
, CASE WHEN h.module IS NULL THEN REGEXP_SUBSTR(h.program,'[^.@]+',1,1)
WHEN h.module LIKE 'PSAE.%' THEN REGEXP_SUBSTR(h.module,'[^.]+',1,2)
ELSE REGEXP_SUBSTR(h.program,'[^.@]+',1,1)
END as module
ELSE h.action
END as action
, CAST(sample_time AS DATE) sample_time
, sql_id, sql_plan_hash_value, sql_exec_id
FROM my_ash h
SELECT object_name, module, action
, sum(10) ash_secs
, COUNT(DISTINCT sql_plan_hash_value) sql_plans
, COUNT(DISTINCT sql_id||sql_plan_hash_value||sql_exec_id) sql_execs
, MAX(sample_time) max_sample_time
WHERE NOT LOWER(module) IN('oracle','toad','sqlplus','sqlplusw')
AND NOT LOWER(module) LIKE 'sql%'
GROUP BY object_name, module, action
ORDER BY SUBSTR(object_name,4), object_name, ash_Secs desc
Spool off
I now have a profile of how much each index is used. In this particular case I found something using every index.  It is possible that you will not find anything that uses some indexes.
                                                                             ASH   SQL    SQL Last
------------------ -------------------- -------------------------------- ------- ----- ------ -------------------

PSJPROJ_RESOURCE PC_TL_TO_PC GFCPBINT_AE.CallmeA.Step24.S 7300 1 66 06:32:57 27/08/2014
PC_PRICING GFCPBINT_AE.CallmeA.Step24.S 40 1 2 08:38:57 22/08/2014
****************** -------
sum 7340

PSLPROJ_RESOURCE PC_TL_TO_PC GFCPBINT_AE.CallmeA.Step28.S 1220 1 53 06:33:17 27/08/2014
****************** -------
sum 1220

PSMPROJ_RESOURCE PC_TL_TO_PC GFCPBINT_AE.XxBiEDM.Step07.S 60 2 6 18:35:18 20/08/2014
****************** -------
sum 60

PSNPROJ_RESOURCE PC_TL_TO_PC GFCPBINT_AE.CallmeA.Step26.S 6720 1 49 18:53:58 26/08/2014
PC_TL_TO_PC GFCPBINT_AE.CallmeA.Step30.S 3460 1 60 06:33:27 27/08/2014
GFCOA_CMSN GFCOA_CMSN.01INIT.Step01.S 2660 1 47 19:19:40 26/08/2014
PC_TL_TO_PC GFCPBINT_AE.CallmeA.Step06.S 1800 1 52 18:53:28 26/08/2014
PC_TL_TO_PC GFCPBINT_AE.CallmeG.Step01.S 1740 1 61 06:34:17 27/08/2014
PC_TL_TO_PC GFCPBINT_AE.CallmeA.Step02.S 1680 1 24 18:53:18 26/08/2014
PC_TL_TO_PC GFCPBINT_AE.CallmeA.Step10.S 1460 1 33 17:26:26 22/08/2014
PC_TL_TO_PC GFCPBINT_AE.CallmeA.Step08.S 920 1 26 17:26:16 22/08/2014
PC_TL_TO_PC GFCPBINT_AE.CallmeA.Step36.S 460 1 18 18:26:38 20/08/2014
PC_TL_TO_PC GFCPBINT_AE.CallmeA.Step09.S 420 1 16 06:33:07 27/08/2014
PC_PRICING GFCPBINT_AE.CallmeG.Step01.S 200 1 10 08:09:55 22/08/2014
PC_AP_TO_PC GFCPBINT_AE.CallmeH.Step00A.S 170 1 17 21:53:26 21/08/2014
PC_PRICING GFCPBINT_AE.CallmeA.Step36.S 20 1 1 08:02:46 05/08/2014
PC_PRICING GFCPBINT_AE.CallmeA.Step30.S 20 1 1 13:42:48 04/08/2014
PC_PRICING GFCPBINT_AE.CallmeA.Step06.S 20 1 1 15:58:35 28/07/2014
PC_TL_TO_PC GFCPBINT_AE.CallmeA.Pseudo.S 20 1 1 19:45:11 06/08/2014
****************** -------
sum 21770

The next stage is to look at individual SQL statements This query looks for which SQL statement is using a particular index on PROJ_RESOURCE. If I can't find the SQL which cost the most time, then just choose another SQL with the same plan
  • I have found that sometimes a plan is captured by AWR, but the SQL statement is not. Personally, I think that is a bug. Working around it has made the following query quite complicated.
Break on object_name skip 1 
column ash_secs heading 'ASH|Secs' format 9999999
Set long 50000
Column cmd Format a200
Spool dmk

SELECT h.object_name
, CASE WHEN h.module IS NULL THEN REGEXP_SUBSTR(h.program,'[^.@]+',1,1)
WHEN h.module LIKE 'PSAE.%' THEN REGEXP_SUBSTR(h.module,'[^.]+',1,2)
ELSE REGEXP_SUBSTR(h.program,'[^.@]+',1,1)
END as module
ELSE h.action
END as action
, h.sql_id, h.sql_plan_hash_value
, t.command_type –-not null if plan and statement captured
FROM my_ash h
FROM dba_hist_sqltext t1
, dba_hist_sql_plan p1
WHERE t1.sql_id = p1.sql_id
AND = 1
) t
ON t.sql_id = h.sql_id
AND t.dbid = h.dbid
AND h.object_Type = 'INDEX'
AND h.object_owner = 'SYSADM'
And NOT LOWER(module) IN('oracle','toad','sqlplus','sqlplusw')
AND NOT LOWER(module) LIKE 'sql%'
), x AS ( --aggregate DB time by object and statement
SELECT object_name, sql_id, sql_plan_hash_value
, sum(10) ash_secs
, 10*COUNT(command_type) sql_secs --DB time for captured statements only
WHERE NOT LOWER(module) IN('oracle','toad','sqlplus','sqlplusw')
AND NOT LOWER(module) LIKE 'sql%'
GROUP BY object_name, sql_id, sql_plan_hash_value
), y AS ( --rank DB time per object and plan
SELECT object_name, sql_id, sql_plan_hash_value
, ash_secs
, SUM(ash_secs) OVER (PARTITION BY object_name, sql_plan_hash_value) plan_ash_secs
, row_number() OVER (PARTITION BY object_name, sql_plan_hash_value ORDER BY sql_Secs DESC) ranking
), z AS (
SELECT object_name
, CASE WHEN t.sql_text IS NOT NULL THEN y.sql_id
ELSE (SELECT t1.sql_id
FROM dba_hist_sqltext t1
, dba_hist_sql_plan p1
WHERE t1.sql_id = p1.sql_id
AND p1.plan_hash_value = y.sql_plan_hash_value
AND rownum = 1) --if still cannot find statement just pick any one
END AS sql_id
, y.sql_plan_hash_value, y.plan_ash_secs
, CASE WHEN t.sql_text IS NOT NULL THEN t.sql_text
ELSE (SELECT t1.sql_Text
FROM dba_hist_sqltext t1
, dba_hist_sql_plan p1
WHERE t1.sql_id = p1.sql_id
AND p1.plan_hash_value = y.sql_plan_hash_value
AND rownum = 1) --if still cannot find statement just pick any one
END AS sql_text
from y
left outer join dba_hist_sqltext t
on t.sql_id = y.sql_id
WHERE ranking = 1 --captured statement with most time
--'SELECT * FROM table(dbms_xplan.display_awr('''||sql_id||''','||sql_plan_hash_value||',NULL,''ADVANCED''))/*'||object_name||':'||plan_ash_Secs||'*/;' cmd
ORDER BY object_name, plan_ash_secs DESC
Spool off
So now I can see the individual SQL statements.
PSJPROJ_RESOURCE   f02k23bqj0xc4          3393167302          7340 UPDATE PS_PROJ_RESOURCE C SET (C.Operating_Unit, C.CHARTFIELD1, C.PRODUCT, C.CLA
SS_FLD, C.CHARTFIELD2, C.VENDOR_ID, C.contract_num, C.contract_line_num, …

PSLPROJ_RESOURCE 2fz0gcb2774y0 821236869 1220 UPDATE ps_proj_resource p SET p.deptid = NVL (( SELECT j.deptid FROM ps_job j WH
ERE j.emplid = p.emplid AND j.empl_rcd = p.empl_rcd AND j.effdt = ( SELECT MAX (…

PSMPROJ_RESOURCE 96cdkb7jyq863 338292674 50 UPDATE PS_GFCBI_EDM_TA04 a SET a.GFCni_amount = ( SELECT x.resource_amount FROM
PS_PROJ_RESOURCE x WHERE x.process_instance = …

1kq9rfy8sb8d4 4135884683 10 UPDATE PS_GFCBI_EDM_TA04 a SET a.GFCni_amount = ( SELECT x.resource_amount FROM
PS_PROJ_RESOURCE x WHERE x.process_instance = …


9z5qsq6wrr7zp 3665912247 3500 UPDATE PS_PROJ_RESOURCE P SET P.TIME_SHEET_ID = …
If I replace the last select clause with the commented line, then I can generate the commands to extract the statement and plan from the AWR repository.
SELECT * FROM table(dbms_xplan.display_awr('45ggt0yfrh5qp',3393167302,NULL,'ADVANCED'))/*PSJPROJ_RESOURCE:7340*/;

SELECT * FROM table(dbms_xplan.display_awr('8ntxj3694r6kg',821236869,NULL,'ADVANCED'))/*PSLPROJ_RESOURCE:1220*/;

SELECT * FROM table(dbms_xplan.display_awr('96cdkb7jyq863',338292674,NULL,'ADVANCED'))/*PSMPROJ_RESOURCE:50*/;

SELECT * FROM table(dbms_xplan.display_awr('1kq9rfy8sb8d4',4135884683,NULL,'ADVANCED'))/*PSMPROJ_RESOURCE:10*/;

SELECT * FROM table(dbms_xplan.display_awr('ga2x2u4jw9p0x',2282068749,NULL,'ADVANCED'))/*PSNPROJ_RESOURCE:6760*/;
SELECT * FROM table(dbms_xplan.display_awr('9z5qsq6wrr7zp',3665912247,NULL,'ADVANCED'))/*PSNPROJ_RESOURCE:3500*/;
SELECT * FROM table(dbms_xplan.display_awr('b28btd6k3x8jt',1288409804,NULL,'ADVANCED'))/*PSNPROJ_RESOURCE:3060*/;
SELECT * FROM table(dbms_xplan.display_awr('avs70c19khxmw',2276811880,NULL,'ADVANCED'))/*PSNPROJ_RESOURCE:2660*/;
SELECT * FROM table(dbms_xplan.display_awr('b78qhsch85g4a',1019599680,NULL,'ADVANCED'))/*PSNPROJ_RESOURCE:1960*/;
SELECT * FROM table(dbms_xplan.display_awr('65kq2v1ubybps',3138703971,NULL,'ADVANCED'))/*PSNPROJ_RESOURCE:1820*/;
SELECT * FROM table(dbms_xplan.display_awr('1dj17ra70c801',1175874548,NULL,'ADVANCED'))/*PSNPROJ_RESOURCE:1460*/;
SELECT * FROM table(dbms_xplan.display_awr('3w71v896s7m5d',3207074729,NULL,'ADVANCED'))/*PSNPROJ_RESOURCE:500*/;
SELECT * FROM table(dbms_xplan.display_awr('35mz5bw7p5ubw',2447377432,NULL,'ADVANCED'))/*PSNPROJ_RESOURCE:170*/;
Ultimately, I have needed to look through the SQL plans that use an index to decide whether I need to keep that index or decide whether the statement would perform adequately using another index. In this case, on this particular system, I think the index PSMPROJ_RESOURCE would be adequate for this statement, and I would consider dropping PSLPROJ_RESOURCE.
>SELECT * FROM table(dbms_xplan.display_awr('8ntxj3694r6kg',821236869,NULL,'ADVANCED'))/*PSLPROJ_RESOURCE:1220*/;
UPDATE ps_proj_resource p SET p.deptid = NVL (( SELECT j.deptid FROM
ps_job j WHERE j.emplid = p.emplid AND j.empl_rcd = p.empl_rcd AND
j.effdt = ( SELECT MAX (j1.effdt) FROM ps_job j1 WHERE j1.emplid =
j.emplid AND j1.empl_rcd = j.empl_rcd AND j1.effdt <= p.trans_dt) AND
j.effseq = ( SELECT MAX (j2.effseq) FROM ps_job j2 WHERE j2.emplid =
j.emplid AND j2.empl_rcd = j.empl_rcd AND j2.effdt = j.effdt)),
p.deptid )
WHERE p.process_instance = …
j.deptid FROM ps_job j WHERE j.emplid = p.emplid AND j.empl_rcd =
p.empl_rcd AND j.effdt = ( SELECT MAX (j1.effdt) FROM ps_job j1 WHERE
j1.emplid = j.emplid AND j1.empl_rcd = j.empl_rcd AND j1.effdt <=
p.trans_dt) AND j.effseq = ( SELECT MAX (j2.effseq) FROM ps_job j2
WHERE j2.emplid = j.emplid AND j2.empl_rcd = j.empl_rcd AND j2.effdt =

Plan hash value: 821236869

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | UPDATE STATEMENT | | | | 63104 (100)| |
| 1 | UPDATE | PS_PROJ_RESOURCE | | | | |
| 2 | INDEX RANGE SCAN | PSLPROJ_RESOURCE | 365 | 11315 | 22 (0)| 00:00:01 |
| 3 | INDEX RANGE SCAN | PSAJOB | 1 | 23 | 3 (0)| 00:00:01 |
| 4 | SORT AGGREGATE | | 1 | 20 | | |
| 5 | INDEX RANGE SCAN| PSAJOB | 1 | 20 | 3 (0)| 00:00:01 |
| 6 | SORT AGGREGATE | | 1 | 23 | | |
| 7 | INDEX RANGE SCAN| PSAJOB | 1 | 23 | 3 (0)| 00:00:01 |
| 8 | INDEX RANGE SCAN | PSAJOB | 1 | 29 | 3 (0)| 00:00:01 |
| 9 | SORT AGGREGATE | | 1 | 20 | | |
| 10 | INDEX RANGE SCAN | PSAJOB | 1 | 20 | 3 (0)| 00:00:01 |
| 11 | SORT AGGREGATE | | 1 | 23 | | |
| 12 | INDEX RANGE SCAN | PSAJOB | 1 | 23 | 3 (0)| 00:00:01 |

I carried on with examination of SQL statements and execution plans to determine whether each index is really needed or another index (or even no index at all) would do as well.  This decision also requires some background knowledge about the application. Eventually, I decided that I want drop the J, L and N indexes on PROJ_RESOURCE and just keep M. 
Limitations of Method
    AWR does not capture all SQLs, nor all SQL plans. First the SQL has to be in the library cache and then it must be one of the top-n. A SQL that is efficient because it uses an appropriate index may not be captured, and will not be detected by this approach. This might lead you to erronously believe that the index could be dropped.
      ASH data is purged after a period of time, by default 31 days. If an index is only used by a process that has not run within the retention period, then it will not be detected by this approach. This is another reason to retain ASH and AWR in a repository for a longer period. I have heard 400 days suggested, so that you have ASH for a year and a month.
        • However, this also causes the SYSAUX tablespace to be become very large, so I would suggest regularly moving the data to a separate database. I know one customer who has built a central AWR repository for all their production and test databases and automated regular transfer of data. That repository has been of immense diagnostic value.
        [Update] This analysis will not detect index use in support constraint validation (PeopleSoft doesn't use database referential integrity constraints).  As Mark Farnham points out below, that may be a reason for retaining a particular index.Getting Rid of Indexes Obviously any index changes need to be tested carefully in all the places that reference the index, but on the other hand it is not viable to do a full regression test every time you want to change an index.
          Therefore, if all the testing is successful and you decide to go ahead and drop the index in production, you might prefer to make it invisible first for a while before actually dropping it. It is likely that the indexes you choose to examine are large and will take time to rebuild. An invisible index will not be used by the optimizer, but it will continue to be maintained during DML. If there are any unfortunate consequences, you can immediately make the index visible without having to rebuild it.
            ©David Kurtz, Go-Faster Consultancy Ltd.

            APEX 5.0 New Features Schulung

            Denes Kubicek - Wed, 2014-09-03 04:27
            Unsere nächste Schulung hat APEX 5.0 New Features im Fokus. Diese findet im Dezember am 08. und am 09. in Bensheim statt. Wenn Sie auf diesen Link klicken, können Sie die Details anschauen und sich ggf. anmelden. Dieses Thema is sehr spannend und viele neue Features bzw. Änderungen kommen auf uns zu. Wir denken, dass Dezember der richtige Zeitpunkt dafür ist, damit anzufangen.

            Categories: Development

            Order of Operation

            Jonathan Lewis - Wed, 2014-09-03 02:42

            One response to my series on reading execution plans was an email request asking me to clarify what I meant by the “order of operation” of the lines of an execution plan. Looking through the set of articles I’d written I realised that I hadn’t made any sort of formal declaration of what I meant, all I had was a passing reference in the introduction to part 4; so here’s the explanation.


            By “order of operation” I mean the order in which the lines of an execution plan start to produce a rowsource. It’s worth stating this a little formally as any other interpretation could lead to confusion; consider the following simple hash join:

            | Id  | Operation           | Name  |
            |   0 | SELECT STATEMENT    |       |
            |*  1 |  HASH JOIN          |       |
            |   2 |   TABLE ACCESS FULL | T1    |
            |   3 |   TABLE ACCESS FULL | T2    |

            The tablescan at line 2 is the first operation to start producing a rowsoruce; the hash join at line 1 consumes the output and builds a hash table – so it definitely has to do some work before line 3 starts to run – but it doesn’t start generating a rowsource at this point. It’s only after line 3 starts its tablescan and starts to generate its rowsource that line 1 can produce a rowsource by consuming the rows coming from line 3 and probing the in-memory hash table. So line 1 starts to produce its rowsource only after line 3 starts producing its rowsource. The “order of operation” is 2, 3, 1, 0. Perhaps, for the purposes of avoiding confusion, it would be better in future if I remembered to say: “the order of rowsource generation”.

   Introduction to Zone Maps Part I (Map Of The Problematique)

            Richard Foote - Wed, 2014-09-03 02:06
            Zone Maps are new index-like structures that enables the “pruning” of disk blocks during accesses of the table by storing the min and max values of selected columns for each “zone” of a table. A zone is simply a range of contiguous blocks within a table. Zone Maps are similar in concept to Exadata storage […]
            Categories: DBA Blogs

            REST with Pivotal GemFire 8.0

            Pas Apicella - Tue, 2014-09-02 21:32
            Pivotal GemFire 8.0 now includes REST support. You can read more about it as follows


            Here is how we set it up and some quick examples showing how it works with some Region data
            In the example below I have PDX setup for the cache servers as shown below.
            <!DOCTYPE cache PUBLIC
            "-//GemStone Systems, Inc.//GemFire Declarative Caching 8.0//EN"
            <pdx read-serialized="true">
            <parameter name="classes">


            1. Firstly you need to enable the REST on a cache server node as shown below. Basically set gemfire.start-dev-rest-api to TRUE , you could use a file but here we just pass it to GFSH as part of the server start command.

            start server --name=server1 --classpath=$CLASSPATH --server-port=40411 --cache-xml-file=./server1/cache.xml --properties-file=./server1/ --locators=localhost[10334] --dir=server1 --initial-heap=1g --max-heap=1g --J=-Dgemfire.http-service-port=7070 --J=-Dgemfire.http-service-bind-address=localhost --J=-Dgemfire.start-dev-rest-api=true

            2. Once started we can quickly ensure we have the REST server up on port 7070 as shown below.

            [Wed Sep 03 12:39:18 papicella@:~/ant-demos/gemfire/80/demo ] $ netstat -an | grep 7070
            tcp4       0      0         *.*                    LISTEN

            3. Next test that you can access the REST server. The command below will list all the regions available in the cluster.

            [Wed Sep 03 12:52:44 papicella@:~/ant-demos/gemfire/80/demo/rest ] $ curl -i http://localhost:7070/gemfire-api/v1
            HTTP/1.1 200 OK
            Server: Apache-Coyote/1.1
            Location: http://localhost:7070/gemfire-api/v1
            Accept-Charset: big5, big5-hkscs, euc-jp, euc-kr, gb18030, gb2312, gbk, ibm-thai, ibm00858, ibm01140, ibm01141, ibm01142, ibm01143, ibm01144, ibm01145, ibm01146, ibm01147, ibm01148, ibm01149, ibm037, ibm1026, ibm1047, ibm273, ibm277, ibm278, ibm280, ibm284, ibm285, ibm290, ibm297, ibm420, ibm424, ibm437, ibm500, ibm775, ibm850, ibm852, ibm855, ibm857, ibm860, ibm861, ibm862, ibm863, ibm864, ibm865, ibm866, ibm868, ibm869, ibm870, ibm871, ibm918, iso-2022-cn, iso-2022-jp, iso-2022-jp-2, iso-2022-kr, iso-8859-1, iso-8859-13, iso-8859-15, iso-8859-2, iso-8859-3, iso-8859-4, iso-8859-5, iso-8859-6, iso-8859-7, iso-8859-8, iso-8859-9, jis_x0201, jis_x0212-1990, koi8-r, koi8-u, shift_jis, tis-620, us-ascii, utf-16, utf-16be, utf-16le, utf-32, utf-32be, utf-32le, utf-8, windows-1250, windows-1251, windows-1252, windows-1253, windows-1254, windows-1255, windows-1256, windows-1257, windows-1258, windows-31j, x-big5-hkscs-2001, x-big5-solaris, x-compound_text, x-euc-jp-linux, x-euc-tw, x-eucjp-open, x-ibm1006, x-ibm1025, x-ibm1046, x-ibm1097, x-ibm1098, x-ibm1112, x-ibm1122, x-ibm1123, x-ibm1124, x-ibm1364, x-ibm1381, x-ibm1383, x-ibm300, x-ibm33722, x-ibm737, x-ibm833, x-ibm834, x-ibm856, x-ibm874, x-ibm875, x-ibm921, x-ibm922, x-ibm930, x-ibm933, x-ibm935, x-ibm937, x-ibm939, x-ibm942, x-ibm942c, x-ibm943, x-ibm943c, x-ibm948, x-ibm949, x-ibm949c, x-ibm950, x-ibm964, x-ibm970, x-iscii91, x-iso-2022-cn-cns, x-iso-2022-cn-gb, x-iso-8859-11, x-jis0208, x-jisautodetect, x-johab, x-macarabic, x-maccentraleurope, x-maccroatian, x-maccyrillic, x-macdingbat, x-macgreek, x-machebrew, x-maciceland, x-macroman, x-macromania, x-macsymbol, x-macthai, x-macturkish, x-macukraine, x-ms932_0213, x-ms950-hkscs, x-ms950-hkscs-xp, x-mswin-936, x-pck, x-sjis_0213, x-utf-16le-bom, x-utf-32be-bom, x-utf-32le-bom, x-windows-50220, x-windows-50221, x-windows-874, x-windows-949, x-windows-950, x-windows-iso2022jp
            Content-Type: application/json
            Content-Length: 493
            Date: Wed, 03 Sep 2014 02:52:46 GMT

              "regions" : [ {
                "name" : "demoRegion",
                "type" : "PARTITION",
                "key-constraint" : null,
                "value-constraint" : null
              }, {
                "name" : "departments",
                "type" : "PARTITION",
                "key-constraint" : null,
                "value-constraint" : null
              }, {
                "name" : "employees",
                "type" : "PARTITION",
                "key-constraint" : null,
                "value-constraint" : null
              }, {
                "name" : "complex",
                "type" : "PARTITION",
                "key-constraint" : null,
                "value-constraint" : null
              } ]

            4. We have a couple of regions in this cluster and once again I am using the classic DEPT/EMP regions here. Some simple REST command belows on the "/departments" region

            View all DEPARTMENT region entries

            [Wed Sep 03 12:53:38 papicella@:~/ant-demos/gemfire/80/demo/rest ] $ curl -i http://localhost:7070/gemfire-api/v1/departments
            HTTP/1.1 200 OK
            Server: Apache-Coyote/1.1
            Content-Location: http://localhost:7070/gemfire-api/v1/departments/20,10,30,40
            Content-Type: application/json
            Content-Length: 225
            Date: Wed, 03 Sep 2014 02:53:40 GMT

              "departments" : [ {
                "deptno" : 20,
                "name" : "RESEARCH"
              }, {
                "deptno" : 10,
                "name" : "ACCOUNTING"
              }, {
                "deptno" : 30,
                "name" : "SALES"
              }, {
                "deptno" : 40,
                "name" : "OPERATIONS"
              } ]

            VIEW a single region entry by KEY

            [Wed Sep 03 12:55:34 papicella@:~/ant-demos/gemfire/80/demo/rest ] $ curl -i http://localhost:7070/gemfire-api/v1/departments/10
            HTTP/1.1 200 OK
            Server: Apache-Coyote/1.1
            Content-Location: http://localhost:7070/gemfire-api/v1/departments/10
            Content-Type: application/json
            Content-Length: 44
            Date: Wed, 03 Sep 2014 02:55:36 GMT

              "deptno" : 10,
              "name" : "ACCOUNTING"

            VIEW multiple entries by KEY

            [Wed Sep 03 12:56:25 papicella@:~/ant-demos/gemfire/80/demo/rest ] $ curl -i http://localhost:7070/gemfire-api/v1/departments/10,30
            HTTP/1.1 200 OK
            Server: Apache-Coyote/1.1
            Content-Location: http://localhost:7070/gemfire-api/v1/departments/10,30
            Content-Type: application/json
            Content-Length: 123
            Date: Wed, 03 Sep 2014 02:56:28 GMT

              "departments" : [ {
                "deptno" : 10,
                "name" : "ACCOUNTING"
              }, {
                "deptno" : 30,
                "name" : "SALES"
              } ]

            5. We can even use the Spring REST shell as shown below.

            Obtain rest-shell using the link below.


            [Wed Sep 03 13:06:22 papicella@:~ ] $ rest-shell

            ___ ___ __ _____ __ _ _ _ _ __
            | _ \ __/' _/_ _/' _/| || | / / | \ \
            | v / _|`._`. | | `._`.| >< | / / / > >
            |_|_\___|___/ |_| |___/|_||_| |_/_/ /_/

            Welcome to the REST shell. For assistance hit TAB or type "help".
            http://localhost:8080:> baseUri http://localhost:7070/
            Base URI set to 'http://localhost:7070'
            http://localhost:7070:> follow gemfire-api
            http://localhost:7070/gemfire-api:> follow v1
            http://localhost:7070/gemfire-api/v1:> follow departments
            http://localhost:7070/gemfire-api/v1/departments:> get 20
            > GET http://localhost:7070/gemfire-api/v1/departments/20

            < 200 OK
            < Server: Apache-Coyote/1.1
            < Content-Location: http://localhost:7070/gemfire-api/v1/departments/20
            < Content-Type: application/json
            < Content-Length: 42
            < Date: Wed, 03 Sep 2014 03:07:17 GMT
            "deptno" : 20,
            "name" : "RESEARCH"

            6. Open a browser and enter the following URL to browse the Swagger-enabled REST APIs:


            7. Perform an operation as shown below.

            Categories: Fusion Middleware

            <b>Contributions by Angela Golla,

            Oracle Infogram - Tue, 2014-09-02 12:32
            Contributions by Angela Golla, Infogram Deputy Editor

            Oracle Support and Services at OpenWorld
            Oracle Support and Services will have many sessions and activities during Oracle OpenWorld, September 28th - October 2nd.  Topics include Tips and Best Practices from the Front Lines, Best Practices for Proactively Supporting E-Business Suite, Near Zero Downtime Database migration and more.  Click HERE for the complete list.  

            Data Profile: Better Knowing Your Data

            Pythian Group - Tue, 2014-09-02 08:40

            Have you ever needed to start a new data analysis project or create a report for the business users querying a database you never worked before? Or simply know the data distribution of a database to create better indexing strategies?

            Working as a consultant, I constantly face this challenge where I have to work with a customer’s database that I don’t know about very deeply. For instance, that “Gender” column stores data as “M” and “F” or “Male” and “Female”? Or even, do they use a bit column for that? (Yeah, I saw that a lot already). Does that “Surname” column accept NULL values? If so, what percent of the table contains NULL for that specific column? In a date/time column, what is the minimum and maximum values so I can create my “Time” dimension in a Data warehouse?

            This data discovery process, where I need an overview of the data, usually takes a lot of time and a lot of query writing, doing DISTINCT, MIN, MAX, AVG kind of queries and analyzing the result of each individual query. Even with a lot of really good code, completing third party tools out there, it is a cumbersome task and sometimes the customer is not willing to wait while I learn everything about their environment before expecting results.

            Today I want to show you a not-so-new feature that we have in SQL Server that will help with the data discovery process. The feature is the Data Profiler Task in SQL Server Integration Services and the Data Profile Viewer.

            Now, that’s the time when you ask me, “Data what?!

            It’s easy, you’ll see. One of the several tasks in the SQL Server Integration Services that you never use and never took the time to google what is used for is called Data Profiling Task. This task allows you to select a table and what kind of data analysis you want to do in that table/column. When you run the SSIS package it will analyze the table and generate a XML file. Once you have the XML file, all you need to do is to open it using the Data Profile Viewer, which will take care of creating a nice user interface for you to analyze the XML, as you can see in the Figure 1.



            Figure 1: Data Profile Viewer

            Cool, now let’s see how to create our own analysis.

            Step 1: Open SQL Data Tools or SQL BIDS if you’re using SQL Server 2008 R2 or below

            Step 2: Create a new SSIS project

            Step 3: Add the Data Profiling Task on your project


            Step 4: Double click in the Data Profiling task so we can configure it. In the General tab we have to set the Destination, that means, the location you want to save the XML file. You can choose to save directly to the file system using a File Connection or store in a XML variable inside your package in the case you want to do something else with the XML, maybe store in a database. Let’s leave the default FileConnection option for the Destination Type option and click in New File Connection in the Destination option.


            Step 5: Now we can choose the file location, on my example I am using one of the most used folders every on windows. The “tmp” folder, sometimes also called as “temp” or just “stuff”. (Note: the author doesn’t recommend storing everything in folders called temp nor saving everything in the desktop)


            Step 6: Ok, we’re back to the main window, we have now to choose which kind of analysis we want to run, the database and the table. We have two options, the first one is to use the Profile Requests tab and choose one by one the data analysis, table and columns. The other option and also the simplest one is to use the Quick Profile tab. Using this option we can define one specific table and what analysis you want to run on that table. If you want to run the analysis on multiple tables you will have to click in the Quick Profile option and choose one by one (nothing on this world is perfect).


            As you can see in the image above, I have chosen the Production.Product table of the AdventureWorks2012 database. In the Compute option you have to choose what data analysis you want to run, the names of the options kind of explain what they’ll do, but if you want a detailed explanation of each option you can check the product documentation on this link:

            Now all you have to do is to run the SSIS package to create the XML file. Once you’re done, you can use the Data Profile Viewer tool to open the XML and analyze its results.


            The Data Profile Viewer is a simple tool that doesn’t need much explanation, just try it for yourself and you’ll certainly like the result.

            I hope this can help you to save some time when you need to quickly learn more about the data inside a database. If you have any questions or want to share what your approach is when you need to complete this task, just leave a comment!


            Categories: DBA Blogs

            Oracle Database: Script to Purge aud$ Table Using dbms_audit_mgmt Package

            Pythian Group - Tue, 2014-09-02 07:59

            With an increase in requirements on database security, database auditing is part of most production databases nowadays. The growth on Oracle database audit table”‘aud$” always add spice on a DBA’s life. Since this table growth directly impacts the database performance, this table got special place on every DBA’s heart.

            Traditionally we follow many methods to purge the old data from this table, all these methods require application downtime for most of the time. Oracle introduced a new in-house package named “DBMS_AUDIT_MGMT”, which gives more control for a DBA over management of auditing records.

            I tried to test this package on my 11gR2 test database. Although this can be done from oracle rdbms version But we need to apply the required patches on (Patch 6989148) and (Patch 6996030) versions. This package is installed by default on versions and I configured audit_trail parameter value to db_extended and enabled database auditing. Later I moved this db audit table and associated LOB segments to dedicated tablespace named “AUDTBS”. I confirmed the audit functionality is running fine after the tablespace change.

            Thanks to MOS notes 1362997.1 and 1508787.1, I successfully tested this purge procedure. I configured the database scheduler jobs in such a way that they should run once per 12 hours, purges data from aud$ table, which are older than 7 days. Here is the script(purge_job.sql) used myself to configure the required jobs.

            Script: purge_job.sql ==> Run this script as SYS database user account.

            prompt start of the script
            set serveroutput on
            prompt Change based on our customization done
            update dam_config_param$ set string_value=’AUDTBS’ where audit_trail_type#=1 and param_id=22;

            prompt First Step: init cleanup (if not already)

            dbms_output.put_line(‘Calling DBMS_AUDIT_MGMT.INIT_CLEANUP’);
            audit_trail_type => dbms_audit_mgmt.AUDIT_TRAIL_AUD_STD,
            default_cleanup_interval => 12);
            dbms_output.put_line(‘Cleanup for STD was already initialized’);
            end if;

            prompt revert back to default values again
            update dam_config_param$ set string_value=’SYSAUX’ where audit_trail_type#=1 and param_id=22;

            prompt set last archive timestamp to older than 7 days

            audit_trail_type => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD,
            last_archive_time => sysdate – 7);

            prompt setup a purge job

            AUDIT_TRAIL_PURGE_NAME => ‘Standard_Audit_Trail_PJ’,

            prompt call DBMS_AUDIT_MGMT.SET_LAST_ARCHIVE_TIMESTAMP regularly to advance the last archive timestamp

            create or replace procedure set_archive_retention
            (retention in number default 7) as
            audit_trail_type => DBMS_AUDIT_MGMT.AUDIT_TRAIL_AUD_STD,
            last_archive_time => sysdate – retention);

            DBMS_SCHEDULER.create_job (
            job_name => ‘advance_archive_timestamp’,
            job_type => ‘STORED_PROCEDURE’,
            job_action => ‘SET_ARCHIVE_RETENTION’,
            number_of_arguments => 1,
            start_date => SYSDATE,
            repeat_interval => ‘freq=hourly;interval=12′ ,
            enabled => false,
            auto_drop => FALSE);
            (job_name =>’advance_archive_timestamp’,
            argument_position =>1,
            argument_value => 7);

            DBMS_SCHEDULER.run_job (job_name => ‘advance_archive_timestamp’,
            use_current_session => FALSE);

            prompt End of the script

            To verify the purge status and configured jobs status execute the following queries.

            SQL> select min(NTIMESTAMP#) from aud$;
            SQL> select LAST_START_DATE,LAST_RUN_DURATION,NEXT_RUN_DATE from dba_scheduler_jobs where job_name=’ADVANCE_ARCHIVE_TIMESTAMP’;
            SQL> select LAST_START_DATE,LAST_RUN_DURATION,NEXT_RUN_DATE from dba_scheduler_jobs where job_name=’STANDARD_AUDIT_TRAIL_PJ’;

            We can definitely customize this script based on requirement, which is different for each database. But testing is required on the cloned database before configuring these purge jobs on the production database.


            Categories: DBA Blogs

            Azure Storage: Creating, Maintaining, and Deleting SQL Server Backups

            Pythian Group - Tue, 2014-09-02 07:50

            This post covers how to create, monitor, maintain, and automatically delete SQL Server backups.

            What is Azure Storage and why should you use it?

            Microsoft Windows Azure is Microsoft’s cloud offering for offsite storage. It offers the ability to seamlessly enable massive storage, Virtual Servers, SQL Server database instances, and many other options without having to worry about the hardware or maintenance in house.

            Many companies are currently using Azure as offsite storage for their nightly Production backups. A company chooses one of 15 datacenters that Microsoft has around the world. This datacenter automatically and transparently maintains three copies of each backup file, and also replicates to a second datacenter in a different geographic location. The replication is not real-time, but in general there will always be six copies of each backup file available in case of an emergency.

            In the event the Primary datacenter fails, Microsoft will decide when or if to failover to the Secondary datacenter. However, in the coming months they plan to roll out an API which would allow individual clients to make that decision.

            SQL Server 2012 SP1 CU6+ is required.

            The current pricing is about $90 per month per TB of storage used.

            Accessing the Azure front end
            To access the Azure front end:

            • Open Internet Explorer and navigate to
              • You will be prompted to login with a Microsoft MSDN Account.
            • The Azure administrator in your company should have granted this account access.
            • Click on the Azure Portal icon to bring up the Azure Main Page for your account.
            • Click on the Storage Icon on the left.
            • Drill down into your storage account to open the Storage Main Page.
            • Click on the Containers tab
            • Drill down into your containerThis is a list of all of the backups being written to the Production container.

            The backups are ordered by their name, and unfortunately there is no way to sort by Date Modified or other field. You can see more backups than are listed on the front page by clicking on the arrow at the bottom left.

            Checking the status of an Azure backup file
            To check the status of a backup to Azure storage, you have two options:
            1. The Azure Portal
            2. SQL Server Management Studio

            To use the Azure Portal, navigate to the container the backup is being written to, and find it in the list of files. If the size is 1 TB, then the backup is currently running. Using the Azure Portal, you can only see if a backup file is being created. You cannot see how much time is remaining until it is done.

            To use SQL Server Management Studio, open SSMS and connect to the instance running the backups. Run the following command to get percent done & estimated time to completion:

            CONVERT(NUMERIC(6,2),r.percent_complete) AS [PercentComplete],
            CONVERT(VARCHAR(20),DATEADD(ms,r.estimated_completion_time,GetDate()),20) AS [ETA Completion Time],
            CONVERT(NUMERIC(10,2),r.total_elapsed_time/1000.0/60.0) AS [Elapsed Min],
            CONVERT(NUMERIC(10,2),r.estimated_completion_time/1000.0/60.0) AS [ETA Min],
            CONVERT(NUMERIC(10,2),r.estimated_completion_time/1000.0/60.0/60.0) AS [ETA Hours],
            CONVERT(VARCHAR(1000),(SELECT SUBSTRING(text,r.statement_start_offset/2, CASE WHEN r.statement_end_offset = -1 THEN 1000 ELSE (r.statement_end_offset-r.statement_start_offset)/2 END) FROM sys.dm_exec_sql_text(sql_handle)))
            sys.dm_exec_requests r
            command = 'BACKUP DATABASE'

            Backing up to an Azure storage container
            In order to write a backup file to a Microsoft Azure storage container, two items are needed:

                • A SQL Server credential to connect to the Azure Account
                  • This requires the Account Name and a Storage Key.
                • The container URL

            To find the Azure Account Name and Storage Key, navigate to the Azure Main Page. At the bottom of the page, there is a “Manage Access Keys”. Clicking on this icon bring ups the Account Name and a Primary & Secondary Access Key.

            1. Copy the Account Name and one of the Access Keys. DO NOT REGENERATE THE KEYS.
            2.    Copy the URL.
            3.    Open SQL Server Management Studio and connect to the RLPRODMSSQL01 instance.
            4.    From a new query window, run the following command:

            CREATE CREDENTIAL AzureBackups
            IDENTITY = ‘Account Name’,
            SECRET = ‘Storage Key’

            5.    Run this same command on all instances that will backup to this container.
            6.    Run the following command to backup a database to the Azure Storage container:

            BACKUP DATABASE db_name
            FROM URL = ‘Container URL + Backup File Name’
            CREDENTIAL = ‘AzureBackups’

            Restoring from an Azure Storage Container
            To restore from an Azure Storage Container, two items are needed:

            1. A credential to connect to the Azure Account (See steps 1-4 of Backing up to an Azure Storage Container)
            2. The backup file URL

            To get the backup file URL, navigate to the container where the backup file is stored. The URL is to the left of the backup name.

            1. Copy the URL.
            2. Run the following command on the instance you want to restore the database onto:

            RESTORE DATABASE db_name
            FROM URL = ‘Backup File URL’
            CREDENTIAL = ‘AzureBackups’

            Deleting SQL Server backups from Azure storage

            In SQL Server 2012, Azure storage is not fully integrated with Maintenance Plans and deleting old backups is a manual process. This causes issues, because there is no way to quickly delete a batch of backups, and if this is forgotten for a few days then the cost of storage begins to rise quickly.

            I have written the below code to create an executable that will connect to the Azure storage container and delete any backups older than x days.

            In addition, the code can check for any backups that have a “locked lease”, break the lease, and then delete them.

            The parameters for the executable are:

            • Parameter 1 – MS Azure Account Name (string)
            • Parameter 2 – MS Azure Storage Key (string)
            • Parameter 3 – Azure Container Name (string)
            • Parameter 4 – Number of days backups to retain (positive integer)
            • Parameter 5 – File type to delete (.bak, .trn, etc..)
            • Parameter 6 – Delete backups with locked lease? (True/False)
              • Note that a True value for parameter 6 will cause the executable to ignore parameters 4 and 5.
              • This is meant to be run after a failed backup job.

            In order to work, the executable will need the Windows Azure Storage Client Library.

            Alternatively, you can download and run the executable using this ZIP file.


            using System;
            using System.Collections.Generic;
            using System.Linq;
            using System.Text;
            using System.Threading.Tasks;
            using Microsoft.WindowsAzure.Storage;
            using Microsoft.WindowsAzure.Storage.Auth;
            using Microsoft.WindowsAzure.Storage.Blob;

            namespace DeleteAzureBackups
            class Program

            static void Main(string[] args)
            if (args.Length != 6) { Console.WriteLine(“Please run with correct number of parameters. Type ? for help.”); return; }

            if (args[0] == “?” || args[0] == “help” || args[0] == “h”)
            Console.WriteLine(“Pythian Azure Backup Delete Utility”);
            Console.WriteLine(“Parameter 1 : String : MS Azure Account Name”);
            Console.WriteLine(“Parameter 2 : String : MS Azure Account Key”);
            Console.WriteLine(“Parameter 3 : String : Container Name”);
            Console.WriteLine(“Parameter 4 : Positive Integer : Number of days to retain backups”);
            Console.WriteLine(“Parameter 5 : String : File type to delete (.bak, .trn, etc…)”);
            Console.WriteLine(“Parameter 6 : True/False : Delete backups with locked leases (will ignore Parameters 4 & 5)”);

            // Account name and key.
            string accountName = args[0].ToLower(); //Account Name
            string accountKey = args[1]; //Account Key
            string containerName = args[2]; //Container Name
            int numberOfDays = Int16.Parse(args[3]); //Number of Days before deleting
            string fileType = args[4];
            bool deleteLockedBlobs = bool.Parse(args[5]);

            CloudBlobContainer container = openConnection(accountName, accountKey, containerName);

            if (!deleteLockedBlobs)
            { deleteAzureBackups(container, numberOfDays, fileType); }
            { unlockLeasedBlobs(container); }

            catch (Exception ex)


            static CloudBlobContainer openConnection(string accountName, string accountKey, string containerName)
            //Get a reference to the storage account, with authentication credentials
            StorageCredentials credentials = new StorageCredentials(accountName, accountKey);
            CloudStorageAccount storageAccount = new CloudStorageAccount(credentials, true);

            //Create a new client object.
            CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();

            // Retrieve a reference to a container.
            CloudBlobContainer container = blobClient.GetContainerReference(containerName);

            return container;
            catch (StorageException ex)
            Console.WriteLine(“Failed to create connection to MS Azure Storage.”);
            return null;

            static void deleteAzureBackups(CloudBlobContainer container, int numberOfDays, string fileType)
            DateTimeOffset now = DateTimeOffset.Now;

            foreach (IListBlobItem item in container.ListBlobs(null, false))
            CloudPageBlob blob = (CloudPageBlob)item;

            //If date blob was last modified is more than x days out, then it gets deleted.
            if ((now – blob.Properties.LastModified.Value).Days >= numberOfDays &&
            blob.Name.Substring(blob.Name.Length – fileType.Length) == fileType)

            static void unlockLeasedBlobs(CloudBlobContainer container)
            foreach (IListBlobItem item in container.ListBlobs(null, false))
            CloudPageBlob blob = (CloudPageBlob)item;

            if (blob.Properties.LeaseStatus == LeaseStatus.Locked)
            Console.WriteLine(“Breaking lease on {0} blob.”, blob.Name);
            blob.BreakLease(new TimeSpan(), null, null, null);
            Console.WriteLine(“Successfully broken lease on {0} blob.”, blob.Name);

            catch (StorageException ex)
            Console.WriteLine(“Failed to break lease on {0} blob.”, blob.Name);

            static void deleteBlob(CloudPageBlob blob)
            Console.WriteLine(“Attempting to delete {0}”, blob.Name);
            Console.WriteLine(“Successfully deleted {0}”, blob.Name);
            catch (StorageException ex)
            Console.WriteLine(“Failed to delete {0}.”, blob.Name);


            Categories: DBA Blogs

            Interested in Oracle APEX in Belgium and around? - join ORCLAPEX-BE Meetup

            Dimitri Gielis - Tue, 2014-09-02 07:22
            Over the last months different Oracle Application Express (APEX) meetups were initiated around the world. You can read a bit of background on how the meetups started on Dan's blog.

            I think it's a great idea to have local meetups to meet other people working or interesting in APEX too. When you just start with APEX it's interesting to talk to other people how they got started. But equally for more seasoned developers it's a great place to share ideas and have discussions about where the product and industry is going, all casual and in a relaxed environment.

            That's why we started an APEX Belgium meetup group too. Our first meetup is planend on September 11th in the APEX R&D office in Leuven, Belgium.

            Previously with the Oracle BeNeLux User Group (OBUG) we created one day APEX events, which were great, but had a fixed schedule. The APEX meetups we want to do more frequently and accessible to everybody - a very low entry. We'll try to make all those meetups free-of-charge and flexible in nature. We'll have a presentation about a specific topic, but equally important (or even more important) is that you can network, have a voice and get some help of peers.

            I talked to the OBUG board too and they found it an interesting idea and are happy to sponsor the event. We can enjoy pizza thanks to OBUG :)

            Every local meetup can decide how they organise their meetup.

            Here are the details of the first ORCLAPEX-BE meetup on September 11th (2014):

            The first Oracle APEX meetup will go over the concept of the meetups and as first topic APEX 5.0 will be covered.

            We'll have an "Open Mic" too, so anybody can get on stage for 5-10 minutes and show what they did or share their ideas or ask for some ideas where they struggle with.

            The meetup is completely free.


            • 16.00  Registration
            • 16.30  Outline concept APEX Meet-up
            • 16.45  Presentation APEX 5.0
            • 18.00  Pizza break, Networking and "Open Mic"
            • 21.00  End

            Interesting in Oracle APEX and in Belgium or around? Join the ORCLAPEX-BE meetup now!
            Can't make the 11th? Make you a member of the meetup and receive updates when the next one is :)

            Look forward seeing you.
            Categories: Development

            New Formspider version integrates Git with PL/SQL

            Gerger Consulting - Tue, 2014-09-02 05:24
            Today, we are thrilled to announce the new version of Formspider. Formspider 1.8 opens up possibilities that are currently unavailable to APEX developers.

            Built-In Version Control
            Formspider 1.8 integrates the prominent version control system Git to Formspider. Formspider developers are now able to manage the source code of their applications (both Formspider application objects and PL/SQL code) right within the Formspider IDE.

            Formspider Version Control Highlights:
            • Improved Team Productivity
              Create and work on different branches of your application. Clone your application and have your team edit the same Formspider application objects and PL/SQL code units at the same time without overwriting each others updates. Pull changes from other developers and merge them easily right in the Formspider IDE using the Merge Wizard.
            • Painless Release Management
              Release maintenance versions of a production application while working on new features.
            • Instant Set Up, Seamless Integration
              Formspider automatically installs Git, features both a point&click user interface via the Formspider IDE and a PL/SQL API. Be it SQL Developer, SQL Navigator, PL/SQL Developer or TOAD, Formspider Version Control works with any PL/SQL development tool. 
            Download Formspider 1.8 today.
            Categories: Development

            John Lewis Partnership Transforms Personnel Services for 85,000 Employees by Using a Self-Service Portal

            WebCenter Team - Tue, 2014-09-02 01:00
            Oracle Customer: John Lewis Partnership
            • Location: London, United Kingdom
            • Industry: Retail
            • Employees: 85,000
            • Annual Revenue: Over $5 Billion

            John Lewis Partnership, one of the United Kingdom’s leading retailers, owns and operates 30 John Lewis department stores, 10 John Lewis at-home stores and 302 Waitrose supermarkets. The organization also runs a substantial online business through and, in addition to business-to-business contracts in the UK and abroad.

            John Lewis Partnership is the UK’s largest example of worker co-ownership where all 85,000 employees are partners in the business. Partners are dedicated to delivering excellent customer service and share in the benefits and profits of the business. John Lewis was named “Retailer of the Year 2013” at the Oracle Retail Week Awards 2013.

             A word from John Lewis Partnership

            "With Oracle WebCenter Portal, our partners have secure, straightforward, and instant access to a wide range of human resources services from any location. We also laid a strong foundation for future developments and growth within the organization.”

            – Stephen Oliphant, Technical Application Architect, Oracle Business Service Center, John Lewis Partnership

            • Centralize personnel services by forming a new shared-service center to streamline services, improve efficiency, and offer all employees self-service functionality from the office, home, or geographically dispersed retail stores
            • Save time and costs by replacing manual, time-intensive human resources processes with an easy-to-use, self-service platform that enables employees to be more self-sufficient in managing their human-resources queries
            • Enable the John Lewis Partnership to better manage growth in the quickly changing, customer-centric retail industry by building a strong and flexible infrastructure to improve management of core back-office processes, including personnel management


            • Centralized and transformed personnel services by implementing user-friendly Oracle WebCenter Portal and offering 85,000 employees self-service access to human resources information, such as payroll, holiday and maternity entitlements, and performance review details
            • Eliminated inefficient manual processes to request holidays, administer leave requests, and check payroll details by encouraging employees to use self-service for 60% of all transactions—reducing the time and costs associated with requesting information by telephone, e-mail, or in person
            • Enabled employees to access the portal, 24/7, while at work in the head office or branch offices, via terminals in all retail stores, or at home using secure identity management, providing the most convenient way to review personnel information, ask questions, and make requests
            • Improved efficiency of leave-request reviews and approvals by providing employees and line managers with all relevant information up-front, such as who else is on holiday at a particular time, by integrating the online portal with Oracle Business Intelligence Enterprise Edition
            • Delivered graphical representation of managers’ hierarchy through the portal to help managers navigate the system in a user-friendly way
            • Built manager pages within Oracle Business Intelligence Enterprise Edition to highlight upcoming activities, helping managers plan their schedules in relation to personnel administration duties
            • Established an Oracle WebCenter platform to deliver the flexibility required for additional functionality; further integrations with multiple applications, including Oracle Learning Management; and to deliver additional services throughout John Lewis Partnership in the future
            Why Oracle

            “Oracle is our strategic partner of choice, and by working with a single vendor, we benefit from consistency across technology, service, and support,” said Stephen Oliphant, technical application architect, Oracle business service center, John Lewis Partnership.

            “The scalability of the Oracle solution is very important. One of the key drivers for this transformation project was to support the rate of change and growth within John Lewis Partnership. By partnering with Oracle, we can deploy the right technology to provide a good service to the business and to scale as our organization grows,” said Oliphant.


            Hitachi ConsultingOracle Partner
            Hitachi Consulting

            John Lewis Partnership chose Oracle Diamond Partner Hitachi Consulting as the  lead systems integrator to provide expertise and resources for its organization-wide HR transformation program. Hitachi led a team of partners and worked with John Lewis Partnership from the requirements-gathering phase, through initial design workshops and conference-room pilot sessions, to the final build and testing. Hitachi deployed a large, onsite team and was responsible for functional, technical, and project management, testing, as well as governance.

            As part of this extensive program, Hitachi upgraded John Lewis Partnership’s Oracle E-Business Suite implementation to Release 12 before designing and implementing the Oracle WebCenter solution. It was important that the look and feel of Oracle E-Business Suite was familiar to users, so Hitachi designed the Oracle WebCenter interface to resemble the John Lewis Partnership intranet.

            “Hitachi provided excellent resources throughout this project. The team was more than willing to spend time with us transferring knowledge, helping to ensure that we get the best value from our Oracle implementations into the future,” said Stephen Oliphant, technical application architect, Oracle Business Service Center, John Lewis Partnership.


            John Lewis Partnership Helps 6,500 Suppliers Cut Trading Costs with Free, Self-Service Portal for Managing Accounts and Tracking Invoices and Payments

            More Reference Assets About Retail

            More WebCenter

            Why is Your Data, Your Brand, Your Customers Important (to you)?

            Bradley Brown - Mon, 2014-09-01 22:10
            We often talk about how the data is yours, but what does that mean?We mean what we say.  We don't mix words.  We don't spin words.  It's plain and simple, the data is yours.  The data about who bought what, who watched what, who owns which products, etc. - it's all yours...not ours.  We can't send your customers an email (unless it's from you telling them thank you for their purchase or upselling them).  We can't sell your data to others.  If you want your data, we'll send you your data.  There are no strings attached.
            We talk about how we help you maintain your brand, but why is that important?Sure, you can (and might want to) put your video products into a marketplace like iTunes or Amazon. When you do this, it's no longer your branding, your look and feel, your customer or your sales - it's theirs.  If I were to put my technical training courses such as my iPhone training into iTunes, when someone searched for iPhone training - once they found my training, they could easily get distracted by others offering iPhone training.  However, when they come to my InteliVideo site, they will see my content, marketing message and exactly what I want them to see - not 100 offers for iPhone training.
            We talk about how customers are your customers, but why is that important to you?It's important to you because you want to sell future products to your customers.  You will likely want to upsell your customers.  You will likely want to minimize communication to your customers so that when you target them, you're successful.  It's your customer list to do with as you wish.  This should be important to you.  You WILL NOT get this in a marketplace.

            Context in UX – What It Is, What It Isn’t, and Why It’s Important

            Oracle AppsLab - Mon, 2014-09-01 19:54
            Big Brown Bat (Eptesicus fuscus) in Flight

            Copyright@2012 Bill Kraus, All rights reserved.

            Our location is relentlessly tracked by our mobile devices. Our online transactions – both business and social – are recorded and stored in the cloud. And reams of biometric data will soon be collected by wearables.  Mining this contextual data offers a significant opportunity to enhance the state of human computer interaction. But this begs the question: what exactly is ‘context’ ?

            Consider the following sentence:

            “As Michael was walking, he observed a bat lying on the ground.”

            Now take a moment and imagine this scene in your mind.

            Got it? Good.

            Now a few questions. First, does the nearby image influence your interpretation of this sentence? Suppose I told you that Michael was a biologist hiking through the Amazonian rain forest. Does this additional information confirm your assumptions?

            Now, suppose I told you that the image has nothing to do with the sentence, but instead it’s just a photograph I took in my own backyard and inserted into this post because I have a thing for flying mammals.  Furthermore, what if I told you that Michael actually works as a ball boy at Yankee stadium? Do these additional facts alter your interpretation of the sentence? Finally, what if I confessed that I have been lying to you all along, that Michael is actually in Australia, his last name is Clarke, and that he was carrying a ball gauge? Has your idea of what I meant by ‘bat’ changed yet again? (Hint – Michael Clarke is a star cricket player.)

            The point here is that contextual information – the who, what, where, and when of a situation – provides critical insights into how we interpret data. In pondering the sentence above, providing you with context – either as additional background statements or through presumed associations with nearby content – significantly altered how you interpreted that simple sentence.

            At its essence, context allows us to resolve ambiguities. What do I mean by this? Think of the first name of someone you work with. Chances are good that there are many other people in the world (or at your company if your company is as big as Oracle) with that same first name. But if I know who you are (and ideally where you are) and what you are working on, and I have similar information about your colleagues, then I can make a reasonably accurate guess as to the identity of the person you are thinking of without you having to explicitly tell me anything other than their first name. Furthermore, if I am wrong, my error is understandable to you, precisely because my selection was the logical choice. Were you thinking of your colleague Madhuri in Mumbai that you worked with remotely on a project six months ago? But I guessed the Madhuri that has an office down the hall from you in Redwood City and with whom you are currently collaborating? Ok, I was wrong, but my error makes sense, doesn’t it?   (In intelligent human computer interactions, the machine doesn’t always need be right as long as any errors are understandable. In fact, Chris Welty of IBM’s Watson team has argued that intelligent machines will do very well to be right 80% of the time – which of course was more than enough to beat human Jeopardy champions.)

            So why is the ability to use context to resolve ambiguities important? Because – using our example – I can now take the information derived from context and provide you with a streamlined, personalized user experience that does not require you to explicitly specify the full name of your colleague – in fact, you might not need to enter any name at all if I have enough contextual background about you and what you are trying to do.

            When it comes to UX, context is actually a two-way street. Traditionally, context has flowed from the machine to the user, where layout and workflow – the consequence of both visual and interaction design – has been used to inform the user as to what something means and what to do next.  But as the availability of data and the complexity of systems have grown to the point of overwhelming the user, visualizations and interactions alone are not sufficient to stem the tide. Rather, context – this time emanating from the user to the machine – is the key for achieving  a more simplified, personalized user experience.

            Context allows us to ask the right questions and infer the correct intentions. But the retrieval of the actual answers – or the execution of the desired task – is not part of context per se. For example, using context based on user identity and past history (demographic category, movies watched in the past) can help a recommendation engine provide a more targeted search result. But context is simply used to identify the appropriate user persona – the retrieval of recommendations is done separately. Another way to express this is that context is used to decide which view to put on the data, but it is not the data itself.

            Finally, how contextual information is mapped to appropriate system responses can be divided into two (not mutually exclusive) approaches, one empirical, the other deductive. First, access to Big Data allows the use of machine learning and predictive analytics to discern patterns of behavior across many people, mapping those patterns back to individual personas and transaction histories. For example, if you are browsing for a banana slicer and Amazon’s analytics show that people who spend a lot of time on the banana slicer page also tend to buy bread slicers, then you can be sure you will see images of bread slicers.

            But while Big Data can certainly be useful, it is not required for context to be effective. This is particularly true in enterprise, where reasonable assumptions can be made from a semantic understanding of the underlying business model, and where information-rich employee data can be mined directly by the company. Are you a salesperson in territory A with customers X, Y, and Z? Well then it is safe to assume that you are interested in the economic climate in A as well as news about X, Y, and Z without you ever having to explicitly say so.

            So in closing, the use of context is essential for creating simple yet powerful user experiences – and like the term ‘user experience’ itself, there is no one single implementation of context – rather, it is a concept that should pervade all aspects of human computer interaction in its myriad of forms.Possibly Related Posts:

            Oracle Cloud Friendly - Red Samurai ADF Performance Audit

            Andrejus Baranovski - Mon, 2014-09-01 13:04
            I have deployed our tool for ADF performance monitoring to Oracle Java Cloud service. It runs perfectly on the cloud, monitors slow performance and allows to analyse collected performance data. All data is stored in Oracle Database Cloud.

            You can access performance monitoring dashboard using this address. Access will be available for a month or so, until my trial account will expire. You would need to use following login credentials - user: redsam, password: We1come@, identity: ltredsamuraictrial99050. You can login and play online with our performance audit dashboard.

            There is ADF demo application deployed as well, Summit App - accessible here (same login credentials). This standard demo application is extended with our audit listener to log slow ADF performance into Oracle Database Cloud instance. Use it to generate some audit data, simply by navigating through the application, updating and inserting data.

            Red Samurai ADF Performance Audit tool and standard ADF sample application Summit are deployed to Oracle Java Cloud:

            This is a main screen of Summit sample application (accessible here), I'm using it to track ADF performance. You could open this application, do several actions and check in performance dashboard to see if any slow actions were reported:

            Red Samurai ADF Performance Audit tool, main dashboard displays various performance issues (accessible here) logged. Audited application is fairly basic, this is why I have set performance thresholds to relatively low values - VO execution time to 10 milliseconds, large fetch to 50 rows and slow activation to 300 milliseconds:

            You could select issue from the list and check details -Application Module, View Object names. SQL statements with bind variables are available for slow SQL executions. There are very helpful features available, such as historical performance analysis option and application health automatic calculation.

            Slow activation tab displays slow activation events, this helps to locate slowest activation Application Modules and possibly to eliminate such slow activations by Application Module parameters tuning:

            Drill down displays large fetch and full scan issues for selected Application Module. The most problematic VO instances are displayed first in the graph:

            Same graph can be displayed in table view (very helpful for performance troubleshooting) with special sorting method:

            You could go and view weekly/monthly history for reported performance issues based on individual VO instance:

            Overall system performance screen displays system statistics - queries, transactions, logged users, activations:

            System load summary could be helpful, it allows to understand general system load and see the most frequent operations:

            SLOB Patch. AWR Post-Processing Script ( Fix.

            Kevin Closson - Mon, 2014-09-01 11:11

            BLOG UPDATE 2014.09.11: Please note: the following is a link to a more recent update of the script. This version adds DB Time, DB CPU and Logical I/O: click here. The MD5 sum for this version of is:  a28a38b11040bb94f08a8f817792c75c

            The SLOB kit comes with a little script that extracts interesting information from the awr.txt file produced at the end of a SLOB test. This is just a quick blog entry to point folks to a patched version of that works properly with all Oracle Database 11g releases as well as Oracle Database 12c.

            Oracle changed AWR format in the and 12c releases so the old script (in the publicly available SLOB kit) has been faulty for some time now.

            I have a release of SLOB in the works that will include this as well as improved data loader and improvements to the driver script ( that includes optional, tunable think time between iterations of the SLOB work loop in slob.sql. For the time being please get a copy of the patched version of

            New Output

            This version of also gleans and outputs logical read (SGA buffer pool cached block accesses) data.

            The following screen shot shows the patched generating proper output for awr.txt files collected by SLOB databases running out of the, and 12c releases.


            The following picture is what Microsoft Excel looks like when I cut and paste the output of I’ve highlighted the new column for logical reads.


            Exadata Cell Single Block Physical Reads?

            Yes, the above picture does show AWR output from a run where the top wait event was cell single block physical read. Exadata? Yes! That’s because SLOB users often share their testing results from the Exadata platform.  However, I do not get enough Exadata AWR reports to work through all of the issues related to Exadata. To that end, latency information is not calculated and presented as is the case with db file sequential read. For what it’s worth this particular AWR report shows Exadata single block reads serviced with average latencies of 507 microseconds ( 7233/14256602).

            Where To Get The Patch?

            The following is a link to my syncplicity share with the README for this patch: Click here for the patch README.
            Finally, the patched version of is available here:  Click here for the latest script.  Please verify the MD5 sum after you download this script to match a28a38b11040bb94f08a8f817792c75c.

            Filed under: oracle

            Oracle and Data Warehouses

            Rittman Mead Consulting - Mon, 2014-09-01 09:21

            If you follow Blogs and Tweets from the Oracle community you won’t have missed hearing about the recent release of the first patch-set for Oracle 12c. With this release there are some significant pieces of new functionality that will be of interest to Data Warehouse DBAs and architects. The headline feature that most Oracle followers will know of is the new in-memory option. In my opinion this is a game-changer for how we design reporting architectures; it gives us an effective way to build operational reporting over the reference data architecture described by Mark Rittman a few weeks ago. Of course, the database team here at Rittman Mead have been rolling up our sleeves and getting into in-memory technology for quite a while now, Mark even featured in the official launch presentation by Larry Ellison with the now famous “so easy it’s boring” quote. Last week Mark published the first of our Rittman Mead in-memory articles, with the promise of more in-memory articles to come including my article for the next edition of UKOUG’s “Oracle Scene”.

            However, the in-memory option is not the only new feature that is going to be a benefit to us in the BI/DW world. One of the new features I am going to describe is Exadata only, but the first one I am going to mention is generally available in the database.

            Typically, data warehouse queries are different from those seen in the OLTP world – in DW we tend to access a large number of rows and probably aggregate things up to answer some business question. Often we are not using indexes and instead scanning tables or table partitions is the norm. Usually, the data we need to aggregate is widely scattered across the table or partition. Data Warehouse queries often look at records that share a set of common attributes; we look at the sales for the ‘ACME’ widget or the value of items shipped to Arizona. For us there can be great advantage if data we use together is stored together, and this is where Attribute Clustering can pay a part.

            Attribute Clustering is usually configured on the table at at DDL time and in-effect controls the ordering of data inserted by DIRECT PATH operations, Oracle does not enforce this ordering for conventional inserts, this may not be an issue in data warehouses as bulk-batch operations typically use APPEND inserts, which are direct path inserts, or partition operations, it may be more of an issue with some of the real-time conventional path loading paradigms. In addition to Direct Path load operations Attribute Clustering can also occur when you do Alter table MOVE type operations (this also includes operations such as PARTITION SPLIT). On the surface, Attribute Clustering sounds little different to using an ORDER by on an append insert and hoping that Oracle actually stores the data where you expect it to. However, Attribute Clustering gives us two other possibilities in how we can order the data in the cluster.

            Firstly, we can cluster on columns from JOINED dimension tables, for example in a SALES DW we may have a sales fact with a product key at the SKU level, but we often join to the product dimension and report at the Product Category level. In this case we can cluster our sales fact table so that each product category appears in the same cluster. For example, we have just opened a chain a supermarkets with a wide but uninspiring range of brands and products (see the tiny piece of our product dimension table below)


            As you can see, our Product PK has no relationship at all to the type of product being sold. In our Kimball-style data warehouse we typically store the product key on the fact table and join to the product dimension to obtain all of the other product attributes and hierarchy members. This is essentially what we can do with join Attribute Clustering, in our example we can cluster our fact table on PRODUCT_CATEGORY so that all of the Laundry sales are physically close to each other in the Fact table.

            CREATE TABLE rm_sales (
            product_idNUMBER NOT NULL,
            store_id        NUMBER NOT NULL,
            sales_date      DATE NOT NULL,
            loyalty_card_id NUMBER ,
            quantity_sold  NUMBER(3) NOT NULL,
            value_sold    NUMBER(10,2) NOT NULL
            rm_sales JOIN products ON (rm_sales.product_id = products.product_pk)
            BY LINEAR ORDER (sales_date, product_category, store_id);

            Notice we are clustering on a join to the product dimension table’s “product_category” column, we are also clustering on sales_date, this is especially important in the case of partitioned fact tables so that the benefits of clustering align to the partitioning strategy.  We are also not restricted in our clustering to just one join, if we wanted to we could also cluster our sales by store region e.g. the Colorado laundry product sales are located in the same area of the sales table. To use Join Attribute Clustering we need to define the PK / FK relationships between fact and dimension, however it is always good practice to have that in place as it helps the CBO so much with query plan evaluation

            Secondly, notice the BY LINEAR ORDER clause in the table DDL. Of the two ordering options, Linear Order is the most basic form of clustering, it this case we have our data structured so that all the items for a sales day are clustered together and within that cluster we order by product category and those categories are in turn ordered by store_id. The other way we can cluster is BY INTERLEAVED ORDER; here, Oracle maps a combination of dimensional values to a single cluster value using a z-order curve fitting approach. This sounds complex but it ensures that items that are frequently queried together are co-located in the disk blocks in the storage.

            Interleaved ordering is probably the best choice for data warehousing at it aligns well with how we access data in our queries. Although we could include all of the dimension keys in our ordering list, it is going to be more benefit to just include a subset of dimensions; typically for retail I’d go with DATE (or something that correlates to the time based partition key of the fact table), the product  and the store. Of course we can again join to the dimension tables and cluster at higher hierarchy levels such as product category and store region. The Oracle 12c Data Warehousing guide gives some good advice, but you can’t go far wrong if you think about clustering items together that will be queried together

            Clustering data can give us some advantages in performance. Better data compression and improved index range scans spring to mind, but to get most benefits we should also look at another new feature, zone-maps. Unlike Attribute Clustering, Zone Maps are Engineered Systems only, In a way they are similar to storage indexes already found on Exadata, but they have some additional advantages, they are also somewhat different from zone maps encountered in other DB vendors’ products such as Netezza.

            In Exadata, a storage index can provide the maximum and minimum values encountered for a column in storage cell. I say “can” as there is no guarantee that range for a given column is held in the storage index. Zone Maps on the other hand will always provide maxima and minima for all of the columns specified at zone map creation. The zone map is orientated in terms of contiguous database blocks and is materialized so that it is physically persisted in the database and thus survives DB startups. Like Materialized views Materialized zone maps can become stale and need to be maintained.

            We can define a zone map on one or more table columns and just like Attribute Clustering we may also create zone maps on table joins. As a table can only have one zone map it is important to include all of the columns you wish to track. Zone Maps are designed to work well with attribute clustering, in fact it is just a simple DDL statement to add a zone-map to an Attribute Clustered table so that the zone map tracks the same attributes as the clustering. This is where we get the major performance boost from attribute clustering, Instead of looking at the whole table the zone map tells us which ranges of database blocks contain data that matches our query predicates.

            Zone Maps with Attribute Clustering gives us another powerful tool to boost DW performance on Exadata – we can do star queries without resorting to bitmap indexes and we minimise IO when scanning fact tables as we only need look where we know the data to be. Exciting times!

            Categories: BI & Warehousing