Skip navigation.

Rittman Mead Consulting

Syndicate content Rittman Mead Consulting
Delivering Oracle Business Intelligence
Updated: 1 hour 30 min ago

Contemplating Upgrading to OBIEE 12c?

Thu, 2016-04-28 04:00
Where You Are Now

OBIEE 12c has been out for some time, and it seems like most folks are delaying upgrading to OBIEE 12c until the very last minute. Or at least until Oracle decides to put out another major version change of OBIEE, which is understandable. You’ve already spent time and money and devoted hundreds of resource hours to system monitoring, maintenance, testing, and development. Maybe you’ve invested in staff training to try to maximize your ROI in your existing OBIEE purchase. And now, after all this time and effort, you and your team have finally gotten things just right. Your BI engine is humming along, user adoption and stickiness are up, and you don’t have a lot of dead objects clogging up the Web Catalog. Your report hacks and work-arounds have been worked and reworked to become sustainable and maintainable business solutions. Everyone is getting what they want.

Sure, this scenario is part fantasy, but it doesn’t mean that as a BI team lead or member, you’re not always working toward this end. It would be nice to think that the people designing the tools with which we do this work understood the daily challenges and processes we must undergo in order to maintain the precarious homeostasis of our BI ecosystems. That’s where Rittman Mead comes in. If you’re considering upgrading to OBIEE 12c, or are even curious, keep reading. We’re here to help.

So Why Upgrade

Let’s get right down to it. Shoot over here and here to check out what our very own Mark Rittman had to say about the good, the bad, and the ugly of 12c. Our Silvia Rauton did a piece on lots of the nuts and bolts of 12c’s new front-end features. They’re all worth a read. Upgrading to OBIEE 12c offers many exciting new features that shouldn’t be ignored.

Heat Map

How Rittman Mead Can Help

We understand what it is to be presented with so many project challenges. Do you really want to risk the potential perils and pitfalls presented by upgrading to OBIEE 12c? We work both harder and smarter to make this stuff look good. And we get the most out of strategy and delivery via a number of in-house tools designed to keep your OBIEE deployment in tip top shape.

Maybe you want to make sure all your Catalog and RPD content gets ported over without issue? Instead of spending hours on testing every dashboard, report, and other catalog content post-migration, we’ve got the Automated Regression Testing package in our tool belt. We deploy this series of proprietary scripts and dashboards to ensure that everything will work just the way it was, if not better, from one version to the next.

Maybe you’d like to make sure your system will fire on all cylinders or you’d like to proactively monitor your OBIEE implementation. For that we’ve got the Performance Analytics Dashboards, built on the open source ELK stack to give you live, active monitoring of critical BI system stats and the underlying database and OS.

OBIEE Performance Analytics

On top of these tools, we’ve got the strategies and processes in place to not only guarantee the success of your upgrade, but to ensure that you and your team remain active and involved in the process.

What to Expect

You might be wondering what kinds of issues you can expect to experience during upgrading to OBIEE 12c (which is to say, nothing’s going to break, right?). Are you going to have to go through a big training curve? Does upgrading to OBIEE 12c mean you’re going to experience considerable resource downtime as your team, or an even an outside company, manages this process? To answer this question, I’m reminded of a quote from the movie Fight Club: “Choose your level of involvement.”

While we always prefer to work alongside your BI or IT team to facilitate the upgrade process, we also know that resource time is valuable and that your crew can’t stop what they’re doing until things wraps up. We often find that the more clients are engaged with the process, however, the easier the hand-off is because clients better understand best practices, and IT and BI teams are more empowered for the future.

Learning More about OBIEE 12c

But if you’re like many organizations, maybe you have to stay more hands off and get training after the upgrade is complete. Check out the link here to look over the agenda of our OBIEE 12c Bootcamp training course. Like our hugely popular 11g course, this program is five days of back-to-front instruction taught via a selection of seminars and hands-on labs, designed to impart most everything your team will need to know to continue or begin their successful BI practice.

What we often find is that, in addition to being a thorough and informative course, the Bootcamp is a great way to bring together teams or team members, often dispersed among different offices, under one roof to gain common understanding about how each person plays an important role as a member of the BI process. Whether they handle the ETL, data modeling, or report development, everyone can benefit from what often evolves from a training session into some impromptu team building.

Feel Empowered

If you’re still on the fence about whether or not to upgrade, as I said before, you’re not alone. There are lots of things you need to consider, and rightfully so. You might be thinking, “What does this mean for extra work on the plates of my resources? How can I ensure the success of my project? Is it worth it to do it now, or should I wait for the next release?” Whatever you may be mulling over, we’ve been there, know how to answer the questions, and have some neat tools in our utility belt to move the process along. In the end, I hope to have presented you with some bits to aid you in making a decision about upgrading to OBIEE 12c, or at least the impetus to start thinking about it.

If you’d like any more information or just want to talk more about the ins and outs of what an upgrade might entail, send over an email or give us a call.

The post Contemplating Upgrading to OBIEE 12c? appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Data Integration Tips: Oracle Data Integrator 12c Password

Sun, 2016-04-24 23:12

Hey, everyone. It’s Sunday night, and we have just enough time for another Data Integration Tip from Rittman Mead. This one has originated from many years of Oracle Data Integrator experience—and a lost 12c password. Let me start first by stating there is never any blame placed when a password is lost, forgotten, or just never stored in a safe place. It happens more often than you might think! Unfortunately, there is no “Forgot password?” link in ODI 12c, which is why I wanted to share my approach to password recovery for these situations.


The Challenge: Lost 12c Password

There are typically two passwords used in Oracle Data Integrator 12c that are forgotten and difficult to recover:

  1. The Work Repository password, created during the setup of the ODI repositories.
  2. The SUPERVISOR user password.

Often there will be more than one ODI user with supervisor privileges, allowing the SUPERVISOR user account password to be reset and making everyone’s life a bit easier. With that, I’ll focus on the Work Repository password and a specific use case I ran into just recently. This approach will work for both lost 12c password instances and I have used it for each in the past.


Now, yes, there is a feature that allows us to change the Work Repository password from within ODI Studio. But (assuming you do have the ability to edit the Work Repository object) as you can see in the image, you also need to know the “current password.” Therein lies the problem.

The Scenario

OK, here we go. The situation I ran into was related to an ODI 11g to 12c upgrade. During the upgrade, we cloned the master and work repositories and set them up on a new database instance in order to lessen the impact on the current 11g repositories. To make this work, a few modifications are required after cloning and before the ODI upgrade assistant can be run. Find more details on these steps in Brian Sauer’s post, Upgrade to ODI 12c: Repository and Standalone Agent.

  • Modify the Work repository connection from within the Master repository. The cloned Master repository is still pointed to the original ODI 11g Work Repository and the connection must be updated.
  • Update the SYSTEM.SCHEMA_VERSION_REGISTRY$ table to add an entry for the cloned ODI repository in the new database instance.
  • Detach the Work Repository from the original Master Repository.

Easy enough. The upgrade assistant completed successfully and everything was working great during testing, until we attempted to open the Work Repository object in ODI:

“Work repository is already attached to another master repository”

Uh-oh. It seems the last bullet point above was skipped. No worries. We have a simple solution to this problem. We can detach the Work Repository from the Master, then attach it once again. Interestingly enough, the action of detaching the repository cleans up the metadata and allows the Work Repository to be added to the cloned master with no problem.

Detaching is easy. Just confirm that you want to remove the Work Repository and poof, it’s gone. It’s the reattaching where we run into an issue…our lost 12c password issue (you knew I was going to bring that up, didn’t you?). Adding a Work repository requires a JDBC connection to a new or existing repository. In this case, we choose the existing repository in our cloned database. The same one we just detached from the Master. Just make sure that you choose to keep the repository contents or you’ll have a much bigger challenge ahead of you.

But then, out of nowhere, we’re prompted for the Work Repository password.


Hmm…well, we set the ODI 11g repository up in 2011. Jim, who installed it for us, doesn’t work here any longer. “Hmm” is right!

Here’s the Tip

Before we go any further, full disclosure—this is most likely not considered a supported action in the eyes of Oracle, so beware. Warning SignAlso, I haven’t attempted to use the ODI SDK and a Groovy script to update a password, so that might be the way to go if you’re concerned about this being a hack. But desperate times require desperate measures, as they say.

In order to “recover” a password for the Work Repository, we must actually change it behind the scenes in the repository tables. There’s a great deal of metadata we can access via the repository schema, and the modification of this data via the schema is not typical nor recommended, but sometimes necessary.

Oracle Support has a Knowledge Base document, Oracle Data Integrator 11g and 12c Repository Description (Doc ID 1903225.1), which provides a nice data dictionary for the repositories. Looking at the ODI 12.2.1 version of the repository definition, we find that the table SNP_LOC_REPW in the Work Repository stores the value for the repository password in the column REP_PASSW. Now the password must be encoded to match the repository and environment, so it cannot simply be added to the table in plain text.

Encoding a password is something that Oracle Data Integrator developers and admins have been doing for years, most often when setting up a Standalone agent. As a part of the agent installation, there is a script called (or encode.bat for Windows) that will accept a plain text password as a parameter and output the encoded string. Brilliant! Let’s try it out.

Browse to the ODI agent domain home and drill into the bin directory. From there, we can execute the encode command. A quick look at the script shows us the expected input parameters.


The instance name is actually the Agent name. Ensure the agent is running and fire off the script:

[oracle@ODIGettingStarted bin]$ ./ -INSTANCE=OGG_ODI_AGENT
2016-04-24 22:00:50.791 TRACE JRFPlatformUtil:Unable to obtain JRF server platform. Probably because you are in JSE mode where oracle.jrf.ServerPlatformSupportFactory is not available which is expected.
2016-04-24 22:00:56.855 NOTIFICATION New data source: [OGG_ODI_REPO/*******@jdbc:oracle:thin:@//localhost:1521/ORCL]
2016-04-24 22:01:01.931 NOTIFICATION Created OdiInstance instance id=1
Enter password to encode:

Now you can enter a password to encode, hit return and boom! Here’s your encoded string.

Enter password to encode:

Let’s take the entire string and write a quick update statement for the Work Repository SNP_LOC_REPW table. Even though I know there is only one Work Repository, I still use a where clause to ensure I’m updating the correct row.

set REP_PASSW = 'ejjYhIeqYp4xBWNUooF31Q=='

Commit the transaction and Bob’s your uncle! Now we can continue on with adding the Work Repository through ODI Studio. Just enter the password used in the command and you’re in!

As I mentioned earlier, this same approach can be used to update the SUPERVISOR user password, or really any ODI user password (if they are stored in the repository). In this case, the use of is the same, but this time we update the SNP_USER table in the Master repository. The column PASS stores the encoded password for each user. Just remember to change the password everywhere that the user is set to access ODI (agents, etc).

So there you have it. A quick, simple way to “recover” a lost ODI 12c password. Just be sure that this information doesn’t fall into the wrong hands. Lock down your ODI agent file directory to only those administrators who require access. Same goes for the repository schemas. And finally, use this approach in only the most dire situation of a completely lost 12c password. Thanks for reading and look here if you want more DI Tips. Enjoy your week!

The post Data Integration Tips: Oracle Data Integrator 12c Password appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Rittman Mead at Collaborate 16: Data Integration Focus

Mon, 2016-04-04 03:59

It’s that time of year again when Oracle technologists from around the world gather in Las Vegas, Nevada, to teach, learn, and, of course, network with their peers. The Collaborate 16 conference, running for 10 years now, has been a collaboration, if you will, between the Independent Oracle Users Group (IOUG), Oracle Applications Users Group (OAUG), and Quest International Users Group (Quest), making it one of the largest user group conferences in the world. Rittman Mead will once again be in attendance, with two data integration focused presentations by me over the course of the week.

My first session at Collaborate 16, “A Walk Through the Kimball ETL Subsystems with Oracle Data Integration,” scheduled for Monday, April 11, at 10:30 a.m., will focus on how we can implement the ETL Subsystems using Oracle Data Integration solutions. As you know, Big Data integration has been the hot topic over the past few years, and it’s an excellent feature in the Oracle Data Integration product suite (Oracle Data Integrator, GoldenGate, & Enterprise Data Quality). But not all analytics require big data technologies, such as labor cost, revenue, or expense reporting. Ralph Kimball, dimensional modeling and data warehousing expert and founder of The Kimball Group, spent much of his career working to build an enterprise data warehouse methodology that can meet these reporting needs. His book, “The Data Warehouse ETL Toolkit,” is a guide for many ETL developers. This session will walk you through his ETL Subsystem categories: Extracting, Cleaning & Conforming, Delivering, and Managing, describing how the Oracle Data Integration products are perfectly suited for the Kimball approach.

I go into further detail on one of the ETL Subsystems in an upcoming IOUG Select Journal article, titled “Implement an Error Event Schema with Oracle Data Integrator.” The Select Journal is a technical magazine published quarterly and available exclusively to IOUG members. My recent post Data Integration Tips: ODI 12c Repository Query – Find the Mapping Target Table shows a bit of the detail behind the research performed for the article.


If you’re not familiar with the Kimball approach to data warehousing, I definitely would recommend reading one (or more) of their published books on the subject. I would also recommend attending one of their training courses, but unfortunately for the data warehousing community, the Kimball Group has closed shop as of December 2015. But hey, the good news is that two of the former Kimball team members have joined forces at Decision Works, and they offer the exact same training they used to deliver under The Kimball Group name.

GoldenGate to Kafka logo

On Thursday, April 14, at 11 a.m., I will dive into the recently released Oracle GoldenGate for Big Data 12.2 in a session titled “Oracle GoldenGate and Apache Kafka: A Deep Dive into Real-Time Data Streaming.” The challenge for us as data integration professionals is to combine relational data with other non-structured, high volume and rapidly changing datasets, known in the industry as Big Data, and transform it into something useful. Not just that, but we must also do it in near real-time and using a big data target system such as Hadoop. The topic of this session, real-time data streaming, provides us a great solution for that challenging task. By combining GoldenGate, Oracle’s premier data replication technology, and Apache Kafka, the latest open-source streaming and messaging system for big data, we can implement a fast, durable, and scalable solution.

If you plan to be at Collaborate 16 next week, feel free to drop me a line in the comments, via email at, or on Twitter @mRainey. I’d love to meet up and have a discussion around my presentation topics, data integration, or really anything we’re doing at Rittman Mead. Hope to see you all there!

The post Rittman Mead at Collaborate 16: Data Integration Focus appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

ChitChat: The Importance of BI Integrations

Thu, 2016-03-31 04:00

A user’s workflow shouldn’t change to accommodate a new tool. A new tool should fill a gap in the current workflow and help streamline the user’s process. An application without a clearly defined scope eventually overlaps with existing solutions, creating confusion and distress among users. It takes both time and effort to clarify the appropriate situations to use the application, reconcile different use cases and approaches, and resolve incorrect uses. We designed ChitChat with appropriate scopes in mind, implementing key integrations, to fit seamlessly into existing workflows.

What exactly do we mean by “scope?”

Let’s look at an example with JIRA. JIRA owns the complete ticketing process, meaning tickets are stored and maintained by the tool. Using a competing ticket solution, such as Trello, for the same purpose within the organization will cause havoc among users. However, JIRA tickets are still extremely useful outside of the JIRA application. They can be linked to and displayed inside other applications, but they are still maintained by JIRA itself.

If you can recognize that the ticketing management should be handled solely by JIRA, but exposure of those tickets outside of the tool is also important, then you understand the correct scope of the application. The scope of the application does not determine where the context of an application is useful. It only describes what section of a workflow the application has absolute control over. The question isn’t “Where should we be able to view the information?” The question is “Where should the content be maintained?”

ChitChat respects the appropriate scopes of neighboring applications and allows the flexibility to continue maintaining the scopes of these applications. With integrations to Atlassian JIRA and Confluence and Salesforce Chatter, the information you need is available where you need it, without infringing on your existing workflow.

Examples of Integrations

Let’s look at some examples. As we use a BI dashboard, we stumble upon an issue. Using ChitChat, the issue can be identified and a conversation can be made about temporarily working around the problem. However, the IT team uses JIRA to accept issues and resolves them as appropriate. We obviously want the IT team to know of this issue, so we must create a ticket in JIRA as well. Rather than going to JIRA and creating a ticket manually, we can simply export the initial annotation to JIRA. The workflow remains generally identical, but now requires less time and effort. And this comes with the added benefit of the ticket pointing directly to the location of the issue on the dashboard.

In another instance, let’s say our dashboard has some confusing calculations on it, some of which are not immediately recognizable. The formulas used, and the reasons to use such formulas, are available in Atlassian Confluence for us to view. However, not all users have a Confluence account, and even fewer have access to the document. We could copy and paste the calculations as a document using ChitChat, but now we have two separate instances of the same information. If the calculations are changed, we must ensure both locations are accurate. Alternatively, ChitChat can sync directly with Confluence and pull a page into the application. The page guarantees accuracy by consistently pulling new updates from Confluence, as well as pushing updates to Confluence if the content is changed in ChitChat.

These approaches allow the JIRA ticket and Confluence document to be maintained in the appropriate location, while also being available in a useful context. Chitchat does not impede on the purposes of other applications. ChitChat offers integrations that seamlessly enhance your workflow without making it convoluted. Our tool is designed specifically to fill the missing pieces in your BI workflow, allowing for a seamless transition between analysis and communication.

To learn more about ChitChat’s many commentary features, or to request a demo, click here.

The post ChitChat: The Importance of BI Integrations appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

New OTN Article – OBIEE Performance Analytics: Analysing the Impact of Suboptimal Design

Wed, 2016-03-30 02:09

I’m pleased to have recently had my first article published on the Oracle Technology Network (OTN). You can read it in its full splendour and glory(!) over there, but I thought I’d give a bit of background to it and the tools demonstrated within.

OBIEE Performance Analytics Dashboards

One of the things that we frequently help our clients with is reviewing and optimising the performance of their OBIEE systems. As part of this we’ve built up a wealth of experience in the kind of suboptimal design patterns that can cause performance issues, as well as how to go about identifying them empirically. Getting a full stack view on OBIEE performance behaviour is key to demonstrating where an issue lies, prior to being able to resolve it and proving it fixed, and for this we use the Rittman Mead OBIEE Performance Analytics Dashboards.

OBIEE Performance Analytics

A common performance issue that we see is analyses and/or RPDs built in such a way that the BI Server inadvertently returns many gigabytes of data from the database and in doing so often has to dump out to disk whilst processing it. This can create large NQS_tmp files, impacting the disk space available (sometimes critically), and the disk I/O subsystem. This is the basis of the OTN article that I wrote, and you can read the full article on OTN to find out more about how this can be a problem and how to go about resolving it.

OBIEE implementations that cause heavy use of temporary files on disk by the BI Server can result in performance problems. Until recently in OBIEE, it was really difficult to track because of the transitory nature of the files. By the time the problem had been observed (for example, disk full messages), the query responsible had moved on and so the temporary files deleted. At Rittman Mead we have developed lightweight diagnostic tools that collect, amongst other things, the amount of temporary disk space used by each of the OBIEE components.


This can then be displayed as part of our Performance Analytics Dashboards, and analysed alongside other performance data on the system such as which queries were running, disk I/O rates, and more:

OBIEE Temp Disk Usage

Because the Performance Analytics Dashboards are built in a modular fashion, it is easy to customise them to suit specific analysis requirements. In this next example you can see performance data from Oracle being analysed by OBIEE dashboard page in order to identify the cause of poorly-performing reports:

OBIEE Database Performance Analysis

We’ve put online a set of videos here demonstrating the Performance Analytics Dashboards, and explaining in each case how they can help you quickly and accurately diagnose OBIEE performance problems.

You can read more about our Performance Analytics offering here, or get in touch to find out more!

The post New OTN Article – OBIEE Performance Analytics: Analysing the Impact of Suboptimal Design appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

The Importance of BI Documentation

Thu, 2016-03-17 04:00
Why Is BI Documentation Important?

Business intelligence systems come with a lot of extra information. Even beautifully constructed analyses have piles of background information and histories. Administrators might often have memos and updates that they’d like share with analysts. Sales figures might have anomalies that need further explanation. But OBIEE does not currently have any options for BI Documentation inside the dashboard.

Let’s say a BI user for a cell phone distribution company is viewing a report comparing the yearly sales figures for several different cell phones. If the analyst notices that one specific cell phone is outperforming the others, but doesn’t know what makes that specific model unique, then they have to go searching for that information.

But what if the individual phone model specifications and advertising and marketing histories were already included as reports inside the dashboard? What if the analyst, with only a couple of clicks, discovered that the reason one cell phone was outperforming the others was due to its next-gen screen, camera, and chip upgrades, which proved popular with consumers? Or what if the analyst discovered that the popular phone, while containing outdated peripherals, was selling so well because a Q3 advertising push for that model only? All of this information might not be contained in the dashboard’s visuals, but greatly affects the analysts’ understanding of the reports.

Current Options for OBIEE Documentation

Some information can be displayed as visuals, but many times this isn’t a practical solution. Besides making dashboards too cluttered, memos, product descriptions, company directories, etc., are not practical as charts and graphs. As of right now, important documentation can be stored in a wide range of places outside of the BI dashboard, but the operating reality at most organizations means that important information is spread across several locations and not always accessible to the people who need it.

Workarounds are inefficient, cost time, cause BI users to leave the BI environment (potentially reducing usage), and increase frustration. If an analyst has to email several different people to locate the information she wants, that complicates her workflow and produces extraneous communications (who likes answering emails?). Before now, there wasn’t an easy solution to these problems.

ChitChat’s BI Documentation Features

With ChitChat, it’s now possible to store critical documentation where it belongs—at the source of the conversation. Keep phone directories, memos from administrators (or requests from analysts to administrators), product descriptions, analytical histories—really, the possibilities are endless—inside the dashboard where they are accessible to the people who need them. Shorten workflows and make life easier for your BI users.

ChitChat’s easy-to-use functionality allows BI users to copy and paste or write (ChitChat has a built-in WYSIWYG text editor) important information inside the BI dashboard, creating a quicker path to insightful and actionable analytics. And isn’t that the goal in the end?

To learn more about ChitChat’s many commentary features, or to request a demo, click here.

The post The Importance of BI Documentation appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

ASO Slice Clears – How Many Members?

Mon, 2016-03-14 04:00

Essbase developers have had the ability to (comparatively) easily clear portions of our ASO cubes since version 11.1.1, getting away from fiddly methods involving manually contra-ing existing data via reports and rules files, making incremental loads substantially easier.

Along with the official documentation in the TechRef and DBAG, there are a number of excellent posts already out there that explain this process and how to effect “slice clears” in detail (here and here are just two I’ve come across that I think are clear and helpful). However, I had a requirement recently where the incremental load was a bit more complex than this. I am sure people must have fulfilled in the same or a very similar way, but I could not find any documentation or articles relating to it, so I thought it might be worth recording.

For the most part, the requirements I’ve had in this area have been relatively straightforward—(mostly) financial systems where the volatile/incremental slice is typically a months-worth (or quarters-worth) of data. The load script will follow this sort of sequence:

  • [prepare source data, if required]
  • Perform a logical clear
  • Load data to buffer(s)
  • Load buffer(s) to new database slice(s)
  • [Merge slices]

With the last stage being run here if processing time allows (this operation precludes access to the cube) or in a separate routine “out of hours” if not.

The “logical clear” element of the script will comprise a line like (note: the lack of a “clear mode” argument means a logical clear; only a physical clear needs to be specified explicitly):

alter database ‘Appname‘.’DBName‘ clear data in region ‘{[Jan16]}’

or more probably

alter database ‘Appname‘.’DBName‘ clear data in region ‘{[&CurrMonth]}’

i.e., using a variable to get away from actually hard coding the member values to clear. For separate year/period dimensions, the slice would need to be referenced with a CrossJoin:

alter database ‘Appname‘.’DBName‘ clear data in region ‘Crossjoin({[Jan]},{[FY16]})’ alter database ‘${Appname}’.’${DBName}’ clear data in region ‘Crossjoin({[&{CurrMonth]},{[&CurrYear]})’

which would, of course, fully nullify all data in that slice prior to the load. Most load scripts will already be formatted so that variables would be used to represent the current period that will potentially be used to scope the source data (or in a BSO context, provide a FIX for post-load calculations), so using the same to control the clear is an easy addition.

Taking this forward a step, I’ve had other systems whereby the load could comprise any number of (monthly) periods from the current year. A little bit more fiddly, but achievable: as part of the prepare source data stage above, it is relatively straightforward to run a select distinct period query on the source data, spool the results to a file, and then use this file to construct that portion of the clear command (or, for a relatively small number, prepare a sequence of clear commands).

The requirement I had recently falls into the latter category in that the volatile dimension (where “Period” would be the volatile dimension in the examples above) was a “product” dimension of sorts, and contained a lot of changed values each load. Several thousand, in fact. Far too many to loop around and build a single command, and far too many to run as individual commands—whilst on test, the “clears” themselves ran satisfyingly quickly, it obviously generated an undesirably large number of slices.

So the problem was this: how to identify and clear data associated with several thousand members of a volatile dimension, the values of which could change totally from load to load.

In short, the answer I arrived at is with a UDA.

The TechRef does not explicitly say or give examples, but because the Uda function can be used within a CrossJoin reference, it can be used to effect a clear: assume the Product dimension had an UDA of CLEAR against certain members…

alter database ‘Appname‘.’DBName‘ clear data in region ‘CrossJoin({Uda([Product], “CLEAR”)})’

…would then clear all data for all of those members. If data for, say, just the ACTUAL scenario is to be cleared, this can be added to the CrossJoin:

alter database ‘Appname‘.’DBName‘ clear data in region ‘CrossJoin({Uda([Product], “CLEAR”)}, {[ACTUAL]})’

But we first need to set this UDA in order to take advantage of it. In the load script steps above, the first step is prepare source data, if required. At this point, a SQLplus call was inserted to a new procedure that

  1. examines the source load table for distinct occurrences of the “volatile” dimension
  2. populates a table (after initially truncating it) with a list of these members (and parents), and a third column containing the text “CLEAR”:


A “rules” file then needs to be built to load the attribute. Because the outline has already been maintained, this is simply a case of loading the UDA itself:


In the “Essbase Client” portion of the load script, prior to running the “clear” command, the temporary UDA table needs to be loaded using the rules file to populate the UDA for those members of the volatile dimension to be cleared:

import database ‘AppName‘.’DBName‘ dimensions connect as ‘SQLUsername‘ identified by ‘SQLPassword‘ using server rules_file ‘PrSetUDA’ on error write to ‘LogPath/ASOCurrDataLoad_SetAttr.err’;



With the relevant slices cleared, the load can proceed as normal.

After the actual data load has run, the UDA settings need to be cleared. Note that the prepared table above also contains an empty column, UDACLEAR. A second rules file, PrClrUDA, was prepared that loads this (4th) column as the UDA value—loading a blank value to a UDA has the same effect as clearing it.

The broad steps of the load script therefore become these:

  • [prepare source data, if required]
  • ascertain members of volatile dimension to clear from load source
  • update table containing current load members / CLEAR attribute
  • Load CLEAR attribute table
  • Perform a logical clear
  • Load data to buffers
  • Load buffer(s) to new database slice(s)
  • [Merge slices]
  • Remove CLEAR attributes

So not without limitations—if the data was volatile over two dimensions (e.g., Product A for Period 1, Product B for Period 2, etc.) the approach would not work (at least, not exactly as described, although in this instance you could possible iterate around the smaller Period dimension)—but overall, I think it’s a reasonable and flexible solution.

Clear / Load Order

While not strictly part of this solution, another little wrinkle to bear in mind here is the resource taken up by the logical clear. When initializing the buffer prior to loading data into it, you have the ability to determine how much of the total available resource is used for that particular buffer—from a total of 1.0, you can allocate (e.g.) 0.25 to each of 4 buffers that can then be used for a parallel load operation, each loaded buffer subsequently writing to a new database slice. Importing a loaded buffer to the database then clears the “share” of the utilization afforded to that buffer.

Although not a “buffer initialization” activity per se, a (slice-generating) logical clear seems to occupy all of this resource—if you have any uncommitted buffers created, even with the lowest possible resource utilization of 0.01 assigned, the logical clear will fail:


The Essbase Technical Reference states at “Loading Data Using Buffers“:

While the data load buffer exists in memory, you cannot build aggregations or merge slices, as these operations are resource-intensive.

It could perhaps be argued that as we are creating a “clear slice,” not merging slices (nor building an aggregation), that the logical clear falls outside of this definition, but a similar restriction certainly appears to apply here too.

This is significant as, arguably, the ideally optimum incremental load would be along the lines of

  • Initialize buffer(s)
  • Load buffer(s) with data
  • Effect partial logical clear (to new database slice)
  • Load buffers to new database slices
  • Merge slices into database

As this would both minimize the time that the cube was inaccessible (during the merge), and also not present the cube with zeroes in the current load area. However, as noted above, this does not seem to be possible—there does not seem to be a way to change the resource usage (RNUM) of the “clear,” meaning that this sequence has to be followed:

  • Effect partial logical clear (to new database slice)
  • Initialize buffer(s)
  • Load buffer(s) with data
  • Load buffers to new database slices
  • Merge slices into database

I.e., the ‘clear’ has to be fully effected before the initialization of the buffers. This works as you would expect, but there is a brief period—after the completion of the “clear” but before the load buffer(s) have been committed to new slices—where the cube is accessible and the load slice will show as “0” in the cube.

The post ASO Slice Clears – How Many Members? appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Use OBIEE to Achieve Your GOOOALS!!! – A Presentation for GaOUG

Thu, 2016-03-10 04:00


A few months before the start of the 2014 World Cup, Jon Mead, Rittman Mead’s CEO, asked me to come up with a way to showcase our strengths and skills while leveraging the excitement generated by the World Cup. With this in mind, my colleague Pete Tamisin and I decided to create our own game-tracking page for World Cup matches, similar to the ones you see on popular sports websites like ESPN and CBSSports, with one caveat: we would build the game-tracker inside an OBIEE dashboard.

Unfortunately, after several long nights and weekends, we weren’t able to come up with something we were satisfied with, but we learned tons along the way and kept a lot of the content we created for future use. That future use came several months later when we decided to create our own soccer match (“The Rittman Mead Cup”) and build a game-tracking dashboard that would support this match. We then had the pleasure to present our work in a few industry conferences, like the BI Forum in Atlanta and KScope in Hollywood, Florida.

GaOUG Tech Day

Recently I had the privilege of delivering that presentation one last time, at Georgia Oracle Users Group’s Tech Day 2016. With the right amount of silliness (yes, The Rittman Mead cup was played/acted by our own employees), this presentation allowed us to discuss with the audience our approach to designing a “sticky” application; meaning, an application that users and consumers will not only find useful, but also enjoyable, increasing the chances they will return to and use the application.

We live in an era where nice, fun, pretty applications are commonplace, and our audience expects the same from their business applications. Validating the numbers on the dashboard is no longer enough. We need to be able to present that data in an attractive, intuitive, and captivating way. So, throughout the presentation, I discussed with the audience the thoughtful approach we used when designing our game-tracking page. We focused mainly on the following topics: Serving Our Consumers; Making Life Easier for Our Designers, Modelers, and Analysts; and Promoting Process and Collaboration (the latter can be accomplished with our ChitChat application). Our job would have been a lot easier if ChitChat were available when we first put this presentation together….

Finally, you can find the slides for the presentation here. Please add your comments and questions below. There are usually multiple ways of accomplishing the same thing, so I’d be grateful to hear how you guys are creating “stickiness” with your users in your organizations.

Until the next time.

The post Use OBIEE to Achieve Your GOOOALS!!! – A Presentation for GaOUG appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

The Importance of BI Commentary

Mon, 2016-03-07 04:00
Why Is Commentary Important?

We communicate every day. Communication through text is especially abundant with the proliferation of new on-demand technologies. Have you gone through your emails today? Have you read the news, weather, or blogs (like this one)? Communication is the backbone to every interpersonal interaction. Without it, we are left guessing and assuming.

BI implementations are no exception when it comes to communication’s importance, and I would argue communication is a major component of every BI environment. The goal of any BI application is to discover and expose actionable information from data, but without collaboration, discovering insights becomes difficult. By allowing users to collaborate immediately in the BI application, new insights can be discovered quicker.

Any BI conversation should maintain its own dedicated communication channel, and the optimal place for these conversations is as close to the information-consumption phase as possible. By allowing users to collaborate in discussions over results at the same location as the data, users will be empowered to extract as much information as possible.

Unfortunately, commentary support is absent from OBIEE.

The Current OBIEE Communication Model

The lack of commentary support does not stop the community from developing their own methods or approaches to communicating within their BI environments. Right now, common approaches include purchasing pre-developed software, engineering custom solutions, or forcing the conversations into other channels.

Purchasing a commentary application or developing your own internal solutions expedites the user communication process. However, what about those who do not find a solution, and instead decide to use a “work-around” approach?

Choosing to ignore the missing functionality is the cheapest approach, initially, but may actually cost more in the long run. To engage in simple conversations, users are required to leave the BI dashboard, which adds time and difficulty to their daily processes. And reiterating the context of a conversation is both time consuming and error prone.

Additionally, which communication channel will the BI conversations invade? A dedicated communication channel, built specifically to easily display and relay the BI topics of interest, is the most efficient, and beneficial, solution.

How ChitChat Can Help

ChitChat provides a channel of communication directly within the BI environment, allowing users to engage in conversations as close to the data consumption phase as possible. Users will never be required to leave the BI application to engage in a conversation about the data, and they won’t need to reiterate the environment through screenshots or descriptions.

Recognizing the importance of separate channels of communication, ChitChat also easily allows each channel to maintain their respective scopes. For instance, a user may discover an error on a BI dashboard. Rather than simply identifying the error in the BI environment, the user can export the comment to Atlassian JIRA and create a ticket for the issue to be resolved, thus maintaining the appropriate scopes of both JIRA and ChitChat. Integrations allow existing channels of communication to maintain their respective importance, and appropriately restrict the scope of conversations.

ChitChat is placed in the most opportune location for BI commentary, while maintaining the correct scope of the conversation. Other approaches often ignore one of these two aspects of BI commentary, but both are required to efficiently support a community within a BI environment. The most effective solution is not one that simply solves the problem, or meets some of the criteria, but the solution that meets all of the requirements.

Commentary Made Simple

Conversation around a BI environment will always occur, regardless of the supporting infrastructure or difficulty in doing so. Rather than forcing users to spend time working around common obstacles or developing their own solutions, investing in an embedded application will save both time and money. These offerings will not only meet the basic requirements, but also ensure the best experience for users, and the most return on investment.

Providing users the exact features they need, where they need it, is one step in nurturing a healthy BI environment, and ChitChat is an excellent solution to meet these criteria.

To find out more about ChitChat, or to request a demo, click here!

The post The Importance of BI Commentary appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

OBIEE 12c – Your Answers After Upgrading

Thu, 2016-03-03 04:00

Several blogs have already been written about new functionality in OBIEE 12c. Mark Rittman, for example, posted a good one here.

Now, I’ve personally had the chance to play with it for a few weeks, mostly in Answers and some with the RPD, and wanted to share my experience. With a sleek interface and many new functionalities, 12c brings some very useful features that users will appreciate. As with most new software releases, I expected to find issues that needed to be worked out. In general, I was pleasantly surprised with the UI, the speed, and the intuitiveness that came along with OBIEE 12c.

Here, I’ll share with you some of the new features within Answers:

Percent Calculation

If you’ve created lots of percent variance columns, it’s probably second nature that you will create your formula and then multiply by 100. In 12c, you can create your percent calculation without multiplying it by 100, then set your % data formatting in the Column Properties. In the same spot where you specify how the data is displayed, you can check the x100 box, which in turn will automatically multiply your results from that column by 100. Pretty sleek solution to simplify your formulas.

Percent Calc

Saved Columns

This feature is very well described here, so I will give a high level overview: 12c gives you a very easy way to save a complex formula into the catalog. If you’ve built a lot of logic in a column’s formula, and would like to reuse the logic in future reports, you will appreciate the opportunity of saving columns. I remember creating many financial calculations that had to be reused often, and until now there was no easy way to retrieve the column formulas. Trying to simplify my life, I ended up inventing “my own method” of saving complex calculations by saving different analyses that I named as “Master – Calculation” containing the columns that I reused often. I would start many reports based on these Master reports because they had my pre-built formulas; however, this was not a clean method for others to follow. OBIEE 12c gives you this clean and simple method for storing and reusing your most wanted columns. You do this by entering your formula in edit formula and choosing to “Save Column as” for future use.

Calculated Columns

OBIEE 12c provides a more intuitive way to create calculated columns than previous versions. In 10g or 11g, you needed to add a “whatever” column to the query, and then go in Edit Formula to define the calculation for your new column. While this worked, most new users often wondered why they were “bringing in two revenue columns,” for example. In 12c, you can add only the needed columns to your Criteria, then go straight to Results. In the Results tab, there is a New Calculated Measure icon that brings you immediately to the Edit Formula screen where you can name your new measure and define its formula.

Calc Columns

Measure Abbreviation

There is also a more intuitive abbreviation of the measures that are placed on a graph. In 11g, when you dragged an amount to an axis, you may recall that the numbers would show up exactly as the raw number. So, if your result was 12,000,000, then that was exactly what you would see on the graph to begin. If you wanted to improve your graph, then you needed to go to the Graph Properties and format the data from the axis to be abbreviated into, for our example above, millions (or 12M). To save you a step, 12c will automatically abbreviate your graph data in the most user-friendly way. So, if the data is 12,000,000, you automatically get 12M!

Measure Abbreviation.png

Heat Matrix

Easy to use heat matrix!—I mean it: easy. While in 11g, you would have to be somewhat visually savvy and spend a lot of time conditionally formatting. OBIEE 12c gives you a tool that allows you to create a meaningful heat matrix in a matter of minutes—wait—even seconds. All you need is to know the two dimensions and one measure that you would like to use, and drag and drop them. Choose from an array of color schemas and how you would like to use the colors. In no time, your heat matrix is ready.

Heat Map


A new member of the OBIEE family is here to provide a visual solution for very complex activities. The Treemap provides a hierarchical structure that allows you to quickly spot patterns and outliers. At first, it may require a bit of head twisting to look at a graph like this, but remember, this is indeed a graph for complex activities. One of the most ideal usages for this new feature is the grouping by parent/children groups and the displaying of how two measures fair up inside each group.


Advanced Analytics

OBIEE 12c gives you the capability of working with statistical and R functions right from the ‘Edit formula’ pane. While I have found that this new feature was still not very user friendly, it’s a lot easier than making this functionality work in 11g. For example, to create a simple Trendline with 11g, the developer had to slowly build each step of a calculation to find the slope of a line, and then find the Y intercept. With these answers in hand, the results had to be carefully placed on a graph, so that it could render meaningful results. If you require statistical graphs within OBIEE, 12c may be a great fit for you. For example, below is a graph showing four different Trendlines:


The Criteria for building these four lines would be very intense in 11g; but in OBIEE 12c, it contained only five columns: one for the Calendar Year, and one for each Trendline. The Trendlines were created one at a time, by inserting the new “Analytics” Function in the column’s formula (see below).

AA Combo

Data Mashup

This is a dream come true to many of us, though it requires an optional data visualization license. With this new functionality, you are able to use OBIEE along with any excel spreadsheet (XSA) saved on your machine.

You can add a spreadsheet to OBIEE from two areas:

  1. When you are creating an analysis (in the Criteria tab, and then choosing to add data source as shown below), or

Add Data Source

2. By going to the Visual Analyzer Home Page.

As this blog focuses on Answers, I will review the first option here.

There are three possible ways of analyzing a spreadsheet in Answers. You either want to:

  1. Analyze the spreadsheet by itself, or
  2. Use attributes from the spreadsheet along with fact data from your enterprise system, or
  3. Use fact data from your spreadsheet along with attributes and facts from your enterprise system.

For options 2 and 3 to work properly, it is important that your joins are properly matched (watch your cardinalities!) from your spreadsheet to your enterprise data. Also, as usual, option 3 will only work along with another fact table when the two tables are joined to a conformed dimension. Cardinalities and conformed dimensions are items that we generally take for granted when working on front-end OBIEE, because these points have been carefully handled during RPD modeling. Since the spreadsheet modeling has to be done in the front end, special caution must be used when modeling them in order to avoid “exploded” results, or simply inaccurate results.

Word of caution on placing an XSA sourced analysis in a shared folder:

Once you create an analysis using a spreadsheet and save it to a shared folder, you will receive this message:


Once you choose “YES,” the spreadsheet will show as a new subject area—for you and for anyone who has access to the folder in which you placed the analysis, meaning that the catalog security just TOOK CONTROL of your spreadsheet! Below is a screenshot of how they show as new subject areas:

Subject Areas

So, if your intent was to share an analysis from a XSA, but not necessarily share the entire spreadsheet to be reused, you may want to restrict your analysis to a folder with the specific securities that you would like to apply to your spreadsheet. BUT…think carefully before saving the analysis in a shared folder. If you realize that you made a mistake, just know that deleting your analysis from the incorrect folder will NOT remove your spreadsheet as an available subject area for other users. Remember, the catalog security took control of your spreadsheet, and it’s not going to let it go! If you saved the analysis in a folder with incorrect permissions, you must delete the spreadsheet altogether from the tool, reload it, and then save the analysis in the correct folder (with the permissions that you want).

You will likely need in-depth information regarding mashup security once you are really working with it. Check out this Oracle doc for more info.

Word of caution when archiving an analysis containing a spreadsheet, or when moving that analysis between environments:

The username of the owner of the analysis gets embedded in the column formula, and so does the precise name that you gave your spreadsheet when you first loaded it. So, let’s say that you are transitioning environments and the new environment does not contain your spreadsheets. If someone else has an archived catalog containing one of your mashup queries, they will get an error when retrieving results for your query, because the tool doesn’t have your spreadsheet loaded yet. The only way for them to unarchive your analysis and retrieve results is for YOUR USER to log into OBIEE, load the original spreadsheet (saving it with the same exact name as before), and then saving the analysis in the proper shared folder once again.

Deleting the New Subject Area

One tricky thing in this new tool: even if you uploaded your spreadsheet (XSA) during an analysis in OBIEE, it can only be deleted from the “New Home Page,” which is the Home Page of Visual Analyzer. You can get to the “New Home Page” from the “Old Home Page”:

New Home Page

Once in the New Home Page, click on Data Sources. Choose your Data Source and delete it!

Delete SA

I confess that I had some trouble finding the delete button. Maybe I would have bumped into it had I played more with VA, but that was not the case. Regardless, I felt relieved that this button existed somewhere!

Data Mashup Performance

This was a bit of an issue, but mostly when combined with the Advanced Analytics functions. From my research and from talking to colleagues, I found that the following must be observed to optimize performance:

  1. Reduce the size of your spreadsheet, when possible.
  2. DB indexing on the field that you are joining.
  3. Proper cardinality on your mashup joins with your DB data.
  4. Set up caching for mashups on bi server.

Overall, the experience in OBIEE 12c Answers was very positive, and the new features could bring a great deal of time savings for any organization!

To learn more about all that OBIEE 12c has to offer, check out our upcoming bootcamps here.

Hope to see you then!

The post OBIEE 12c – Your Answers After Upgrading appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

ChitChat – Commentary Made Simple

Tue, 2016-03-01 04:00
Rittman Mead is excited to announce ChitChat, an innovative new communication tool for OBIEE.


Rittman Mead’s clients already receive the best Oracle Business Analytics and Data Integration consulting and training. Now, in a major expansion of our business capabilities, our clients can enjoy the benefits of our engineering team’s innovation and expertise.

As an innovator within the world of Oracle enterprise technologies, Rittman Mead sought to design a solution to the lack of commentary features within the business intelligence dashboard. After much work, our best and brightest software engineers developed ChitChat to transform the way you do BI.

ChitChat is a multi-tiered platform that creates a collaborative and dynamic environment for discussion. ChitChat enhances BI capabilities by bringing commenting and documentation functions into the dashboard, increasing ease-of-use and seamlessly integrating with current workflows.

Focus discussion with versatile commenting. Store critical information at the source. Enhance the BI experience. Seamlessly integrate.

Rely on Rittman Mead to get the most out of your business intelligence investments.

To learn more about ChitChat, or to request a demo, click here.

The post ChitChat – Commentary Made Simple appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Rittman Mead Sponsors Georgia Oracle Users Group’s Tech Day

Mon, 2016-02-29 09:00

As part of Rittman Mead’s continuing support and participation in the Oracle community, we are proudly sponsoring this year’s Georgia Oracle Users Group’s Tech Day on Thursday, March 3, 2016, in Downtown Atlanta at the Executive Conference Center.

Rittman Mead’s own Consulting Manager, Andy Rocha, will be speaking at this year’s event. Andy will be presenting on how we created a game-tracking page in OBIEE for the 2014 World Cup. He will break down how we created the dashboard to track real-time scores, stats, match events, and player and team information.

We will also have a booth set up, so if you are in the area, come down to see us and get a demo of some of our exciting new products and services.

To learn more about GaOUG and their events, visit their website here.

The post Rittman Mead Sponsors Georgia Oracle Users Group’s Tech Day appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

OBIEE Performance – Why Metrics Matter (and…Announcing obi-metrics-agent v2!)

Fri, 2016-02-26 08:32

One of the first steps to improve OBIEE performance is to determine why it is slow. That may sound obvious—can’t fix it if you don’t know what you’re fixing, right? Unfortunately, the “Drunk Man anti-method”, in which we merrily stumble from one change to another, maybe breaking things along the way and certainly having a headache at the end of it, is far too prevalent. This comes about partly through unawareness of a better method to follow, and partly encouraged by tuning documents comprising reams of configuration settings to “tune” and fiddle with without really knowing why or how to prove if they indeed actually fixed anything…

Determining the cause of performance problems is often a case of working out what it’s not just as much as what it is. This is for two important reasons. Firstly, we begin to narrow down the area of focus and analysis. Secondly, we know what to leave alone. If we can prove that, for example, the database is running the query behind a report quickly, then there is no point “tuning” the database, because the problem doesn’t lie there. Similarly, if we can see that a report taking 60 seconds in total to run spends 59 seconds of that in the database, fiddling with Java Heap Size settings on OBIEE is going to at the very, very most reduce our total runtime to…59 seconds! This kind of time profiling is important to do, and something that we produce automatically in our Performance Analytics Report:


So, how do we pinpoint what is, or isn’t, going wrong? We need data, and specifically, we need metrics. We need log files, too, maybe for the real nitty-gritty of explain plans, but a huge amount can be understood about a system by looking at the metrics available.

Any modern operating system, from Windows to Linux, AIX to Solaris, will have copious utilities that will expose important metrics such as CPU usage, disk throughout, and so on. These can often be of great assistance in diagnosing performance problems.


When it comes to OBIEE itself, we are spoilt by the performance counters available that since 11g (and still in 12c) have been exposed through the Dynamic Monitoring System (DMS). They were even there in 10g too, but accessed through JMX. These metrics give us information ranging from things like the number of logged in users, through how many connections are open to a given database, down to real low-level internals like how many threads are in use for handling LDAP lookups. Crucially, there are also metrics showing current and peak levels of queueing within the various internal systems in OBIEE, which is where DMS becomes particularly important.

By being able to prove that OBIEE has, for example, run out of available connections to the database, we can confidently state that by changing a given configuration parameter we will alleviate a bottleneck. Not only that, but we can monitor and determine how many connections we really do need at a given workload level. The chart below illustrates this. The capacity of the connection pool is plotted against the number of busy connections. As the number of active sessions increases so does the pressure on the connection pool, until it hits capacity at which point queueing starts—which now means queries are waiting for a connection to the database before they can even begin to execute (and it’s at this point we’d expect to see response times suffer).

So this is the kind of valuable information that is just not available anywhere other than the DMS metrics, and you can see from the above illustration just how useful it is. To access DMS metrics in OBIEE 11g and 12c, you have several options available out of the box:

Some of these are useful for programmatically scraping the data, others for interactively checking values at a point in time.

obi-metrics-agent – v2

At Rittman Mead, we always recommend collecting and storing DMS metrics (alongside others, including OS) all the time—not just if you find yourself with performance problems. That way you can compare before and after states, you can track historical trends—and you’re all set to hit the ground running with your diagnostics when (if) you do hit performance problems.

You can capture DMS metrics with the BI Management Pack in Enterprise Manager, you can write something yourself, or you can take advantage of an open-source tool from Rittman Mead, obi-metrics-agent.

I wrote about obi-metrics-agent originally when we first open-sourced it almost two years ago. The principle in version 2 is still the same, we’ve just rewritten it in Jython so as to remove the need for any dependencies like Python and associated libraries. We’ve also added native InfluxDB output, as well as retained the option to send data in the original carbon/graphite protocol.

You can run obi-metrics-agent and just write the DMS data to CSV, but our recommendation is always to persist it straight to a time series data store such as InfluxDB. Once you’ve collected the data you can analyse and monitor it with several tools, our favourite being Grafana (read more about this here).

As part of our Performance Analytics Service we’ve built a set of Performance Analytics Dashboards, making available a full-stack view of OBIEE metrics (including DMS, OS, and even Oracle ASH data), as seen in this video here (click on the image to enlarge it):

If you’d like to find out more about these and the Performance Analytics service offered by Rittman Mead, please get in touch. You can download obi-metrics-agent itself freely from our github repository.

The post OBIEE Performance – Why Metrics Matter (and…Announcing obi-metrics-agent v2!) appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Becky’s BI Apps Corner: Incrementals and Future dated Employee records

Wed, 2016-02-24 23:00

During the last few posts, we have delved into a few of the many interesting aspects of a BI Apps installation. Today I wanted to change gears a bit and talk about what starts to happen when you are past installation and configuration and begin running load plans. On a client project, I recently worked through a unique constraint error on W_EMPLOYEE_D that I found really interesting related to how the incremental logic was working in the knowledge module (KM). Before I can really get into the workaround, we need to understand how incremental loads work in general for BI Apps.

High Level Overview

In the initial run, the load will grab a full set of data, i.e. all data from the source system, based on the data load parameters set during configuration. The same load plan will be used to load data incrementally, picking up only data that has changed since the most recent load plan has completed (Last Extract Date). The pre-built mappings have incremental change capture built into the knowledge module logic. When a load runs, it will extract records that have changed or been created since the Last Extract Date. The load plan determines which rows to extract by using the formula Source Last Updated Date >= (Last Extract Date – Prune Days).

In the weeds

Is it an incremental load? How does that get decided? Actually, that isn’t decided at the load plan level. Each individual package (run as a scenario) starts with a step that refreshes a variable called #IS_INCREMENTAL.

This variable’s refresh logic, shown in the below screenshot, will determine if this package previously completed successfully. After every successful completion an entry gets made into W_ETL_LOAD_DATES with the package name and date timestamp, amongst other audit information.

So we have a scenario running now with the #IS_INCREMENTAL be set to ‘Y’. What does the Knowledge Module (KM) do? Incremental runs normally have steps to load an I$ table (flow table) from the source logic and update the records in the target table based on the DETECTION_STRATEGY option in the KM. For Fact table loads, the option can accept the possible values (explanation given).

  • OUTER: Outer join to target table when populating flow table in order to determine insert/update/useless records
  • NOT_EXISTS: NOT EXISTS clause is used when populating flow table in order to exclude records, which identically exist in target.
  • POST_FLOW: all records from source are loaded into flow table. After that an update statement is used to flag all rows in flow table, which identically exist in target.
  • NONE: all records from source are loaded into flow table. All target records are updated even when a target record is identical to flow table record.

In most cases, the option OUTER is used for facts, which updates the records based on primary keys (PK’s). Incremental decisions are based on the values of the system date columns like CHANGED_ON_DT, AUX1_CHANGED_ON_DT, AUX2_CHANGED_ON_DT, AUX3_CHANGED_ON_DT and AUX4_CHANGED_ON_DT columns populated from the source. This is better performing than the NOT_EXISTS and POST_FLOW options that compares each and every column to identify the records present.

For Slowly Changing Dimensions (like W_EMPLOYEE_D), the DETECTION_STRATEGY option can take the possible values (explanation given).

  • MINUS: MINUS clause is used when populating flow table in order to exclude records, which identically exist in target.
  • NOT_EXISTS: NOT EXISTS clause is used when populating flow table in order to exclude records, which identically exist in target.

The default option is NOT_EXISTS and Incremental decisions are based on PK’s and the date columns.

Future dated rows

Imagine now that during a full load, all records from the source tables for EMPLOYEES are brought forward into the data warehouse table W_EMPLOYEE_D. One of those records is an entry with an effective start date 2 weeks in the future. For W_EMPLOYEE_D one of the columns in the primary key is the effective start date. Fast forward two weeks to the date when the future dated row’s effective start date is the current date. During the incremental load on that date, the incremental logic for this one record is comparing the primary keys and all of the change indicator columns, and sees that the effective start date is greater than the last extract date from last night. This incremental comparison incorrectly determines this is a record that needs to be added to the fact table, even though the record is already in the fact table. Now we have an ERROR! The familiar unique constraint on the _U1 unique index rears its ugly head. On top of that, troubleshooting this duplicate is not coming up with any duplicate records in the usual places (I$, DS, source tables, nada!). Isolating the two identical records and tracking them back to the source tables however, there is the one record. The only clue is that the effective start date is today’s date. After a second occurrence, discussions and back and forth on an SR, a workaround is now available.


Step 1. Remove any Future dated rows in W_EMPLOYEE_D

Step 2. Add a filter on the interface to prevent future dated rows from loading into W_EMPLOYEE_DS until they are <= current date.

At our client, this mapping continues to run without any additional errors. The steps here are most likely version specific, and this issue is a known bug to Oracle, so please don’t hesitate to open an SR if you are getting this specific issue, as a quick turn around is very likely.

There are some other odds and ends about how incremental load plans work and I plan to gather them up and have another post about those in the coming weeks. If you want to learn more ins and outs of incrementals and more, join me for the upcoming remote ODI for BIApps course on March 14th-16th. We have only a few spots left so sign up here.

The post Becky’s BI Apps Corner: Incrementals and Future dated Employee records appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Data Integration Tips: Oracle Data Integrator – Quotes in Variables

Sun, 2016-02-21 20:37

It’s Sunday night, and we have just enough time for another quick Data Integration Tip from Rittman Mead. In another recent, yet somewhat trivial, challenge I ran into this week was the ability to pass quotations through to a Scenario as a part of an ODI 11g Variable (this is also applicable to ODI 12c). Let’s get right into the example.

To set the stage, I’m working on project that includes integration with Hyperion Planning. This specific task is a command line call to the CalcMgrCmdLineLauncher.cmd to execute a specific business rule against the data in the Planning application. In this case, our business rule name contains spaces and, for various reasons, the name cannot be changed to remove the spaces. The ODI 11g project is setup to run many of these planning business rules by calling a generic Scenario, based on a Package, passing variables to make it dynamic.

The command for the OdiStartScen ODI Tool in the calling Package looks like this:


As you can see, there are double quotes around each of the parameters and their values being passed into the ODI Tool, including the Scenario name and version, the synchronous flag, and additional variables such as the PV_PLANNING_CALCMGR_RULE. What I found was that the Scenario had issues accepting the ODI variable value with spaces.

To work around this issue, I simply added an escape to the double quotes around the variable value with spaces. The double double quotes allow the ” to be passed through as a part of the variable string value.


Now, the Scenario UTIL_EXECUTE_CALCMGR can accept the variable value with spaces, and the necessary double quotes, and pass it on through to the CalcMgrCmdLineLauncher.cmd command.

So there you have it, another Data Integration Tip from the folks at Rittman Mead. Simple as it was, I hope you find it useful! Check back regularly for more DI Tips and have a great week!


The post Data Integration Tips: Oracle Data Integrator – Quotes in Variables appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Introduction to Oracle R Training

Thu, 2016-02-18 04:00
Rittman Mead is thrilled to announce our new Introduction to Oracle R courses!

With the launch of 12c comes an exciting extension to OBIEE’s analytical capabilities—embedded R execution. Oracle has made several investments in the R language and its related technologies (Oracle R, ROracle, ORAAH, and Oracle R Enterprise). Based on our experience deploying advanced analytics solutions from within business intelligence departments, we view this simplified pairing of OBIEE and R as a great way forward in delivering advanced analytics across organizations.

Oracle R

But the world of R is a big one, with hundreds of open-source R packages and countless training avenues. With so many confusing options on the market, we wanted to simplify things for you. It can be difficult to know where to start, so we’ve designed an R course to help you bridge the gap between existing BI/DW skills and the new skills required to confidently derive insights with R. We want to help you look beyond the dashboard and delve into novel analytical techniques.

The Advanced Analytics course dives deeply into the analytical capabilities of 12c and R. We’ve designed the training so that prior knowledge of R isn’t assumed. The course builds on existing SQL and data visualization skills—covering R use cases from predictive analytics and time series forecasting to natural language processing and many others.

Oracle R

We want to share with you our vision of R-enhanced business intelligence that provides more insights, predictions, and actionable discoveries, all while scaling effortlessly with your data. With our training, you’ll learn to expand your analytic skills and become an invaluable asset to your company’s BI operation.

We are excited about what R brings to the world of BI and can’t wait to share our knowledge.

We will follow up this blog post with one about the business value of our new Introduction to Oracle R course and another post about the more technical aspects of the course.

The post Introduction to Oracle R Training appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Data Integration Tips: ODI 12c Repository Query – Find the Mapping Target Table

Mon, 2016-02-15 04:00

Well, it’s been awhile since I’ve posted one of our Rittman Mead Data Integration Tips, so I thought a recent challenge might be the next great candidate. I was working through the Kimball ETL Subsystems and the Error Event Schema, Subsystem 5 if you’re familiar with the methodology, and attempting to build the schema from Oracle Data Integrator 12c (ODI 12c) metadata tables. If you caught my presentation “A Walk Through the Kimball ETL Subsystems with Oracle Data Integration” at Oracle OpenWorld 2015, then you’ll know a bit more about where I’m coming from. You can still catch my presentation on this topic at both IOUG Collaborate16 and ODTUG KScope16 (being invited to speak at each of these events is always an honor). Now this Data Integration Tip isn’t solely related to the Kimball ETL Subsystems, though the solution did prove rather useful for my presentation (and upcoming article in the IOUG Select Journal). It’s actually an interesting twist in how Oracle Data Integrator mapping metadata is stored differently between the ODI 11g and ODI 12c repositories.

The challenge is to find the target table, or Datastore, for a given Mapping in ODI 12c, or Interface in ODI 11g. When I say “find”, I mean query the ODI work repository tables and return the target table, or tables in 12c, for a given Mapping. Luckily, I’m equipped with some guidance from our friends at My Oracle Support. If you look at support document Oracle Data Integrator 11g and 12c Repository Description (Doc ID 1903225.1) you’ll find the data dictionaries for both ODI11g ( and ODI12c (12.1.2, 12.1.3, & 12.2.1). These are invaluable resources from Oracle – though the may be works in progress and somewhat incomplete!

Let’s take a look first at ODI 11g and how simple things used to be. Back when Interfaces were the mechanism for extracting, transforming, and loading, we were allowed only 1 single target Datastore. Ahh, those were the good ‘ol days! Digging into the repository we really only had one place to look for the target table – SNP_POP. This table in the ODI 11g repository, SNP_POP (which essentially stands for Synopsis – Populate), contains a column I_TABLE. This identifying column represents the target table for that particular Interface. Here’s a query that ties it all together.

    p.pop_name interface_name,
    t.table_name target_table,
    m.cod_mod model_code
from snp_pop p inner join snp_table t on p.i_table = t.i_table
    inner join snp_model m on t.i_mod = m.i_mod;

As you can see, the key to capturing the target table for an Interface is simply in the SNP_POP.I_TABLE column. Because there is only one target, we can easily figure it out.


Now, ODI 12c is where the real challenge lies. As you may know, with the move from Interfaces in 11g to flow based Mappings in 12c, we were allowed to do new and exciting things, such as load multiple target tables from a single Mapping. We may also have a case where a Datastore component maps to a Filter component, which then maps to another Datastore component, etc. As you can see in the image below, we can have lots of tables, and lots of tables that may be sources or targets, but we’re only interested in the final target table (or tables, for that matter!).


Ok, so let’s dig into the Work Repository now. It seems an ODI Mapping can’t be that much more difficult than an Interface, right? Well…

First, there are quite a few tables related to the Mapping itself.


Whoa…we’ve got some work to do. Lucky for you, I’ve already done the work. Let’s look at how it all fits together. We can start with the Mapping (SNP_MAPPING) table. We also have different components on the Mapping, such as Lookups, etc, so we can join in the table SNP_MAP_COMP as well. Here’s the information we’ll be able to see with that simple join.

select ...
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping


That’s interesting, we’ve captured all components in the mapping. But there are still quite a lot of non-targets here. Ok, maybe if we add the connection points for each component we can find the input and output for each. Components each have an input connector point, allowing an output from a different component to flow into it.

select ...
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping
    inner join snp_map_cp cp on mc.i_map_comp = cp.i_owner_map_comp

That just added the connection point information for each component and whether it is an INPUT or OUTPUT. Not extremely useful by itself, so let’s dig a bit deeper. How about adding the SNP_MAP_REF table? This table seems to contain a reference to all types of other attributes about the Mapping and its Components. We also need to consider that Datastores, just as any other Component, will have both an input and output. Right now, the dataset shows both the input and output connectors for each Component. Let’s remove the input connection points to limit our result set.

select ...
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping
    inner join snp_map_cp cp on mc.i_map_comp = cp.i_owner_map_comp
    inner join snp_map_ref mr on mc.i_map_ref = mr.i_map_ref
where cp.direction = 'O' --output connection point only

Joining the reference table has now allowed us to focus on the Datastores in the Mapping and their OUTPUT connection point. What I really want is to see only the Datastores that do not have their OUTPUT connection point connected to an INPUT connector. Therefore, if the OUTPUT is empty, it must be the target table!

select ...
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping
    inner join snp_map_cp cp on mc.i_map_comp = cp.i_owner_map_comp
    inner join snp_map_ref mr on mc.i_map_ref = mr.i_map_ref
where cp.direction = ‘O' and --output connection point only.
    cp.i_map_cp not in
        (select i_start_map_cp from snp_map_conn) --not a starting connection point

The SNP_MAP_CONN, which stores the mapping connections, will allow me to limit the query to the components that only have an output, but not an input. The connections table will contain all component connections in the ODI 12c mappings. Here’s what we get as a result.

mapping target tables found

Hey, now we’re onto something here. In fact, this is what I was looking for! Target table(s) in a single Mapping in ODI12c. Not quite as simple as ODI11g, but with a bit of SQL and understanding of the repository tables, you can do it. Here’s the final query again, joining in the SNP_TABLE & SNP_MODEL tables to complete the dataset.

select mapping_name,
    mr.qualified_name, datastore_alias,
    t.table_name target_table,
    mdl.cod_mod model_code
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping
    inner join snp_map_cp cp on mc.i_map_comp = cp.i_owner_map_comp
    inner join snp_map_ref mr on mc.i_map_ref = mr.i_map_ref
    inner join snp_table t on mr.i_ref_id = t.i_table
    inner join snp_model mdl on t.i_mod = mdl.i_mod
where cp.direction = 'O' and --output connection point
    cp.i_map_cp not in
        (select i_start_map_cp from snp_map_conn) --not a starting connection point

Please let me know if you find this Data Integration Tip useful or if you have a better way of accessing the target table in a mapping. One of my Rittman Mead colleagues asked, “why not just use the ODI Java API?”. For accessing the ODI repository, I do prefer using some Groovy script and the API. But in this case, I’m interested in building out a dimensional schema and writing ETL to load the dimensions and facts, which lends itself to SQL rather than Groovy script.

As always, if you’re team needs help around Oracle Data Integrator, or Oracle Data Integration Solutions in general, drop us a line at Or feel free to reach out to me directly via email ( or Twitter (@mRainey). Cheers!

The post Data Integration Tips: ODI 12c Repository Query – Find the Mapping Target Table appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

OBIEE 12c – Repository Password Corruption Issue

Thu, 2016-02-11 10:00

Here at Rittman Mead we’ve been working with OBIEE 12c for some time now, as part of the beta programme and more recently with clients looking to get the most out an upgrade to OBIEE 12c. We’ve also been hard at work on our brand new OBIEE 12c training course. What we’ve seen in terms of the stability of OBIEE 12c has been pleasantly surprising. Anyone who’s worked with software long enough will be familiar with the reputation that first releases in general have for nasty bugs, and it’s probably fair to say that with the first release of 11g ( this was proven out. With the first release of OBIEE 12c, however, we’re seeing a stable tool with very few issues so far.

That said…I’m going to demonstrate an issue here that is a bit of a nasty one. It’s nasty because the trigger for it appears innocuous, and what it breaks is one of the really new things in OBIEE 12c—the way in which the RPD is stored on disk and accessed by the BI Server. This makes it a bit of a tough one to get to the bottom of at first, but it’s a good excuse to go digging!


If you open the RPD in online mode (use File –> Copy As and then use the Save option), the password on the server gets corrupted.

[nQSError: 43113] Message returned from OBIS  
[nQSError: 13042] Repository password is wrong

obiee 12c repository password corruption

From this point on you cannot checkin any changes, and when you restart the BI Server it will fail to start up.


In OBIEE (12c, and before) it is possible to open the RPD as a straight binary file on disk (“Offline” mode), or by connecting directly to the BI Server and opening the copy that it is currently running (“Online mode”). Offline mode suits larger changes and development, with the RPD then being deployed onto the server once the development work is ready to be tested. Online mode is a good way for making changes on a dedicated dev server, minor changes on a shared server, or indeed just for viewing the RPD that’s currently being run.

Here’s what we’ve seen as the problem:

  1. Open RPD in online mode
  2. File -> Copy As
  3. Enter a password with which to protect the RPD being saved on disk.
  4. Do one of:
    • File -> Close, and then when prompted to save changes click Yes
    • File -> Save
    • Click the Save icon on the toolbar
    • Ctrl-S

obiee 12c repository password error

What happens now is two-fold:

  1. You cannot check in any changes made online—the check in fails with an error from the Administration Tool:
    [nQSError: 43113] Message returned from OBIS  
    [nQSError: 13042] Repository password is wrong
  2. The BI Server will fail on restart with the same error:
    Opening latest versioned cached RPD for : /app/oracle/biee/bi/bifoundation/server/empty.rpd which is /app/oracle/biee/user_projects/domains/bi/bidata/service_instances/ssi/metadata/datamodel/customizations/liverpd.rpd_5  
    [nQSError: 13042] Repository password is wrong. [[

We saw this on SampleApp v511, as well as on vanilla installations of OBIEE. Versions on both were

After we reported this to Oracle, they agreed it was a bug and have logged it as bug number 22682937, with no patch currently (February 10, 2016) available.


If you open the RPD online and use File -> Copy As, don’t hit save or check in, even if prompted by the Admin Tool. Close the RPD straightaway.

Often people will use File -> Copy As to take a copy of the current live RPD before doing some changes to it. At Rittman Mead, we’d always recommend using source control such as git to store all code including the RPD, and using this approach you obviate the need to open the RPD online simply to get the latest copy (because the latest copy is in source control).

You can also use the data-model-cmd downloadrpd option to download the actual live RPD—that’s exactly what this option is provided for.

Solution – if BI Server (OBIS) has not yet been restarted

If you’ve hit this bug and are hitting “Repository password is wrong” when you try to checkin, and if the BI Server is still running, then redeploy the RPD using the data-model-cmd uploadrpd tool. By redeploying the RPD the password appears to get sorted out.

If the BI Server is down, then this is not an option because it has to be running in order for data-model-cmd uploadrpd to work.

Solution – if BI Server (OBIS) has been restarted and failed

At this point using data-model-cmd uploadrpd is not possible because OBIS is not running and so the data-model-cmd uploadrpd will fail with the error:

[oracle@demo ~]$ /app/oracle/biee/user_projects/domains/bi/bitools/bin/ uploadrpd -I /home/oracle/rmoff.rpd -W Password01 -U weblogic -P Admin123 -SI ssi  
Service Instance: ssi

Operation failed.  
An exception occurred during execution, please check server logs.

The only option from this point is to use importServiceInstance to reset the service instance, either to an empty, SampleAppLite, or an existing .bar export of your environment. For example:


This will enable OBIS to start up correctly, from which point the desired RPD can then be re-uploaded if required using data-model-cmd uploadrpd.


The easiest thing is to simply not use File -> Copy As in online mode. Whilst this on its own is fine, the UI means it’s easy to accidentally use the Save option, which then triggers this problem. Instead, use data-model-cmd downloadrpd, and/or use source control so that you can easily identify the latest RPD that you want to develop against.

If you do hit this repository password corruption problem, then keep calm and don’t restart the BI Server—just re-upload the RPD using data-model-cmd uploadrpd. If you have already uploaded the RPD, then you need to use importServiceInstance to restore things to a working state.

As part of the diagnostics that we did to get to the bottom of this issue, we found some interesting things in OBIEE 12c, such as a web service endpoint for RPD upload/download, as well as the detailed workings of the RPD upload process and that infamous liverpd.rpd file. Stay tuned for a blog post on this and more soon! And in the meantime, be sure to get in touch with us to discuss how we can help you with your OBIEE systems, including OBIEE 12c upgrade, and OBIEE 12c training.

The post OBIEE 12c – Repository Password Corruption Issue appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Implementing OBIEE Commentary

Thu, 2016-02-11 04:00
My Application Development Experience with Rittman Mead

Greetings, readers! My name is Nick Padgett, and this is my first blog post with Rittman Mead. I started with Rittman Mead as an intern over a year ago while I was attending Kennesaw State, a local university, but I’ve graduated from both school and the internship. Since then I’ve been working on several interesting applications. Today, I would like to take some time to discuss developing one such application, our OBIEE Commentary tool, as well as my learning experiences working at Rittman Mead.

When I first started with Rittman Mead in October 2014, I was beginning my final year in a Computer Science program. When I joined, I knew very little about Business Intelligence but was eager to learn everything I could. Learning BI while still in school created a powerful synergy between my class studies and the real world. As I learned web development and security in class, I learned about data warehouses and agile development at work. In addition, I had the opportunity to practice my new knowledge and skills on a daily basis.

The most important lesson I learned during my time as an intern was the importance of developing primarily to meet the user requirements. This subject was often ignored in school, except for the obligatory Software Engineering course. But now, as a formal developer, I was always being reminded of the users we were developing for. All features for all projects were first qualified by asking “Is this what the user wants or needs?” or “Will this add value for the users?”

The first project I was assigned was the Rittman Mead commentary tool. As I assisted in development, the team always looked back to the initial requirements: “Users want commentary in OBIEE.” All development was focused in that direction, and any feature which hindered the goal, despite the technical impressiveness, was removed from the application.

Now, after spending many hours developing the commentary tool, the team is confident it has effectively answered the primary user requirements. We’re all very excited to provide this tool to our customers, and we are confident it will meet all the requirements we spent so long accumulating and developing for.

The Primary Use-case: OBIEE Commentary

In a typical environment, users are required to leave the application to communicate with their peers. This often involves screenshots, long text descriptions of the subject, and a service such as email or Slack. However, forcing users to leave the application in order to do their job is less than optimum. What’s to keep them using the expensive tool you purchased for them if they have to leave it every few minutes?

Many organizations recognize this issue and seek to implement commentary themselves. Frequent approaches typically use either a text object on a dashboard, a write-back enabled input, or even strange iFrame solutions to achieve the desired capabilities. Obviously, these implementations have serious drawbacks, and may not be maintainable or feasible for many organizations.

The commentary tool we have developed uses none of the above approaches, but instead uses a standard web deployment, hosted in WebLogic, to provide similar capabilities. This means no RPD modifications, no requirement for dashboard designers to become involved, and no security threats with iFrames or Cross-Site Scripting (XSS) attacks, while still fulfilling the primary use case of adding commentary to your OBIEE dashboards.

obiee 12c commentary

Our tool allows for users to create several different types of comments, all displayed through standard HTML in a web browser. Comments can be added on a dashboard to provide commentary for the page as a whole, while other comments can be placed on reports, which will be shown anywhere the report is located. You can even submit replies to existing comments, allowing for comment threads over a particular subject. All of these comments are available for use without having to leave the application.

A Second Use-case: Documentation

Some comments are highly structured and curated, containing a wealth of information. Some users like to add comments explaining how a measure is calculated, or a post-mortem for the launch of a new product. These comments are best classified as documentation, as they don’t serve to engage in a conversation, but to present information. In addition, there may be formal documentation that is required to be on a dashboard, and relates directly to a series of reports. The most common way to provide this is to supply links to external applications specifically suited for documentation. Among a myriad of issues pertaining to accessibility concerns, this approach forces users to leave the application, some of whom may never return.

As an alternative, documentation can be placed directly on a dashboard or report, but this leads to design problems. Having a twenty-page document on a dashboard is hardly an acceptable solution, even though this is the best place to put your documentation. In addition, users will have to write their documentation in HTML, or at least be able to understand it.

obiee 12c documentation tool

The commentary tool also allows for this use case. Documentation can be created or edited using a WYSIWYG (What You See Is What You Get) editor, similar to any standard document editor, and added to a “Table of Contents” on a dashboard. The Table of Contents lists all current documents available to view and displays them in a dedicated area. This provides access to any pertinent content directly on a dashboard, rather than forcing users to migrate to an external application, which is the entire point for having native commentary in the first place.

A Third Use-case: Integration

A common concern may be the introduction of yet another channel for communication. For example, many organizations use Slack as a communication tool or Atlassian Confluence for documentation. Rittman Mead does not desire to replace these tools or make them obsolete to your organization. We acknowledge the fact you spend money on these applications, and they may be widely adopted within your organization.

obiee commenting tool

Instead of forcing you to maintain several channels of communication, our commentary tool allows integrations with applications such as JIRA and Confluence. Comments can be submitted as JIRA issues, and documents can be synchronized with Confluence pages. Several other integrations are available by default, while additional implementations are available on a per-customer basis. Now, if users absolutely must leave the application, they can do it on their own terms, with the convenience of a few clicks.

As I have grown during my time here with Rittman Mead, so has our commentary application. While I learned how to cater to user requirements, the tool evolved into an extension of this knowledge and was developed with the intention to meet all customer needs. I am incredibly excited to see this product released soon, and I can’t wait for users to start getting more out of their BI applications.

More information on this tool will be coming very soon, so keep an eye on our blog and website for updates!

The post Implementing OBIEE Commentary appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing

Using Python to ‘Wrangle’ Your Messy Data

Mon, 2016-02-08 00:00
 or How to Organize Your Comic Book Collection Based on Issue Popularity

In addition to being a product manager at Rittman Mead, I consider myself to be a nerd of the highest order. My love of comic books, fantasy, sci-fi and general geekery began long before the word ‘Stark’ had anything to do with Robert Downey Jr or memes about the impending winter. As such, any chance to incorporate my hobbies into my work is a welcomed opportunity. For my next few blog entries, I’ve decided to construct a predictive classification model using comic book sales data whose eventual goal will be to build a model that can accurately predict whether a comic will rocket off the shelves or if it will be a sales dud. The first blog of the series shows some of the pitfalls that can come up when preparing your data for analysis. Data preparation, or data wrangling as it has come to be known, is an imperfect process that usually takes multiple iterations of transformation, evaluation and refactoring before the data is “clean” enough for analysis.

While the steps involved in data wrangling vary based on the state and availability of the raw data, for this blog I have chosen to focus on the gathering of data from disparate sources, the enrichment of that data by merging their attributes and the restructuring of it to facilitate analysis. Comic book sales data is readily available on the interwebs, however finding that data in a usable format proved to be a more difficult task. In the end, I had to resort to dreaded process of screen scraping the data from a comic research site. For those of you who are lucky enough be unfamiliar with it, screen scraping is the process of programmatically downloading HTML data and stripping away that formatting to make it suitable for use. This is generally used as a last resort because web sites are volatile creatures that are prone to change their appearance as often as my teenage kids do preparing to leave the house. However, for the purposes of this series, as my friend Melvin the handyman would often say, “We works with what we gots.”

blog-ironman-pythonexclamation-point-icon-30522This leads us to the first issue you may run into while wrangling your data. You have access to lots of data but it’s not pretty. So make it pretty.  Working with raw data is like being a sculptor working with wood. Your job is not to change the core composition of the data to suit your purposes but to chip away at the excess to reveal what was there all along, a beautiful horse… er I mean insight. Sorry, I got lost in my analogy.  Actually to expand on this analogy a bit, the first tool I pulled out of my toolbox for this project was Python, the Leatherman of  programming languages.  Python is fast, plays well with other technologies and most importantly in this case, Python is ubiquitous. Used for tasks ranging from process automation and ETL to gaming and academic pursuits, Python is truly a multipurpose tool. As such, if you have a functional need, chances are there is a native module or someone has already written a public library to perform that function.  In my case, I needed some scripts to “scrape” HTML tables containing comic sales data and combine that data with other comic data that I retrieved elsewhere. The “other” data is metadata about each of the issues. Metadata is just data about data. In this case, information about who authored it, how it was sold, when it was published, etc… More on that later.  

blog-sales-tableLuckily for me, the format of the data I was scraping was tabular, so extracting the data and transforming it into Python objects was a relatively simple matter of iterating through the table rows and binding each table column to the designated Python object field. There was still a lot of unnecessary content on the page that needs to be ignored, like the titles and all of the other structural tags, but once I found the specific table holding the data, I was able to isolate it. At that point, I wrote the objects to to a CSV file, to make the data easy to transport and to facilitate usability by other languages and/or processes.

The heavy lifting in this process was performed by three different Python modules: urllib2, bs4 and csv. Urllib2, as the name implies, provides functions to open URLs. In this case, I found a site that hosted a page containing the estimated issue sales for every month going back to the early 1990’s. To extract each month without manually updating the hardcoded URL over and over, I created a script that accepted MONTH and YEAR as arguments,

The response from the urlopen(url) function call was the full HTML code that is typically rendered by a web browser. In that format, it does me very little good, so I needed to employ a parser to extract the data from the HTML. In this context, a parser is a program that is used to read in a specific document format, break it down into its constituent parts while preserving the established relationships between those parts, and then finally provide a means to selectively access said parts. So an HTML parser would allow me to easily access all the <TD> column tags for a specific table within an HTML document. For my purposes, I chose BeautifulSoup, or bs4.

BeautifulSoup provided search functions that I used to find the specific HTML table containing the sales data and loop through each row, while using the column values to populate a Python object.


This Python object, named data, contains fields populated with data from different sources. The year and month are populated using the arguments passed to the module. The format field is dynamically set based on logic related to the rankings and the remaining fields are set based on their source’s position in the HTML table. As you can see, there is a lot of hard coded logic that would need to be updated, should the scraped site change their format. However, for now this logic gets it done.

The final step of this task was to write those Python objects to a CSV file. The python module, CSV, provides the function writerow(), which accepts an array as a parameter and writes each of the array elements as columns in the CSV.

My first pass raised the an exception because the title field contained unicode characters that the CSV writer could not handle.

To rectify this, I had to add a check for unicode and encoded the content as UTF-8. Unicode and UTF-8 are character encodings; meaning they provide a map computers use to identify characters. This includes alphabets and logographic symbols from different languages as well as common symbols like ®.


Additionally, there was the matter of reformatting the values of some of the numeric fields to allow math to be performed on them later(ie stripping ‘$’ and commas). Other than that, the data load went pretty smoothly. A file named (MONTH)_(YEAR).CSV was generated for each month. Each file turned out like so:


While this generated tens of thousands of rows of comic sales data, it was not enough. Rather, it had the volume but not the breadth of information I needed. In order to make an accurate prediction, I needed to feed more variables to the model than just the comic’s title, issue number, and price. The publisher was not relevant as I decided to limit this exercise to only Marvel comics and passing in the the estimated sales would be cheating, as rank is based on sales. So to enhance my dataset, I pulled metadata about each of the issues down from “the Cloud” using Marvel’s Developer API. Thankfully, since the API is a web service, there was no need for screen scraping.

exclamation-point-icon-30522Retrieving and joining this data was not as easy as one might think. My biggest problem was that the issue titles from the scraped source were not an exact match to the titles stored in the Marvel database. For example, the scraped dataset lists one title as ‘All New All Different Avengers”. Using their API to search the Marvel database with that title retrieved no results. Eventually, I was able to manually find it in their database listed as “All-New All-Different Avengers”. In other cases, there were extra words like “The Superior Foes of Spider-Man” vs “Superior Foes of Spider-Man”. So in order to perform a lookup by name, I needed to know the titles as they expected them. To do this I decided to pull a list of all the series titles whose metadata was modified during the timeframes for which I had sales data. Again, I ran into a roadblock. The Marvel API only allows you to retrieve up to 100 results per request and Marvel has published thousands of titles. To get around this I had to perform incremental pulls, segmented alphabetically. 


Even then there were a few minor issues, as some letters like ‘S’ had more than 100 titles. To get around that I had to pull the list for ‘S’ titles sorted ascending and descending then combine the results, making sure to remove duplicates. So my advice on this one is be sure to read up on the limitations of any API you are using. It may enforce limits but you may be able to work around the limits using creative querying.


At this point I have my list of Marvel series titles, stored in some CSV files that I eventually combined into a single file, MarvelSeriesList.csv, for ease of use. Actually, I have more than that. While retrieving the series titles, I also pulled down the ID for the series and an appropriateness rating. Searching the API by ID will be much more accurate than name and the appropriateness  rating may be useful when building out the prediction model. The next step was to iterate through each row of the CSVs we created from the sales data, find the matching ID from MarvelSeriesList.csv and use that ID to retrieve its metadata using the API.

exclamation-point-icon-30522If you remember, the point of doing that last step was that the titles stored in the sales data files don’t match the titles in the API, so I needed to find a way to join the two sources. Rather than writing cases to handle each of the scenarios (e.g. mismatched punctuation, extra filler words), I looked for a python library to perform some fuzzy matching. What I found was a extremely useful library called, Fuzzy Wuzzy. Fuzzy Wuzzy provides a function called extractOne() that allows you to pass in a term and compare it with an array of values. The extractOne() function will then return the term in the array that has the highest match percentage. Additionally, you can specify a lower bound for acceptable matches (ie. only return result where the match is >= 90%).

Again, it took a few passes to get the configuration to work effectively. The first time through about only about 65% of the titles in the sales file found a match. That was throwing away too much data for my liking so I had to look at the exceptions and figure out why the matches were falling through. One issue that I found was titles that tacked on the publication years in the Marvel database, like “All-New X-Men (2012)”, had a match score in the 80’s when matched against a sales title like, “All New X-Men”. This was a pretty consistent issue, so rather than lowering the match percentage, which could introduce some legitimate mismatches, I decided to strip the year, if present, on mismatches and run it through that matching process again. This got me almost there. The only other issue I ran into was Fuzzy Wuzzy had trouble matching acronyms/acrostics. So ‘S.H.E.I.L.D.’  had a match score in the 50’s when matching ‘SHIELD’. That’s because half the characters (periods) were missing. Since  there were only two titles affected, I built a lookup dictionary of special cases that needed to be translated. For the purposes of this exercise, I would still have had enough matches to skip that step, but by doing it brought us up to 100% matching between the two sources. Once the matching function was working, I pulled out urllib2 and retrieved all the metadata I could for each of the issues.

The resulting files contained not only sales data (title, issue number, month rank, estimated sales), but information about the creative team, issue descriptions, characters, release dates and  associated story arcs. This would be enough to get us started with building our predictive classification model.
blog-csv-all That being said, there was still a lot of structural rearranging required to make it ready for the type of analysis I wanted to do, but we will deal with that in the next blog. Hopefully,  you picked up some useful tips on how to combine data from different sources or at the very least found solace in knowing that while you may not be the coolest person in the world, somewhere out there is a grown man who still likes to read comic books enough to write a blog about it. Be sure to tune in next week, True Believers, as we delve into The Mysteries of R!

The post Using Python to ‘Wrangle’ Your Messy Data appeared first on Rittman Mead Consulting.

Categories: BI & Warehousing