Skip navigation.

Pythian Group

Syndicate content
Love Your Data
Updated: 9 hours 37 min ago

Pythian’s Velocity of Innovation Event Gives Attendees a View Into the Future of IT

Fri, 2015-10-02 12:20

This summer I had the opportunity to moderate a discussion that brought together IT’s most innovative thought leaders with some of North America’s top CIOs. And we gave the CIOs the chance to ask the panelists anything. Here’s what happened.

When three of the most respected information technology thought leaders came together with 20 of North America’s top CIOs for an hour and a half in New York City this summer, the result was a compelling discussion about the future of IT.

Pythian supplied the panel of experts at the Velocity of Innovation CIO event: Paul Vallé, Pythian’s founder and CEO; Gene Leganza, vice-president, principal analyst serving enterprise architecture professionals at Forrester Research; and Otto Toth, CTO at Huffington Post. And the attendees supplied the questions, covering an impressive range of topics.

More than an exchange between panel members, the 90-minute session fully engaged the audience in a discussion about everything from agile software development and the importance of being a data-driven business to such forward-looking matters as artificial intelligence and virtual reality.

This blog post is the first of a series that focuses on topics covered in the Velocity of Innovation discussions in New York, and concentrates on the first question addressed: putting innovation and agility into practice.

Question: Innovation and agility are common buzz words to describe what everybody wants. What have you explicitly done to get better at innovating or to make your organization or your clients’ organizations more agile? And how has this push for agility impacted either your enterprise architecture or your enterprise culture?

Otto Toth: Innovating quickly is not always in the best interest of the business, or it may not be easy to do properly. It can actually work against the business. Instead of moving faster, being very agile can slow down everything.

Everybody has a different perception about what agile should be and we have our own definition. We started to build our own CMS three years ago. We started very agile, building everything simultaneously. Later, we decided to use Scala as the programming language, but it’s hard to find people who are educated in this language.

A year ago, I changed the process a little bit. Now we break down the process into small pieces and modules. After three years in development, we finally shipped the first module of our CMS. This is an interesting story about how big agility can slow down the whole process.

Most users don’t really know what they want. Too many decision-makers just slow down the process. It’s better to have a few selective people or a core team who make the decisions and dream about features that are not available.

Gene Leganza: What makes an enterprise agile is being able to sense what’s going on in the marketplace and responding to it. What’s happening in your marketplace is about being able to deal with data. The ability to understand everything that everybody is doing and everything their friends are doing because of available data means you can be in tune with the marketplace as never before.

What does that mean for your organization? Data modelling and data integration are starting to migrate to the business side, where people are doing self-service analytics. If you try to keep up by jumping on some of this technology, you may shoot yourself in the foot. Instead of being on the leading edge, pay attention to what’s going on and when you figure out how to do it right, then do it as soon as possible. You avoid shooting yourself in the foot and you’re not that far behind the leaders of the dot-coms.

The flip side of agility is innovation. An interesting aspect of innovation is getting really hot talent into your environment. Getting the right talent and doing smart things and being leading edge are challenges. You have to figure out what level to drop in on, where you are in the industry. Are you a startup or are you a state organization that needs to be a fast follower?

Paul Vallé: Addressing what Otto Toth highlighted, tying up capital for three years is an enormous cost, in carrying costs, as well as competitive posture. Any solution you imagined three years ago has three years of technical debt when it launches. Three years is about the amount of time we expect the system to last before renewal. This is a key problem that the agile development methodology tries to tackle.

If you can create high bandwidth, you have access to a lot of knowledge and a lot of insights. Then you have low latency, and you’re able to respond to what you learned quickly and you’re able to inform your business decisions.

Your systems must be alive and you must continuously push them forward and tighten up your release cycles. You should try to release four or eight times a year, and eventually get to a point where you can do multiple releases per day, and even do per feature releases.

Per feature releases are ideal in an agile world. It gives you the high bandwidth, the low latency and feedback loops. And a per user feature enablement lets you create an A/B testing capability for new revolutions of your application. Moving to per user feature enablement and per feature releases is an enormous transformation and has everything to do with agility and velocity.

Audience comment: As a CTO manager, what resource do you use to be skeptical of smart engineers who want to build things from scratch?

Otto Toth: I have a very simple rule that if the service costs less than an engineer’s one-year salary, then don’t even think about developing it. Two weeks can turn into two years and 10 people and can result in nothing.

Gene Leganza: There’s a notion at the engineer level that agile means “There’s nothing between me and creating a solution. That’s what you hired me for and I have the smarts. And I don’t want any bureaucracy.” Then there’s agility at the enterprise level, which is about reducing risk and understanding how soon change can be in production.

What I tell people is it depends who you talk to. The higher up you go, the more people are going to be receptive to what improves the whole portfolio rather than one project.

And this is where architects come in, someone who has been hands-on, and has the credibility and knowledge to guide the organization more strategically.

Interested in being a part of a discussion like this one? VELOCITY OF INNOVATION is a series of thought-leadership events for senior IT management hosted by Pythian. Pythian invites leading IT innovators to participate in discussions about today’s disruptive technologies: big data, cloud, advanced analytics, DevOps, and more. These events are by invitation only.

If you are interested in attending an upcoming Velocity of Innovation event in a city near you, please contact To view our schedule of upcoming events visit out Velocity of Innovation page.

Categories: DBA Blogs

Pythian at Oracle Open World 2015

Fri, 2015-10-02 11:45


Are you excited for this year’s Oracle Open World 2015? Pythian will once again be attending and participating in a number of marquee presentations and events.

This event is an amazing opportunity to hear from the finest minds in the Oracle space from Pythian and across the industry. We will be sending our Oracle experts to deliver some compelling presentations, packed with best practices, expert tips and advice that you won’t want to miss (See schedule below). Make sure you take some time to connect with us, and engage with our sought-after Oracle experts.


Oak Table World

Don’t miss the popular mini conference – Oak Table World in San Francisco, October 26th & 27th. There is a stacked agenda for this years conference along with our very own CTO Alex Gorbachev on stage at 3PM October 27th.


Bloggers Meet-up

Once again, we are thrilled to announce that we will be co-hosting the Annual Bloggers Meetup, organized in partnership with the Oracle Technology Network on October 28th (Save the date!). Be on the lookout for a blog post from Alex Gorbachev, Pythian’s CTO, providing details including the location, opportunities to connect, and activities.


Contact Us

Should you wish to schedule time to connect with one of our experts at OOW reach out to Emilia (Partner Program Mgr/ and she will be happy to help set that up. Alternatively, catch all the action live from @pythianpartners while on site. This is the best way to get a hold of any of the Pythian team attending the event.

Attention Pythian Partners & clients, if you’re attending please reach out to us for details on social happenings you won’t want to miss!

See you there! #oow15 #pythianlife



Pythian Sessions at Oracle OpenWorld 2015

Alex Gorbachev

Session 1 – Bridging Oracle Database and Hadoop

Date/Time: October 25, 8:00 am – 8:45 am | Moscone South—303

Abstract: Modern big data solutions often incorporate Hadoop as one of the components and require the integration of Hadoop with other components including Oracle Database. This presentation explains how Hadoop integrates with Oracle products focusing specifically on the Oracle Database products. Explore various methods and tools available to move data between Oracle Database and Hadoop, how to transparently access data in Hadoop from Oracle Database, and review how other products, such as Oracle Business Intelligence Enterprise Edition and Oracle Data Integrator integrate with Hadoop.


Session 2 – Solving the “Dirty Secret” of Big Data with Oracle Big Data Preparation Cloud Service

Date/Time: October 27, 5:15 pm – 6:00 pm | Moscone South—310

Abstract: Big data promises many game-changing capabilities, but the “dirty secret” is that up to 90 percent of big data project time is spent preparing the data so that it can be usefully analyzed. Join this session to see a live demonstration of a new solution to this problem—Oracle Big Data Preparation Cloud Service. See how Oracle Big Data Preparation Cloud Service takes “noisy” data from a broad variety of sources in many different formats, both structured and unstructured, and uses sophisticated statistical, knowledge-based, and machine-learning algorithms to cleanse, enrich, and blend it into useful data streams, ready for further discovery, analysis, and reporting.


Oracle Multitenant: Customer Panel Date/Time: Wednesday, Oct 28, 1:45 p.m. | Moscone South—301Abstract: Attend this session to hear from some of Oracle’s leading customers about their successful deployments of Oracle Multitenant.

Session 3 – Oracle Multitenant: Customer Panel 

Date/Time: October 28, 1:45 p.m. | Moscone South—301

Abstract: Attend this session to hear from some of Oracle’s leading customers about their successful deployments of Oracle Multitenant.

Rene Antunez

Session 1 – Oracle Exadata 101: My First 100 Days with Oracle Exadata 

Date/Time: October 25, 3:30 pm – 4:15 pm | Moscone West—3011

Abstract: The biggest headline at the 2009 Oracle OpenWorld was when Oracle’s Larry Ellison announced that Oracle was entering the hardware business with a prebuilt database machine, engineered by Oracle. Since then, businesses around the world have started to use these engineered systems. This beginner- to intermediate-level session takes you through my first 100 days of administering an Oracle Exadata machine and all the roadblocks and success I had along this new path.


Session 2 – Private Cloud Provisioning Using Oracle Enterprise Manager 12c

Date/Time: October 25, 12:00 pm – 12:45 pm | Moscone South—305

Abstract: With the newest version of Oracle Database 12c and its multitenant option, we are moving toward an era of provisioning databases to our clients faster than we ever could, even leaving out the DBA and allowing the developers and project leads to provision themselves the database that they need. This presentation guides you through the different ways you can provision data from one Oracle Database to another using Oracle Enterprise Manager 12c.


Session 3 – Oracle Recovery Manager 12c>: It’s Alive and All About Recovery, Recovery, Recovery 

Date/Time: October 25, 9:00 am – 9:45 am | Moscone South—304

Abstract: Oracle Recovery Manager (Oracle RMAN) has evolved since being released in Oracle 8i Database. With the newest version of Oracle Database 12c, Oracle RMAN has great new features that allow you to reduce your downtime in case of a disaster. In this session, learn about the new features that were introduced in Oracle Database 12c regarding recovery and configuration tips, and go over several demos so that if you are ever faced with this adversity, you will come out on top.


Session 4 – Database as a Service: What Is It and How Do You Use It? A Q&A with an Expert Panel 

Date/Time: October 29, 9:30 am – 10:15 am | Moscone South—102

Abstract: What is the best method for delivering, setting up, and using database as a service (DBaaS) utilizing Oracle software, hardware, or the cloud? Within a company (private cloud) there are several ways, outside of a company (public cloud) there are additional ways, or you can use a combination of both. For IT, Oracle Database, and technology professionals, platform as a service (PaaS) and DBaaS are technologies you need to know about and use to stay current in your profession. Attend this session to learn about technology and deployment choices available using Oracle technology and solutions. Subject matter experts from Oracle OpenWorld and Collaborate answer your questions so you can architect, plan, and implement a successful DBaaS project.


Christo Kutrovsky

Session 1 – Oracle Real Application Clusters Solving Common Scalability Problems 

Date/Time: October 25, 8:00 am – 8:45 am | Moscone South—304

Abstract: In this session, explore the promise of near-linear scalability that Oracle Real Application Clusters (Oracle RAC) makes. This session investigates several common problems in detail and presents solutions that are transparent to the application. Cache fusion contention, write-write contention, application partitioning and indexing are some of the topics discussed. Specific examples and their solutions are provided as well as performance comparisons.


Maris Elsins

Session 1 – Mining Automatic Workload Repository: Alternative Methods for Identification of the Top SQLs 

Date/Time: October 25, 1:30 pm – 2:15 pm | Moscone South—308

Abstract: We can use reports from the Automatic Workload Repository feature to quickly identify the SQL_IDs of the top statements in the database. But what if the Automatic Workload Repository report reveals no “low-hanging fruit,” and the resource usage is evenly distributed among multiple statements? Where do we start? Is there a better way to identify the starting point for the tuning of a resource-bound system? This session explains when the Automatic Workload Repository reports are misleading and how to look at stored data from a different angle to determine the top consumers. This session includes a practical demonstration using scripts for Automatic Workload Repository mining that attendees can apply to their own challenging database performance tuning problems.


Nelson Calero

Session 1 – Oracle Exadata Maintenance Tasks 101 

Date/Time: October 25, 1:30 pm – 2:15 pm | Moscone West—3011

Abstract: You have been working as a DBA for years and are planning/starting to support an Oracle Exadata installation. There are several planned and unplanned maintenance tasks that you will face, each one involving many details. Don’t wait to master them through practice—in this session, the speaker shares their experience with Oracle Exadata installations during the last few years, and shares details of the most common maintenance tasks and best practices to make better use of your resources. This includes setting up ASR, replacing components (hard and flash drives, batteries, InfiniBand switches, and more), and patching.


Session 2 – Get More Out of Your Oracle Investment in Latin America

Date/Time: October 25, 2:30 pm – 3:15 pm | Moscone West—3010

Abstract:  Nelson Calero Database ConsultantOracle ACE The LAOUC is the Latin America user group community that has an affiliated group of about 1,000 professionals. It is a self-governing community of Oracle users who volunteer to achieve common goals and objectives connected with Oracle technology. The group’s purpose is to promote membership involvement, education, and influence. Come to this session to learn more.


Deiby Gómez

Session 1 – Oracle Database New Features 

Date/Time: Sunday, Oct 25, 11:00 a.m. | Moscone West—3011

Abstract: Oracle Database was released more than one year ago; however, a lot of new features are not well known by the community yet. In this session, learn what you have to know for each new feature in order to understand how they work. This session goes deep to give you enough information to go home talking about Oracle Database and feel prepared to welcome the new versions of Oracle Database with your new knowledge.



Discover more about our expertise in the world of Oracle.

Categories: DBA Blogs

Log Buffer #443: A Carnival of the Vanities for DBAs

Fri, 2015-10-02 10:45

This Log Buffer Edition finds and publishes blog posts from Oracle, SQL Server and MySQL.


  • SCRIPT execution errors when creating a DBaaS instance with local and cloud backups.
  • Designing Surrogate vs Natural Keys with Oracle SQL Developer.
  • EBS General Ledger – Accounting Hub Reporting Cloud Service.
  • Oracle Database Standard Edition 2 is available.
  • Disable “Build After Save at JDeveloper” 12c.

SQL Server:

  • Learn where to get the latest installation and learning resources for the ever-evolving components of the Hadoop ecosystem and how those components may complement Microsoft SQL Server common everyday tasks.
  • Creating Dashboards for Mobile Devices with Datazen.
  • Install SQL Server Management Studio with Web Installer without Internet Access.
  • Using wildcard to find patterns in strings going from the basics to some more complicated conditions.
  • A general way to maintain History tables.


  • Capture database traffic using the Performance Schema.
  • Become a MySQL DBA blog series – Database Indexing.
  • Prepping your MySQL indexes for a character set change.
  • HowTo: Retrieve Direct Messages From Twitter and Store Them in MySQL.
  • Using Docker to Visualize MySQL Performance Schema.


Learn more about Pythian’s expertise in Oracle SQL Server & MySQL.

Categories: DBA Blogs

My Sales Journey: #5

Thu, 2015-10-01 13:49

I am a little sad to let go of the Life of a newly minted Sales Professional title today but happy that I am starting to develop a bit of a patina. My Sales Journey reflects my state of mind right now and feels more appropriate given the fact that today marks the end of my first month at Pythian!

Last week we explored social media and where people are building their online habitats. Today, lets journey inside the minds of these people to see what they think. Let me explain.

As a sales professional for a managed service I talk to decision makers and stakeholders. My job is to engage them and hopefully they will want to take the conversation one step further. For all this to happen I need to know what their responsibilities, their pain points and their challenges are.

So today lets step inside the mind of a VP/Director/Executive:

They are responsible for ensuring the continual functioning of mission critical operations. They have to stay relevant while bridging the gap between business and technology. They are in charge of driving innovation and improving profitability. They are also tasked with delivering flexible strategies that enable business to deliver high quality/valuable services.

Now that I know this on a broad level I can be more effective with my message. This week has been all about creating/assembling content that answers to these challenges and responsibilities. These can come in the form of white papers, relevant marketing, team resumes, cost analysis and slide decks.

Know their industry and purchasing cycles. This is a research heavy task but pays off when you have a carefully curated list that you can refer to. It becomes a powerful weapon in your sales arsenal.

Sleuth enough information to determine initial feasibility. Your first contact is twice as likely to be successful if you have qualified the type of messaging with their current challenge.

As you can tell, lots of learning happened in the last week. I hope some of these tips will help you with your outreach. Stay tuned next week when we step in to the mind of a manager who may be lower on the rung but can be a key person to land your message in front of the decision makers.

Now, It is Friday and time to kick back and relax with a team lunch! My second one in 4 weeks! Yessirrrr, that is how we roll at Pythian. See you next week!


Categories: DBA Blogs

SQL On The Edge #3 – Azure Elastic Database Jobs

Wed, 2015-09-30 10:15

Hello and welcome to episode number three of SQL On the Edge! In this episode we’re going to cover Azure SQL elastic database jobs. In the last few months, Microsoft has put a lot of effort in augmenting the elastic capabilities of the Azure SQL database service. We’re not only talking about the ability to scale UP but in this case about the capability to scale OUT.

The investments made on this front have produced:

– Elastic database tools for easier .NET development against multiple Azure SQL databases.
– Elastic database pools for better resource management.
– Elastic database jobs for easier job execution against groups of databases.
– Elastic database query for easier integrated querying of groups of databases.

Elastic database jobs are interesting because for a long time there wasn’t a native way to run jobs against Azure SQL databases. The solution we used with many of our clients was to have either a SQL Server on a VM to use the SQL Agent scheduling capabilities or using the Windows scheduler from a Windows Server VM or using the Azure job scheduler. These options, while serviceable were not really optimal for the task.

Elastic database jobs provide capabilities that cover the gaps we had before:

– You can run a job against an entire database pool, a pre-defined shard set or a custom database list.
– Job execution is logged and history can be retrieved per database.
– Information can be collected and stored from the results of each job execution.

For now, elastic database jobs are provided as a customer hosted web solution and an accompanying SQL database. At the moment, Powershell is also required to access the full functionality as the Portal only exposes a small part of it.

Let’s go over the feature and the demo in the video, enjoy!


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

SQL Server: Migrations/Upgrades – 10 Things to Think About

Wed, 2015-09-30 10:04

Here at Pythian we have planned and implemented dozens of migrations and upgrades for our customers. In this blog post I am trying to summarize the 10 most important things to consider when migrating or upgrading a SQL Server environment, based on our lengthy experience.
If you need to migrate/upgrade your SQL Server environment to a new server/environment but you are afraid of having long outages, losing data or degrading performance, here are the things to consider:

  1. To Cloud or not to cloud: if you are considering moving to the cloud, you should research carefully around  the capabilities, constraints and limitations of the cloud solutions you are considering. For example, there are few things you could do on a “regular” server with a SQL Server instance installed that you cannot implement with Amazon RDS or Azure Database (DaaS). I am not posting any links here because things are changing very fast on the cloud currently, so you need to ensure you are up to date about this.
  2. Physical to Virtual: Often, when moving from a physical box with attached local disks to a VM environment with shared storage, SQL Server performance may degrade due to multiple potential factors. It is recommended to test performance of new environments prior to Production cutover and compare to the performance on the existing one. Following configuration Best Practices in all levels of the architecture is essential (please see more details in section 5).
  3. Minimal downtime during Production cutover: there are few methods to copy an instance and databases to the new environment without affecting exiting Production and then cutover with minimal downtime.
    The options range between transferring backups through portable media, setting up Database Mirroring or Log Shipping or simply backing up and restoring databases.
    The solution depends on few things such as:

    1. Current performance of your environment
    2. Database SLAs and how much downtime can be afforded
    3. Network bandwidth and capacity between existing and new environment (is there a connection between the environments at all?)
    4. Volume of data updates in your current environment
    5. Size of the databases
    6. Etc.
  4. Performance baseline comparison before and after: this is a step that is usually missed during migration and upgrade projects. Performance degradation when moving to a new environment may be a common issue but we need to understand what exactly are the new bottlenecks. For that reason, it’s important to record the performance baseline from the old environment before migrating and another baseline after migration (not right away, we need the instance to “warm up” for a while, say: a week). In comparing the two baselines, fixing the problem may be much easier and faster because it will give us an indication of what to look for.
  5. Best Practices: Microsoft and other providers or partners provide lists of Best Practices. We need to ensure we follow Best Practices in all levels: Hardware, storage, VM, OS and SQL Server. Some examples of Best Practices:
    – VMWare: VMWare Best Practices
    – Storage: Disk partition alignment for SQL Server
    – SQL Server: Best Practices list
  6. Testing phase: before implementing the real cutover, it is strongly recommended that you implement a test migration without affecting current Production environment or as required. This step may take longer and add time to the schedule but it will reduce or (hopefully) prevent issues that may occur during the cutover (i.e.: missing components, application not being able to connect to SQL Server, etc).
  7. Rollback and Plan B: when preparing for a migration or an upgrade of a critical Production environment, it’s always important to plan for a rollback in case something goes wrong. Even more important is to understand how much time a rollback would take, and the implications (technical and business wise).
  8. Upgrade plan: if you are planning to upgrade the SQL Server version or the application version at the same time as the migration is happening, you should ensure that all components can be upgraded ahead of time and decide if any database can stay in older compatibility level. There’s the Upgrade Advisor that can help you check things but there is never a 100% verification unless you test all part of the application(s).  Well.. If there’s not a lot of code running, you may be able to trace everything and have 100% verification of the code, but this rarely happens.
  9. HA and DR planning: When migrating to a new environment, it is probably time to redesign the HA and DR plan in advance. Some of the HA/DR SQL Server native solutions: Cluster, Always On Availability Groups (SQL 2012+), Database Mirroring (to be deprecated in future versions), Log Shipping, Replication, backups. There are other non-SQL Server solutions such as VM or storage mirroring as well as 3-rd party and cloud solutions. You can combine some of the HA/DR solutions together in order to meet higher SLAs as well.
  10. Refresh your monitoring abilities: since you are moving to a new environment, it’s probably also time to refresh your monitoring tools if required. A good monitoring tool will notify you about the right issues and on time, preferably one that can also do pro-active monitoring and help you prevent outages. It is important to be able to monitor hardware and storage, network, Virtual Machine (if applies), Operating System and SQL Server instance(s) so that you can always stay on top and understand in which layer there is a problem. You will need to make sure that someone gets the notifications and is able to connect on a timely manner based on your system’s SLAs to fix issues.


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

Import/Export Multiple SSIS Packages

Wed, 2015-09-30 09:39

While I was working on a migration project recently, I had the task of importing and then exporting multiple packages – about 120~ on to a new server. Usually, if there are less numbers of packages we can do this task manually but, in this case it was tedious and time consuming to import and then export it manually. Here’s where the utility DTUTIL comes in handy, this utility has many options that you can use according to your need.

What I used is very simple and easy to use. I created a two batch file using the below commands:

Exporting SSIS Packages to File:

dtutil /SQL ExportPackage /COPY FILE;C:\Packages\ExportPackage.dtsx

Importing SSIS Package to SQL:

dtutil /file C:\Packages\ImportPackage.dtsx /copy sql;ImportPackage

Find out more details and a comprehensive list of options.


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

Using Docker to Visualize MySQL Performance Schema

Wed, 2015-09-30 09:30

Last week, I was pleased to present at Percona Live Amsterdam 2015 regarding the importance of Performance Schema and how to begin approaching visualizing the data that is available to diagnose performance issues and fine tune your MySQL environment. You can download the slides from the Percona Live conference page.

The session highlighted using the ELK (Elasticsearch+Logstash+Kibana) stack to assist visualizing event data, in addition to graphite+graphana to visualize time-series data.

As many people who attend conferences are aware, the availability of internet access is sketchy at best. To combat this, the visualization stack that I created was strictly local utilizing Docker, specifically the Docker Toolbox.

The docker-compose.yml file that creates the stack is fairly straightforward:

  container_name: mysql_data
  image: tianon/true
    - /var/lib/mysql
    - /var/lib/mysql-files

  container_name: mysql
    - MYSQL_DATABASE=world
  build: mysql
    - mysql_data
    - "./mysql/config:/etc/mysql/conf.d"
    - "3306"
    - "3306:3306"

  container_name: sysbench
  build: ./sysbench
    - "mysql"

  container_name: elasticsearch_data
  image: tianon/true
    - /usr/share/elasticsearch/data

  container_name: elasticsearch
  build: elasticsearch
    - elasticsearch_data
    - "9200:9200"

  container_name: graphite
  image: kamon/grafana_graphite
    - "8000:80"
    - "8126:8126"

  container_name: logstash
  build: logstash
    - "./logstash/scripts:/opt/logstash/scripts"
    - mysql
    - graphite
    - "elasticsearch:es"
    - "9292:9292"

Since the session was about the upcoming changes to the Performance Schema in MySQL 5.7, I chose to use MySQL’s official 5.7 Docker image:

  container_name: mysql_data
  image: tianon/true
    - /var/lib/mysql
    - /var/lib/mysql-files

  container_name: mysql
    - MYSQL_DATABASE=world
  build: mysql
    - mysql_data
    - "./mysql/config:/etc/mysql/conf.d"
    - "3306"
    - "3306:3306"

Pay attention to two aspects of this output.

First, I am creating a mysql_data container. Using a Data Volume Container is highly recommended when working with any data that should be persisted and not baked into the Docker image.

Second, I am building the container using the `build` command. Throughout the entire docker-compose.yml file, I am adding additional functionality to base containers provided by the Docker Hub. For the MySQL container, the Dockerfile looks like this:

FROM mysql:5.7

ADD world_innodb.sql /docker-entrypoint-initdb.d/world.sql

CMD ["mysqld"]

The rest of the docker-compose file follows similar guidelines, and I won’t explain each aspect here. You can find the Dockerfiles for each component of the stack in the dtest/visualize-mysql github repository.

To bring the stack up, the steps are straightforward:

  1. Install Docker Toolbox on your machine
  2. Create the docker-machine:
    $ docker-machine create -d virtualbox \
    --virtualbox-memory "2048" --virtualbox-cpu-count "2" \
    $ eval $(docker-machine env visualize)
  3. Clone the dtest/visualize-mysql repository and change into the parent directory.
    $ git clone
    $ cd visualize-mysql
  4. Bring the environment up:
    $ docker-compose up -d

Verify that the containers are running, noting that the data volume containers should have exited with status 0:

$ docker-compose ps
       Name                     Command               State                            Ports
elasticsearch        / -Des ...   Up>9200/tcp, 9300/tcp
elasticsearch_data   /true                            Exit 0
graphite             /usr/bin/supervisord             Up>80/tcp, 8125/udp,>8126/tcp
logstash             /app/bin/boot                    Up>9292/tcp
mysql                / mysqld            Up>3306/tcp
mysql_data           /true                            Exit 0
sysbench             /etc/ -d            Up

You can then point your browser to the Kibana and Graphana dashboards. The graphana dashboard requires login, and the default credentials are admin/admin.

The dashboards for the presentation are not automatically loaded, so you can find some basic dashboards in the dashboards directory of the repository. Unfortunately, I did not export them before discarding the machine so have lost some of the fine-tuning changes I had made. I do intend to recreate them and even try out the newer versions of the ELK stack available.

The goal here is to show how easy it is to setup a local environment using Docker and existing images on the Docker Hub to get started with your own projects.


Discover more about our expertise in MySQL.

Categories: DBA Blogs

Oracle 12c – Adaptive Query Optimization

Wed, 2015-09-30 07:51

Scenario: A user complains that a batch job that is critical for the month end business processing is taking more time than expected. As a DBA you quickly check and identify the sql statement that is running, compare the current execution plan with the previous plans and come to the conclusion that it has picked up a wrong plan. Instead of a Hash Join, it is doing a Nested Loop or did not pick the parallel threads and instead does a single threaded full table scan.

At this juncture, you do not have control to change the execution plan unless you cancel the job, fix it and rerun it. A simple analogy is like getting stuck in traffic and cannot do anything about it, even though you know that there are better and faster alternative routes to reach your destination.

Luckily, the above scenario was the case with 11g, though with 12c a new feature called – Adaptive Query Optimization was introduced and this feature helps the optimizer adjust the plans based on the real time data.

12c Optimizer:- With Oracle 11g, an optimizer decides an optimal execution plan for a query based on the conditions in the query, statistical information of the underlying objects and initialization parameters that influence the optimizer. With 12c, a new adaptive approach to query optimization is introduced by adjusting execution plans based on information collected during run time. This new approach is extremely helpful when the existing statistics are not sufficient to generate an optimal plan. There are two aspects in Adaptive Query Optimization:

  1. Improving the initial execution of a query during runtime.
  2. Adaptive statistics, that provide additional information for subsequent executions.

I will be focusing on the first aspect in this article.

The two features that optimizer adjusts based on the real time data are:

Hash Join vs Nested Loops – Optimizer may pick nested loop for a join operation based on the existing table statistics information, but during run time, if it realizes that more data is being processed, it will switch to Hash Join. This adaptive optimizer feature not only changes the plan on run time, but stores the adaptive statistics in the database. This additional information will be useful for future executions.

Parallel Distribution – Hybrid Hash – When a SQL statement is executed in parallel mode, optimizer decides whether to perform certain operations like sorts, joins and aggregations in parallel or as a single threaded operation based on the statistics. With the new adaptive feature, optimizer differs this decision till run time and based on the run time data it decides whether to pick the parallel mode or not.

Good resource about this new feature can be found here and here.

Check some of our other Oracle 12c blogs


Discover more about our Oracle expertise. 

Categories: DBA Blogs

Attending Tech Conferences

Fri, 2015-09-25 11:04


Now that I am in management, and have been for some time, I have still been attending tech shows and presenting at them a few times a year. When I came back to Pythian in February 2011, we had a handful of personnel doing sessions at the shows. Since then the number of people submitting abstracts and appearing at these shows has skyrocketed. I cannot take the credit single-handedly for that growth, but I am pleased that I have been a catalyst getting more Pythianites on the global stage where they should be. Bravo to all the new techs appearing at the shows.

I just attended the EastCoastOracle show in Raleigh/Durham USA and sat in on a steady stream of sessions and found the cloud material especially intriguing. Simon Pane, Roopesh Ramklass, Subhajit Das Chaudhuri, and Gleb Otochkin shone and had substance and pizazz in their sessions. Pythian should be proud.

The Pythian name was all over the show and we were tied with another company for most sessions. There was avid interest in what we do and I did get some “You work for Pythian!” comments from attendees. Our name is out there and getting “outer” every show.

Every show I attend I am thankful for the opportunity and Pythian’s support. It sweetens the deal when I look back at the number of presenters we send now as compared to 5 years ago. I value and I’m very honoured that so many colleagues have reached out to me over the years to get over that last remaining hurdle to entering the conference presentation life …


Discover more about our expertise in Cloud and Oracle.

Categories: DBA Blogs

My Sales Journey: #4

Fri, 2015-09-25 10:53


Let me start today with some great news! This week I got my first meeting….Woohoo! It happened in totally unexpected ways. It happened on Twitter. Of the dozens of calls and emails that I had been sending out it happened with a quick, to the point tweet. It made me take a step back and look at what I was doing differently.

Sales is about reaching people with a message and hoping they get interested enough to talk to you. You have to reach people where they live. Today, people live on many different hubs. We live on Twitter, Facebook, Hootsuite, Snapchat, Email, Linkedin or in an office. People also do these things depending where they are geographically. New Yorkers are more likely to hang out on social media channels than Floridians who may love phone calls and a good chat. Reach your people where they live.

Sales is also a very creative channel. It forces you to think out of the box at all times. To be successful you must not only be creative with your messaging, you also have to be creative about how you will deliver that message. Make your message brief, relevant and fun.

I am an entrepreneur first and foremost, which makes me naturally curious about business. Reaching out to all these people is all about getting to know their business. I am genuinely interested in knowing who they are and how their business works. It is about meeting cool folks doing cool things. Be genuine. Business will follow.

Its been a good week! Do you have a story or two to share about where you meet your people. Have you had success over social media? If so, please share. I want to be inspired!



Categories: DBA Blogs

Log Buffer #442: A Carnival of the Vanities for DBAs

Fri, 2015-09-25 07:24

This Log Buffer Edition collects and then showers around some of the information-rich blog posts from Oracle, SQL Server and MySQL.


  • Generic Java Server for Static content and Directory Listing using API mode or command line mode.
  • OEM agents on each host upload data to your management servers every few minutes.
  • exitcommit … or your career down the drain.
  • Encryption is the Easy Part; Managing those Keys is Difficult.
  • Managing the OBIEE BI Server Cache from ODI 12c.

SQL Server:

  • Email addresses are very prevalent in IT systems and often used as a natural primary key. The repetitive storage of email addresses in multiple tables is a bad design choice. Following is a design pattern to store email addresses in a single table and to retrieve them efficiently.
  • OpenStack: The Good and Not-So-Good Bits.
  • By defining server- and database-level audits, you can record just about any kind of event that occurs in SQL Server, which can be an invaluable source of security troubleshooting and forensic information when security breaches occur.
  • Koen Verbeeck shows how to easily extract metadata from files in your directories with Power Query.
  • Implementing John Conway’s Game of Life in Microsoft SQL Server.


  • Oracle HA, DR, data warehouse loading, and license reduction through edge apps.
  • Easy Load-balancing and High-availability using MySQL Router.
  • When hosting data on Amazon turns bloodsport.
  • MySQL 5.7 Labs — Using Loopback Fast Path With Windows 8/2012.
  • How to evaluate if MySQL table can be recovered.


Learn more about Pythian’s expertise in Oracle SQL Server & MySQL.

Categories: DBA Blogs

Meet Pythian at AWS re:Invent 2015

Thu, 2015-09-24 12:21


We’re pumped to be sponsoring our first AWS re:Invent show on October 6-9 at The Venetian in Las Vegas!

As an AWS Advanced Consulting Partner with AWS accredited experts on our team, we’ve helped our clients design, architect, build, migrate, and manage their workloads and applications on AWS.

If you’re going to be at the show next week, stop by our booth (#1255) or book a one-on-one session with one of our leading AWS technical experts to get answers to some of your biggest cloud challenges. From strategy development and workload assessment to tool selection and advanced cloud capabilities, we can help you at any stage of your cloud journey.

Don’t miss out! Schedule a one-on-one session with one of our technical experts:

Alex GorbachevCTO, Oracle ACE Director, Cloudera Champion of Big Data, specializes in cloud strategy and architecture, data and big data in the cloud, Securing data in the cloud, and IOT strategy.

Aaron Lee – VP Transformation Services, specializes in cloud strategy and transformation solutions, DevOps, big data, advanced analytics, and application platforms and migrations to AWS

Dennis Walker – Director, Engineering, specializes in cloud solution architecture and DevOps solutions for AWS.

Categories: DBA Blogs

Creating a Test Lab Using VirtualBox / NAT networking

Tue, 2015-09-22 12:37

My job is almost completely reliant upon my ability to perform work in lab of virtual machines. Almost every action plan I write is tested locally. When I need to troubleshoot an issue for a client one of the most common first steps I’ll perform is attempting to recreate the issue in a virtual environment so I can work on it there without the risk of impacting client data.

I believe that having a place to test and grow your skills is an absolute necessity for anyone working in the IT field today regardless of your specialization, even if you’re an IT generalist. But every now and then I hear about individuals who have issues with their virtual machines or with the virtual environment provided by their employer, so I figured this was a good time to share my method of creating a virtual lab. More specifically, one that allows you to do virtual work on a commodity laptop, one that won’t break down if you lose connectivity, one that won’t be interfered with if you connect to a restrictive VPN.

We’re going to build 2 CentOS 6 virtual machines on a virtual NAT network using Oracle VirtualBox. Before you begin, all you’re going to need is the installer for VirtualBox for your host machine and an ISO of CentOS 6 (64 bit).

The first thing we’re going to do once VirtualBox is installed is setup a NAT network.  The most common application of NAT networking is likely the usage of home broadband internet ‘routers’. This takes the single IP you get from your ISP and allows multiple machines to interface with it.

We’re going to use this concept to build a NAT network in your virtual environment that your virtual machines will connect to. By internalizing the network structure I will be able to interact with my virtual machines and allow my virtual machines to interact with one another regardless of whether my PC is connected to a network or not. This can be really handy if you’re having a networking problem, if you’re traveling, or if you’re inheriting a restrictive networking policy from a client’s soft VPN connection, etc. Regardless of what happens, if you can turn on your machine you can work in your lab of virtual machines.


For this tutorial I’ve installed VirtualBox version 5.0.4 on my local machine. A Nat Network is already setup by default, but you can verify this by going to File -> Preferences -> Network (on the left menu bar). On the Nat Networks tab you will likely see an network called ‘LocalNat’. You can click on the edit button (the little screwdriver icon) to see the network configuration. If you don’t have a Nat Network by default, create one by hitting the icon that looks like a network card with a ‘+’ symbol over it, then select it and click the edit button.


You will see the following options in the Nat Network Details window:

(Checkbox) Enable Network: Make sure this is checked.

Network Name: I’ve decided to stay with the name ‘LocalNat’.

Network CCIDR: This is the IP/Netmask of the Nat Network. For this tutorial I’ve chosen to stay with the default option of For those of you unfamiliar with slash subnet notation, /24 translates to, or rather a class ‘C’ network. This means all devices (or in this case, virtual machines) that connect to this network must have an IP starting with 10.0.2.XX, but the XX can be anything you want it to be above the number 3. The reason for this the first IP (1) is reserved for the host resolution and the second IP (2) is reserved for the network gateway.

(Checkbox) Support DHCP: Leave this unchecked. We will be assigning static IPs.

(Checkbox) Support IPv6: Leave this unchecked. We only need Ipv4.


There will also be a button for port forwarding. We will come back to that later after our first virtual machine has been created. Keep clicking OK until you’re back to VM VirtualBox Manager, as any networking configuration changes you have made will not be applied until you have done so.



The network is setup and we’re ready to create our first virtual machine! Click on new and this will take you to the ‘Create Virtual Machine’ window. The first option will be to name your virtual machine. In this case I am going to name my first virtual machine ‘CentVM1’ and, will use type ‘Linux’ and Version ‘Red Hat (64-Bit)’.


NOTE: Often I hear about people having issues with the 64-bit options not being available in the version menu. If you only see 32 bit versions, learn how to enable the 64 bit options.

The next option will be to set the amount of memory that your virtual machine will use. Bare in mind that the amount of virtual memory you set will be used on your local machine whenever the virtual machine is turned on. For this tutorial I have set the memory to 1024MB (1G), meaning whenever the virtual machine is on, 1G of memory will be used on the local machine to support it.


The next step will give you the option to create a virtual hard disk for your virtual machine. Select ‘create a virtual hard disk now’ and click ‘create’.


The next option will allow you to select a hard disk type. In this example I will use a VDI (VirtualBox Disk Image).


The next option will allow you to choose whether you want the virtual hard disk to be dynamically allocated or a fixed size.

Dynamically allocated means that the file on your machine that represents the virtual hard disk will start at a small size, but will grow as needed until it meets the cap designated by the size of the drive. For example, if you specify for the drive to be 20G in size it will appear as 20G in the virtual machine, but the file that represents this on your host machine will be much smaller until the disk space is actually used. The advantage to this is that you save space initially until you actually use the hard disk space in the virtual machine. The downside to this is if you need to use more space, you may incur I/O overhead to allow your host machine to expand the size of the file on demand. The file can also easily become fragmented and distributed on the physical drive platter.

Fixed size is just what it sounds like. If you specify that the virtual machine is going to have a 20G hard disk, a 20G file is going to be created on the host machine and will likely not experience any expansion beyond that. The downside is that the file that represents the disk on your host machine will be 20G regardless of how much of the disk is actually being used. The upside is that you don’t have to worry about losing track of how much your virtual machine disk files may expand.

In this case I find dynamic allocation to be acceptable as I’m only going to use this VM for testing and development. However, you can use whichever you feel comfortable with using.


The next option allows you to set the size of the virtual hard disk. For a MySQL lab virtual machine I would recommend no less than 10G. In this case, I’m going to use 20G in case I need the extra space.


After clicking ‘create’ you will be brought back to the VM VirtualBox Manager. You will see the VM you just created, but we can’t start it up just yet! We need to attach it to the Nat Network we setup earlier and we need to load the CentOS 6 ISO. Right click on your new virtual machine (CentosVM1) and click Settings.


In the settings window for you virtual machine, click ‘Storage’ on the left menu, in the storage tree section click the ’empty’ CD-ROM designation under the IDE controller, and then to the right click in the attributes section click on the icon that looks like a CD next to the optical drive drop down menu. Then click ‘Choose Virtual Optical Disk File’


Browse to and select the CentOS 6 iso. When you return to the Virtual machines setting window you will see that the iso has been loaded into the virtual optical drive on the IDE controller.


Next, on the left menu section, click ‘Network’. Here you will see that you have a network adapter that is enabled and is attached to NAT by default. Use the ‘Attached to’ drop down menu and select ‘Nat Network’, then on the ‘Name’ drop down menu, select the Nat Network we created earlier (LocalNat). This will attach this network controller to your Nat Network.


Click OK to go back to the VM VirtualBox Manager. We are now ready to start up the virtual machine! I know we said earlier we were going to create 2 virtual machines, but we’ll take care of the second one later using cloning. For now, select your first virtual machine (CentVM1) and click start.



Install OS / Configure networking

You should install CentOS as you normally would in any other circumstance. However, I would suggest that you consider a minimal installation when given the option of what installation you would like to use. You don’t know what you will or won’t have available to you on a client system, so I strongly prefer to have a virtual machine where I can’t assume anything and will need to install packages on an as needed basis. This will allow you to troubleshoot and be ready for issues you may face on a client system.


The only other thing to note is that during the installation I selected to use the hostname cent1.localhost.

Once the installation is complete we are going to have to use the virtualbox virtual machine terminal until we have connectivity established. We can start by setting up networking. You’ll notice that if you run ‘ifconfig’ there isn’t any network controllers enabled beyond the virtual machine’s loopback ( Let’s enable the controller by editing it’s config file. You can edit the file with vi by using the following command:

> vi /etc/sysconfig/network-scripts/ifcfg-eth0

You will want the file to contain the following:

DEVICE="eth0" #this is the default, do not change
HWADDR="08:00:27:F6:B3:84" #this is going to be the mac address of the VM's network interface, do not change
NM_CONTROLLED="yes" #this is the default, do not change
ONBOOT="yes" #set to yes. This is what sets the network interface to start at boot
BOOTPROTO="none" #Since we are using a static IP, we don't want any boot networking protocol like dhcp or bootp to run
IPADDR= #The IP address of your virtual machine. I typically start my first machine with 6 and then move up from there. This is a matter of personal preference
NETMASK= #This netmask matches the /24 subnet notation we set for the Nat Network earlier
GATEWAY= #As stated earlier, VirtualBow reserves the .2 IP in the subnet for gateway
DNS1= #The DNS server you want to use. Personal preference.


Once you’re done modifying the configuration file for adapter eth0, you will want to restart networking with the following command:

> /etc/init.d/network restart

You should now see your static IP when running ifconfig and should be able to ping out.





At this point the machine is able to perform its basic functions. Note that you can’t SSH to the server yet, we’ll get to that. This is where you will want to put any finishing touches on the virtual machine and create your first snapshot. Before creating my first snapshot I will typically take the following steps.

  • Stop iptables and remove it entirely from chkconfig so it doesn’t start at boot. This is the linux software firewall and hasn’t been needed for any work I’ve ever had to do in a lab. This is typically one of the big things that throws people off when they’re trying to establish inbound connectivity for the first time so I recommend disabling it.
  • Disable SELINUX.
  • Update the OS by running yum update.

Note that I did not install MySQL. The reason for this is I want a snapshot of just the operating system load and nothing else. I utilize snapshots a lot to quickly spin up different types of working environments and I find it’s easier to restore an OS load to create a new environment then to deal with uninstalling and reinstalling packages.

For example. If I have an Oracle MySQL 5.6 VM installed and I want to switch to Percona 5.6, I would stop the VM, take a snapshot of my Oracle load for future reference, restore the OS Load snapshot, start the VM, and now I have a freshly installed CentOS VM where I can install Percona 5.6 and then create a new snapshot for it. This way if I ever need to use Oracle or Percona I can just load the appropriate snapshot and start the VM. No need to have more than 1 VM unless you’re emulating a replication or clustering environment.

So now that we have our first VM configured. Let’s shut it down and snapshot it. Shutdown your VM using the following command.

> shutdown -h now

Once shut down, go back to the VM VirtualBox Manager. Click on your VM and then click on snapshots.


Right now the only thing that you should see is the current state. No other states have been saved at this point, so create a new snapshot by clicking the menu icon that looks like a camera. This will take you to the option to create a snapshot of your virtual machine. When naming your snapshot and creating the snapshot description BE DESCRIPTIVE. You are going to want to know what is the state of the snapshot without having to load it. Once done, click ok and you will see that the new snapshot is created.




Setting up port forwarding

The last thing you need to do for this VM is make it accessible. You’ll notice that if you try to SSH to the machine at you’re not going to get very far. The reason for this is because that IP is sitting behind a NAT network, just like your machine does if you are using a router between yourself and your ISP. If your local machine on your local network is, no one outside your network is going to be able to ping your machine from the internet, they would have to connect to you using your public facing IP address with a port that you have designated to forward from your router to your local machine. Remember how we mentioned NAT Network port forwarding earlier? We’re going to configure that now.

Click on File -> Preferences -> Network (on the left menu bar), select your NAT network (LocalNat) and click the screwdriver icon to the right to edit like we did before when confirming the networking settings for the NAT network. This will bring you back to the NAT networking details window, click on the ‘port forwarding’ button.

You’ll need to setup a rule to forward a port on your local machine to the port of your virtual machine. In this example we’re going to setup an SSH port forward. We’ll use the following options…

Name: CentVM1SSH

Protocol: TCP

Host IP: <blank>

Host Port: 61022

Guest IP:

Guest Port: 22


What we have done is created a port forwarding rule called ‘CentVM1SSH’ that has stated that any inbound connectivity on port 61022 on your local machine should be forwarded to port 22 on the virtual machine we just created. Now we should be able to SSH from our local machine to our VM using Don’t forget to turn your virtual machine back on before you try this! Also, be sure that you have clicked okay on all preferences windows and are back to the VirtualBox VM Manager before attempting as new networking port forward rules will not be applied until you have done so.



Cool! As you can see from the example images above we can SSH directly to our VM using the ports we have forwarded.

Having 1 virtual machine is nice, but it’s not going to be enough. How often do you see single server solutions? We need another! However, I don’t want to go through all that configuration again, so let’s just clone this virtual machine. Make sure your first virtual machine is shut down and then follow these steps.


Cloning your virtual machine

In the VM VirtualBox Manager, right click on your first virtual machine and click on clone.


This will bring up the ‘Clone Virtual Machine’ window. Enter the name for your second virtual machine (CentVM2), MAKE SURE ‘Reinitialize the MAC address of all network cards’ IS CHECKED and then click next. If you do not reinitialize the MAC address, it means that the network card on your new virtual machine will have the same MAC address as the first one. This is bad.


The next options will be to either create a full clone or a linked clone. A linked clone is like a snapshot. I prefer to go with full clones so long as I have the disk space to support them.


The next option will ask if you want the a clone of the machine in just it’s current state, or everything including the existing snapshots. I would suggest cloning the machine in it’s current state. We are going to have to update some networking settings on the new virtual machine and create a new OS load snapshot for it later.


After you click on the clone button, you’ll see a progress bar come up. On average the cloning process of a small VM with just an OS load typically takes about 5 minutes. Once it’s done you will see that you have 2 virtual machines at your disposal. However there are going to be some issues with the second virtual machine that we will need to fix. Specifically it has the hostname, IP, and MAC address of the first virtual machine in it’s configuration files. Let’s fix that!


First, we need to know what the MAC address of the new virtual machine is. We can get that by right clicking on the second virtual machine, click on settings and then select network from the menu on the left. Drop down the advanced options and note the MAC address.


Go back to the VM VirtualBox Manager and start up your second virtual machine.

You will want to edit the following files….


Change the ‘HOSTNAME’ entry to whatever you would like your new virtual machine host name to be.


Change the ‘HWADDR’ entry to the MAC address that was reinitialized for this virtual machine. This is the MAC address you noted earlier.

Change the ‘IPADDR’ entry to the IP address you would like for this machine.


Delete everything that comes below the comments at the top of the file. This file is used to store information to tie network adapters to their MAC addresses and will be repopulated on the next reboot.

One this is done, reboot your virtual machine. The following is output that shows what the contents look like on my second virtual machine after it was rebooted.

[root@cent2 ~]# cat /etc/sysconfig/network
[root@cent2 ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
[root@cent2 ~]# cat /etc/udev/rules.d/70-persistent-net.rules
# This file was automatically generated by the /lib/udev/write_net_rules
# program, run by the persistent-net-generator.rules rules file.
# You can modify it, as long as you keep each rule on a single
# line, and change only the value of the NAME= key. #Delete below this line before reboot
# PCI device 0x8086:0x100e (e1000) (custom name provided by external tool)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:7e:a7:a5", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

At this point you’re good to go. Shut down your second virtual machine. Take an OS load snapshot of it and you’re all set. You can power on both virtual machines and they should be able to communicate with one another with no additional configuration. Don’t forget to set up another port forward so you can SSH directly to your second virtual machine, and additional ports as needed to connect to services like MySQL.

This virtual machine setup is as compact as you would like it to be, while being effective and will remain available to you so long as you can power your machine on. Enjoy! Oh, and also, everything written in this tutorial was tested using virtual machines.

Good day!


Discover more about Pythian and our technical expertise.

Categories: DBA Blogs

SQL On The Edge #2 – SQL 2016 Temporal Tables

Tue, 2015-09-22 12:31

Hello and welcome to our second episode of SQL On The Edge! On this episode we’re going to focus on a new feature of SQL Server 2016 called Temporal Tables.

What’s a Temporal table?

Temporal tables allow us to go back in time and see what the data looked like at some point in the past and are also referred to as “system-versioned” tables. This has several different uses, some of which are:

a) Auditing
b) Quickly fix mistakes
c) Data trending

A temporal table is implemented as two different tables by SQL Server and they’re displayed transparently as one table to the user. Each one of these tables also has two datetime2 columns to track the START of the validity of a record and the END of the validity. These tables are referred to as the current and the history table. Partitioning is supported for both the current and the history table to make it easier to work with large amounts of archived data.


Enabling support for a table to be temporal comes with some limitations, these are the main ones:
– A PK is required on the current table.
– The table can’t be TRUNCATED.
– No support for FILETABLE or FILESTREAM.

There are more, for the full list refer to the documentation.

Creating and Querying

For creating a temporal table and query examples let’s jump to the video, enjoy!


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

Windows Containers: Installing SQL Server

Mon, 2015-09-21 14:28

This blog post is a quick introduction to Containers in the Windows world, and a walk-through on installing SQL Server in a Windows Container.


As many of you have heard, Microsoft is jumping into containers with native support for Docker containers in Windows 2016. Containers are the current big thing in virtualization, Linux, and DevOps, because the are very light-weight and allow you to quickly create a new environment without having to wait for a VM to be deployed and provisioned by the server team. I expect them to be just as useful and ubiquitous in the Windows world very soon.

Hypervisors are based on emulating hardware, and so they can be very resource intensive. At a minimum, they’re required to have an entire Operating System, CPU, RAM, and some drives assigned before they’re useable, and that’s often overkill for a VM running a single application. Containers, by contrast, virtualize only the OS level and share the kernel libraries between them, and you don’t need to worry about the rest. Containers are small and light-weight enough, that you can expect to run 4 to 6 times as many containers vs VMs on one host.

This MSDN Blog Post goes into detail on containers and their differences from VMs.

It’s important to note that containers are meant to run a single application and do not have GUI interfaces. So, everything must be run via the command line or a remote connection.

Why use containers for SQL Server?
  1. You need to quickly create a set of SQL Server instances for development or testing.
  2. Your company runs a Software-as-a-Service and wants to separate clients into different environments while squeezing everything they can from their hardware.
  3. You want to be able to share development environments without everyone getting in each others way.
  4. Your VM or Server team just isn’t very good, and they take forever to get you what you need.
Installing SQL Server in a Windows Container

The following is a walk-through for installing SQL Server in a Windows Container. You might want to reference the Docker documentation for more details on the commands I use.

When you’re done with the tutorial, try to get multiple containers and their instances of SQL Server running on the same box.

Step 1: Create a New Server

The server should be running Windows Server 2016 Technical Preview 3 (or higher) Core with the Container role enabled.

I used Azure’s “Windows Server Container Preview” VM for this tutorial, which luckily has the Host OS all setup for me. Find out more details on setting up Windows to run containers.

* A quick note for anyone who hasn’t used Windows Server Core before: Open Task Manager and use File–Run New Task to get new CMD windows.

At this point, you should also create a new directory structure in your VM:


Creating an Azure VM

Creating an Azure VM


Step 2: Configure Azure Security Rules

If you’re using Azure, you need to define the Inbound Security Rule for this port. To get there in Azure:
From the VM’s main blade click: All Settings — Network Interfaces — [Interface Name] — Network Security Group — All Settings — Inbound Security Rules.

The default rule to allow RDP traffic will be there. Create another rule to allow SQL Server traffic. For reasons I don’t understand, setting the port to 1433 here doesn’t work. You need to open it up, and hope your firewall is up to date.

Creating an Inboud Security Rule


Step 3: Configure Windows Firewall

Run the following in Powershell on your host to open the right ports. I’ll be using the default port 1433:

if (!(Get-NetFirewallRule | where {$_.Name -eq "SQLServer 1433"})) {
New-NetFirewallRule -Name "SQL Server 1433" -DisplayName "SQL Server 1433" -Protocol tcp -LocalPort 1433 -Action Allow -Enabled True


Step 4: Enable .Net 3.5 Framework

This is a hassle. The base Windows image that Microsoft provides does not have .Net Framework 3.5 enabled. So, you need to enable it in the container which should be easy enough, and we’ll get to that. Unfortunately, for reasons that I do not understand, when attempting to install .Net 3.5 in the container, it doesn’t use WindowsUpdate and fails. If you have a Windows .iso file (which I don’t), you can theoretically point the below command at it from within the container, and it should work.

The “fix” is to enable .Net 3.5 on the host, export the registry keys, and then import them into the Container’s registry. This tricks the SQL Server installer into thinking you have it enabled. Does SQL Server 2016 need anything in the .Net 3.5 SP1 Framework? Probably!

Seriously, I spent hours banging my head against this thing and if you can figure out how to get out to from your container, please let me know.

Enable and upgrade the .Net 3.5 Framework on the host server by running the following commands within Powershell. You don’t need to do this if you have a Windows .iso file because we’ll be installing it in the container later.

get-windowsfeature -name NET-Framework-Features | install-windowsfeature
get-windowsfeature -name NET-Framework-Core | install-windowsfeature


Using regedit, export the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\NET Framework Setup\NDP\v3.5 registry keys and save them as C:\mssql\install\registrykeys\registry.reg


Step 5: Download and Extract SQL Server Installation Files

Run the following commands in Powershell on your host to download the SQL Server installers. Change URLs as needed…

wget -uri '' -outfile 'SQLServer2016-x64-ENU.exe'
wget -uri '' -outfile ''

Run the executable and save the files at C:\mssql\install. You should delete or move the .exe & .box files as well.

Step 6: Create SQL Server Configuration File

As mentioned earlier, containers don’t allow any GUI interfaces, so SQL Server has to be installed silently. In addition, not everything in SQL Server is supported on Windows Server Core.

I used this configuration file. If you use the same one, make sure you change the password (search for CHANGEME).

Please put your configuration file at C:\mssql\install\configurationfile.ini.


Step 7: Create your dockerfile

Docker uses the dockerfile as a configuration file and to ensure images are built exactly the same every time.

Take the below code and save it in a text file as c:\mssql\dockerfile

The lack of extension is on purpose. This isn’t a new folder. If Windows insists on saving the file with a .txt extension, which happened to me a couple of times, use the Powershell rename-file command and remove the extension.

#Define the base image we'll be building everything else off of...
FROM windowsservercore

#Give it a label
LABEL Description=”SQL Server” Vendor=”Microsoft” Version=”13.00.500.53″

#These files and folders will be imported into the docker image and be available for us to use.
ADD install/SQLServer2016-x64-ENU /mssql/install
ADD install/configurationfile.ini /mssql/install/configurationfile.ini
ADD install/registrykeys/registry.reg /mssql/registrykeys/registry.reg


Step 8: Build Docker Image

At this point, everything you need to install SQL Server should be staged somewhere underneath the c:\mssql directory and you should have a reference to each file or the folder in your dockerfile.

To build the docker image, run this:

docker build -t mssql2016 c:\mssql

This command tells docker to build an image with a name of mssql2016, and that the dockerfile is located in the c:\mssql folder.

While it’s running, open up Task Manager and watch how much CPU & RAM it uses. Also, get some coffee and check your email. You’ve got time.


Step 9: Verify

After the build completes, run the below command to see all of the images you have available. It should be a magical rainbow of three choices.

docker images


Step 10: Run your image

This command will run your image

docker run -it --name mssqlContainer -p 1433:1433 mssql2016 cmd

Let’s walk through each of these parameters:

  • it | Runs an interactive psuedo-terminal.
  • name | The name of your running container.
  • p 1433:1433 | This binds port 1433 on the host to port 1433 on the container process.
    • In other words, any traffic coming into the host on port 1433 will be forwarded to the container’s port 1433.
  • mssql2016 | The name of the image you want to run.
  • cmd | The utility that you’ll be running.


Step 11: Enable .Net 3.5 Framework

As mentioned back in Step 4, this is a hassle.

We need to import the registry keys into the Container’s registry to trick the SQL Server installer into thinking we have .Net 3.5 SP1 installed. IN ADDITION, we need to enable as much as possible of the actual .Net 3.5 framework so it’s at least sort of usable. So, run the following commands to enable .Net 3.5 and import the registry keys.

DISM /online /enable-feature /featurename:NetFx3ServerFeatures
reg import C:\mssql\registrykeys\registry.reg


Step 12: Install SQL Server in a Windows Container

Navigate to C:\mssql\install and run the below command to install SQL Server using the values setup in your configuration file.

setup /IAcceptSQLServerLicenseTerms /ConfigurationFile=configurationfile.ini


Step 13: Fix the installation

At this point, the SQL Server instance should be up and running. Unfortunately, there’s a good chance the next time you start the container that the instance will not come up.

Here’s a blog post talking all about what happens. It appears to be due to how the container shuts down the underlying processes (or doesn’t).

The quick fix is to go against every best practice document and run SQL Server under LocalSystem.

sc config MSSQLSERVER obj=LocalSystem


Step 14: Connect to SQL Server

As a test of the instance, you can use OSQL from the command line to verify it’s up and running.
C:\Program Files\Microsoft SQL Server\130\Tools\Binn>osql -E

From your local machine, connect to the SQL Server instance. You should use your host server’s IP address (the Public IP address in Azure).

Congratulations! (but you’re not done yet)


Step 15: Save your work

Boy, it sure would be a shame if something happened to that nice, new SQL Server installation you’ve got.

Once you’re done playing with the instance, head back to the command window with access to your container. I recommend attempting to cleanly stop the container. From another command window on the host, run:

docker ps
## The output from docker ps will give you a ContainerID. Use it in the stop command.
docker stop [ContainerID]


Alternatively, just type exit as many times as necessary to get back to the host’s command line, and the container will shut down very poorly.

Type this to commit the new image to your repository
docker commit [ContainerID] mssql2016:Installed

This will save your container locally and give it a new tag of Installed. This will also take some time.

To start it again, use:
docker run -it --name mssqlcontainer mssql2016:Installed cmd


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

Log Buffer #441: A Carnival of the Vanities for DBAs

Fri, 2015-09-18 13:38

This Log Buffer Edition dives deep into the ocean of blogosphere and surfaces with some cool blog posts from Oracle, SQL Server and MySQL.


  • Lets Talk DB Perth
  • The Fundamental Challenge of Computer System Performance
  • Index Advanced Compression: Multi-Column Index
  • Middleware Diagnostics Advisor (MDA) Setup
  • Most people are now aware that in 12c, a VARCHAR2 has been extended from a maximum size of 4000 to 32767 bytes.

SQL Server:

  • Creating SQL databases on Azure Blob storage
  • Monitoring Availability Group Replica Synchronization
  • PowerShell Tool Time: Controlling Our Tools
  • Automated Patching of SQL Server IaaS VMs
  • SQLCMD Mode; Run all SQL files in a directory


  • Clarification on “Call me Maybe: MariaDB Galera Cluster”
  • MySQL Aurora CPU spikes
  • Oracle HA, DR, data warehouse loading, and license reduction through edge apps
  • Protecting MySQL Passwords With the sha256_password Plugin
  • Ever get called out for a MySQL issue only to realize that there was no issue?  It was a false alarm from the monitor.
Categories: DBA Blogs

My Sales Journey: #3

Fri, 2015-09-18 13:26


This week started off with the last on boarding session with the CEO himself. It was my favorite for many reasons. It was passionate the way it can be when you build something from nothing with blood, sweat and tears and it resonated with me having been down that path myself. You hear it in the tone of their voice and the pride in their eyes. That BORG (boarding of recruited gurus!) is the inspiration for today’s post. Here are my takeaways from Week 3:

To be effective in a sales role you need to speak like a CEO, learn your product inside out and when people listen to you they must hear how amazing your product/service is without you ever saying how amazing it is. Your passion will shine through by your demonstrated knowledge.

Outreach needs organization as it demands many different tasks to be done at once. It is a daunting task but I am sure it will get better as I learn how to get the best out of my day. I am lucky to have a set up that allows me focus like two big screens, headsets, phones and most of all a manager who is there to assist and actively involved to make sure I succeed.

Being flustered is normal! You know that moment when you are making calls and leaving voice messages and suddenly you get a live one! You mumble your way through it and think Wow! I could have been so much better. I had one of those moments this week and it made me go out and do research and find better questions to ask. I chalk it down to my learning curve and move on.

As a rookie, your team is your support structure. Again, lucky to be surrounded by people who collaborate, take the time to help out, craft email templates and show you how its done on the phone. Without all that this fish jumping into unknown waters would certainly drown.

It’s been a good week! Share this with your rookies or if you are rookie tell me how your week was. Looking forward to hearing from you.


Categories: DBA Blogs

Part 3: Comparing Oracle Cloud Database Backups Options

Wed, 2015-09-16 10:43
Comparing Oracle Database Backup Service (ODBS) and Oracle Secure Backups (OSB) to Amazon Web Services (AWS)

This is part 3 of a 3 part series on “Getting Started with Oracle Cloud Backups”.

  • Part 1 covers setting up RMAN to backup directly to the new Oracle Cloud Database Backup Service (ODBS).
  • Part 2 covers setting up RMAN to backup directly to the cloud using Amazon Web Services (AWS) Simple Storage Service (S3).
  • Part 3 compares and contrasts the two services.



Oracle recently announced their Oracle Database Backup Service (ODBS) as part of their big push to the Cloud. However while the name is new, the technology really isn’t. It’s effectively just a re-brand of their Oracle Secure Backup Cloud Module which was introduced years ago, initially with the ability to backup to the Amazon Simple Storage Service (S3). The functional and non-functional differences are minor but are summarized in this article.


Use Case

Both services probably appeal mostly to small or medium sized business looking for off-site backups for whatever reason (such as DR or regulatory requirements).

Keep in mind that a service like this probably isn’t a full replacement for your onsite primary backup storage device. But it very well could replace your old-style off site backup or tape vaulting vendor, which usually involves a physical pickup of backup tapes and transportation to a storage location on a daily, weekly, or in some cases monthly basis.

And while the restore times are certainly going to be considerably slower than restoring from on-premise disk based devices, it’s undoubtedly faster than bringing back tapes from an offsite storage location through a vendor service (time of which is usually measured in days with ad-hoc recall requests often being at an additional expense).

The specifics of how to technically get started with implementing either service is discussed in the previous articles of this series.


Decision Criteria Checklist

Many factors come into consideration when deciding on whether to allow a business critical database to travel off-site and when selecting the appropriate vendor or service to do so. The following generic checklist is simply a guide of suggested criteria that one may consider:

  • Storage costs (metered or flat rate; progressive or flat incremental rates)?
  • Ease of installation (outages required or not)?
  • Effects on internal processes (i.e. does the module need to be reinstalled when home cloning; changes to RDBMS software installation processes)?
  • Ease of use?
  • Changes required to existing scripts/processes (i.e. complicated changes to RMAN commands or scripts; changes or integration with 3rd party backup tools required)?
  • Is backed-up data secured at rest (at vendor data center)?
  • Is backed-up data secured in flight (transfer to or from vendor data center through the public internet)?
  • Upload and download performance (is there an effect on Recovery Time Objectives)?
  • Is transferring the additional data going to effect the organization internet performance or costs from their ISP?
  • Additional licensing costs?


Pros and Cons

The answers to some of the above criteria quests are going to be site and/or database specific. Others have been discussed in more details in the other articles in this series.

However, the pros and cons of each service can be summarized as follows:

OSDB ProsOSDB ConsNo upfront costs (no additional licenses required)No security through keys/credentials – instead must use “users” correlated to actual named users and email addressesReasonable and competitive metered ratesNavigating between services, accounts, and domains not as simple as it should beSecurity at-rest and in-flight through mandatory encryption and HTTPS transferWebUI doesn’t show details beyond overall “space used” (i.e. doesn’t show files or per database space usage)Advanced RMAN compression option included (without requiring the Advanced Compression Option)Can’t specify Oracle data center used, just the key geographical region (i.e. North America)Data is triple mirrored in Oracle Data CenterNo ability to replicate data between Oracle data centers


OSB & AWS ProsOSB & AWS ConsAbility to create unique IDs and keys for each DB being backed up (credentials independent of named users)Requires licenses for the “Oracle Secure Backup Cloud Module”, which is licensed on a RMAN per-channel basisBilling calculator for easy cost estimationBy default data is neither secure at-rest or in-flight (though both can be enabled)Additional options with S3 such as more specific data center selection and cross-region replication

It should be noted that while the Oracle Secure Backup Cloud Module is licensed on a per “stream” or per RMAN channel basis, those RMAN channels are not dedicated to one database. Rather, they are concurrently in-use channels. So if you had licenses for 10 “streams” (channels) those could be used by concurrently by 10 different databases each only using one RMAN channel or one database using 10 RMAN channels or any combination there of.

And while both provide use of backup encryption and advanced compression options as part of “special-use licensing”, it should be noted that these options are available only for the cloud based (or in the case of OSB, SBT based) backups. Regular disk based backups of the same database(s) would still require the Advanced Security Option for RMAN backup encryption and the Advanced Compression Option for anything other than “BASIC” RMAN backup compression.

The AWS solution also provides (by default) the option of not securing the data at rest or in flight. Not encrypting RMAN backups is beneficial when trying to take advantage of storage based deduplication which is not relevant here. Hence I struggle to think of a strong business use case for not wanting to encrypt backups all else being equal? Similarly why one may want to choose HTTP over HTTP for critical business data?



One possible requirement may be the need to use both services concurrently for test/evaluation purposes. Fortunately, since the module (library, configuration, and wallet) files are all uniquely named, it’s absolutely possible to use both services concurrently, even from within the same RMAN session. For example:

RMAN> run {
2> allocate channel odbs type sbt
3> PARMS=',SBT_PARMS=(OPC_PFILE=/u01/app/oracle/product/12.1.0/dbhome_2/dbs/opcCDB121.ora)';
4> backup tablespace users;
5> }

allocated channel: odbs
channel odbs: SID=270 device type=SBT_TAPE
channel odbs: Oracle Database Backup Service Library VER=

Starting backup at 10-SEP-15
channel odbs: starting full datafile backup set
channel odbs: specifying datafile(s) in backup set
input datafile file number=00006 name=/u01/app/oracle/oradata/CDB121/users01.dbf
channel odbs: starting piece 1 at 10-SEP-15
channel odbs: finished piece 1 at 10-SEP-15
piece handle=2tqgq3t5_1_1 tag=TAG20150910T114021 comment=API Version 2.0,MMS Version
channel odbs: backup set complete, elapsed time: 00:00:15
Finished backup at 10-SEP-15
released channel: odbs

RMAN> run {
2> allocate channel aws_s3 type sbt
3> PARMS=',SBT_PARMS=(OSB_WS_PFILE=/u01/app/oracle/product/12.1.0/dbhome_2/dbs/osbwsCDB121.ora)';
4> backup tablespace users;
5> }

allocated channel: aws_s3
channel aws_s3: SID=270 device type=SBT_TAPE
channel aws_s3: Oracle Secure Backup Web Services Library VER=

Starting backup at 10-SEP-15
channel aws_s3: starting full datafile backup set
channel aws_s3: specifying datafile(s) in backup set
input datafile file number=00006 name=/u01/app/oracle/oradata/CDB121/users01.dbf
channel aws_s3: starting piece 1 at 10-SEP-15
channel aws_s3: finished piece 1 at 10-SEP-15
piece handle=2uqgq3un_1_1 tag=TAG20150910T114111 comment=API Version 2.0,MMS Version
channel aws_s3: backup set complete, elapsed time: 00:00:15
Finished backup at 10-SEP-15
released channel: aws_s3



However, if the wrong SBT library is being used by certain RMAN commands trying to access the backup pieces, the following RMAN error will be returned:

RMAN-06207: WARNING: 2 objects could not be deleted for SBT_TAPE channel(s) due
RMAN-06208:          to mismatched status.  Use CROSSCHECK command to fix status
RMAN-06210: List of Mismatched objects
RMAN-06211: ==========================
RMAN-06212:   Object Type   Filename/Handle
RMAN-06213: --------------- ---------------------------------------------------
RMAN-06214: Backup Piece    2qqgq3ik_1_1
RMAN-06214: Backup Piece    c-3847224663-20150910-01


The issue with the above example DELETE BACKUP command is resolved by simply allocating a channel using the proper library for the proper vendor.

Running similar backups as the above commands but on an entire CDB (using the RMAN command “BACKUP AS COMPRESSED BACKUPSET DATABASE;”) rather than just one tablespace in order to have a larger source or amount of data to process shows some interesting results:

SQL> SELECT status, input_type,
  2         backed_by_osb, compression_ratio,
  3         input_bytes_display, output_bytes_display,
  4         input_bytes_per_sec_display, output_bytes_per_sec_display,
  5         time_taken_display
  6    FROM v$rman_backup_job_details
  7   ORDER BY start_time;

---------- ------------- --- ----------------- -------------- -------------- -------------- -------------- ----------
COMPLETED  DB FULL       YES        6.60411404     4.47G        692.75M          1.59M        245.88K      00:48:05
COMPLETED  DB FULL       YES        6.60794224     4.47G        692.50M          1.96M        303.43K      00:38:57


The interesting points are:

  1. The backups to OSDB (top line in red) consistently took longer on my system than the backups to AWS (blue second line). Would need longer term testing to see whether this is an anomaly or a pattern. Backup performance time would need to be evaluated in detail when selecting a service.
  2. Both are recorded in the catalog as using OSB (the “backed_by_osb” column). This is no surprise as the OSDB module is simply a copy of the OSBWS module as is apparent by the API version numbers.



From an RMAN functional perspective the two are almost identical. As would be expected since the ODBS module and library is essentially just a clone of the OSB module and library. With a re-branding and slight modification simply to differentiate it and to conveniently provide Oracle an exemption from their own licensing requirements. Which is not uncommon from Oracle – after-all they do like to provide exceptions to themselves, allowing them to promote their products and services over their competition.

From a data management perspective the Amazon S3 is far ahead with additional features, such as regional replication and more granular storage location options. Something Oracle very well may catch up on but at time of writing does not yet provide.

Hence, I think the choice really comes down to priorities. Additional storage/data management options vs. additional licensing costs. For many smaller customers price is a key concern and therefore the Oracle solution likely preferable as essentially it is the same as the OSB & AWS solution but without the license requirement.


Discover more about our expertise in Oracle and Cloud.

Categories: DBA Blogs

Part 2: Oracle Cloud Backups to AWS S3

Wed, 2015-09-16 07:08
Getting Started with Oracle Cloud Backups – Amazon Web Services (AWS)

This is part 2 of a 3 part series on “Getting Started with Oracle Cloud Backups”.

  • Part 1 covers setting up RMAN to backup directly to the new Oracle Cloud Database Backup Service (ODBS).
  • Part 2 covers setting up RMAN to backup directly to the cloud using Amazon Web Services (AWS) Simple Storage Service (S3).
  • Part 3 compares and contrasts the two services.

As mentioned in part 1 of this series, it’s actually really easy to get started with writing Oracle backups directly to the cloud. So regardless of the motivation, it’s certainly possible to go from zero to fully backed-up to a cloud based service in just a few hours, hence meeting any requirements to implement off-site backups extremely quickly. Of course, there are several dependencies including the database size, internet upload speed (and cost), and change control formalities.

Recovery speed is most certainly going to be slow compared to recovering from local on-premise storage, but this service probably isn’t meant to replace your on-premise backup strategy, but rather compliment it or possibly replace your off-site tape-vaulting vendor service (at least for Oracle database backups). Regardless, recovery time and RTO must of course also be taken into consideration.

Also, while the Amazon Web Services metered usage costs are extremely competitive, directly integrating with RMAN does require the Oracle Secure Backup Cloud Module, which is a licensed option.

However, Amazon does also provide some additional unique features with their S3 storage: namely object versioning, object life-cycle rules, the ability to pick the storage region with more granularity (i.e. a specific region within the United States) and “Cross-Region Replication”.

This article provides a quick start for getting up and running with Oracle RMAN backups to AWS S3 using the Oracle Secure Backup (OSB) Cloud Module for users new to the service and cloud backups.


Backing up to Amazon Web Services (AWS)

Backing up to the AWS Simple Storage Service or S3 isn’t new. I first experimented with this in June 2011 and hadn’t really touched it since. Yet, fundamentally it hasn’t really changed at all since that time.

From a very high level the process is:

  1. Create an AWS S3 account and setup the necessary credentials.
  2. Install an AWS specific Oracle Secure Backup library into your Oracle Home.
  3. Run an RMAN backup using the SBT_TAPE device type.

Really that’s it. And the first two steps are pretty quick and easy. The 3rd is the one that will vary depending on the size or your database and your upload bandwidth.


Creating an AWS S3 Account

Creating the necessary AWS S3 account is almost trivially easy. If you or your organization does not already have an “Amazon Web Services account”, start by signing up for the Free Trial. The free trial gives you 5GB of storage (along with about 20 other services) for a full year.

For almost all organizations the 5GB of storage probably won’t be sufficient. However, Amazon does provide a simple to use billing calculator to try to estimate your monthly costs based on usage.

Amazon also provides other benefits, such as being able to choose what data centers or geographic regions your data is stored in. Further details can be found in their FAQ.

After filling in the necessary information and creating an account (if your organization does not already have one), the next step is to set up a security credential. In the top right corner of your S3 console navigate to “Security Credentials”:



From the “Users” option select “Create New Users”:



Enter a user name and do check the check box to generate an access key – this is what will be used by RMAN, not the actual username:



Once added, remember to record display (and maybe temporarily record or use the “Download Credentials” button in the bottom right) the “Access Key ID” and “Secret Access Key”, as you’ll need this information during the setup. Don’t worry, if you forget or loose this information, it’s very easy to generate new security credentials for this user. Or, to delete credentials as I have for the one shown in the screenshots. (It’s conceptually similar to the “single use password” you can generate with a Google Account with their 2-factor authentication enabled.)



Before the new credential can be used, it must be given permissions though a “Policy”. Just like within the Oracle database, the required permission (or “policy”) can be granted directly or through a role allowing for flexible management.

From “Users” click on the newly created username and then the “Attach Policy” button:



For RMAN to write to the S3 storage, the “AmazonS3FullAccess” policy will be required. After selected choose the “Attach Policy” button in the bottom right.

At this point, we’re ready to start configuring the database and using the service.


Installing the “Oracle Secure Backup Cloud Module for Amazon S3”

Installing the necessary “Oracle Secure Backup Cloud Module for Amazon S3” into your Oracle home is very easy.

From OTN download an installer Java JAR file and copy and extract the zip to your database server. When run, the installer will determine the proper database version and OS platform, as well as download the appropriate library file to your Oracle home or other specified directory.

Installation requires a few mandatory arguments, namely:

  • The AWS credentials created previously
  • Your OTN username (but not your Oracle account password)
  • The location for the library file it will download. Usually use $ORACLE_HOME/lib
  • The location for the secure wallet file which stores the AWS credentials

There are a number of other optional arguments (specified in the README or by running the JAR file without arguments) such as proxy server details if necessary.

Example installation:

$ java -jar osbws_install.jar \
>    -AWSID AKI***************QA \
>    -AWSKey no/MD*******************************upxK \
>    -otnUser \
>    -walletDir $ORACLE_HOME/dbs/osbws_wallet \
>    -libDir $ORACLE_HOME/lib
Oracle Secure Backup Web Service Install Tool, build 2015-06-22
AWS credentials are valid.
S3 user already registered.
Registration ID: 17d*****-0***-4***-8***-41e******ccc
S3 Logging Bucket: oracle-log-pane-1
Validating log bucket location ...
Validating license file ...
Oracle Secure Backup Web Service wallet created in directory /u01/app/oracle/product/12.1.0/dbhome_2/dbs/osbws_wallet.
Oracle Secure Backup Web Service initialization file /u01/app/oracle/product/12.1.0/dbhome_2/dbs/osbwsCDB121.ora created.
Downloading Oracle Secure Backup Web Service Software Library from file
Downloaded 27151475 bytes in 40 seconds. Transfer rate was 678786 bytes/second.
Download complete.


This determines the appropriate Oracle Secure Backup library file for your OS platform and database version and downloads it to the specified location (recommended $ORACLE_HOME/lib) and creates a config file and the wallet file:

$ ls -ltr $ORACLE_HOME/lib | tail -1
-rw-r--r--. 1 oracle oinstall  86629108 Sep  3 09:05

$ ls -ltr $ORACLE_HOME/dbs | tail -1
-rw-r--r--. 1 oracle oinstall      204 Sep  3 09:04 osbwsCDB121.ora

$ cat $ORACLE_HOME/dbs/osbwsCDB121.ora
OSB_WS_WALLET='location=file:/u01/app/oracle/product/12.1.0/dbhome_2/dbs/osbws_wallet CREDENTIAL_ALIAS=pane_aws'

$ ls -l $ORACLE_HOME/dbs/osbws_wallet
total 12
-rw-r--r--. 1 oracle oinstall 10228 Sep  3 09:04 cwallet.sso


At this point we’re ready to backup directly to the AWS S3 cloud.


Using with RMAN

Sending the RMAN backup pieces to the AWS S3 storage is as simple as backing via the normal RMAN commands but to the SBT_TAPE device. Of course the new OSB library file and configuration file to use must be specified. For example we can backup in a single run block without over-riding any of our existing RMAN configuration:

backup device type SBT_TAPE tablespace users;

RMAN> run {
2> allocate channel aws_s3 type sbt
3> parms=',SBT_PARMS=(OSB_WS_PFILE=/u01/app/oracle/product/12.1.0/dbhome_2/dbs/osbwsCDB121.ora)';
4> backup tablespace users;
5> }

allocated channel: aws_s3
channel aws_s3: SID=21 device type=SBT_TAPE
channel aws_s3: Oracle Secure Backup Web Services Library VER=

Starting backup at 03-SEP-15
channel aws_s3: starting full datafile backup set
channel aws_s3: specifying datafile(s) in backup set
input datafile file number=00006 name=/u01/app/oracle/oradata/CDB121/users01.dbf
channel aws_s3: starting piece 1 at 03-SEP-15
channel aws_s3: finished piece 1 at 03-SEP-15
piece handle=0aqg7f8h_1_1 tag=TAG20150903T095737 comment=API Version 2.0,MMS Version
channel aws_s3: backup set complete, elapsed time: 00:00:15
Finished backup at 03-SEP-15

Starting Control File and SPFILE Autobackup at 03-SEP-15
piece handle=c-3847224663-20150903-01 comment=API Version 2.0,MMS Version
Finished Control File and SPFILE Autobackup at 03-SEP-15
released channel: aws_s3



And to verify:

RMAN> list backup of tablespace users;

List of Backup Sets

------- ---- -- ---------- ----------- ------------ ---------------
6       Full    1.34M      DISK        00:00:00     03-SEP-15
        BP Key: 6   Status: AVAILABLE  Compressed: NO  Tag: TAG20150903T094342
        Piece Name: /u01/app/oracle/product/12.1.0/dbhome_2/dbs/06qg7eee_1_1
  List of Datafiles in backup set 6
  File LV Type Ckp SCN    Ckp Time  Name
  ---- -- ---- ---------- --------- ----
  6       Full 1366827    03-SEP-15 /u01/app/oracle/oradata/CDB121/users01.dbf

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
8       Full    1.50M      SBT_TAPE    00:00:02     03-SEP-15
        BP Key: 8   Status: AVAILABLE  Compressed: NO  Tag: TAG20150903T095453
        Handle: 08qg7f3d_1_1   Media:
  List of Datafiles in backup set 8
  File LV Type Ckp SCN    Ckp Time  Name
  ---- -- ---- ---------- --------- ----
  6       Full 1367173    03-SEP-15 /u01/app/oracle/oradata/CDB121/users01.dbf



Notice that the first backup was a local one run earlier and shows a local backup piece (file). The second shows that the media was “”. The “oracle-data-pane-1” is the “bucket” or logical container automatically created within the Amazon S3.

If we want to make the backup command automatically use the AWS S3 SBT_TAPE it’s trivial to do using the RMAN CONFIGURE command:

RMAN> configure channel device type sbt parms=',
2> SBT_PARMS=(OSB_WS_PFILE=/u01/app/oracle/product/12.1.0/dbhome_2/dbs/osbwsCDB121.ora)';

using target database control file instead of recovery catalog
new RMAN configuration parameters:
CONFIGURE CHANNEL DEVICE TYPE 'SBT_TAPE' PARMS  ',SBT_PARMS=(OSB_WS_PFILE=/u01/app/oracle/product/12.1.0/dbhome_2/dbs/osbwsCDB121.ora)';
new RMAN configuration parameters are successfully stored

RMAN> backup device type sbt tablespace users;

Starting backup at 03-SEP-15
released channel: ORA_DISK_1
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=16 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Oracle Secure Backup Web Services Library VER=
channel ORA_SBT_TAPE_1: starting full datafile backup set
channel ORA_SBT_TAPE_1: specifying datafile(s) in backup set
input datafile file number=00006 name=/u01/app/oracle/oradata/CDB121/users01.dbf
channel ORA_SBT_TAPE_1: starting piece 1 at 03-SEP-15
channel ORA_SBT_TAPE_1: finished piece 1 at 03-SEP-15
piece handle=0eqg7ft3_1_1 tag=TAG20150903T100834 comment=API Version 2.0,MMS Version
channel ORA_SBT_TAPE_1: backup set complete, elapsed time: 00:00:15
Finished backup at 03-SEP-15

Starting Control File and SPFILE Autobackup at 03-SEP-15
piece handle=c-3847224663-20150903-03 comment=API Version 2.0,MMS Version
Finished Control File and SPFILE Autobackup at 03-SEP-15



And really that’s all there is to it. Of course you can perform more advanced RMAN commands such as allocating multiple channels, etc. And we’re free to perform any combination of local backups and/or cloud backups to the SBT_TAPE device type:

RMAN> list backup summary;

List of Backups
Key     TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag
------- -- -- - ----------- --------------- ------- ------- ---------- ---
1       B  F  A SBT_TAPE    01-SEP-15       1       1       NO         TAG20150901T123222
2       B  F  A SBT_TAPE    01-SEP-15       1       1       NO         TAG20150901T123222
3       B  F  A SBT_TAPE    01-SEP-15       1       1       NO         TAG20150901T123222
4       B  F  A SBT_TAPE    01-SEP-15       1       1       NO         TAG20150901T123222
5       B  F  A SBT_TAPE    01-SEP-15       1       1       NO         TAG20150901T150814
6       B  F  A DISK        03-SEP-15       1       1       NO         TAG20150903T094342
7       B  F  A DISK        03-SEP-15       1       1       NO         TAG20150903T094343
8       B  F  A SBT_TAPE    03-SEP-15       1       1       NO         TAG20150903T095453
9       B  F  A SBT_TAPE    03-SEP-15       1       1       NO         TAG20150903T095737
10      B  F  A SBT_TAPE    03-SEP-15       1       1       NO         TAG20150903T095752
11      B  F  A DISK        03-SEP-15       1       1       NO         TAG20150903T100555
12      B  F  A DISK        03-SEP-15       1       1       NO         TAG20150903T100557
13      B  F  A SBT_TAPE    03-SEP-15       1       1       NO         TAG20150903T100834
14      B  F  A SBT_TAPE    03-SEP-15       1       1       NO         TAG20150903T100850


And obviously restoring works exactly the same way.

IMPORTANT: Note here that none of the backups are encrypted or compressed. Not having to encrypt backups is the first major functional difference from the Oracle Database Backup Service (OSDB) where encryption in mandatory.

Either encryption and/or compression can be used. Normally, backup encryption requires the Oracle Advanced Security Option however one exemption to that is when backing up using Oracle Secure Backup. Similarly, “basic” RMAN backup compression is included with the database without any additional licenses. However the additional compression options (such as “HIGH”, “MEDIUM”, or “LOW”) usually do require the Oracle Advanced Compression option but are included as “Special-use licensing” of the Oracle Secure Backup product.

Compressing prior to backup is generally highly recommended. While this will consume local CPU cycles, it will minimize transfer time through the internet and S3 space used.


Advanced Configuration

Generally, the base parameters should be sufficient. However, if further customization is required there are a number of optional parameters that can be added to the configuration/initialization file (in this example “$ORACLE_HOME/dbs/osbwsCDB121.ora”). An easy way to look for the available parameters including hidden parameters is to search through the OSB Cloud Module library file. For example:

$ strings $ORACLE_HOME/lib/ |grep OSB_


Secure Transfer

Similar to how backups using the OSB cloud module do not require encryption, they also transfer the data without SSL security by default.

This is apparent from a simple Linux netstat command while the backup is running:

$ netstat | grep http
tcp        0 299300 ORALINUX.localdom:46151 s3-1-w.amazonaws.c:http ESTABLISHED


However we can easily remedy this by changing the URL in the configuration/initialization file to use the “https” address:

$ head -1 $ORACLE_HOME/dbs/osbwsCDB121.ora


After adjusting, starting a new RMAN session (critical) and re-running the same backup command now shows a secure SSL (or HTTPS) connection:

$ netstat | grep http
tcp        0 299300 ORALINUX.localdom:34048 s3-1-w.amazonaws.:https ESTABLISHED
tcp        0      0 ORALINUX.localdom:25700 s3-1-w.amazonaws.:https TIME_WAIT


One would think that the undocumented parameter “_OSB_WS_NO_SSL” (which is a boolean accepting only TRUE or FALSE as possible values) might also effect which connection is used though experimentation showed no effect.


Viewing Usage

Viewing the files stored and data usage on the AWS S3 at first is not as intuitive as one might think. From the AWS dashboard selecting S3 shows the “buckets” (logical storage containers) and allows you to drill down into each to see the actual files stored:



However, properties of the bucket such as the number of files and total size is not displayed. To find this information you must navigate to the “Billing and Cost Management” account option:



From here we can see the usage and the associated costs and other relevant details of the account:




Just like with the Oracle Database Backup Service, getting going with the Oracle Secure Backup Cloud Module and Amazon Web Services is very simple. It really can be implemented quickly and easily allowing for RMAN backups to be written directly to off-site storage almost immediately.

However unlike OSDB, the AWS OSB module is not secure by default. With Oracle’s service, the data must be encrypted and the transfer is secured by default and hence they can live up to their claim that your data is secure in flight and at rest. However with the AWS OSB module, by default the reverse is true. Though of course, both HTTPS data transfer and backup encryption can be enabled if desired.

On a positive note, the ability to generate AWS ID and Key values (or “credentials”) specific for each backup configuration can be considered a security advantage.

Apart from the security differences, functionally the two services are almost identical as would be expected. After all, the OSDB module appears to simply be a clone of the OSBWS module introduced about half a decade ago.

The one big caveat of the AWS OSB module is the licensing cost. The Oracle Secure Backup Cloud module is licensed per RMAN channel. Channel based licenses can be shared among multiple databases however.

And overall the S3 storage service is more functional and mature than Oracle’s newer public cloud service. AWS includes some nice features with respect to geographic data placement and replication.

The bottom line is that the Oracle Secure Backup Cloud Module is a great technical solution for implementing off-site Oracle RMAN backups (likely to compliment your on-premise backups) and specifically putting them onto the AWS S3. Functionally it seems great. However, the additional licensing costs is likely the road block for most small to mid-sized organizations.


Additional References


Discover more about our expertise in Oracle and Cloud.

Categories: DBA Blogs