Skip navigation.

DBA Blogs

Welcome to Blackbird.io Employees and Clients

Pythian Group - Thu, 2014-06-26 11:29

Today, we announced that Pythian has entered into an agreement to acquire Blackbird.io, itself the result of a recent merger between PalominoDB and DriveDev.

I want to start with a hearty welcome to the 40+ new esteemed collaborators joining our firm today. Simultaneously, I want to welcome Blackbird.io’s valued clients to the Pythian family.

I am looking forward to cultivating a long-lasting collaboration and friendship with each one of you, many of whom I have already counted as friends for years.

To that point, I want to highlight my longstanding friendship and collaboration with Laine Campbell, the CEO of Blackbird.io. I first met Laine in 2007 and was impressed by her intelligence, her energy, her charisma and, most of all, her remarkable passion for doing the right thing by her team, her clients, and her community.

In February 2008, I sent Laine an email with the subject “Join us?”, the most important line of which was “I’m looking for a founder for a new office in the Bay Area.”

Laine was gracious in her reply: “At this point, I’m absolutely packed with long-term clients.  I’m quite comfortable with revenue and challenge and location.  I really am flattered you’d consider me for the position, but I’m going to have to pass.” That was only about a year after she had founded PalominoDB.

Laine and I have been friends ever since and have made a discipline of trading notes and advice about our respective businesses.

As we fast-forward six years to the present, Laine and her team have achieved what many might have thought impossible. Out of thin air, with no venture capital and in only eight short years, Blackbird.io is a business eerily reminiscent of Pythian in 2008… a feat that took us 11 years.

Earlier this year, PalominoDB joined forces with DriveDev, itself a highly successful DevOps business transformation company founded in 2007 to create Blackbird.io. Blackbird.io delivers a coherent and differentiated vision that helps transform businesses through breakthrough velocity, availability, security, performance, and cost.

In what has to be one of the longest corporate romances our niche has known, Laine reached out to me in May indicating that she’d like to accept my original offer and join forces with us. It was my turn to be flattered and go through a week’s soul searching.  I was not alone in the exercise. A lot of soul searching, strategic thinking, and sheer hard work has gone into this announcement today. By the end of our efforts, it became clear that joining forces would dramatically accelerate our ability to reshape the enterprise IT services landscape.

I would like to specifically thank Laine Campbell, Aaron Lee, and Vicki Vance as owners of Blackbird.io for their courage, vision, and determination through these demanding weeks. On the Pythian side, I would like to especially thank  Andrew Waitman, without whom this deal would be impossible to contemplate, Alain Tardif and Siobhan Devlin, and the rest of the executive team at Pythian who’ve moved mountains on our behalf to make it real. I don’t want to forget to highlight as well the external support of Bob Ford at Kelly Santini and our financing partners.

We have months of hard work ahead of us integrating our businesses. It’s our goal and imperative to listen and learn from each other, and pick and choose the best parts of each respective business as we weave a coherent and integrated whole. This will be the first meaningful merger Pythian undertakes.

Together we are almost 350 strong and are home to the industry’s largest open-source database managed services capability. Together we will accelerate the adoption of Enterprise DevOps and help countless SaaS, retail, media, and online businesses leave their competitors in the dust. And that is a vision worth getting excited about.

Categories: DBA Blogs

Oracle Database: Query to List All Statistics Tables

Pythian Group - Wed, 2014-06-25 08:00

If you were a big fan of manual database upgrade steps, perhaps you would have come across this step many times in your life while reading MOS notes, upgrade guides, etc.

Upgrade Statistics Tables Created by the DBMS_STATS Package
If you created statistics tables using the DBMS_STATS.CREATE_STAT_TABLE procedure, then upgrade these tables by executing the following procedure:
EXECUTE DBMS_STATS.UPGRADE_STAT_TABLE(‘SYS’,’dictstattab’);

In my experience, I found the statistics tables can be created from Oracle rdbms version 8i. So this step became part of the database upgrade documents until now. I also noticed the structure of the statistics table was the same until 10gR2 version, but Oracle had modified the structure marginally on 11g and 12c versions.

I have been using this single query to list all statistics tables that exist on a database, which can be still used despite changes on the table structure.

SQL> select owner,table_name from dba_tab_columns where COLUMN_NAME=’STATID’ AND DATA_TYPE= ‘VARCHAR2′;

Though this is not a critical step, it is required as a part of the post upgrade. Here is the small action plan to run the required command to upgrade all statistics tables.

Connect as SYS database user and run these steps:
SQL> set pages 1000
SQL> set head off
SQL> set feedback off
SQL> spool /home/oracle/stattab_upg.sql
SQL> select ‘EXEC DBMS_STATS.UPGRADE_STAT_TABLE(”’||owner||”’,”’||table_name||”’);’ from dba_tab_columns where COLUMN_NAME=’STATID’ AND DATA_TYPE= ‘VARCHAR2′;
SQL> spool off
SQL> @/home/oracle/stattab_upg.sql
SQL> exit

Categories: DBA Blogs

Virtual CPUs with Amazon Web Services

Pythian Group - Tue, 2014-06-24 15:41

Some months ago, Amazon Web Services changed the way they measure CPU capacity on their EC2 compute platform. In addition to the old ECUs, there is a new unit to measure compute capacity: vCPUs. The instance type page defines a vCPU as “a hyperthreaded core for M3, C3, R3, HS1, G2, and I2.” The description seems a bit confusing: is it a dedicated CPU core (which has two hyperthreads in the E5-2670 v2 CPU platform being used), or is it a half-core, single hyperthread?

I decided to test this out for myself by setting up one of the new-generation m3.xlarge instances (with thanks to Christo for technical assistance). It is stated to have 4 vCPUs running E5-2670 v2 processor at 2.5GHz on the Ivy Bridge-EP microarchitecture (or sometimes 2.6GHz in the case of xlarge instances).

Investigating for ourselves

I’m going to use paravirtualized Amazon Linux 64-bit for simplicity:

$ ec2-describe-images ami-fb8e9292 -H
Type    ImageID Name    Owner   State   Accessibility   ProductCodes    Architecture    ImageType       KernelId        RamdiskId Platform        RootDeviceType  VirtualizationType      Hypervisor
IMAGE   ami-fb8e9292    amazon/amzn-ami-pv-2014.03.1.x86_64-ebs amazon  available       public          x86_64  machine aki-919dcaf8                      ebs     paravirtual     xen
BLOCKDEVICEMAPPING      /dev/sda1               snap-b047276d   8

Launching the instance:

$ ec2-run-instances ami-fb8e9292 -k marc-aws --instance-type m3.xlarge --availability-zone us-east-1d
RESERVATION     r-cde66bb3      462281317311    default
INSTANCE        i-b5f5a2e6      ami-fb8e9292                    pending marc-aws        0               m3.xlarge       2014-06-16T20:23:48+0000  us-east-1d      aki-919dcaf8                    monitoring-disabled                              ebs                                      paravirtual     xen             sg-5fc61437     default

The instance is up and running within a few minutes:

$ ec2-describe-instances i-b5f5a2e6 -H
Type    ReservationID   Owner   Groups  Platform
RESERVATION     r-cde66bb3      462281317311    default
INSTANCE        i-b5f5a2e6      ami-fb8e9292    ec2-54-242-182-88.compute-1.amazonaws.com       ip-10-145-209-67.ec2.internal     running marc-aws        0               m3.xlarge       2014-06-16T20:23:48+0000        us-east-1d      aki-919dcaf8                      monitoring-disabled     54.242.182.88   10.145.209.67                   ebs                      paravirtual      xen             sg-5fc61437     default
BLOCKDEVICE     /dev/sda1       vol-1633ed53    2014-06-16T20:23:52.000Z        true

Logging in as ec2-user. First of all, let’s see what /proc/cpuinfo says:

[ec2-user@ip-10-7-160-199 ~]$ egrep '(processor|model name|cpu MHz|physical id|siblings|core id|cpu cores)' /proc/cpuinfo
processor       : 0
model name      : Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
cpu MHz         : 2599.998
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 1
processor       : 1
model name      : Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
cpu MHz         : 2599.998
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 1
processor       : 2
model name      : Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
cpu MHz         : 2599.998
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 1
processor       : 3
model name      : Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
cpu MHz         : 2599.998
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 1

Looks like I got some of the slightly faster 2.6GHz CPUs. /proc/cpuinfo shows four processors, each with physical id 0 and core id 0. Or in other words, one single-core processor with 4 threads. We know that the E5-2670 v2 processor is actually a 10-core processor, so the information we see at the OS level is not quite corresponding.

Nevertheless, we’ll proceed with a few simple tests. I’m going to run “gzip”, an integer-compute-intensive compression test, on 2.2GB of zeroes from /dev/zero. By using synthetic input and discarding output, we can avoid effects of disk I/O. I’m going to combine this test with taskset comments to impose processor affinity on the process.

A simple test

The simplest case: a single thread, on processor 0:

[ec2-user@ip-10-7-160-199 ~]$ taskset -pc 0 $$
pid 1531's current affinity list: 0-3
pid 1531's new affinity list: 0
[ec2-user@ip-10-7-160-199 ~]$ dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null
2170552320 bytes (2.2 GB) copied, 17.8837 s, 121 MB/s

With the single processor, we can process 121 MB/sec. Let’s try running two gzips at once. Sharing a single processor, we should see half the throughput.

[ec2-user@ip-10-7-160-199 ~]$ for i in {1..2}; do dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null & done
2170552320 bytes (2.2 GB) copied, 35.8279 s, 60.6 MB/s
2170552320 bytes (2.2 GB) copied, 35.8666 s, 60.5 MB/s
Sharing those cores

Now, let’s make things more interesting: two threads, on adjacent processors. If they are truly dedicated CPU cores, we should get a full 121 MB/s each. If our processors are in fact hyperthreads, we’ll see throughput drop.

[ec2-user@ip-10-7-160-199 ~]$ taskset -pc 0,1 $$
pid 1531's current affinity list: 0
pid 1531's new affinity list: 0,1
[ec2-user@ip-10-7-160-199 ~]$ for i in {1..2}; do dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null & done
2170552320 bytes (2.2 GB) copied, 27.1704 s, 79.9 MB/s
2170552320 bytes (2.2 GB) copied, 27.1687 s, 79.9 MB/s

We have our answer: throughput has dropped by a third, to 79.9 MB/sec, showing that processors 0 and 1 are threads sharing a single core. (But note that Hyperthreading is giving performance benefits here: 79.9 MB/s on a shared core is higher than then 60.5 MB/s we see when sharing a single hyperthread.)

Trying the exact same test, but this time, non-adjacent processors 0 and 2:

[ec2-user@ip-10-7-160-199 ~]$ taskset -pc 0,2 $$
pid 1531's current affinity list: 0,1
pid 1531's new affinity list: 0,2
[ec2-user@ip-10-7-160-199 ~]$ for i in {1..2}; do dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null & done
2170552320 bytes (2.2 GB) copied, 17.8967 s, 121 MB/s
2170552320 bytes (2.2 GB) copied, 17.8982 s, 121 MB/s

All the way up to full-speed, showing dedicated cores.

What does this all mean? Let’s go back to the Amazon’s vCPU definition

Each vCPU is a hyperthreaded core

As our tests have shown, a vCPU is most definitely not a core. It’s a half of a shared core, or one hyperthread.

A side effect: inconsistent performance

There’s another issue at play here too: the shared-core behavior is hidden from the operating system. Going back to /proc/cpuinfo:

[ec2-user@ip-10-7-160-199 ~]$ grep 'core id' /proc/cpuinfo
core id         : 0
core id         : 0
core id         : 0
core id         : 0

This means that the OS scheduler has no way of knowing which processors have shared cores, and can not schedule tasks around it. Let’s go back to our two-thread test, but instead of restricting it to two specific processors, we’ll let it run on any of them.

[ec2-user@ip-10-7-160-199 ~]$ taskset -pc 0-3 $$
pid 1531's current affinity list: 0,2
pid 1531's new affinity list: 0-3
[ec2-user@ip-10-7-160-199 ~]$ for i in {1..2}; do dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null & done
2170552320 bytes (2.2 GB) copied, 18.041 s, 120 MB/s
2170552320 bytes (2.2 GB) copied, 18.0451 s, 120 MB/s
[ec2-user@ip-10-7-160-199 ~]$ for i in {1..2}; do dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null & done
2170552320 bytes (2.2 GB) copied, 21.2189 s, 102 MB/s
2170552320 bytes (2.2 GB) copied, 21.2215 s, 102 MB/s
[ec2-user@ip-10-7-160-199 ~]$ for i in {1..2}; do dd if=/dev/zero bs=1M count=2070 2> >(grep bytes >&2 ) | gzip -c > /dev/null & done
2170552320 bytes (2.2 GB) copied, 26.2199 s, 82.8 MB/s
2170552320 bytes (2.2 GB) copied, 26.22 s, 82.8 MB/s

We see throughput varying between 82 MB/sec and 120 MB/sec, for the exact same workload. To get some more performance information, we’ll configure top to run 10-second samples with per-processor usage information:

[ec2-user@ip-10-7-160-199 ~]$ cat > ~/.toprc <<-EOF
RCfile for "top with windows"           # shameless braggin'
Id:a, Mode_altscr=0, Mode_irixps=1, Delay_time=3.000, Curwin=0
Def     fieldscur=AEHIOQTWKNMbcdfgjplrsuvyzX
        winflags=25913, sortindx=10, maxtasks=2
        summclr=1, msgsclr=1, headclr=3, taskclr=1
Job     fieldscur=ABcefgjlrstuvyzMKNHIWOPQDX
        winflags=62777, sortindx=0, maxtasks=0
        summclr=6, msgsclr=6, headclr=7, taskclr=6
Mem     fieldscur=ANOPQRSTUVbcdefgjlmyzWHIKX
        winflags=62777, sortindx=13, maxtasks=0
        summclr=5, msgsclr=5, headclr=4, taskclr=5
Usr     fieldscur=ABDECGfhijlopqrstuvyzMKNWX
        winflags=62777, sortindx=4, maxtasks=0
        summclr=3, msgsclr=3, headclr=2, taskclr=3
EOF
[ec2-user@ip-10-7-160-199 ~]$ top -b -n10 -U ec2-user
top - 21:07:50 up 43 min,  2 users,  load average: 0.55, 0.45, 0.36
Tasks:  86 total,   4 running,  82 sleeping,   0 stopped,   0 zombie
Cpu0  : 96.7%us,  3.3%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  1.4%sy,  0.0%ni, 97.9%id,  0.0%wa,  0.3%hi,  0.0%si,  0.3%st
Cpu2  : 96.0%us,  4.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  1.0%sy,  0.0%ni, 97.9%id,  0.0%wa,  0.7%hi,  0.0%si,  0.3%st

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1766 ec2-user  20   0  4444  608  400 R 99.7  0.0   0:06.08 gzip
 1768 ec2-user  20   0  4444  608  400 R 99.7  0.0   0:06.08 gzip

Here two non-adjacent CPUs are in use. But 3 seconds later, the processes are running on adjacent CPUs:

top - 21:07:53 up 43 min,  2 users,  load average: 0.55, 0.45, 0.36
Tasks:  86 total,   4 running,  82 sleeping,   0 stopped,   0 zombie
Cpu0  : 96.3%us,  3.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  : 96.0%us,  3.6%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.3%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.3%hi,  0.0%si,  0.3%st
Cpu3  :  0.3%us,  0.0%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.3%st

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1766 ec2-user  20   0  4444  608  400 R 99.7  0.0   0:09.08 gzip
 1768 ec2-user  20   0  4444  608  400 R 99.7  0.0   0:09.08 gzip

Although usage percentages are similar, we’ve seen earlier that throughput drops by a third when cores are shared, and we see varied throughput as the processes are context-switched between processors.

This type of situation arises where compute-intensive workloads are running, and when there are fewer processes than total CPU threads. And if only AWS would report correct core IDs to the system, this problem wouldn’t happen: the OS scheduler would make sure processes did not share cores unless necessary.

Here’s a chart summarizing the results:

aws-cpu Summing up

Over the course of the testing I’ve learned two things:

  • A vCPU in an AWS environment actually represents only half a physical core. So if you’re looking for equivalent compute capacity to, say, an 8-core server, you would need a so-called 4xlarge EC2 instance with 16 vCPUs. So take it into account in your costing models!
  • The mislabeling of the CPU threads as separate single-core processors can result in performance variability as processes are switched between threads. This is something the AWS and/or Xen teams should be able to fix in the kernel.

Readers: what has been your experience with CPU performance in AWS? If any of you has access to a physical machine running E5-2670 processors, it would be interesting to see how the simple gzip test runs.

Categories: DBA Blogs

The Art of War for Small Business

Surachart Opun - Mon, 2014-06-23 10:51
The Art of War is an ancient Chinese military treatise attributed to Sun Tzu, a high-ranking military general, strategist and tactician. A lot of books have written by using Sun Tzu's ancient The Art of War and adaptation for military, political, and business.

The Art of War for Small Business Defeat the Competition and Dominate the Market with the Masterful Strategies of Sun Tzu, this is a book was applied the Art of War for small business. So, it's a perfect book for small business owners and entrepreneurs entrenched in fierce competition for customers, market share, talent and etc. In a book, it was written with 4 parts with 224 pages - SEIZE THE ADVANTAGE WITH SUN TZU, UNDERSTANDING: ESSENTIAL SUN TZU, PRINCIPLES FOR THE BATTLEFIELD, ADVANCED SUN TZU: STRATEGY FOR YOUR SMALL.
It's not much pages for read and it begins with why the art of war should be used with the small business and gives lot of examples and idea how to apply the art of war with the small business and use it everyday (It helps how to Choose the right ground for your battles, Prepare without falling prey to paralysis, Leverage strengths while overcoming limitations, Strike competitors' weakest points and seize every opportunity, Focus priorities and resources on conquering key challenges, Go where the enemy is not, Build and leverage strategic alliances).

After reading, readers should see the picture of  the common advantages and disadvantages in the small business and why the small business needs Sun Tzu. In additional, Readers will learn the basic of the art of war and idea to apply with the small business. It shows the example by giving the real world of small business.




Written By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs

Working in Pythian’s Advanced Technology Consulting Group

Pythian Group - Mon, 2014-06-23 08:22

Before I joined Pythian, I had the fortune of having a lot of good jobs across various industries. My favorite jobs were the ones that were fast paced and required me to ramp up my skills on the fly while learning new technology. My least favorite jobs were the ones where my skills were not used and the challenges were few and far between. When I joined Pythian I hadn’t realized I found my first great job.

In April 2012, I joined Pythian’s Professional Consulting Group (PCG). The members of PCG represented some of the world’s leading data experts, but the name did not adequately represent the skills of the members. Members of PCG were experts in many complementary technologies and many, if not all, were quickly becoming experts in emerging technologies such as Big Data. Because of this, the Professional Consulting Group became the Advanced Technology Consulting Group (ATCG).

As a member of ATCG, my main responsibility is to deliver consulting services to our customers either on site or remotely. Examples of some of the work we might do include: troubleshooting performance problems, migrating databases into Exadata, setting up replication with Oracle GoldenGate, and data integration with numerous sources using Oracle Data Integrator. While all of the items I mentioned deal with Oracle technologies, ATCG also has members who specialize in Microsoft SQL Server and MySQL.

The services we provide to our customers do not stop at traditional database services, ATCG also delivers Big Data services using Hadoop. Examples of some of the Hadoop work I have been involved with include: installing and configuring Cloudera Hadoop, securing Hadoop with Kerberos, and troubleshooting performance. As you can see, ATCG has the opportunity to gain valuable experience across a broad range of technologies.

All of our projects begin with a call with the potential customer. Members of ATCG serve as a technical resource on the call. It is our responsibility to understand the customer’s issue and estimate the effort required to perform the work. Sometimes this can be challenging because the customer might not have a technical resource on their side who can articulately convey the issue. Even if there is a technical resource on the customer’s side, we have to be mindful to not alienate others on the call, so it is vitally important that we are able to convey our message in way everybody on the call can understand.

You might be thinking “I am not a salesperson!” and “I have never used some of these technologies.” You would not be alone. ATCG are not sales people, we simply assist Sales by providing our technical knowledge on a call. Imagine that you are speaking with your boss or customer about a problem or issue – It really is no different. Dealing with new technology is little different at Pythian from your current job; If you don’t understand something, you can talk to a few coworkers or research on the net at your current job. At Pythian we can reach out to 200+ coworkers and find quite a few who have experience with the technology in question. We can search our internal technical documents, which are quite vast as they detail all of the work we have done, and as a last resort we can search the net. At Pythian, you are never alone and you are never without resources.

There are times when we might not have a project to work on, a.k.a. downtime. During our downtime, we can build our knowledge of technologies that we have interest in or that we may need a refresher for. We can practice our new found knowledge assisting other teams. We can help build the Pythian knowledge base by posting blogs and contributing to our internal documentation.

The work in ATCG is very challenging and you are always learning something new, whether it is a new technology or a new way of thinking about a particular topic. Being bored or pigeonholed is not a problem in ATCG; we are involved in some of toughest problems and work with the latest technologies. And when we are not, we are in control of our workday so we can pursue interests in new and emerging database technologies.

Categories: DBA Blogs

Ambari Blueprints and One-Touch Hadoop Clusters

Pythian Group - Fri, 2014-06-20 11:11

For those who aren’t familiar, Apache Ambari is the best open source solution for managing your Hadoop cluster: it’s capable of adding nodes, assigning roles, managing configuration and monitoring cluster health. Ambari is HortonWorks’ version of Cloudera Manager and MapR’s Warden, and it has been steadily improving with every release. As of version 1.5.1, Ambari added support for a declarative configuration (called a Blueprint) which makes it easy to automatically create clusters with many ecosystem components in the cloud. I’ll give an example of how to use Ambari Blueprints, and compare them with existing one-touch deployment methods for other distributions.

Why would I want that?

I’ve been working on improving the methodology used by the Berkeley Big Data Benchmark. Right now spinning up the clusters is a relatively manual process, where the user has to step through the web interfaces of Cloudera Manager and Ambari, copy-paste certificates and IPs, and assign roles to nodes. The benchmark runs on EC2 instances, so I’ve been focused on automatic ways to create clusters on Amazon:

  • Apache Whirr can create a Hadoop cluster (or a number of other Big Data technologies), including CDH5, MapR and HDP. Documentation is sparse, and there doesn’t appear to be support for installing ecosystem projects like Hive automatically.
  • Amazon EMR supports installing Hive and Impala natively, and other projects like Shark via bootstrap actions. These tend to be older versions which aren’t suitable for my purposes.
  • MapR’s distribution is also available on EMR, but I haven’t used that since the different filesystem (MapRFS vs. HDFS) would impact results.

Hive-on-Tez is only supported on HDP at the moment, so it’s crucial that I have a one-touch command to create both CDH5 clusters, but also HDP clusters. Ambari Blueprints provide a crucial piece of the solution.

The Blueprint

Blueprints themselves are just JSON documents you send to the Ambari REST API. Every Ambari Blueprint has two main parts: a list of “host groups”, and configuration.

Host Groups

Host groups are a set of machines with the same agents (“components” in Ambari terms) installed – a typical cluster might have host groups for the NameNode, SecondaryNameNode, ResourceManager, DataNodes and client nodes for submitting jobs. The small clusters I’m creating have a “master” node group with the NameNode, ResourceManager, and HiveServer components on a single server and then a collection of “slaves” running the NodeManager and DataNode components. Besides a list of software components to install, every host group has a cardinality. Right now this is a bit of a pain, since the cardinality is exact: your blueprint with 5 slave nodes must have 5 slaves Hopefully the developers will add an option for “many”, so we don’t have to generate a new blueprint for every different sized cluster.  Thanks to John from HortonWorks for a correction, cardinality is an optional hint which isn’t validated by Ambari. This wasn’t clear from the docs.

To provide a concrete example, the sample host groups I’m using look like this:

"host_groups" : [
 {
 "name" : "master",
 "components" : [
 {
 "name" : "NAMENODE"
 },
 {
 "name" : "SECONDARY_NAMENODE"
 },
 {
 "name" : "RESOURCEMANAGER"
 },
 {
 "name" : "HISTORYSERVER"
 },
 {
 "name" : "ZOOKEEPER_SERVER"
 },
 {
 "name" : "HIVE_METASTORE"
 },
 {
 "name" : "HIVE_SERVER"
 },
 {
 "name" : "MYSQL_SERVER"
 }
 ],
 "cardinality" : "1"
 },
{
 "name" : "slaves",
 "components" : [
 {
 "name" : "DATANODE"
 },
 {
 "name" : "HDFS_CLIENT"
 },
 {
 "name" : "NODEMANAGER"
 },
 {
 "name" : "YARN_CLIENT"
 },
 {
 "name" : "MAPREDUCE2_CLIENT"
 },
 {
 "name" : "ZOOKEEPER_CLIENT"
 },
 {
 "name" : "TEZ_CLIENT"
 },
 {
 "name" : "HIVE_CLIENT"
 }
 ],
 "cardinality" : "5"
 }

This host_groups describes a single node with all of the “master” components installed, and five slaves with just the DataNode, NodeManager and clients installed. Note that some components have depedencies: it’s possible to build an invalid blueprint which contains a HIVE_METASTORE but not a MYSQL_SERVER. The REST API provides appropriate error messages when such a blueprint is submitted.

Configuration

Configuration allows you to override the defaults for any services you’re installing, and it comes in two varieties: global, and service-specific. Global parameters are required for different services: to my knowledge Nagios and Hive require global parameters to be specified – these parameters apply to multiple roles within the cluster, and the API will tell you if any are missing. Most cluster configuration (your typical core-site.xml, hive-site.xml, etc. parameters) can be overriden in the blueprint by specifying a configuration with the leading part of the file name, and then providing a map of the keys to overwrite. The configuration below provides a global variable that Hive requires, and it also overrides some of the default parameters in hive-site.xml. These changes will be propagated to the cluster as if you changed them in the Ambari UI.

"configurations": [
  {
    "global": {
      "hive_metastore_user_passwd": "p"
    }
  },
  {
    "hive-site": {
      "javax.jdo.option.ConnectionPassword": "p",
      "hive.security.authenticator.manager": "org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator",
      "hive.execution.engine": "tez",
      "hive.exec.failure.hooks": "",
      "hive.exec.pre.hooks": "",
      "hive.exec.post.hooks": ""
    }
  }
]

This config will override some parameters in hive-site.xml, as well as setting the metastore password to ‘p’. Note that you can specify more configuration files to override (core-site.xml, hdfs-site.xml, etc.) but each file must be it’s own object in the configurations array, similar to how global and hive-site are handled above.

Once you’ve specified the host groups and any configuration overrides, the Blueprint also needs a stack – the versions of software to install. Right now Ambari only supports HDP – see this table for the stack versions supported in each Ambari release. As a weird constraint, the blueprint name is inside the blueprint itself, along with the stack information. This name must be the same as the name you provide to the REST endpoint, for some reason. To upload a new blueprint to an Ambari server you can use:

$ curl -X POST -H 'X-Requested-By: Pythian' <ambari-host>/api/v1/blueprints/<blueprint name> -d @<blueprint file>

The X-Requested-By header is required, and as noted the blueprint name must match the file.

You can see the entire blueprint file from this example here, feel free to use it as a baseline for your cluster.

Creating the Cluster

Once you’ve written a blueprint with the services and configuration you want, you need to:

  • Create EC2 instances with the correct security groups
  • Install ambari-master on one, and ambari-agent on the others
  • Configure the agents to report to the master
  • Write a file mapping hosts to host groups
  • Push both files (the blueprint and the mapping) to the REST API

Fortunately, we have a Python script that can do that for you! This script will create a benchmarking cluster with a specific number of data nodes, an Ambari master and a separate Hadoop master. It can easily be modified to create multiple classes of machines, if you want to have more host groups than “master” and “slave”. The core of the script (the EC2 interaction and Ambari RPM installation) is stolen from based on work by Ahir Reddy from Databricks, with the Ambari Blueprints support added by yours truly.

If you’re curious about the host mapping file: it has the blueprint name, and an array of host names for every host_group. Corresponding to the example above, the cluster definition would be:

{
  "blueprint":"hadoop-benchmark",
  "host_groups: [
    { 
      "name":"master",
      "hosts":[{"fqdn":"host-1"}]
    },
    {
      "name":"slaves",
      "hosts":[ 
        {"fqdn":"host-2"},
        {"fqdn":"host-3"}  
      ]
  ]
}

You could replace “host-n” with the real domain names for your Amazon instances (use the internal ones!), and create a new cluster over those machines using:

$ curl -X POST -H 'X-Requested-By: Pythian' <ambari-host>/api/v1/clusters/<cluster name> -d @<mapping file>
Conclusion

Ambari Blueprints have some rough edges right now, but they provide a convenient way to deploy all of the services supported by the HDP stack. Watch this space for more posts about my effort to create a repeatable, one-touch, cross-distribution Hadoop SQL benchmark on EC2.

Categories: DBA Blogs

Log Buffer #376, A Carnival of the Vanities for DBAs

Pythian Group - Fri, 2014-06-20 08:52

This Log Buffer Editions collects and present you various blog posts from Oracle, SQL Server and MySQL arena.

Oracle:

Oracle SOA Suite Achieves Significant Customer Adoption and Industry Recognition

The Perks of Integrated Business Planning

Why are we so excited about Oracle Database XE R2?

Skipping ACS ramp-up using a SQL Patch

Moving datafiles has always been a pain.

SQL Server:

Using DBCC DROPCLEANBUFFERS When Testing Performance

Data Mining Introduction Part 9: Microsoft Linear Regression

PowerShell One-Liners: Accessing, Handling and Writing Data

Stairway to SQL Server Security: Level 2, Authentication

Matthew Flatt was trying to measure the performance of a tool interacting with Microsoft Azure

MySQL:

Using UDFs for geo-distance search in MySQL

Amidst various blog postings on Docker, a security issue announced yesterday that detailed an exploit of Docker that makes it possible to do container breakout.

Discover the MySQL Central @ OpenWorld Content Catalog

Shinguz: Replication channel fail-over with Galera Cluster for MySQL

SQL queries preview before sent to server in MySQL for Excel

Categories: DBA Blogs

Move That Datafile!

alt.oracle - Thu, 2014-06-19 14:56
Moving datafiles has always been a pain.  There are several steps, it’s fairly easy to make a mistake and it requires the datafile to be offline.  There are also different steps depending on whether the database is in ARCHIVELOG mode or not.  In ARCHIVELOG mode, the steps are…
1)      Take the tablespace containing the datafile offline2)      Copy/rename the datafile at the OS layer3)      Use ALTER TABLESPACE…RENAME DATAFILE to rename the datafile so that the controlfile will be aware of it4)      Backup the database for recovery purposes (recommended)
If the database is in NOARCHIVELOG mode, you have to shutdown the DB, put it in the MOUNT state, etc, etc.  That’s certainly not that hard to do, but you get the feeling that there should be a better way.  Now in Oracle 12c, there is – using the ALTER DATABASE MOVE DATAFILE command.  With this command, you can move a datafile, while it’s online, in one simple step.  Let’s set this up.
SQL> create tablespace test datafile '/oracle/base/oradata/TEST1/datafile/test01.dbf' size 10m;
Tablespace created.
SQL> create table altdotoracle.tab1 (col1 number) tablespace test;
Table created.
SQL> insert into altdotoracle.tab1 values (1);
1 row created.
SQL> commit;
Commit complete.
Let’s go the extra mile and lock the table in that datafile in another session.
SQL> lock table altdotoracle.tab1 in exclusive mode;
Table(s) Locked.
Now let’s use the command.
SQL> alter database move datafile '/oracle/base/oradata/TEST1/datafile/test01.dbf'  2   to '/oracle/base/oradata/TEST1/datafile/newtest01.dbf';
Database altered.
That’s all there is to it.  Datafile moved/renamed in one step while a table it contained was locked.
SQL> select file_name from dba_data_files where file_name like '%newtest%';
FILE_NAME--------------------------------------------------------------------------------/oracle/base/oradata/TEST1/datafile/newtest01.dbf
Categories: DBA Blogs

Intermediate Python Practical Techniques for Deeper Skill Development

Surachart Opun - Thu, 2014-06-19 09:40
It's time to learn more about Python. I found "Intermediate Python Practical Techniques for Deeper Skill Development" video course by Python expert Steve Holden.
It's very useful for Python video learning, but users should have basic about Python. They must install ipython.
Note start ipython by " ipython  notebook" command and users can check how to install ipython?and users should download example codes at https://github.com/DevTeam-TheOpenBastion/int-py-notes

This video course gaves deeply Python learning topics by using iPython, including:

  • Functions: return values, arguments, decorators, and the function API
  • Comprehensions, generator functions, and generator expressions
  • Understanding the import system and namespace relationships
  • Using the Python DB API to query and maintain relational data, and JSON to extract data from the Web
  • The NumPy, SciPy, and Matplotlib libraries for numerical and analytical computing
  • An introduction to unit testing with unit test
  • Deeper understanding of Unicode, with explanations of encoding and decoding techniques and the relationship between byte strings and text
  • An introduction to textual analysis using regular expressions
  • Information sources for documentation, further research, and coding style considerations

First of all, Users should install "ipython" and download examples codes. Users will be able to learn Python each topic easier, because it's easy to follow each example demo in video. It's very good to use this video course and iPython for Python improvement.

Written By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs

Service Reliability: How Systems Administration is Evolving – Part Two

Pythian Group - Thu, 2014-06-19 08:06

In part one of this mini-series, I wrote about how the old ways of systems administration is broken and shared a story from my personal experiences. Today I’ll be talking about how it’s evolved with Site Reliability Engineering (also known as Service Reliability Engineering or SRE).

SRE is born

Interestingly, this is a concept I was pushing back around 2008-2010, but unfortunately my ideas fell on deaf ears.  How do Netflix, Facebook, Dropbox, and Google provide such reliable service? First I can tell you what they don’t do – they don’t throw more hardware or ram at the problem.

So how do you do it?

First everyone has to agree that service reliability is everyone’s problem.  It is the reason our jobs exist! Management must agree, because the way a company thinks and operates needs to change from the top down, and the focus needs to be on service reliability.

Operations and Development teams need to be aligned so that outages are everyone’s problem. Remember the story in my previous blog post? What do you think would have happened if I had forced the developers to reboot the servers every 12 hours, then every 24 hours? Would it have taken weeks or months to resolve the problem? Of course not.

You need your Ops team to know how to code, and understand development. I’m not saying they have to actually be on Dev team, but they need to understand coding, speak the language, and script their way out of issues. Most importantly, they must be on the hook if things go wrong.

At Google and Facebook, their developers are on call. If a release suddenly causes a significant amount of paging/failures, they don’t hire more Ops employees. The developers fill the gap until the issue is fixed, which clearly happens very quickly.

No one wants to be getting paged all night, especially developers. If they’re getting paged due to a bug in their software, you can bet the issue will be resolved in days, not weeks. Making developers responsible for the service reliability means they are required to think about failure in their software development – they’ll have to design in graceful failure modes, and expect their underlying infrastructure to be unreliable. If their little widget takes down the entire site, you can be sure they’re going to be up until it’s corrected.

The bottom line is that software and hardware come together to provide a service, and one isn’t more important than the other. Major wins in reliability can be realized if you align software design, automation, and hardware design.

At Pythian, we offer this as a managed service. Our team has the experience, background, and skills to help get you to this promise land of high reliable, scalable systems. We are building a team of Site Reliability Engineers to help companies benefit and transition to this new paradigm in systems administration, where service reliability is everyone’s problem.

Categories: DBA Blogs

Bought Matthew Morris’ OCP 12c upgrade book for Key DBA Skills section

Bobby Durrett's DBA Blog - Wed, 2014-06-18 17:22

I haven’t even started on Sam Alapati’s OCP book but I realized today that Matthew Morris’s OCP 12c upgrade book has an outline of the Key DBA Skills section of the OCP exam and that is what I need most.  I was two questions away on that section when I took the test the first time and I need to make sure I’m studying the right things.  After I got back to the office after failing my first attempt (a $245 lesson) I made notes about what I remember being unsure of and that gave me some ideas what to study.  But, it is nice to get some advice from someone else who has passed the exam.  I saved a couple of bucks by getting the Kindle version for $9.99 so it seemed like a safe bet.  Hopefully I can hit it hard on both books and my practice test software, which also covers the key DBA skills, and be able to take the test again by the end of the August.

Categories: DBA Blogs

Service Reliability: How Systems Administration is Evolving – Part One

Pythian Group - Wed, 2014-06-18 07:52

The world of systems administration is changing, and it’s affecting everyone involved. Today’s blog post is the first of two in mini-series dedicated to Service Reliability: How Systems Administration is Evolving.

The days of formulas that tell us, You need “X” system admins, for every “Y” physical server, and every “Z” VMs are coming to a close. Even the world of IT management is changing.

Why? Because as scale increases, it’s simply impossible to continue at the pace we were at. Google saw this in the mid 2000’s and began the next evolution of systems administration. They recognized that there was no way they could scale up the way things had been managed for decades. In fact, Netflix came to a similar conclusion.

Interestingly, what Google did has nothing to do with technology, rather to do with the philosophy of systems administration. They started a new group which they originally called Production Engineering, and was later renamed Site Reliability Engineering, also known as Service Reliability Engineering or SRE.  At its core, SRE changes the fundamental thinking of IT management. It recognizes site reliability as everyone’s responsibility. Some might say that’s obvious, but in the past it wasn’t.

Worked fine in dev meme.jpg

The old way is broken

Most companies have two very separate and distinct groups. Operations and Development. Historically these two groups are highly siloed, and in some cases, do not get along very well. Why? It comes down to philosophy, really.

Operations folks are driven to ensure systems are up, secure, and reliable. Developers, on the other hand, are driven to create cool new features and applications. Here lies one of the biggest problems.

Years back I worked as an Operations Director, and had a counterpart on the development side who was the Software Engineering Director. We had just completed releasing a major update for one of our platforms, and very quickly we saw we had major issues. Our primary application servers (25+ physical boxes) were becoming unstable after about 12 hours of production load (I won’t go into why this happened, that’s a story for another day.) We quickly identified this, so the Ops team began rebooting these boxes in a rolling fashion. They were boxes that had some specialized hardware in them, and starting/stopping, then testing them took about 15-30 minutes each.  We had a team of about 5 people, which was not a 24/7 group. Clearly this caused significant pain for our Operations staff. We determined that part of the problem was a memory leak. Due to the nature of the release, rolling back simply was not an option.

The initial response I received was that we would just have to deal with it for now, as there were a few other pressing issues they wanted to resolve first. After many sleepless nights and lost weekends, we finally were able to get a update so the systems only needed to be rebooted daily, 7 days a week. It stayed this way for months.

Crazy, right?

But why? It was because the software team, and the management we both reported to, was far more interested in hitting deadlines for features, and new functionality – not how much sleep, or how many days off our Ops employees were getting. I was told on more than one occasion that high availability and recovery were Ops problems, not Development problems.

The core of this problem is simple. Development felt that service reliability was 100%  an Operations problem. Our new release takes 2x more ram? Add more ram to 100 servers! Our new application requires 20 new servers? Sure, with some work it could be cut down to 2-3, but just get the 20 servers. That’s easy!

Without naming names, has anyone else faced this issue? Comment below. Stay tuned for part two, where I’ll be discussing the birth of SRE, how it’s allowed systems administration to evolve, and how to achieve it.

Categories: DBA Blogs

Oracle Announces the Next-Generation of Virtual Compute Appliance

Oracle Virtual Compute Appliance, the latest addition on Oracle's Engineered System portfolio,  enables rapid, repeatable software-defined infrastructure deployment for virtually any x86...

We share our skills to maximize your revenue!
Categories: DBA Blogs

Room for Improvement – Using DBMS_REDEFINITION with Spatial Data

Pythian Group - Tue, 2014-06-17 07:53
Resizing a Column in a Table With a Spatial Column

Recently a client asked for help with using DBMS_REDEFINITION to modify a column in a table.

As the intent was to reduce the precision of a numeric column, simply modifying the column would not work, as there was already data in the column:

13:51:44 ora11203fs.jks.com - jkstill@js01 SQL&gt; /
alter table t modify(id number(6,2))
*
ERROR at line 1:
ORA-01440: column to be modified must be empty to decrease precision or scale

The client had made a few attempts use DBMS_REDEFINITION, but ran into several issues with each attempt.

In this case the owner of the destination table was different than the owner of the source table.
As per My Oracle Support, this situation requires several permissions granted that may not normally be necessary for using DBMS_REDEFINITION.
However I have seen similar problems when the source and destination user are the same.

The version of database for the client is 11.2.0.3 running on Solaris
These test are also performed on 11.2.0.3, but on Linux 5 rather than Solaris.

Most of the issues involve permissions that must be granted to the owner of the destination object.
This is true even if the job is being run as SYSDBA.
These errors were due to the table in question having a column of spatial data with a corresponding spatial index.

17:27:49 ora11203fs.jks.com – sys@js01 SQL> desc spdata
Name Null? Type
———————————————– ——– ——————————–
ID NOT NULL NUMBER(28)
RADIUS NOT NULL NUMBER(16,8)
LOCATION NOT NULL MDSYS.SDO_GEOMETRY

Workflow

The information in this article will be presented somewhat backwards to the way that is usually seen.
First I will show a working example of using DBMS_REDEFINITION to redefine the column.
Following that some of the problems will be highlighted, and then some technical references shown.

Doing so will make this article a little more user friendly I think.
If you are facing a similar issue then the parts you need the most are right up front.

A Working Example

Following are the steps to create the test data, setup DBMS_REDEFINITION and complete the process

Create the Test Data

The first step is to create a test user. The permissions granted to this user are quite important, as will be demonstrated later on.

-- redefdest_user.sql

create user redefdest identified by redefdest;

grant resource, connect, create session, unlimited tablespace to redefdest;

grant alter session to redefdest;

grant execute on dbms_redefinition to redefdest;

grant create any sequence  to redefdest;
grant create any table  to redefdest;
grant alter any table  to redefdest;
grant drop any table  to redefdest;
grant lock any table  to redefdest;
grant select any table to redefdest;
grant create any index to redefdest;
grant create any trigger  to redefdest;

Once this process is completed, remember to revoke any privileges that are not needed on a permanent basis.

Now let’s create a simple table with some spatial data:

-- tab_create.sql

create table redefdest.spdata
(
   id number(28,0) not null,
   radius number(16,8) not null,
   location mdsys.sdo_geometry not null
);

The next step is to setup the Geometry Metadata. This must be run as the owner of the SPDATA table.

-- insert_geo_data.sql

delete from user_sdo_geom_metadata where  table_name = 'SPDATA' and column_name = 'LOCATION';

insert into user_sdo_geom_metadata
(table_name, column_name, diminfo, srid)
values
(
   'SPDATA', 'LOCATION',
   sdo_dim_array (
      sdo_dim_element ('Longitude', -180, 180, 0.005),
      sdo_dim_element ('Latitude', -90, 90, 0.005)
   ),
   8307 -- SRID - Spatial Reference IDentifier - see mdsys.cs_srs
);

commit;

select * from user_sdo_geom_metadata;

Now we can insert some test data into the table.

-- insert.sql

insert into spdata(id, radius, location)
select
   id,
   dbms_random.value(10000,20000),
   sdo_geometry(2001, 8307,
       sdo_point_type (floor(dbms_random.value(-180,180)),floor(dbms_random.value(-90,90)) , null),
       null, null
   )
from (
select level id
from dual
connect by level <= 100
) data;

commit;

Now create indexes on the ID column and the spatial data column.

-- idx_create.sql

create index redefdest.spdata_id_idx on redefdest.spdata (id);

create index redefdest.spdata_location_idx on redefdest.spdata (location)
   indextype is mdsys.spatial_index  parameters ('SDO_DML_BATCH_SIZE=2000');
Configure DBMS_REDEFINITION

The goal of this excercise is to change the scale and precision of the RADIUS column.

That cannot be done directly via ALTER TABLE as any attempt to so will cause ORA-01440.

The first task is to create the interim table. This table will be created nearly identical to the source table. The RADIUS column in the interim table will have a smaller scale and precision than the source table.

The columns for the table SPDATA_INTERIM are all set as nullable.
The reason for the will be explained later on.

-- create_interim_table.sql

create table redefdest.spdata_interim
(
   id number(28,0),
   radius number(12,4),
   location mdsys.sdo_geometry
);

Now geo data for the interim table must be created. This is not the same script as seen previously. Though the script name is the same, it is stored in a different directory. The only difference from the previous script is the name of the table used – SPDATA_INTERIM rather than SPDATA.
As before, this SQL must be run as the owner of the table in question, SPDATA_INTERIM in this case.

-- insert_geo_data.sql

insert into user_sdo_geom_metadata
(table_name, column_name, diminfo, srid)
VALUES
(
   'SPDATA_INTERIM', 'LOCATION',
   sdo_dim_array (
      sdo_dim_element ('Longitude', -180, 180, 0.005),
      sdo_dim_element ('Latitude', -90, 90, 0.005)
   ),
   8307 -- SRID  - see mdsys.cs_srs
)
/

commit;

Now that the interim table has been created and the geo data inserted, the redefinition process can begin:

-- redefine.sql

  1  declare
  2  v_col_map varchar(2048) := 'ID ,RADIUS ,LOCATION';
  3  begin
  4     dbms_redefinition.start_redef_table (
  5              uname          => 'REDEFDEST'
  6             ,orig_table     => 'SPDATA'
  7             ,int_table      => 'SPDATA_INTERIM'
  8             ,col_mapping    => v_col_map
  9             ,options_flag   => dbms_redefinition.cons_use_rowid
 10             ,orderby_cols   => null
 11             ,part_name      => null
 12  );
 13* end;
17:34:51 ora11203fs.jks.com - jkstill@js01 SQL> /

PL/SQL procedure successfully completed.

Once the refinition process has completed, the dependent objects can be created.

-- copy_dependent_objects.sql

Wrote file afiedt.buf

  1  declare
  2     v_number_of_errors number := 0;
  3  begin
  4     dbms_redefinition.copy_table_dependents(
  5             uname             => 'REDEFDEST'
  6             ,orig_table       => 'SPDATA'
  7             ,int_table        => 'SPDATA_INTERIM'
  8             ,copy_indexes     =>  dbms_redefinition.cons_orig_params
  9             ,copy_triggers    =>  true
 10             ,copy_constraints =>  true
 11             ,copy_privileges  =>  true
 12             ,ignore_errors    => false
 13             ,num_errors       => v_number_of_errors
 14             ,copy_statistics  => true
 15             ,copy_mvlog       => true
 16     );
 17     dbms_output.put_line('Number of Errors' || v_number_of_errors);
 18* END;
17:35:58 ora11203fs.jks.com - jkstill@js01 SQL> /

PL/SQL procedure successfully completed.

And now finish the redefinition process, check the SPDATA.RADIUS column to ensure it is now NUBMER(12,4), and drop the interim table (now the original table)

-- finish_redef.sql

  1  begin
  2    dbms_redefinition.finish_redef_table (
  3      uname          => 'REDEFDEST'
  4     ,orig_table     => 'SPDATA'
  5     ,int_table      => 'SPDATA_INTERIM'
  6  );
  7* end;
17:35:59 ora11203fs.jks.com - jkstill@js01 SQL> /

17:36:43 ora11203fs.jks.com - jkstill@js01 SQL> alter table redefdest.spdata modify (id not null);

Table altered.

17:44:05 ora11203fs.jks.com - jkstill@js01 SQL>  alter table redefdest.spdata modify (radius not null);
alter table redefdest.spdata modify (radius not null)
                                     *
ERROR at line 1:
ORA-01442: column to be modified to NOT NULL is already NOT NULL

17:44:05 ora11203fs.jks.com - jkstill@js01 SQL> alter table redefdest.spdata modify (location not null);

Table altered.

PL/SQL procedure successfully completed.

16:01:40 ora112304a.jks.com - redefdest@js02 SQL> desc spdata
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                        NOT NULL NUMBER(28)
 RADIUS                                             NUMBER(12,4)
 LOCATION                                  NOT NULL MDSYS.SDO_GEOMETRY

16:01:40 ora112304a.jks.com - redefdest@js02 SQL> select count(*) from spdata where radius is not null;

  COUNT(*)
----------
       100

With all of the required permissions in place, everything works as expected.

Getting to that point however required reading a few Oracle Support notes and some experimentation.

There are several ways this process can fail if one of the requirements is not met.

Alter/Create/Drop Any Table

If the owner of the tables is missing any of ALTER/CREATE/DROP ANY TABLE, the process will fail when copying dependent objects.
This due to a bug in Oracle.

In this test the CREATE ANY TABLE privilege has been revoked, and the entire test re-run:

18:04:25 ora11203fs.jks.com - sys@js01 SQL> revoke create any table from redefdest;

Revoke succeeded.

## Copy Dependents ##

declare
*
ERROR at line 1:
ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine
ORA-13249: internal error in Spatial index: [mdidxrbd]
ORA-13249: Error in Spatial index: index build failed
ORA-13249: Error in R-tree: [mdrcrtscrt]
ORA-13231: failed to create index table [MDRT_190DB$] during R-tree creation
ORA-13249: Stmt-Execute Failure: CREATE TABLE "REDEFDEST".MDRT_190DB$ (NODE_ID NUMBER, NODE_LEVEL NUMBER, INFO BLOB)
LOB (INFO) STORE AS (NOCACHE)  PCTFREE 2
ORA-29400: data cartridge error
ORA-01031: insufficient privileges
ORA-06512: at "SYS.DBMS_REDEFINITION", line 1155
ORA-06512: at "SYS.DBMS_REDEFINITION", line 1885
ORA-06512: at line 4

Likewise if CREATE ANY SEQUENCE is revoked, the process will fail.
(CREATE ANY TABLE had already been re-granted)


18:12:23 ora11203fs.jks.com - sys@js01 SQL> revoke create any sequence from redefdest;

Revoke succeeded.

## Copy Dependents ##

declare
*
ERROR at line 1:
ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine
ORA-13249: internal error in Spatial index: [mdidxrbd]
ORA-13249: Error in Spatial index: index build failed
ORA-13249: Error in R-tree: [mdrcrtscrt]
ORA-13231: failed to create index table [MDRS_190F9$] during R-tree creation
ORA-13249: Stmt-Execute Failure: CREATE SEQUENCE "REDEFDEST".MDRS_190F9$ ORDER START WITH 1 CACHE 100
ORA-29400: data cartridge error
ORA-01031: insufficient privileges
ORA-06512: at "SYS.DBMS_REDEFINITION", line 1155
ORA-06512: at "SYS.DBMS_REDEFINITION", line 1885
ORA-06512: at line 4

So each of these permission are required for this process to succeed, even when running as SYSDBA.

Permissions Required

The test objects created for this article are quite simple.
As such there are other potential errors that could occur due to the destination account not having all required permissions for a successful use of dbms_redefinition on a table with spatial data.
Before granting new privileges to an account, be sure to audit the current privileges.
That way you will know which privileges can be safely revoked when the table modifications are complete.

Constraint Issues

There are some bugs associated with DBMS_REDEFINITION.

The COPY_DEPENDENT_OBJECTS subprogram seemingly does not handle some constraints very well.

The original table SPDATA was created with all columns set to NOT NULL.

The interim table SPDATA_INTERIM was created with all columns set to NULL.

If the interim table is created with one or more columns as NOT NULL, the following error occurs:

## Copy Dependents ##
declare
*
ERROR at line 1:
ORA-01442: column to be modified to NOT NULL is already NOT NULL
ORA-06512: at "SYS.DBMS_REDEFINITION", line 984
ORA-06512: at "SYS.DBMS_REDEFINITION", line 1899
ORA-06512: at line 4

Part of the job of the procedure is to create check constraints as found on the original table.

That part of the process doesn’t seem to work quite correctly.

When the process has completed, the columns appear as NULLable to the sqlplus DESC command, but that is not the case.

18:21:15 ora11203fs.jks.com - jkstill@js01 SQL> desc spdata
 Name								   Null?    Type
 ----------------------------------------------------------------- -------- --------------------------------------------
 ID									    NUMBER(28)
 RADIUS 								    NUMBER(12,4)
 LOCATION								    MDSYS.SDO_GEOMETRY

18:21:17 ora11203fs.jks.com - jkstill@js01 SQL> insert into spdata values(null,null,null);
insert into spdata values(null,null,null)
                          *
ERROR at line 1:
ORA-01400: cannot insert NULL into ("REDEFDEST"."SPDATA"."ID")

18:21:35 ora11203fs.jks.com - jkstill@js01 SQL> @check_cons

TABLE NAME		       CONSTRAINT_NAME		      C SEARCH_CONDITION	       STATUS
------------------------------ ------------------------------ - ------------------------------ --------
SPDATA			       SYS_C0018231		      C "ID" IS NOT NULL	       ENABLED
SPDATA			       SYS_C0018232		      C "RADIUS" IS NOT NULL	       ENABLED
SPDATA			       SYS_C0018233		      F "LOCATION" IS NOT NULL	       ENABLED

3 rows selected.

The output of the DESC command and the contents of DBA_CONSTRAINTS do not agree.

While it would be interesting to dig into the lower level reasons for why this is happening, it for now just an inconvience while there are more pressing matters to work on.

References

Following are the Oracle Notes considered when troubleshooting this issue.

Registering constraints

The DBMS_REDEFINITION.REGISTER_DEPENDENT_OBJECT procedure could be used to better deal with constraints, at least according to this Oracle Note:
HOW TO USE DBMS_REDEFINITION.REGISTER_DEPENDENT_OBJECT (Doc ID 1304838.1)
I have not yet tried this.

In regard to desc table not showing constraints

This may be the issue: Bug 16023293 : DBMS_REDEFINITION.COPY_TABLE_DEPENDENTS DOES NOT VALIDATE ALL CONSTRAINTS
This bug affects Oracle 11.2.0.3 on Solaris, so it seems likely to be an issue for the client, and may well affect other platforms as well.

Direct privs granted to table owner

How to Re-Organize a Table Online (Doc ID 177407.1)
ORA-13233 and ORA-01031 When Creating a Spatial Index in Another Schema (Doc ID 251593.1)

Test Code: spatial_redef_dist.zip

Categories: DBA Blogs

OPN Solutions Catalog Redesigned

Good news for our partners and our customers looking for partners solutions! We are pleased to announce the launch of the brand new, redesigned and mobile-ready OPN Solutions Catalog. The OPN...

We share our skills to maximize your revenue!
Categories: DBA Blogs

The Hitchhiker’s Guide to the EXPLAIN PLAN (Act II)

Iggy Fernandez - Mon, 2014-06-16 13:04
Over at ToadWorld: Part 5: SQL Sucks! Part 6: Trees Rule Part 7: Don’t pre-order your EXPLAIN PLAN Part 8: Tree Menagerie Bonus article: Equivalence of Relational Algebra and Relational Calculus The story so far: A relational database is “a database in which: the data is perceived by the user as tables (and nothing but tables)  and […]
Categories: DBA Blogs

Instant REST API For Any Database

Pythian Group - Mon, 2014-06-16 07:48

Not so long ago, I was playing with ElasticSearch, which has the interesting characteristic of having a REST API as its primary interface. Sure, it’s a little more stilted and awkward than any native interface but, on the other hand, it’s a nice universal type of API. Any language that can make a http request can talk to it and, hey, bad comes to worse, even ‘curl’ will do. It would be kinda cool if other databases had such a web service.

And then I began to think…

Don’t we have DBIx::Class::Schema::Loader, which can connect to a database and auto-generate its DBIx::Class schema?


package MyDB;

use parent 'DBIx::Class::Schema::Loader'; 

...;

# later on

my $schema = MyDB->connect( 'dbi:SQLite:foo.db' ); # boom, we have our schema

And once we have a DBIx::Class representation of a schema, can’t we introspect it and pretty much get everything there is to know about it?


use Data::Printer;

# get all the table names
my @tables = $schema->sources;

# and all the columns of all the tables
for my $table ( $schema->sources ) {
    say "Table $table";
    p $schema->source($table)->columns_info;
}

That is, that’s if we want to do it manually, considering that there’s already SQL::Translator that can do most of the job for us.


use SQL::Translator;

print SQL::Translator->new (
    parser      => 'SQL::Translator::Parser::DBIx::Class',
    parser_args => {
        dbic_schema => $schema,
    },
    producer    => 'JSON',
)->translate;

Of course, since we are talking web service, we will want to pass everything back and forth using JSON, including database entries. Well, that’s hardly a problem if we use DBIx::Class::Helper::Row::ToJSON.

So it seems we have the database side covered. For the web framework? You’ll probably not be surprised to see me go with Dancer. Not only can we leverage the serializers and plugins like Dancer::Plugin::DBIC, but setting routes are ridiculously easy.


get '/_tables' => sub {
    return [ schema->sources ];
};

Even niftier: remember that Dancer routes are defined at runtime, so we can introspect that schema as much as we want and come up with any route we can dream of.


my @primary_key = schema->source($table)->primary_columns;
my $row_url = join '/', undef, $table, ( '*' ) x @primary_key;
 # GET ///
get $row_url => sub {
    my @ids = splat;
    return $schema->resultset($table)->find({
        zip @primary_key, @ids
    });
};
 # GET /
get "/$table" => sub {
    my @things = $schema->resultset($table)->search({ params() })->all;
    return \@things;
};
 # create new entry
post "/$table" => sub {
    $schema->resultset($table)->create({ params() });
};

Added bonus: the way Dancer’s params() conglomerate parameters defined in the query string and in the serialized body of the request plays in our favor: simple queries can be passed directly via the url, and more complicated ones can be defined as JSON structures.

So, you put all of this together, and you obtain waack. All it needs is a dsn pointing to the right database (and credentials, if needed). To illustrate, let’s try with my Digikam SQLite database.


$ waack dbi:SQLite:digikam4.db
>> Dancer 1.3124 server 28914 listening on http://0.0.0.0:3000
>> Dancer::Plugin::DBIC (0.2100)
== Entering the development dance floor ...

And now, let’s fire up App::Presto as our REST client.


$ presto http://enkidu:3000

http://enkidu:3000> type application/json

First, we can retrieve all the table names.


http://enkidu:3000> GET /_tables
[
   "TagsTree",
   "ImageMetadata",
   "Tag",
   "Setting",
   "ImageRelation",
   "ImageTag",
   "ImageProperty",
   "ImageInformation",
   "ImageHaarMatrix",
   "ImageCopyright",
   "VideoMetadata",
   "ImageHistory",
   "DownloadHistory",
   "Search",
   "ImageTagProperty",
   "Image",
   "Album",
   "ImagePosition",
   "TagProperty",
   "AlbumRoot",
   "ImageComment"
]

We can also get the whole schema.


http://enkidu:3000> GET /_schema
{
   "translator" : {
      "producer_args" : {},
      "show_warnings" : 0,
      "add_drop_table" : 0,
      "parser_args" : {
         "dbic_schema" : null
      },
      "filename" : null,
      "no_comments" : 0,
      "version" : "0.11018",
      "parser_type" : "SQL::Translator::Parser::DBIx::Class",
      "trace" : 0,
      "producer_type" : "SQL::Translator::Producer::JSON"
   },
   "schema" : {
      "tables" : {
         "ImageRelations" : {
            "options" : [],
            "indices" : [],
            "order" : "12",
            "name" : "ImageRelations",
            "constraints" : [
               {
                  "type" : "UNIQUE",
                  "deferrable" : 1,
                  "name" : "subject_object_type_unique",
                  "on_delete" : "",
                  "reference_fields" : [],
                  "fields" : [
                     "subject",
                     "object",
                     "type"
                  ],
                  "match_type" : "",
                  "reference_table" : "",
                  "options" : [],
                  "expression" : "",
                  "on_update" : ""
               }
            ],
...

Too much? We can get the columns of a single table.


http://enkidu:3000> GET /Tag/_schema
{
   "iconkde" : {
      "is_nullable" : 1,
      "data_type" : "text",
      "is_serializable" : 1
   },
   "name" : {
      "is_serializable" : 1,
      "data_type" : "text",
      "is_nullable" : 0
   },
   "id" : {
      "is_nullable" : 0,
      "data_type" : "integer",
      "is_auto_increment" : 1,
      "is_serializable" : 1
   },
   "icon" : {
      "is_nullable" : 1,
      "data_type" : "integer",
      "is_serializable" : 1
   },
   "pid" : {
      "is_serializable" : 1,
      "is_nullable" : 1,
      "data_type" : "integer"
   }
}

Query that table, with a simple condition…


http://enkidu:3000> GET /Tag id=1
[
   {
      "name" : "orchid",
      "icon" : null,
      "id" : 1,
      "pid" : 0,
      "iconkde" : null
   }
]

… or with something a little more oomphie.


$ curl -XGET -H Content-Type:application/json --data '{"name":{"LIKE":"%bulbo%"}}' http://enkidu:3000/Tag
[
   {
      "pid" : 1,
      "name" : "Bulbophyllum 'Melting Point'",
      "icon" : null,
      "id" : 32,
      "iconkde" : "/home/yanick/Pictures/My Plants/IMG_0461.JPG"
   },
   {
      "id" : 56,
      "iconkde" : "tag",
      "icon" : null,
      "pid" : 39,
      "name" : "Bulbophyllum ebergardetii"
   },
   {
      "name" : "bulbophyllum",
      "pid" : 564,
      "iconkde" : null,
      "id" : 565,
      "icon" : 0
   }
]

Btw: I cheated for that last one. Presto doesn’t send body with GET requests. And Dancer doesn’t deserialize GET bodies either. Patches will be written tonight.

Anyway, back with the show. We can also select specific rows by primary keys.


http://enkidu:3000> GET /Tag/1
{
   "id" : 1,
   "iconkde" : null,
   "pid" : 0,
   "icon" : null,
   "name" : "orchid"
}

Create new rows.


http://enkidu:3000> POST /Tag '{"name":"nepenthes","pid":0}'
{
   "pid" : 0,
   "name" : "nepenthes",
   "iconkde" : null,
   "icon" : null,
   "id" : 569
}

And do updates.


http://enkidu:3000> PUT /Tag/569 '{"icon":"img.png"}'
{
   "icon" : "img.png",
   "iconkde" : null,
   "pid" : 0,
   "name" : "nepenthes",
   "id" : 569
}

Not too shabby, isn’t? Mostly considering that, if you look at the source of waack, you’ll see that it barely clock over 100 lines of code. Take a minute and let this sink in.

One hundred lines of code. For a universal database REST web service.

If that’s not standing on the shoulders of giants, then I don’t know what is.

Categories: DBA Blogs

Internet Scale Design: Part Two

Pythian Group - Fri, 2014-06-13 08:01

In my previous blog post, I emphasized that internet scale design can be implemented for any type of company. Whether it’s a small, bootstrapped startup or a rapidly growing, well-funded tier 2. But if it’s suitable for that many companies, why isn’t everyone moving into the cloud? In my opinion, there are two reasons.

First, the model of utility computing doesn’t work for all business models. It is most effective in models where demand changes, where there are peaks and valleys for larger scale systems. It also works well as a way to get your startup or project off the ground with little-to-no capital investment. In the story I began in my previous blog post, the seasonality of their environment made them a perfect candidate.

The second is more of a people problem. In many companies,  IT leadership, SysAdmins, Developers, DBAs, and everyone else involved in service management, have been working with whatever technology stack that company has been using for years. It’s important to remember that most SysAdmins see their primary job as keeping things up and running, so we typically prefer working with things we know vs things we don’t.

If a C-level executive or VP returns from a conference about cloud, and issues a mandate that they need to “move everything to the cloud!” to remain “competitive” the SysAdmins will likely fail. Why? Not because they’re not smart enough, but because they simply don’t know enough about it.

While it would be ideal for the COO to say, “I want us to look into moving our platform into AWS, so I’m going to send you to get Amazon certified,” it rarely happens. Usually it sounds more like, “You’re smart, you’ll figure it out. Oh sure, you already have a full-time job keeping the lights on, but just squeeze it in when you can. We’ll need to see a POC by the end of the quarter.”

I don’t need to tell you how this ends ? it’s obvious. It will fail almost every time.

One of the amazing benefits to the Pythian model is that our teams are exposed to a wide variety of systems. We have built auto-scaling systems in AWS, OpenStack systems, VMWare systems, as well as legacy physical systems we support. Our teams are not relegated to whichever tech stack they happened to be stuck with for the last five years.

The bottom line here is that it doesn’t matter what kind of company you’re at – Whether it’s a small retailer, midsize tier 3, or larger tier 2, if you’re willing to sign on for the concept of site reliability engineering and commit to it, together we can accomplish some amazing things, all for a price you can afford.

Categories: DBA Blogs

Log Buffer #375, A Carnival of the Vanities for DBAs

Pythian Group - Fri, 2014-06-13 07:57

What’s better; Watching Football World Cup or Reading Log Buffer? Yes, right Log Buffer, but please also take out sometime to watch the matches, as this tournament comes only once in 4 years. No? Ok, as you say. So read along then.

Oracle:

Alan Hargreaves reasons as Why you should Patch NTP.

This post examines the results of a recent database protection survey conducted by Database Trends and Applications (DBTA) Magazine.

Keep jobs active with screen command.

ORE Getting Connected: ore.connect and other commands.

End-to-End ODI12c ETL on Oracle Big Data Appliance Pt.4 : Transforming Data using Python & Hive Streaming

SQL Server:

Enabling and Configuring Reporting and Logging for Maintenance Plans in SQL Server 2012

The articles collected here will help you understand the theories and methodologies behind every stage of the database delivery pipeline, starting when database changes are checked in, and ending when they’re deployed to production.

Stairway to Database Source Control Level 2: Getting a Database into Source Control .

What does it take to become a database administrator, or what kinds of traits should I be looking for when I am hiring a DBA. Those traits can be summarized it two categories: Technical and Personal.

Display all errors thrown within a catch block by a restore statement in a stored procedure using extended events.

MySQL:

RHEL7 & the transition from MySQL to MariaDB: A first look.

FairWarning Privacy Monitoring Solutions Rely on MySQL to Secure Patient Data

MariaDB 5.5.38 Overview and Highlights

Recruiters Looking for MySQL DBAs and MySQL Developers

MariaDB Galera Cluster 10.0.11 now available

On-disk/block-level encryption for MariaDB

Categories: DBA Blogs

Gather Statistics Enhancements in 12c

Hemant K Chitale - Fri, 2014-06-13 01:32
Here are 5 posts that I did on Gather Statistics Enhancements in 12c :

1.  During a CTAS

2.  In a Direct Path INSERT

3.   Reports on Statistics

4.  Does not COMMIT a GTT

5.  Report on COL_USAGE
.
.
.


Categories: DBA Blogs