Skip navigation.

Pythian Group

Syndicate content
Love Your Data
Updated: 1 hour 51 min ago

Log Buffer #448: A Carnival of the Vanities for DBAs

Fri, 2015-11-06 11:38

This Log Buffer is dedicated to the top quality news from the arena of Oracle, SQL Server and MySQL.


  • We had a question on AskTom the other day, asking us to explain what a “latch” was.
  • Jonathan Lewis thinks column groups can be amazingly useful in helping the optimizer to generate good execution plans because of the way they supply better details about cardinality.
  • Today it’s all about developing software that makes access to your product easier.
  • Steve Jobs sets a great perspective on the journey of simplicity. It starts from simple, goes through complexity and ends up in simplicity.
  • AWR period comparison is pretty easy if you have access to the two periods in the same AWR repository.

SQL Server:

  • Understanding Peer-to-Peer Transactional Replication, Part 2.
  • Knee-Jerk Wait Statistics : PAGELATCH.
  • Stairway to Columnstore Indexes Level 5: Adding New Data To Columnstore Indexes.
  • SQL Server Reporting Services General Best Practices.
  • Hello Azure: Azure IaaS – Getting Started.
  • A Skills Roadmap for DBAs in the Cloud Era.


  • MySQL Performance: 1M QPS on mixed OLTP_RO with MySQL 5.7 GA.
  • Deploying MongoDB, MySQL, PostgreSQL & MariaDB’s MaxScale in 40min.
  • ClusterControl Tips & Tricks: wtmp Log Rotation Settings for Sudo User.
  • Setting-up second mysql instance & replication on Linux in 10 steps.
  • s9s Tools and Resources: ‘Become a MySQL DBA’ series, ClusterControl 1.2.11 release, and more!


Learn more about Pythian’s expertise in Oracle SQL Server & MySQL.

Categories: DBA Blogs

Nagios Authentication with Active Directory.

Wed, 2015-11-04 10:14


Nagios authentication with Active Directory aligns with user management consolidation policies in most organizations. This post explains how to setup Nagios authentication with Active Directory, while using Apache as web server.

mod_authz_ldap is an apache LDAP authorization module. This can be used to authorize a user based on an LDAP query.

Install mod_authz_ldap.

# yum install mod_authz_ldap

Make sure that the module is loaded in apache:

/etc/httpd/conf.d/authz_ldap.confLoadModule authz_ldap_module modules/

To query LDAP, ldapsearch can be used. Install following package:

# yum install openldap-clients

Active Directory will not allow an LDAP client to operate against it anonymously, therefore a user DN and password with minimum permission is required.

For example: CN=Nagios User,CN=Users,DC=hq,DC=CORP,DC=abc,DC=org

The CN attribute corresponds to the “Display Name” of the account in Active Directory.

ldapsearch can be used to query LDAP server. In this case Active Directory.

In this example, we will look at how to enable access to all the members in ‘Pythian’ group who in turn have membership in ‘Nagios Admins’ group.

To find the members of Pythian group, run following command:

# ldapsearch -x -LLL -D ‘CN=Nagios User,CN=Users,DC=hq,DC=CORP,DC=abc,DC=org’ -W -H ldap:// -b ‘CN=Pythian,OU=Internal Groups,DC=hq,DC=CORP,DC=abc,DC=org’
Enter LDAP Password:
dn: CN=Pythian,OU=Internal Security Groups,DC=hq,DC=CORP,DC=abc,DC=org
objectClass: top
objectClass: group
cn: pythian
description: General Pythian group.
member: CN=Joseph Minto,OU=Service Consultants,OU=Consultants,OU=User Accounts,DC=hq,DC=CORP,DC=abc,DC=org <—————
member: CN=Test All,OU=Service Consultants,OU=Consultants,OU=User Accounts,DC=hq,DC=CORP,DC=abc,DC=org <—————
distinguishedName: CN=pythian,OU=Internal Security Groups,DC=hq,DC=CORP,DC=abc,DC=org
instanceType: 4
whenCreated: 20120720203444.0Z
whenChanged: 20150611152516.0Z
uSNCreated: 11258263
memberOf: CN=OA Admins,OU=Internal Security Groups,DC=hq,DC=CORP,DC=abc,DC=org
uSNChanged: 128023795
name: pythian
objectGUID:: XY68X44xZU6KQckM3gckcw==
sAMAccountName: pythian
sAMAccountType: 268435456
groupType: -2147483646
objectCategory: CN=Group,CN=Schema,CN=Configuration,DC=CORP,DC=abc,DC=org
dSCorePropagationData: 20140718174533.0Z
dSCorePropagationData: 20121012140635.0Z
dSCorePropagationData: 20120823115415.0Z
dSCorePropagationData: 20120723133138.0Z
dSCorePropagationData: 16010714223649.0Z

To find the details of a user account, following command can be used:

# ldapsearch -x -LLL -D ‘CN=Nagios User,CN=Users,DC=hq,DC=CORP,DC=abc,DC=org’ -W -H ldap:// -b ‘CN=Pythian,OU=Internal Groups,DC=hq,DC=CORP,DC=abc,DC=org’ -s sub “sAMAccountName=jminto”
Enter LDAP Password:
dn: CN=Joseph Minto,OU=Service Consultants,OU=Consultants,OU=User Accounts,DC= hq,DC=CORP,DC=abc,DC=org
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: user
cn: Joseph Minto
sn: Minto
c: US
l: Arlington
st: VA
description: 09/30/15 – Consultant – Pythian
postalCode: 22314
telephoneNumber: 1 866 – 798 – 4426
givenName: Joseph
distinguishedName: CN=Joseph Minto,OU=Service Consultants,OU=Consultants,OU=User Accounts,DC=hq,DC=CORP,DC=abc,DC=org
instanceType: 4
whenCreated: 20131203160403.0Z
whenChanged: 20150811045216.0Z
displayName: Joseph Minto
uSNCreated: 62354283
info: sponsored by:
memberOf: CN=Pythian,OU=Internal Security Groups,DC=hq,DC=CORP,DC=abc,DC=org
memberOf: CN=Nagios Admins,OU=Nagios Groups,OU=AppSecurityGroups,DC=hq,DC=CORP,DC=abc,DC=org <————-
uSNChanged: 137182483
co: United States
name: Joseph Minto
objectGUID:: uh9bC/ke6Uap0/dUk9gyLw==
userAccountControl: 512
badPwdCount: 0
codePage: 0
countryCode: 840
badPasswordTime: 130360542953202075
lastLogoff: 0
lastLogon: 130844674893200195
scriptPath: callsl.bat
logonHours:: ////////////////////////////
pwdLastSet: 130305602432591455
primaryGroupID: 513
adminCount: 1
accountExpires: 130881456000000000
logonCount: 116
sAMAccountName: jminto
sAMAccountType: 805306368
objectCategory: CN=Person,CN=Schema,CN=Configuration,DC=CORP,DC=abc,DC=org
dSCorePropagationData: 20150320162428.0Z
dSCorePropagationData: 20140718174545.0Z
dSCorePropagationData: 20131203161019.0Z
dSCorePropagationData: 16010101181632.0Z
lastLogonTimestamp: 130837423368430625

Following are the ldapsearch switches used above:

-x Use simple authentication instead of SASL.
-L Search results are display in LDAP Data Interchange Format detailed in ldif(5). A single -L restricts the output to LDIFv1.
A second -L disables comments. A third -L disables printing of the LDIF version. The default is to use an extended version of LDIF.-D binddn
Use the Distinguished Name binddn to bind to the LDAP directory. For SASL binds, the server is expected to ignore this value.-W Prompt for simple authentication. This is used instead of specifying the password on the command line.-H ldapuri
Specify URI(s) referring to the ldap server(s); a list of URI, separated by whitespace or commas is expected; only the protocol/host/port fields are
allowed. As an exception, if no host/port is specified, but a DN is, the DN is used to look up the corresponding host(s) using the DNS SRV records,
according to RFC 2782. The DN must be a non-empty sequence of AVAs whose attribute type is “dc” (domain component), and must be escaped according to RFC
2396.-b searchbase
Use searchbase as the starting point for the search instead of the default.-s {base|one|sub|children}
Specify the scope of the search to be one of base, one, sub, or children to specify a base object, one-level, subtree, or children search. The default is
sub. Note: children scope requires LDAPv3 subordinate feature extension.

In the nagios configuration in apache, parameters in mod_authz_ldap can be used to validate a user like we used in ldapsearch:

# cat /etc/httpd/conf.d/nagios.conf
# Last Modified: 11-26-2005
# This file contains examples of entries that need
# to be incorporated into your Apache web server
# configuration file. Customize the paths, etc. as
# needed to fit your system.ScriptAlias /nagios/cgi-bin/ “/usr/lib64/nagios/cgi-bin/”Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName “Nagios Access”
AuthType BasicAuthzLDAPMethod ldap
AuthzLDAPServer “”
AuthzLDAPBindDN “CN=Nagios User,CN=Users,DC=hq,DC=CORP,DC=abc,DC=org”
AuthzLDAPBindPassword “typepasswordhere”
AuthzLDAPUserKey sAMAccountName
AuthzLDAPUserBase “CN=Pythian,OU=Internal Groups,DC=hq,DC=CORP,DC=abc,DC=org”
AuthzLDAPUserScope subtree
AuthzLDAPGroupKey cn
AuthzLDAPMemberKey member
AuthzLDAPSetGroupAuth ldapdn
require group “Nagios Admins”Alias /nagios “/usr/share/nagios/html”Options None
AllowOverride None
Order allow,deny
Allow from all
AuthName “Nagios Access”
AuthType BasicAuthzLDAPMethod ldap
AuthzLDAPServer “”
AuthzLDAPBindDN “CN=Nagios User,CN=Users,DC=hq,DC=CORP,DC=abc,DC=org”
AuthzLDAPBindPassword “typepasswordhere”
AuthzLDAPUserKey sAMAccountName
AuthzLDAPUserBase “CN=Pythian,OU=Internal Groups,DC=hq,DC=CORP,DC=abc,DC=org”
AuthzLDAPUserScope subtree
AuthzLDAPGroupKey cn
AuthzLDAPMemberKey member
AuthzLDAPSetGroupAuth ldapdn
require group “WUG Admins”

In the above configuration, mod_authz_ldap uses parameters like ldapserver, binddn, bindpassword, scope, searchbase etc to see if the supplied user credentials can be found in the Active Directory. It would also check to see if the user is a member of ‘Nagios Admins’ group.

Restarting apache would start enable Active Directory based authentication for Nagios.


Discover more about our expertise in Infrastructure Management.

Categories: DBA Blogs

Online Resizing of ASM Disks

Wed, 2015-11-04 07:01


The SAN administrator has informed us that they have extended the disks. This is the information I had from our client. The disks were labelled:



The original size of the disks were 300GB and they had been extended to 600GB. These were multipath disks belonging to the disk diskgroup ARCH, which was being used to store archive logs in ASM. The database was and was in a 2-node RAC configuration. The server was Red Hat Linux 5.9 – 2.6.18-406.el5 – 64bit.

I checked the disks using fdisk (as the root user) and got the following:

fdisk -l /dev/mpath/mpath_compellent_oraarch

Disk /dev/mpath/mpath_compellent_oraarch: 322.1 GB, 322122547200 bytes
255 heads, 63 sectors/track, 39162 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/mpath/mpath_compellent_oraarch doesn't contain a valid partition table

This confirmed that the OS was not aware of the size extension.


Firstly, I wanted to confirm that the correct disks had been extended. So the first place to look is in ASM:

     , d.path
     , d.os_mb
     , d.total_mb
from v$asm_diskgroup g
   , v$asm_disk      d
where g.group_number = d.group_number
and = 'ARCH'

NAME       PATH                           OS_MB   TOTAL_MB
---------- ------------------------- ---------- ----------
ARCH       ORCL:ASMDISK_NEW_ARCH03       307200     307200
ARCH       ORCL:ASMDISK_NEW_ARCH01       307200     307200
ARCH       ORCL:ASMDISK_NEW_ARCH02       307200     307200


Now we need to match these names to those provided by the SAN administrator.

Check the directory:

ls -l /dev/oracleasm/disks/ASMDISK_NEW_ARCH*
brw-rw---- 1 oracle dba 253, 30 Oct  6 00:35 /dev/oracleasm/disks/ASMDISK_NEW_ARCH01
brw-rw---- 1 oracle dba 253, 29 Oct  6 00:35 /dev/oracleasm/disks/ASMDISK_NEW_ARCH02
brw-rw---- 1 oracle dba 253, 32 Oct  6 00:35 /dev/oracleasm/disks/ASMDISK_NEW_ARCH03

This gives is the major and minor numbers for the disks – major number is 253 and minor numbers are 30,29 and 32.


Then compare these numbers against the devices listed in:

ls -l /dev/mapper/mpath_compellent_oraarch*
brw-rw---- 1 root disk 253, 30 Oct  6 00:34 /dev/mapper/mpath_compellent_oraarch
brw-rw---- 1 root disk 253, 29 Oct  6 00:34 /dev/mapper/mpath_compellent_oraarch02
brw-rw---- 1 root disk 253, 32 Oct  6 00:34 /dev/mapper/mpath_compellent_oraarch03

The numbers match showing that they are the same devices.


Now we need to find the actual disks that make up the multipath devices.

multipath -l
Output truncated for brevity

mpath_compellent_oraarch03 (36000d310009aa700000000000000002b) dm-32 COMPELNT,Compellent Vol
[size=300G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 13:0:0:3  sdfm       130:128 [active][undef]
 \_ 11:0:0:3  sdgd       131:144 [active][undef]

mpath_compellent_oraarch02 (36000d310009aa700000000000000002a) dm-29 COMPELNT,Compellent Vol
[size=300G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 12:0:0:2  sdfi       130:64  [active][undef]
 \_ 14:0:0:2  sdfk       130:96  [active][undef]

mpath_compellent_oraarch (36000d310009aa7000000000000000026) dm-30 COMPELNT,Compellent Vol
[size=300G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 13:0:0:1  sdfj       130:80  [active][undef]
 \_ 11:0:0:1  sdgc       131:128 [active][undef]

From here we can see the disks:


We need to find this information on the other node as well, as the underlying disk names will very likely be different on the other server.


Now for each disk we need to rescan the disk to register the new size. To do this we need to the following for each disk on both nodes:

echo 1 &gt; /sys/block/sdfm/device/rescan

Then we can check each disk to make sure it has successfully been extended:

fdisk -l /dev/sdfm

Disk /dev/sdfm: 644.2 GB, 644245094400 bytes
255 heads, 63 sectors/track, 78325 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdfm doesn't contain a valid partition table

Looks good – once done on all nodes we can then extend the multipath devices for each device name on both nodes:

multipathd -k'resize map mpath_compellent_oraarch'

Then we can check the multipath device disk size:

fdisk -l /dev/mpath/mpath_compellent_oraarch

Disk /dev/mpath/mpath_compellent_oraarch: 644.2 GB, 644245094400 bytes
255 heads, 63 sectors/track, 78325 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/mpath/mpath_compellent_oraarch doesn't contain a valid partition table

Looks good – once done on both nodes we can then resize the ASM disks within ASM:

SQL&gt; select 
     , d.path
     , d.os_mb
     , d.total_mb
from v$asm_diskgroup g
   , v$asm_disk      d
where g.group_number = d.group_number
and = 'ARCH'

NAME       PATH                           OS_MB   TOTAL_MB
---------- ------------------------- ---------- ----------
ARCH       ORCL:ASMDISK_NEW_ARCH03       614400     307200
ARCH       ORCL:ASMDISK_NEW_ARCH01       614400     307200
ARCH       ORCL:ASMDISK_NEW_ARCH02       614400     307200

SQL&gt; alter diskgroup ARCH resize all;

Diskgroup altered.

SQL&gt; select 
     , d.path
     , d.os_mb
     , d.total_mb
from v$asm_diskgroup g
   , v$asm_disk      d
where g.group_number = d.group_number
and = 'ARCH'

NAME       PATH                           OS_MB   TOTAL_MB
---------- ------------------------- ---------- ----------
ARCH       ORCL:ASMDISK_NEW_ARCH03       614400     614400
ARCH       ORCL:ASMDISK_NEW_ARCH01       614400     614400
ARCH       ORCL:ASMDISK_NEW_ARCH02       614400     614400

The disks and diskgroup were successfully resized.


Discover more about our expertise in Database Management.

Categories: DBA Blogs

Log Buffer #447: A Carnival of the Vanities for DBAs

Mon, 2015-11-02 09:34


This Log Buffer Edition covers the weekly blog posts of Oracle, SQL Server and MySQL.


  • An Index or Disaster, You Choose (It’s The End Of The World As We Know It).
  • SQL Monitoring in Oracle Database 12c.
  • RMAN Full Backup vs. Level 0 Incremental.
  • Auto optimizer stats after CTAS or direct loads in #Oracle 12c.
  • How to move OEM12c management agent to new location.

SQL Server:

  • Automate SQL Server Log Monitoring.
  • 10 New Features Worth Exploring in SQL Server 2016.
  • The ABCs of Database Creation.
  • Top 10 Most Common Database Scripts.
  • In-Memory OLTP Table Checkpoint Processes Performance Comparison.


  • The Client Library, Part 1: The API, the Whole API and Nothing but the API.
  • Performance of Connection Routing plugin in MySQL Router 2.0.
  • MariaDB 10.0.22 now available.
  • Semi-synchronous Replication Performance in MySQL 5.7.
  • MySQL and Trojan.Chikdos.A.


Learn more about Pythian’s expertise in Oracle SQL Server & MySQL.

Categories: DBA Blogs

Does your business need to become data-driven?

Sun, 2015-11-01 18:52

Find out how the experts answered at Pythian’s Velocity of Innovation event in NYC

This summer I had the pleasure of moderating panel discussion that brought some of IT’s most innovative thought leaders together with some of North America’s top CIOs. This is the second in our series of posts that outline the discussions that took place at our Velocity of Innovation event this summer in New York. Our panel of experts consisted of: Paul Vallé, Pythian’s founder and CEO; Gene Leganza, vice-president, principal analyst serving enterprise architecture professionals at Forrester Research; and Otto Toth, CTO at Huffington Post. Event attendees included IT leaders from across a range of industries who supplied the questions that formed the discussion.

This series of blog posts focuses on topics covered in the New York Velocity of Innovation discussion this past summer. This post concentrates on a discussion between Gene and Paul about the importance of data as a driver of success. The question was: Do we think that every organization must become data-driven or is this just for customer facing marketing organizations?”

Here’s just a sample of the discussion:
Paul: Gene is one of the world’s leading experts on that exact subject. I’m just dying to hear what he’s going to say about this.

Gene: I’m a believer. It’s a really interesting space. But the question is how to get there, and the culture that needs to exist. I’ve had a lot of discussions with vendors doing interesting things with data. It’s funny, if I ask what their software technology does — how it’s being implemented, and the success their customer are having — somehow they always end up saying, “Well, but that’s a technology, the culture is a lot slower to change.”
Somehow the culture needs to change in order to implement a lot of these technologies and make organizations really data-driven. It is not something that’s going to change overnight. It’s largely why, I think, organizations are not yet embracing the notion of a chief data officer role. When I first did surveys on this maybe two years ago, four or five percent of organizations said they had or were planning on hiring a chief data officer in the next year. Now, it’s up to 50 to 60 percent depending on the industry and region. People don’t always do what they say they’re going to do in surveys. But I think what’s behind these responses is an understanding that someone has to lead the charge, not only on the technology and analytic strategy side, but on the cultural side — making the move to change the organization.

There’s so much data out there, and increasingly so, that there’s probably data to support any decision you want to make relating to your business. If you’re not using data to make the decision, you’re just going to be behind the curve. For a while, you can say, “Oh, we have really smart people who have been in the industry forever. They go by their gut and I really trust that, and that’s really cool.” That will be good a little while longer, but sooner or later as your competition starts getting better at using data to understand what to do and get predictive about what to do, you’re just going to be behind the curve. It takes a while to get there so I think people need to get started now.

When you’re writing business decisions, having business conversations, wondering which way the market was going, what people are buying, what the competition is buying, what your customer friends’ friends are buying, are all things you can now find out. Making assumptions about that is really unfounded. I mean, there’s just too much data. It’s really a race to see who can be better at finding the relevant information and getting the answers out of data than the next guy. It’s what it’s coming down to, making it easier and easier for the business people to do that without having to ask IT.

The old school here is if you need to analyze something, you need to ask IT to provision something in the data warehouse so we can do analytics. Now, you need to play with the data for a while before you know what you really want to look at. The ability to monkey with data from multiple sources to explore the answers to questions has become easy to do. If you don’t do it, your competition will, and you’ll be behind the curve. I think that’s really just the bottom line.
But also, you have to be an evangelist and a cheerleader. Being opportunistic and having a cheerleader explicitly lead the charge to get the people to think differently about it so that you can eventually change processes. It eventually gets done.

I recently I talked with someone who felt extremely behind the curve and they only recently did things like looking at data design as integral to the design phase of their system development. They had always, of course, reviewed design of the applications before implementing them, but data was always an afterthought. Applications create data, so what?

Now that they’re trying to get better at data, they had to actually explicitly make sure you got a data person in there designing the data, looking at the impact of what this application’s doing with the landscape or the information architecture they have and reviewing that before going off and actually writing a code.
That was new for them. It’s important for them that they instantiate that in the process. That starts to win over the hearts and minds of the developers who are doing this sort of stuff. In terms of getting all the various players to get it and do their part, using that data instead of flying off the seat of their pants, now we have a lot of socializing and conversation, but baking data-driven things into processes.

It’s not something that you can do upfront and necessarily just expect a process to take over because then it looks like bureaucracy and people hate you for slowing down their life. You have to be very careful about introducing ideas into the organization and then try to bake them into the fabric of the organization before people see the actual value and they’re actually bought into the concept.

Paul: I have something to chime in, it’s actually in contradiction to Gene, but it will prompt, I think, an interesting discussion right along the lines of your question.

I find that one of the nicest ways to look at a business at a holistic level is to put a lens in front of it. I’m just going to use Fresh Direct as an example because I’m in New York City and I’m seeing the trucks all the time. You all live here, you might order their groceries. It’s useful to think of FreshDirect and look at the business through the lens of a grocer. You’re going to talk about the thickness of the steaks, you’re going to talk about the freshness of the vegetables and that’s great. It’s a great way to look at that business. You can also look at that business and think of it as a technology business. To what degree is FreshDirect a technology business that happens to deliver groceries the way is a technology business that happens to deliver books?

That’s a useful way to look at the business. To me, this whole data idea and the chief data scientist idea is the new lens and that’s what’s powerful about it. It lets you look at FreshDirect and say, “Hey, to what degree is Fresh Direct a data business?”

As for Huffington Post, it’s a media business. You need to be able to look through the media lens to you look at the quality of the content. You need to look at your reach and your audience, but to what degree is the Huffington Post a technology business? That’s a new lens. To what degree is the Huffington Post a data business?

That’s where sometimes, not always and not for every business, but sometimes you realize, “Holy cow! I am a data business.” Sometimes you realize that the most game-changing investments that you can make in your business involve adopting insights that emerge from the data. If you look at Fresh Direct and you say, “Well, Fresh Direct is a logistics business.” It is. Maybe when you look at Fresh Direct through that lens, you have a ‘holy cow’ moment and you realize that the investment in your supply chain management or your logistics allows you to compete better. That’s where your most efficient investments are. That’s okay.
What’s powerful about data is that we never used to talk about the power of our data as a lens through which to look at our businesses holistically and then to quantify our investments. To rephrase your question, I think it’s a really good question, but I wanted to just disagree with Gene just a little bit. I don’t know whether the crowd will lie, but in general, you should listen to Gene.

I think one of the interesting twist on your question is which of us, which of you are looking at your businesses through the data lens and seeing opportunities for investments that you feel that pressure in your belly, in your gut that you need to claim that opportunity?

Attendee # 1: Data can make an enormous difference to your business. Deciding which data to pay attention to, which to collect, where to source data, and what insights you’re looking for is almost more of an art than a science at this point.
That’s why the whole data scientist’s movement, it’s like the most sought-after position, it’s a huge growth space and it’s because there isn’t any prescriptive method. You have to bring judgement to the table. Choosing the data that’s important is probably the harder part right now. If you choose the wrong data, then it doesn’t make sense.

Attendee # 2: I think you need to invest not just in good talent, but also in good knowledge and optimization especially for a legacy business model.

Gene: That’s a really big challenge. We just got some results from a survey that asked what biggest obstacle was to getting more data-driven? There were a bazillion answers that all have some weight. At the top of the list is hiring the right data talent. Obviously, that’s a big deal for everybody. It’s not just data scientists. It’s always interesting to look at how you define what a data scientist is. The best definition of data scientist is somebody that would totally impossible to find because they would bring together PhD level education, a hard science level of math and algorithm knowledge, combined with the skills of a software engineer. All this would be combined with someone who has a domain knowledge we were just talking about it, it’s not just numbers.

This is just a sample of the our discussion on data at the New York event this summer. More of these sessions are planned for the coming weeks. To request an invitation to a Velocity of Innovation event in a city near you, visit

Velocity of Innovation is a series of thought-leadership events for senior IT management hosted by Pythian.The format is a panel discussion in which Pythian leads conversations around today’s disruptive technologies. Topics range from big data and cloud to advanced analytics and DevOps. These events are by invitation only. If you’re interested in attending one of our upcoming Velocity of Innovation events, contact

Categories: DBA Blogs

RMAN Full Backup vs. Level 0 Incremental

Thu, 2015-10-29 11:23

Perhaps you’ve wondered about this. What is the difference between taking an RMAN full backup and a level 0 incremental backup?

If you read the documentation the following explanation will be found here: Incremental Backups

The only difference between a level 0 incremental backup and a full backup is that a full backup is never included in an incremental strategy. Thus, an incremental level 0 backup is a full backup that happens to be the parent of incremental backups whose level is greater than 0.

Logically that is true, but for many folks that is not enough. Are they different physically, and if so exactly how are they different physically?

If you google for an explanation you will find a number of ideas, most based on what the documentation says.

Here are some examples of what I found:

  • The only difference between level 0 and full is that a level 0 is marked in the RMAN catalog as a level 0.
  • Level 0 incremental and full backups are physically identical.

If you are an experienced IT professional then by now you have learned not to assume that articles found on the WWW are always correct.It seemed like it might be an interesting exercise to find out what the differences might be.

This called for some experimentation; here is the general procedure followed:

  • Put the database in mount state so there is no activity in the database.
  • Create two RMAN ‘full’ backups.
  • Create two RMAN incremental level 0 backups.
  • Open the database and perform some DML and/or DDL.
  • Create an incremental level 1 backup.
  • Do more DML/DDL activity.
  • Create another incremental level 1 backup.

This will provide some backup files to examine. You may be wondering why two of each backup type was made. The reason for that is to filter out changes in the backup file that are not relevant to the difference in backup types. That should become more clear with examples.

The test environment is as follows:

  • Test database: Oracle CDB on Linux 6.5.
  • RMAN Catalog database: Oracle on Linux 5.5.

The following backup commands were used to create two full and two level 0 backups:


rman target / catalog rman12c/rman12c@oravm &amp;amp;lt;&amp;amp;lt;-EOF
run {
allocate channel ch1 device type disk format '/mnt/oracle-backups/ora12c/full-01/%U';
backup database tag 'full-01' plus archivelog tag 'full-01';

rman target / catalog rman12c/rman12c@oravm &amp;amp;lt;&amp;amp;lt;-EOF
run {
allocate channel ch1 device type disk format '/mnt/oracle-backups/ora12c/full-02/%U';
backup database tag 'full-02' plus archivelog tag 'full-02';

rman target / catalog rman12c/rman12c@oravm &amp;amp;lt;&amp;amp;lt;-EOF
run {
allocate channel ch1 device type disk format '/mnt/oracle-backups/ora12c/lvl0-01/%U';
backup incremental level 0 database tag 'lvl0-01' plus archivelog tag 'lvl0-01';

rman target / catalog rman12c/rman12c@oravm &amp;amp;lt;&amp;amp;lt;-EOF
run {
allocate channel ch1 device type disk format '/mnt/oracle-backups/ora12c/lvl0-02/%U';
backup incremental level 0 database tag 'lvl0-02' plus archivelog tag 'lvl0-02';

Let’s take a look at the size of the files:

[root@lestrade ora12c]# ls -l full-01/*
-rw-r----- 1 root root    6878208 Oct 27 15:15 full-01/0sqkp2dl_1_1
-rw-r----- 1 root root  761749504 Oct 27 15:16 full-01/0tqkp2dn_1_1
-rw-r----- 1 root root 1696112640 Oct 27 15:17 full-01/0uqkp2eq_1_1
-rw-r----- 1 root root  735043584 Oct 27 15:17 full-01/0vqkp2g7_1_1
-rw-r----- 1 root root  623837184 Oct 27 15:17 full-01/10qkp2h0_1_1

[root@lestrade ora12c]# ls -l full-02/*
-rw-r----- 1 root root    6878208 Oct 27 15:18 full-02/12qkp2hm_1_1
-rw-r----- 1 root root  761749504 Oct 27 15:18 full-02/13qkp2ho_1_1
-rw-r----- 1 root root 1696112640 Oct 27 15:19 full-02/14qkp2ir_1_1
-rw-r----- 1 root root  735043584 Oct 27 15:19 full-02/15qkp2k8_1_1
-rw-r----- 1 root root  623837184 Oct 27 15:20 full-02/16qkp2l2_1_1

[root@lestrade ora12c]# ls -l lvl0-01/*
-rw-r----- 1 root root    6878208 Oct 27 15:20 lvl0-01/18qkp2lm_1_1
-rw-r----- 1 root root  761749504 Oct 27 15:21 lvl0-01/19qkp2lo_1_1
-rw-r----- 1 root root 1696112640 Oct 27 15:21 lvl0-01/1aqkp2mr_1_1
-rw-r----- 1 root root  735043584 Oct 27 15:22 lvl0-01/1bqkp2o8_1_1
-rw-r----- 1 root root  623837184 Oct 27 15:22 lvl0-01/1cqkp2p2_1_1

[root@lestrade ora12c]# ls -l lvl0-02/*
-rw-r----- 1 root root    6878208 Oct 27 15:23 lvl0-02/1eqkp2q4_1_1
-rw-r----- 1 root root  761749504 Oct 27 15:23 lvl0-02/1fqkp2q6_1_1
-rw-r----- 1 root root 1696112640 Oct 27 15:24 lvl0-02/1gqkp2r9_1_1
-rw-r----- 1 root root  735043584 Oct 27 15:25 lvl0-02/1hqkp2sm_1_1
-rw-r----- 1 root root  623837184 Oct 27 15:25 lvl0-02/1iqkp2tf_1_1

The number and sizes of the files appear the same regardless of the backup type, lending some credence to the idea that they may be physically identical. So now let’s dump some of the files to determine the difference.

For purposes of this experiment we are going to examine the backup files that contain datafile 1 from the database, the SYSTEM tablespace datafile.

We can find these backup files with the RMAN list command:

RMAN> list backup of datafile 1;

List of Backup Sets

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ -------------------
1259    Full    1.58G      DISK        00:00:39     2015-10-27 15:15:13
        BP Key: 1267   Status: AVAILABLE  Compressed: NO  Tag: FULL-01
        Piece Name: /mnt/oracle-backups/ora12c/full-01/0uqkp2eq_1_1
  List of Datafiles in backup set 1259
  File LV Type Ckp SCN    Ckp Time            Name
  ---- -- ---- ---------- ------------------- ----
  1       Full 50957811   2015-10-27 18:08:29 +DATA/JS03/DATAFILE/system.258.856612879

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ -------------------
1341    Full    1.58G      DISK        00:00:39     2015-10-27 15:17:22
        BP Key: 1349   Status: AVAILABLE  Compressed: NO  Tag: FULL-02
        Piece Name: /mnt/oracle-backups/ora12c/full-02/14qkp2ir_1_1
  List of Datafiles in backup set 1341
  File LV Type Ckp SCN    Ckp Time            Name
  ---- -- ---- ---------- ------------------- ----
  1       Full 50957811   2015-10-27 18:08:29 +DATA/JS03/DATAFILE/system.258.856612879

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ -------------------
1433    Incr 0  1.58G      DISK        00:00:39     2015-10-27 15:19:30
        BP Key: 1441   Status: AVAILABLE  Compressed: NO  Tag: LVL0-01
        Piece Name: /mnt/oracle-backups/ora12c/lvl0-01/1aqkp2mr_1_1
  List of Datafiles in backup set 1433
  File LV Type Ckp SCN    Ckp Time            Name
  ---- -- ---- ---------- ------------------- ----
  1    0  Incr 50957811   2015-10-27 18:08:29 +DATA/JS03/DATAFILE/system.258.856612879

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ -------------------
1531    Incr 0  1.58G      DISK        00:00:42     2015-10-27 15:21:55
        BP Key: 1539   Status: AVAILABLE  Compressed: NO  Tag: LVL0-02
        Piece Name: /mnt/oracle-backups/ora12c/lvl0-02/1gqkp2r9_1_1
  List of Datafiles in backup set 1531
  File LV Type Ckp SCN    Ckp Time            Name
  ---- -- ---- ---------- ------------------- ----
  1    0  Incr 50957811   2015-10-27 18:08:29 +DATA/JS03/DATAFILE/system.258.856612879

Now dump the backup pieces to hex files. Just the first 100M is being dumped from each, resulting in ~300M text files.

mkdir -p $DUMPDIR

for f in full-01/0uqkp2eq_1_1 full-02/14qkp2ir_1_1 lvl0-01/1aqkp2mr_1_1 lvl0-02/1gqkp2r9_1_1
        tag=$(dirname $f)
        filename=$(basename $f)
        echo $filename
        # dump first 100M
        dd if=$f bs=512 count=204800 2>/dev/null | hexdump -C > $filename

[root@lestrade hexdump]# ls -l full* lvl0*
-rw-r--r-- 1 root root 305807309 Oct 27 16:11 full-01_0uqkp2eq_1_1.txt
-rw-r--r-- 1 root root 305807309 Oct 27 16:11 full-02_14qkp2ir_1_1.txt
-rw-r--r-- 1 root root 305807309 Oct 27 16:11 lvl0-01_1aqkp2mr_1_1.txt
-rw-r--r-- 1 root root 305807309 Oct 27 16:11 lvl0-02_1gqkp2r9_1_1.txt

We can compare the dump of the two full backup files. As there were no changes to the database during this time (you may recall the database is in MOUNT state) the only difference in the files should be changes to metadata.

[root@lestrade hexdump]# diff full-01_0uqkp2eq_1_1.txt full-02_14qkp2ir_1_1.txt
< 00002010  46 7d 00 00 00 02 10 0c  00 02 10 0c c8 29 05 07  |F}...........)..|
< 00002020  4a 53 30 33 00 00 00 00  de 19 01 00 00 00 00 00  |JS03............|
> 00002010  dd 7e 00 00 00 02 10 0c  00 02 10 0c c8 29 05 07  |.~...........)..|
> 00002020  4a 53 30 33 00 00 00 00  fd 19 01 00 00 00 00 00  |JS03............|
< 00002060  01 00 00 00 da 89 4c 35  1e 00 00 00 04 00 00 00  |......L5........|
< 00002070  01 00 00 00 46 55 4c 4c  2d 30 31 00 00 00 00 00  |....FULL-01.....|
> 00002060  01 00 00 00 5b 8a 4c 35  24 00 00 00 04 00 00 00  |....[.L5$.......|
> 00002070  01 00 00 00 46 55 4c 4c  2d 30 32 00 00 00 00 00  |....FULL-02.....|

These two full backup files differ only the by these four lines. These differences are in the 17th OS block of the files and can be identified as metadata by the tags FULL-0[12] and the database name JS03. There are other differences that do not appear as printable characters. I did spend a little time trying to decode their meanings; they may be backup piece#, backup piece keys, backup set keys, and similar. Determining the meaning of these didn’t really seem necessary to understand the difference between full and incremental level 0 backups, so I did not continue. If you do know what these values represent, please, leave a comment.

Doing a diff on the level 0 hex dump files shows changes in the same lines:

[root@lestrade hexdump]# diff lvl0-01_1aqkp2mr_1_1.txt lvl0-02_1gqkp2r9_1_1.txt
< 00002010  bc 02 00 00 00 02 10 0c  00 02 10 0c c8 29 05 07  |.............)..|
< 00002020  4a 53 30 33 00 00 00 00  1a 1a 01 00 00 00 00 00  |JS03............|
> 00002010  34 03 00 00 00 02 10 0c  00 02 10 0c c8 29 05 07  |4............)..|
> 00002020  4a 53 30 33 00 00 00 00  39 1a 01 00 00 00 00 00  |JS03....9.......|
< 00002060  01 00 00 00 db 8a 4c 35  2a 00 00 00 04 00 00 00  |......L5*.......|
< 00002070  01 00 00 00 4c 56 4c 30  2d 30 31 00 00 00 00 00  |....LVL0-01.....|
> 00002060  01 00 00 00 69 8b 4c 35  30 00 00 00 04 00 00 00  |....i.L50.......|
> 00002070  01 00 00 00 4c 56 4c 30  2d 30 32 00 00 00 00 00  |....LVL0-02.....|

Now it is time to compare a full backup file to a level 0 backup file:

[root@lestrade hexdump]# diff full-01_0uqkp2eq_1_1.txt lvl0-01_1aqkp2mr_1_1.txt
< 00002010  46 7d 00 00 00 02 10 0c  00 02 10 0c c8 29 05 07  |F}...........)..|
< 00002020  4a 53 30 33 00 00 00 00  de 19 01 00 00 00 00 00  |JS03............|
> 00002010  bc 02 00 00 00 02 10 0c  00 02 10 0c c8 29 05 07  |.............)..|
> 00002020  4a 53 30 33 00 00 00 00  1a 1a 01 00 00 00 00 00  |JS03............|
< 00002060  01 00 00 00 da 89 4c 35  1e 00 00 00 04 00 00 00  |......L5........|
< 00002070  01 00 00 00 46 55 4c 4c  2d 30 31 00 00 00 00 00  |....FULL-01.....|
> 00002060  01 00 00 00 db 8a 4c 35  2a 00 00 00 04 00 00 00  |......L5*.......|
> 00002070  01 00 00 00 4c 56 4c 30  2d 30 31 00 00 00 00 00  |....LVL0-01.....|
< 00002090  00 00 00 00 04 04 00 00  5a 08 00 00 c3 2c 04 00  |........Z....,..|
> 00002090  00 00 00 00 05 04 00 00  5a 08 00 00 c3 2c 04 00  |........Z....,..|

Notice the new line 2090? This is the line that contains the difference between a full and a level 0 backup file.

Before continuing it is time to create a couple of level 1 backups:

SQL> create user dilbert identified by phb;

User created.

SQL> alter user dilbert quota 10m on users;

User altered.

SQL> create table dilbert.t1 as select systimestamp t1 from dual;

Table created.

rman target / catalog rman12c/xxx@oravm <<-EOF
run {
allocate channel ch1 device type disk format '/mnt/oracle-backups/ora12c/lvl1-01/%U';
backup incremental level 1 database tag 'lvl1-01' plus archivelog tag 'lvl1-01';

Create another table in dilbert schema

SQL> create table dilbert.t2 as select * from dba_users;

Table created.

rman target / catalog rman12c/xxx@oravm <<-EOF
run {
allocate channel ch1 device type disk format '/mnt/oracle-backups/ora12c/lvl1-03/%U';
backup incremental level 1 database tag 'lvl1-03' plus archivelog tag 'lvl1-03';

Note:  lvl1-02 backup was discarded as I forgot to make db changes before making it.

If we were to go on and compare all of the backup files that contain datafile 1:

[root@lestrade hexdump]# grep ^00002090 *.txt
full-01_0uqkp2eq_1_1.txt:00002090  00 00 00 00 04 04 00 00  5a 08 00 00 c3 2c 04 00  |........Z....,..|
full-02_14qkp2ir_1_1.txt:00002090  00 00 00 00 04 04 00 00  5a 08 00 00 c3 2c 04 00  |........Z....,..|
lvl0-01_1aqkp2mr_1_1.txt:00002090  00 00 00 00 05 04 00 00  5a 08 00 00 c3 2c 04 00  |........Z....,..|
lvl0-02_1gqkp2r9_1_1.txt:00002090  00 00 00 00 05 04 00 00  5a 08 00 00 c3 2c 04 00  |........Z....,..|
lvl1-01_1mqkp7jk_1_1.txt:00002090  00 00 00 00 05 00 00 00  5a 08 00 00 c3 2c 04 00  |........Z....,..|
lvl1-03_22qks49l_1_1.txt:00002090  00 00 00 00 05 00 00 00  5a 08 00 00 c3 2c 04 00  |........Z....,..|

A bitmask emerges that is peculiar to each backup type:

Starting at offset 0x2090
     00 01 02 03 04 05 06 07
Full 00 00 00 00 04 04 00 00  
LVL0 00 00 00 00 05 04 00 00
LVL1 00 00 00 00 05 00 00 00

What might these mean?
Here’s a possible explanation:

0x2094 = 0x04 = FULL
0x2094 = 0x05 = Incremental

0x2095 = 0x04 = level 0 if the previous byte is 0x05
0x2095 = 0x00 = level 1 if the previous byte is 0x05

Of course this doesn’t explain the purpose of 0x04 at 0x205 if the backup is ‘FULL’.

To fully understand these flags would require dumping and examining many more backup files, not just of datafiles but of archive logs, controlfiles, etc.

One thing is clear from this experiment; there are differences in the metadata contained in the file headers when comparing a full backup to a level 0 incremental backup. It is not simply a matter of the one backup being cataloged differently by RMAN; the backup pieces can be identified as either full or incremental level backups based on the metadata in the file header.

So what good is all this effort for a little bit of knowledge?

How about the next time you are asked about the difference between a level 0 and full backup? Now you can say something other that quoting the documentation or some article on the internet (one that is lacking anything to backup the assertions made).

If the question is posed to you during an interview, this information becomes even more valuable.

Now another question comes to mind: Can backup files made with a full backup be modified to work as incremental level 0 backups?

That would be an interesting experiment. This is not something one would normally do, but it is not too hard to imagine a scenario where only a full backup, an incremental level 1 backup and the following archive logs were available to recover a database. This would be a desperate measure and a last resort, but sometimes last resorts are necessary.

Do you think this can be done? If so please give it a try, but only test it on an expendable test system please.

Please be sure to leave a comment if you try this experiment, it would be quite interesting to discuss.


Discover more about our expertise in Oracle.

Categories: DBA Blogs

Pythian More Innovative Than GE?

Tue, 2015-10-27 08:41


GE is known as an innovative company and according to Forbes is one of the worlds most valuable brands. Late summer they made headlines with the announcement that they were retiring performance management reviews.

No big deal? Think again. GE has built a reputation on Jack Welch’s ridgid performance management programs, cutting the bottom 10 percent of under performers each year.

First, I applaud GE for the bold move. Any company wide change in an organization the size of GE’s would be challenging. Second, what took you so long? In 2011, Pythian acted against the crowd and ditched the prevailing zeitgeist by implementing a performance feedback program.

At the time, with approximately one hundred employees and a brand new HR team we were just beginning to establish our HR programs. Like many small companies, we did not have a structured performance management program. We were growing rapidly and identified a need to provide employees with useful feedback and career discussions.

Ratings, rankings and bell curves, “Oh My!” We didn’t even consider them. We designed a program that removed standard performance management elements like numerical rankings. Our program focus was created to facilitate formal, organized feedback discussions, in a comfortable environment, between an employee and their manager. The idea was to base the discussion on each team member’s career aspirations and journey. During a new hire orientation, the first steps of the career journey begin. Following every six months we schedule time to sit down and have focused career feedback discussions. During these discussions, goals are established, recent successes reviewed, progress on goals updated, and challenges chronicled with suggestions to overcome. Furthermore, career aspirations and plans for professional development are discussed and established.

The feedback program is constantly evolving and improving to meet the changing needs and scale of the company.  Of course we listen to employee feedback about the program and implement changes after a review of the suggestions. Change can be difficult for people. Initially, employees more accustomed to traditional performance management were hesitant, but they quickly responded to the easy and relaxed format of our program.

Regular feedback is key. We encourage two way feedback: up and/or down, across the organization, in real time, all the time. We are always working to improve our programs, processes and ideas, e.g. “upping our game” as a company.   We believe it’s important to build a culture of constant feedback. A culture of two way feedback built on trust and transparency is a team effort by all members of the Pythian Team.

During orientation I enjoy encouraging and empowering all new employees with permission to ask their leaders for feedback anytime. I encourage them to not wait to share what’s going well and to disclose where they need extra support/further direction, etc. In my own team meetings I inquire what I could be doing more of and less of. How can I be a better communicator and leader. I create a safe environment for the team to provide feedback so we can collectively improve.

It will be interesting to see if GE’s announcement encourages more companies to re-evaluate their approach to Performance Management systems and encourage more effective dialogue and feedback discussions with their employees.

Categories: DBA Blogs

SQL Server Row Level Security

Tue, 2015-10-27 08:27


Row Level Security (RLS) has been implemented in SQL Server 2016 for both on-premise and v12 of Azure instances.

The problem this solves is: a company with multiple applications accessing sensitive data in one or more tables.

How do you ensure the data being read or written is only the data that login is authorized to see? In the past, this has been accomplished with a complicated series of views or functions, and there’s no guarantee a bug or malicious user wouldn’t be able to bypass those measures. With Row Level Security, it doesn’t matter what privileges you have (including sysadmin) or how you try to access the data.

How it Works

Row Level Security has two options: You can either FILTER the rows or BLOCK the operation entirely. The BLOCK functionality is not yet implemented in CTP 2.4, but the FILTER logic works like a charm.

The steps are very simple:
1 – Figure out what you’re going to associate with your users and data. You will need to create some link between your data and a login’s or user’s properties. Something that will allow the engine to say This Row is ok for This User.

2 – Create a Function defining the relationship between users and the data.

3 – Create a Security Policy for the function and table(s). You can use the same policy on multiple tables or views.

Once the Security Policy has been created, every query or DML operation on the tables or views you’re filtering will automatically have the function applied to the WHERE or HAVING clause. You can see the filter working by reviewing the execution plan as well. SQL Server will generate the Plan Hash value with the filtering logic in place. This allows Plan Re-Use with Row Level Security, and it’s a big improvement over Oracle’s implementation which doesn’t do this (as of Oracle 10g, the last time I worked with it) :-). See the bottom of this post for an example of the same query with RLS turned on & off.

What is particularly nice about these policies is that the data is filtered regardless of the user’s privileges. A sysadmin or other superuser who disables the policy is just an Audit log review away from having to explain what they were doing.

Row Level Security Walk Through

This is an example of setting up an RLS system for the Credit Card data in the AdventureWorks database. After this is completed, only users associated with a Business Entity in the Person.Person table will be able to see or update any credit card information, and the data they can touch will be limited to just their business.

Step 1: Add user_name column to Person.Person table

In this example, I’m associating the user_name() function’s value for each login with the BusinessEntityID. Of course, you can use any value you want, as long as you can access it from a SELECT statement in a Schema-Bound function. This means many system tables are off-limits.

USE AdventureWorks

ALTER TABLE person.person
ADD UserName nvarchar(128) NULL;

— Associate some person.person rows with a login too.
UPDATE person.person
SET UserName = ‘Business1’
BusinessEntityID IN (301, 303, 305);

Step 2: Create Users to Test

I’m just creating a login named Business1 to demonstrate this. Note that the user has db_owner in AdventureWorks

USE [master] GO
USE [AdventureWorks] GO
CREATE USER [business1] FOR LOGIN [business1] GO
USE [AdventureWorks] GO
ALTER ROLE [db_owner] ADD MEMBER [business1] GO

Step 3: Create Function to Filter Data

This function finds all credit cards for the user_name() running the query. Any values not returned by this function will be inaccessible to this user.

CREATE FUNCTION [Sales].[fn_FindBusinessCreditCard] (@CreditCardID INT)

1 AS result
person.person p INNER JOIN
sales.PersonCreditCard pcc ON p.BusinessEntityID = pcc.BusinessEntityID INNER JOIN
sales.CreditCard cc ON pcc.CreditCardID = cc.CreditCardID
cc.CreditCardID = @CreditCardID AND
p.UserName = user_name();

Step 4: Create a Security Policy

This creates a security policy on the Sales.CreditCard table.

CREATE SECURITY POLICY sales.RestrictCreditCardToBusinessEntity
ADD FILTER PREDICATE sales.fn_FindBusinessCreditCard(CreditCardID)
ON Sales.CreditCard

Step 5: Test Away

For all of the following examples, You should be logged in as the Business1 user who can only see 3 credit cards. In reality, there are 19,118 rows in that table.

--Will return three records
pcc.*, cc.*
Sales.PersonCreditCard pcc INNER JOIN
Sales.CreditCard cc ON cc.CreditCardID = pcc.CreditCardID

— Will only update three records
UPDATE Sales.CreditCard
ExpYear = ‘2020’

These are the execution plans for the above query with Row Level Security turned on and off:

Turned On: (and missing an index…)

Execution Plan With Security Turned On

Execution Plan With Security Turned On.

Turned Off:

Execution Plan With Security Turned Off

Execution Plan With Security Turned Off.


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

A Cassandra Consistency Use Case

Tue, 2015-10-27 08:07


I recently completed a project where I worked with a company using Cassandra to keep metadata about objects stored in an Object Store. The application keeps track of individual objects as rows within a partition based on user id. In an effort to save space there is also a mechanism to track duplicate references to the objects in another table. Object writes take place as background activity and the time it takes to complete those writes is invisible to the applications end users. The time it takes to retrieve an object though is very visible to the end user. The keyspace was defined as network topology using two data centers (actual data centers here about 50 ms apart) with replication factor 3 in both data centers.

Initially the application was set up to use consistency ONE for both writes and reads. This seemed to be working okay until we started doing failure testing. At which point objects would come up missing due to the delay time in pushing hinted hand offs from one node to another. A simple solution to this was to make all writes and reads LOCAL_QUORUM. In fact doing so did resolve pretty much all of the testing errors but at a much increased latency, about 3 times longer than with consistency ONE, on both reads and writes. Even so, the latencies were deemed to be acceptable since they were still well under anticipated network latencies outside of the data centers which is what the users would be seeing.

Could we have done better than that though?

The writes are a background activity not visible to the end user. The increased write latency is probably reasonable there. The read latency is visible to the user. There is an option which guarantees finding the stored object references while still keeping the latency to a minimum. This is what I propose, the default read consistency is set back to ONE and most of the time a read to Cassandra will find the object reference as was clear in the initial testing. But, if a read returns no object reference then a second read is issued using LOCAL_QUORUM. This way most, more than 99%, of all reads are satisfied with the much lower latency consistency of ONE only occasionally needing the second read. This can be extended further to a full QUORUM read if the LOCAL_QUORUM read fails.

It is important to note that this approach only works if there are no row versions. E.G. rows only exist or do not exist. If a row may have different versions over time as you might have if the row were updated rather than just inserted and then later deleted. It is also important to note that its possible to find a deleted row this way. For this use case these qualifications are not issues.


Discover more about our expertise in Cassandra.

Categories: DBA Blogs

Running SQL Server Jobs Against an Azure SQL Database

Tue, 2015-10-27 07:50


I recently had a client ask how to run SQL Server jobs against an Azure SQL Database, and because SQL DB doesn’t have an SQL Agent like most other versions of SQL Server, it isn’t always obvious how to implement. Fortunately, we have several options in Azure and within a “normal” instance.

Options to Run Jobs

The first three options require a non-Azure version of SQL Server to be running and to have connectivity to the Azure SQL DB.

  1. Linked Server
  2. Maintenance Plan w/new connection
  3. Powershell Script
  4. Azure Services
Elastic DB Jobs

As mentioned by Warner Chaves in the comments, and currently (Nov 2015) in preview, the new Elastic Database Jobs might also work well.

Linked Server

To create a linked server to an Azure DB, you just need to get the ODBC connection string from the Azure portal and use it as the “Provider String” when defining the LS.




Maintenance Plan w/New Connection

Simply create your maintenance plan with all the required logic for your job, then click the “Manage Connections” menu at the top of the screen and define the new connection to be used for this plan.




Powershell Script

You can use the Powershell Invoke-SqlCmd functionality and run it as a Powershell job step.

Invoke-Sqlcmd -Query "SELECT GETDATE() AS TimeOfQuery;" -ServerInstance "AzurreInstance" -U "UserName" -P "Password"

Azure Services

These change, it seems like, almost daily, so I’ve listed some of the common ways this is currently done.

  1. The Azure Scheduler used in conjunction with mobile services.
  2. Azure Automation


Discover more about our expertise in Cloud and SQL Server.

Categories: DBA Blogs

Log Buffer #446: A Carnival of the Vanities for DBAs

Fri, 2015-10-23 15:25

This Log Buffer Edition covers some more Oracle, SQL Server and MySQL blog posts from across the world.


  • Should I monitor the MGMTDB database?
  • Connecting MAF Applications to Mobile Cloud Service
  • JRE 6, 7, and 8 now certified with EBS 11i, 12.1, and 12.2
  • The database writer copies data blocks from the buffer cache onto disk. The db_writer_processes initialization parameter determines how many processes will do this task. Its default value is 1 or cpu_count / 8, whichever is greater. I found an Oracle 9 database on a Tru64 server with cpu_count set to 1
  • How To Install Latest Verisign G5 Root Certificates

SQL Server:

  • Dynamic Pivot, Passing Temp Table Variables to Dynamic SQL
  • Oracle offers a results cache in the database (from 11.2 onwards) and in PL/SQL (again, from 11.2 onwards) which can greatly reduce execution time of repeated statements when insert/update/delete activity is not heavy.
  • Maintaining a grouped running MAX (or MIN)
  • Harnessing SQL Server Metadata- Disabling and Rebuilding Indexes
  • Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator


  • ClusterControl Tips & Tricks: Updating your MySQL Configuration
  • Become a MySQL DBA blog series – Optimizer Hints for faster query execution
  • Loading Arbitary XML documents into MySQL tables with p_load_xml
  • Use MySQL and Perl to automatically find, follow and unfollow twitter users
  • Great Advice on Monitoring Amazon RDS For MySQL From Datadog

Learn more about Pythian’s expertise in Oracle SQL Server & MySQL.

Categories: DBA Blogs

SQL On The Edge #4 – SQL 2016 Stretch Database

Wed, 2015-10-21 09:49


As the Microsoft SQL Server team continues to build very interesting integrations between the full SQL Server box product we all know and the Azure cloud platform. Stretch database is another integration that will be included as part of SQL Server 2016 and it will help customers solve these issues:

– Running out of space in your own data center?
– Have long archival requirements for your data?
– Users still query data from many years ago on occasion?
– You don’t want to do code changes to deal with these scenarios?


Stretch database attacks those problems by allowing us to “stretch” tables from on-premises (or full VM) SQL Servers into a table living in Azure SQL Database. Your users and applications will not be aware of where the data is coming from, they just have to query and let SQL Server handle the behind the scenes magic to either go to the local data files or go to the cloud to get the records.

So if you’re running out of space, you can stretch to a Standard tier db and get 250GB instantaneously or 500GB on Premium or even 1TB if you get a P11! Of course you will have to pay for the Azure SQL database cost but considering it’s a fully managed service, in some scenarios this can easily offset the cost of the on-premises storage plus managing all the archival if your system is busy and large enough.

The feature also works transparently with the backup and restore operations. When you backup your SQL Server database it will contain the metadata for the Stretch database and if you have the proper credentials, you can re-enable the connection to the Azure SQL database on RESTORE. If you just want to RESTORE as let’s say a DEV copy then you can either not reconnect the stretch database at all or reconnect it to some other DEV Azure copy for example.

As is usually the case, this feature does come with some limitations. For example, it can’t be used on replicated tables, In-Memory tables or tables using the CLR based data types (hierarchy, XML, spatial types, etc). The main limitation now is that UPDATE and DELETE are not supported so it’s very much for ‘append-only’ type of data. However, for many scenarios these limitations can be more than acceptable. You can see the full list here.

To see how we enable this feature and how it works, let’s jump to the video, enjoy!


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

Gotta Love English

Wed, 2015-10-21 07:28


I do love to write, and have always enjoyed the idiosyncrasies of this wonderful language. One of my favourite features of English is how I struggle with the following items almost daily:

  1. I am doing a few things to alleviate the need to worry about something. Do I write There is a handful of things … or There are a handful of things …? I recently asked a handful of colleagues their feelings on the above and the results are in … 5-5. I guess the jury is still out on that one, but I still maintain there is a handful of questions as yet unanswered.
  2. I hope my decision does not affect (or is that effect) others’ opinion of my writing skills. Even though classically I use affect as a verb and effect as a noun … hold on … wait a sec … maybe it’s the other way around. I will still struggle and every time hope when I use one of them wrong, the effect of that mistake will not affect my feelings for the wonders of this fine language.
  3. It’s time for me to take its advice. I am not sure if it stands on its own with respect to ownership. Most words in this insane language use the ‘s to show possession but alas not all. Needless to say, it’s not obvious and I pause for a moment of thought every time I use this word, with or without the s on the end and/or the ‘s.
  4. That pesky ing ending to so many words. I saw him walking into a store. Is this a gerund or a present participle? I am not sure if anyone cares maybe except me :). Google says that  a gerund is “is a noun made from a verb by adding “-ing.” Thus, I believe walking, in the context of the above statement, is a gerund, being a verb behaving like a noun. Pencil is a noun and, when someone loses one, we say Have you seen his pencil not Have you seen him pencil. If this is the case, why would one say I saw him walking rather than I saw his walking? A noun is a noun.

To sum up this musing, suffice to say, a handful of questions has its effect on my studying more about this quirky language we all love.

Categories: DBA Blogs

Errors in a Pluggable Database?

Wed, 2015-10-21 06:23


There might be a situation where executing some DDL in pluggable database may cause the following error:

ORA-65040: operation not allowed from within a pluggable database

This error could occur if a tablespace is being dropped from within PDB and this tablespace is a former default tablespace having some of the system objects. Even system objects cannot be moved with simple alter statements from within PDBs.

So in order to move these objects from within PDBs, you should be using procedure dbms_pdb.exec_as_oracle_script which is undocumented so far.

For example:

exec dbms_pdb.exec_as_oracle_script(‘alter table <owner>.<table_name> move tablespace <tablespace name>’);

From My Oracle Support, Doc ID 1943303.1 lists:

—   This procedure enables execution of certain restricted statements (most DDLs) on metadata-linked objects, from within a PDB.


Discover more about our expertise in the world of Oracle.

Categories: DBA Blogs

The 2nd Annual PASS Summit Bloggers Meetup (2015)

Tue, 2015-10-20 17:20

I’m excited to announce the second annual PASS Summit Bloggers Meetup! We began this last year and it was cool but this year will be even cooler!

What: PASS Summit Bloggers Meetup 2015
When: Thursday, October 29th, 5:30pm
Where: Sports Restaurant & Bar, 140 4th Ave North, Suite 130, Seattle, WA 98109.
How: Please comment with “COUNT ME IN” if you’re coming — we need to know attendance numbers.

We’re excited to meet old friends, and make new ones in Seattle this year. Pythian will sponsor munchies and drinks as usual. There will be a networking contest with some cool prizes, plus you will get your very own Love Your Data t-shirt (at least the first 50 people). Meetups are a great way to make new friends in the community, so come on out — all bloggers are welcome!

If you are planning to attend, please comment here with the phrase “COUNT ME IN”. This will help us ensure we have the attendance numbers right. Please provide your blog URL with your comment — it’s a Bloggers Meetup after all! Make sure you comment here if you are attending so that we have enough room, food, and drinks.

See the photos from the last year’s meetup courtesy to Pat Wright.

SQL PASS 2014 Bloggers Meetup photo

The location is perfect to get ready for the Community Appreciation Party — a few minutes walk from EMP Museum! Snacks and drinks before the big event and mingling with fellow bloggers. What can be better?

Of course, do not forget to blog and tweet about this year’s bloggers meetup using #Summit15 #sqlpass. See you there!


Discover more about our expertise in SQL Server.

Categories: DBA Blogs

Oracle OpenWorld 2015 – Bloggers Meetup

Tue, 2015-10-20 16:51

Oracle OpenWorld Bloggers Meetup Many of you are coming to San Francisco next week for Oracle OpenWorld 2015 and many of you have already booked time on your calendars on Wednesday evening before the appreciation event. You are right — the Annual Oracle Bloggers Meetup, one of your favorite events of the OpenWorld, is happening at usual place and time!

What: Oracle Bloggers Meetup 2015.

When: Wed, 28-Oct-2015, 5:30pm.

Where: Main Dining Room, Jillian’s Billiards @ Metreon, 101 Fourth Street, San Francisco, CA 94103 (street view). Please comment with “COUNT ME IN” if you’re coming — we need to know the attendance numbers.

As usual, Oracle Technology Network and Pythian sponsor the venue, drinks and cool fun social stuff. This year we are dropping a cool app and resurrecting traditions — you know what it means and if not, come and learn. All blogger community participants are invited — self qualify is what that means ;).

As usual, vintage t-shirts, ties, or bandanas from previous meetups will make you look cool — feel free to wear them.

For those of you who don’t know the history: The Bloggers Meetup during Oracle OpenWorld was started by Mark Rittman and continued by Eddie Awad, and then I picked up the flag in 2009. This year we have Oracle Technology Network taking more leadership on the organization of the event in addition to just being a “corporate sponsor”.

The meetups have been a great success for making new friends and catching up with the old, so let’s keep them this way! To give you an idea, here are the photos from the OOW08 Bloggers Meetup (courtesy of Eddie Awad) and OOW09 meetup blog post update from myself, and a super cool video by a good blogging friend, Bjorn Roest from OOW13.

While the initial meetings were mostly targeted to Oracle database folks, guys and gals from many Oracle technologies — Oracle database, MySQL, Apps, Sun technologies, Java and more join in the fun. All bloggers are welcome. Last year we crossed 150 attendees and I expect this year we may set a new record.

If you are planning to attend, please comment here with the phrase “COUNT ME IN”. This will help us ensure we have the attendance numbers right. Please provide your blog URL (or whatever you consider a replacement of that — I’ll leave it to your interpretation) with your comment — it’s a Bloggers Meetup after all! Please do make sure you comment here if you are attending so that we have enough room, food, and (most importantly) drinks.

Of course, do not forget to blog, tweet, linkedin, G+, instagram, email and just talk about this year’s bloggers meetup. See you there — it will be fun!


Discover more about our expertise in the world of Oracle.

Categories: DBA Blogs

Log Buffer #445: A Carnival of the Vanities for DBAs

Fri, 2015-10-16 13:45

This Log Buffer edition works it way through some of the coolest blog posts from Oracle, SQL Server and MySQL of the past week.


  • What if I’m a developer or QA person using a copy of production database to do my work. What if my copy of production is now out of date and I want to refresh my data with the data as it is on production.
  • Direct path and buffered reads again.
  • Copy Data Management for Oracle Database with EMC AppSync and XtremIO.
  • Little things worth knowing: automatic generation of extended statistics in 12c.
  • Any time you execute more than one SQL statement in a PL/SQL procedure the results of executing that procedure may not be self-consistent unless you have explicitly locked your session SCN to a fixed value!!!!!

SQL Server:

  • An Introduction to the SQLCMD Mode in SSMS (SQL Spackle).
  • Describes the idle connection resiliency feature, which allows ODBC and SqlClient data access applications to maintain their connections to SQL Server 2014 or an Azure SQL Database.
  • How to Upgrade SQL Server.
  • Integration Services Logging Levels in SQL Server 2016.
  • Understanding the OVER clause.


  • Proxy Protocol and Percona XtraDB Cluster: A Quick Guide.
  • Do not run those commands with MariaDB GTIDs.
  • What If You Can’t Trace End-to-End?
  • Storing UUID Values in MySQL Tables.
  • How to Deploy and Manage MaxScale using ClusterControl.


Learn more about Pythian’s expertise in Oracle SQL Server & MySQL.

Categories: DBA Blogs

Log Buffer #444: A Carnival of the Vanities for DBAs

Fri, 2015-10-09 13:53

This Log Buffer Edition covers some blog posts of Oracle, SQL Server and MySQL from this past week.


  • Oracle Utilities Application Framework V4. includes a new help engine and changes to the organization of help.
  • Very simple oracle package for HTTPS and HTTP.
  • Oracle ZFS Storage Appliance surpasses $1B in Revenue (Oct 6th).
  • Tim spotted a problem with the PDB Logging Clause.
  • How to Pass Arguments to OS Shell Script from Oracle Database.

SQL Server:

  • How efficient is your covered index?
  • Storing Passwords in a Secure Way in a SQL Server Database.
  • There is plenty that is novel and perhaps foreign to a new R user, but it’s no reason to abandon your hard-earned SQL skills.
  • SSIS Design Pattern – Staging Fixed Width Flat Files.
  • Shorten Audits, Tighten Security with Minion Enterprise.


  • Confusion and problems with lost+found directory in MySQL/Galera cluster configuration.
  • Simplifying Docker Interactions with BASH Aliases.
  • Getting started MySQL Group Replication on Ubuntu with MySQL Sandbox.
  • MySQL lost “AUTO_INCREMENT” after a long time.
  • Using Apache Spark and MySQL for Data Analysis.


Learn more about Pythian’s expertise in Oracle SQL Server & MySQL.

Categories: DBA Blogs

My Sales Journey: #6

Fri, 2015-10-09 13:43


Last week, I delved into the the mind of our executive tier to pin point how to make outreach appealing to that group. Although most companies want to speak to these decision makers it is an uphill battle to get the ear of a VP/Exec. Sometimes, we must reach the people who are talking to their bosses each day and these are the managers in the trenches living with the problems we want to solve.
So today, lets step inside the mind of a Manager:

Managers maintain the integrity and continual functioning of mission critical operations. They evaluate and make recommendations regarding new technologies, tools and techniques. They are keeping a head count and know where the gaps exist in their teams. Managers are responsible for their team performance whilst also expected to do more with less.

They are also the people who will most likely be willing to listen if you have a solution  to their problems. Tell them how you or your company has solved similar problems with others. Give them proof.

Assure them of their importance. Make seeking your help a positive experience for them and their team. Managers need to know that you are not taking their teams job’s or theirs. As a managed service firm we like our clients to know that we can augment and work as an intimate extension of their in-house team.

If you should be so lucky that a forward thinking manager wants to introduce you to their executive team. Give them credit. There are tons of people who are too afraid to do things differently and resistant to change so when someone takes that leap for you do not leave them behind.

You can either work up the food chain or down as a Sales professional. Know that catching the big fish is hard but rewarding and most times you will need to get creative with your hook. Isn’t that what Sales is all about in a nutshell? I personally, do not have a preference but I am also new to the game.

I’d love to hear your thoughts about your sales process, who do you talk to first, what gets you a higher response rate, do you go after the big fish or work your way up the ladder – leave me a message!

Categories: DBA Blogs

When it comes to Black Friday’s online rush, even we Brits don’t want to queue

Fri, 2015-10-09 08:10

Black Friday is a pretty new concept in the UK. Traditionally an American post-Thanksgiving shopping day, it has recently gained popularity in the UK. It really took off last year and defined the start of what I heard being called the Golden Quarter – from November through to the end of January – when retailers will make 40-50 percent of their annual sales.

With anything new, people take a while to find their feet, and will try out new things. I hope that one of the technologies that was trialled widely on e-commerce sites during this period last year, isn’t used again this year: The queuing system.

The idea was sound: instead of having too many customers hitting a site and causing poor performance for everyone, a number were put into a virtual waiting room and allowed onto the site in order. The meant that, once in, everyone could shop quickly without any risk of a crash from overload.

But in practice, it seemed to customers as if the site wasn’t working. The Press reported that sites had “crashed” anyway and the user experience was awful. You might queue to get into Harrods on the first day of a Sale, or you might queue online to get a ticket to the last One Direction concert, but with plenty of choices available, users simply hopped elsewhere.

To me, the most frustrating thing was that this seemed like a lazy solution. It is not difficult to build your e-commerce site for the peaks you should be expecting.

Why not ensure you can spin up extra capacity with a cloud provider, if needed? And why not take the time to configure the database structure? This would mean that, for those few days when 1000 people a minute are wanting to see the shoes you have in stock, they all can. Easily and without delay.

Building an e-commerce site – or indeed any application – to be scalable to handle peak traffic should be a high priority for your developers, DBAs, and sys admins. With the range of available cloud technologies, database technologies, and automation tools there is no excuse.

Let’s hope that for Black Friday 2015, for once the UK is queue free!

By the way, if you need a hand on any of the areas discussed above, please do get in touch. As a majority of Pythian’s retail clients are in the US, we’ve had many years of practice, and ensure our clients peak demands are handled smoothly.

Categories: DBA Blogs