Skip navigation.

Pythian Group

Syndicate content
Official Pythian Blog - Love Your Data
Updated: 17 hours 21 min ago

Log Buffer #355, A Carnival of the Vanities for DBAs

Fri, 2014-01-17 07:55

It’s no secret anymore that social networks like Facebook are fast loosing their hyper-active charm. But ask anybody about blogging, and its still there and the future is as solid as rock. Innovation, creation, and quality content with public engagement is what makes it lasting. This Log Buffer acknowledges that fact for the database bloggers.
Oracle:

The  Critical Patch Update (CPU) for January 2014 was released on January 14, 2014.  Oracle strongly recommends applying the patches as soon as possible.

Inject filesystem I/O latency. Oracle Linux 5 and 6

What to expect at the Oracle Value Chain Summit

Get a Closer Look at Oracle Database Appliance

EBS Release 12 Certified with Mac OS X 10.9 with Safari 7 and JRE 7

SQL Server:

Calculating Duration Using DATETIME Start and End Dates (SQL Spackle)

Learn SQL Server in Arabic

There is nothing mysterious about SQL Server backups. They are essential, however you use your databases.

Stairway to Advanced T-SQL Level 1: The TOP Clause

Free eBook: Troubleshooting SQL Server: A Guide for the Accidental DBA

MySQL:

Multi-master, multi-region MySQL deployment in Amazon AWS

In fact, Docker is a wrapper around LXC. It is fun to use. Docker has the philosophy to virtualize single applications using LXC.

Analyzing WordPress MySQL queries with Query Analytics

MySQL Cluster: The Latest Developments in Management, Free webinar

MySQL Proxy lives – 0.8.4 alpha released

Categories: DBA Blogs

My first week at Pythian

Mon, 2014-01-13 09:13

“You are lucky,” the taxi driver told me when he picked me up at the Ottawa Airport. “It began snowing right after your flight landed.” I personally like snow — just like the Chinese saying, “heavy snow brings heavy harvest”. Yes, I am lucky to be able to work at Pythian where the world’s top 5% talented data lovers come together. I am lucky to begin my first week at Pythian with hard weather and hard work which eventually leads to heavy harvest, I always believe.

I feel Pythian is like a warm home, when Dag met me on early Monday morning to bring me to the office. After that, Jennifer and Sarah introduced me to every person we met and they all show their kindness to me. I feel Pythian is transparent, when HR, Marketing, Sales and SD departments shared as much as possible information to us new employees. We can even see everyone’s calendar and work posting. Yes, I mean it, EVERYONE! Which is really good, because being transparent means being confident, reliable and responsible. I also feel inspiring while Paul, Pythian founder, shared his story and his thoughts regarding the company.

Besides all the above good things, you need to work hard as well, since there are lots of information and skills waiting for you.  No worry, you will have an experienced SDC to help you as we have Shawna. Here lists 3 tips to help you as a DBA survive and enjoy the first week at Pythian.

1. Taking those BORG sessions are important but only part of your tasks. The Onboarding check list your SDC sent to you before your arriving is really important. Make sure you walk through all the items in the list and try you best to understand them.

2. Start to use Support Track to book your time the earlier the better. Make sure you book your time under the right CR numbers listed in the email coming with the checklist.

3. Your TTL may send you an email asking you do some Bootcamp tasks, spend some time on it because it is your job related technical things. Just follow the instructions, it is not that hard.

Go relax and enjoy your first week at Pythian!

By the way, if you are lucky enough having a seat near Michael S. Abbey, he will share his experiences with you, calm you down and relax your stress. which is really helpful.

Last but not the least, I would like say thank you to Christina, Danil, Amanda, Manoj and Steve who made the wonderful decision to bring me in!

Categories: DBA Blogs

Deduping 100 Gigs Worth of Files? Gimme 5 Minutes…

Mon, 2014-01-13 08:57

Shortly before the Holidays, I became aware of the Dallas/Fort Worth Perl Mongers Winter Hackaton Contest. The challenge was simple enough: Here’s a directory structure filled with 100G worth of files, go and find all the duplicate files in there, as fast as you can.

You know me: can’t resist a contest. But since I knew I didn’t have heaps of time to sink into it, and because — let’s face it — the Perl community is despairingly brim-full of smart people who always beat me to the finish line, I decided to compete on slightly different terms. My main objective would be to leverage a maximum from modern Perl tools so that effort to get a working solution would be minimal. As for the performance of the solution, I decided that I’d be satisfied, and that the case for modernity and laziness would be made if it was at least in the ballpark of the other entries.

And so it began…

The Application Interface

Writing code to interact with the users is booooooring.

The hard part, the fun part, the challenging part, is to come up with the function do_it( $foo, $bar ). Once this is done, all that remains to do is to turn the handle and give a way to the user to set $foo and $bar. And to validate the incoming values for $foo and $bar. And document what are $foo and $bar. Blergh. It’s dotting the i’s, and crossing the t’s. It’s writing the police report after that blood-pumping drug cartel take-down.

Fortunately, there are a few modules that will take that tedium off yours hands. One of my favorites is MooseX-App, and for this project I used its single-command variation, MooseX::App::Simple. Its use is pretty straightforward. The main module of the application is a Moose class


package Deduper;

use strict;
use warnings;

use MooseX::App::Simple;

Attributes of the class that you want to be accessible as options of the script are defined using the option keyword. Same story for positional parameters, with the parameter keyword. Other attributes stays untouched, and won’t be visible to the user.


parameter root_dir => (
    is => 'ro',
    required => 1,
    documentation => 'path to dedupe',
);

option hash_size => (
    isa => 'Int',
    is => 'ro',
    default => '1024',
    trigger => sub {
        my $self = shift;
        Deduper::File->hash_size( $self->hash_size );
    },
    documentation => 'size of the file hash',
);

option stats => (
    isa => 'Bool',
    is => 'ro',
    documentation => 'report statistics',
);

option max_files => (
    traits => [ 'Counter' ],
    isa => 'Int',
    is => 'ro',
    predicate => 'has_max_files',
    default => 0,
    documentation => 'max number of files to scan (for testing)',
    handles => {
        dec_files_to_scan => 'dec',
    },
);

has start_time => (
    is => 'ro',
    isa => 'Int',
    default => sub { 0 + time },
);

Last required touch for the class: a run() method that will be invoked when the app is run:


sub run {
    my $self = shift;

    say join "\t", map { $_->path } @$_ for $self->all_dupes;
}

With that, your app is in working condition. Command-line parsing, argument validation, --help text, it’s all taken care of for you:


$ cat ./dedup.pl

use Deduper; 
Deduper->new_with_options->run;

$ ./dedup.pl
Required parameter 'root_dir' missing
usage:
    dedup.pl [long options...]
    dedup.pl --help

parameters:
    root_dir  path to dedupe [Required]

options:
    --hash_size           size of the file hash [Default:"1024"; Integer]
    --help -h --usage -?  Prints this usage information. [Flag]
    --max_files           max number of files to scan (for testing) [Default:
                          "0"; Integer]
    --stats               report statistics [Flag]

Traversing Directory Structures

Traversing directory structures isn’t terribly hard. But there are a few things like symlinks that you have to watch for. What we care about, really, is to have each file of the structure handed to us on a silver platter. Let’s have somebody else deal with the underlying menial work.

Here, this somebody else is Path::Iterator::Rule. With it, visiting all files of the directory structure boils down to creating an attribute


has file_iterator => (
    is => 'ro',
    lazy => 1,
    default => sub {
        my $self = shift;
        return Path::Iterator::Rule->new->file->iter_fast( 
            $self->root_dir, {
                follow_symlinks => 0,
                sorted          => 0,
        });
    },
);

and, well, using it


while( my $file = $self->file_iterator->() ) {
    # do stuff with $file...
}
A Couple Of No-Brainish Optimizations

Everybody love optimizations that require no more effort than to press a ‘turbo’ button. On this project, two modules providing that kind of boost were just begging to be used.

The first is MooseX::XSAccessor, which invests the accessors of the class with super-speedy XS powers. Alas, that module does not speed up lazy-defined attributes, which I used in spade. But since its use only require to drop a


use MooseX::XSAccessor;

at the top of the file, it’s not like I have much to lose anyway.

The second module is MooseX::ClassAttribute. As you probably realized from the previous code snippets, my code create an object per file. For 100G worth of files, that turns out to be lots of objects. So having configuration attributes that will all end up having the same value over and over again for each object would be quite wasteful. In that case, using an attribute which value is shared for all objects of the class makes much more sense. And, again, using that module is dead simple:


package Deduper::File;

use Moose;
use MooseX::ClassAttribute;

class_has hash_size => (
    isa => 'Int',
    is => 'rw',
    default => 1024,
);

sub foo {
    my $self = shift;

    # hash_size can be used as a regular attribute
    my $hs = $self->hash_size;
    ...
}

The cherry on top of everything is that those class attributes are totally transparent to the rest of the code. Ever find that you want per-object values after all, change ‘class_has’ for ‘has’, and you’re done.

Keeping Different Functionalities Nicely Segregated

A bonus that using a Moose-based class brought to the project was to keep different functionalities logically apart via method modifiers. For example, the run() method of an app typically juggles with the different options requested:


sub run {
    my $self = shift;

    my $n = $self->max_files || -1;

    until ( $self->finished ) {
        print $self->next_dupe;
        if ( $n > -1 ) {
            $n--;
            $self->finished(1) if $n == 0;
        }
    }
    $self->print_stats if $self->stats;
}

instead, once can encapsulate those different behaviors in separate functions


sub check_max_files {
    my $self = shift;

    $self->dec_files_to_scan;
    $self->finished(1) if $self->files_to_scan == 0;
}

sub stats {
    # print a lot of stuff
}

sub run {
    my $self = shift;

    print $self->next_dupe until $self->finished;
}

and stitch in whatever is needed at creation time:


sub BUILD {
    my $self = shift;

    $self->meta->make_mutable;

    $self->meta->add_after_method_modifier(
        next_dupe => \&check_max_file,
    ) if $self->max_files;

    $self->meta->add_after_method_modifier( run => \&print_stats )
        if $self->stats;

    $self->meta->make_immutable;
}

If managed properly, this makes each method much smaller and much more single-minded. As a nice side-effect, the final application object will only contain the code that it requires. If the option --max_file isn’t passed, the algorithm won’t have an additional ‘if’ statement to deal with per iteration. It’s not much, but it will make the default case just a tad faster. I must say, however, that this is not a free lunch: if the option is passed, the running time will be slower than if a simple ‘if’ was used. But that trade-off can make sense, like here where I’m ready to have a slight penalty when I run my test runs, but really want to have the tires screeching for the main event.

Finally, The Core Algorithm

With all boilerplate stuff dealt with, we can focus on the core problem. After some experimenting, I ended up with a recipe that isn’t exactly ground-breaking, but seems to be efficient. The encountered files are first classified by file size. For files of the same size, we compare the first X bytes of the file (where ‘X’ is fairly small). If they share that beginning, we compare their last X bytes (in case we’re dealing with file types with the same preamble). Then, ultimately, we compare full-file digests (using Digest::xxHash which is pretty damn fast). Everything is only computed if required, and then cached so that we do the work only once.

Translated into code, it looks pretty much like



# in class Deduper

sub next_dupe {
    my $self = shift;

    return if $self->finished;

    while( my $file = $self->file_iterator->() ) {
        $file = Deduper::File->new( path => $file );
        my $orig = $self->is_dupe($file) or next;
        return $orig => $file;
    }

    $self->finished(1);
    return;
}

sub is_dupe {
    my( $self, $file ) = @_;

    return $_ for $self->find_orig($file);

    # not a dupe? enter it in the registry
    $self->add_file($file);

    return;
}

sub find_orig {
    my( $self, $file ) = @_;

    # do we have any file of the same size?
    my $candidates = $self->files->{$file->size}
        or return;

    my @c;

    # only have a sub-hash if we have more than one
    # file, so as not to compute the 'hash' (beginning of the file) 
    # needlessly
    if( ref $candidates eq 'Deduper::File' ) {
        return if $candidates->hash ne $file->hash;
        @c = ( $candidates );
    }
    else {
        @c = @{ $candidates->{$file->hash} || return };
    }

    # first check if any share the same inode
    my $inode = $file->inode;
    for ( @c ) {
        return $_ if $_->inode == $inode;
    }

    # then check if dupes
    for ( @c ) {
        return $_ if $_->is_dupe($file);
    }

    return;
}

sub add_file {
    my( $self, $file ) = @_;

    if( my $ref = $self->files->{$file->size} ) {
        if ( ref $ref  eq 'Deduper::File' ) {
            $ref = $self->files->{$file->size} = { $ref->hash => [ $ref ] };
        }
        push @{$ref->{$file->hash}}, $file;
    }
    else {
        # nothing yet, just put the sucker
        $self->files->{$file->size} = $file;
    }

}

# and then in Deduper::File

sub is_dupe {
    my( $self, $other ) = @_;

    # if we are here, it's assumed the sizes are the same
    # and the beginning hashes are the same

    # different hashes?
    return $self->end_hash eq $other->end_hash
        && $self->digest eq $other->digest;
}
And I Give You The Pudding, Fully Cooked

The application, as submitted to the contest, can be found on GitHub.

“And how did it fare, performance-wise?”, you ask? If the contest results are to be trusted, it processed the 100G monster in a little less than 4 minutes which, I think, ain’t too shabby.

Mostly considering that, y’know, it turned out to be the best time of all submitted entries. :-)

Categories: DBA Blogs

Meet us at NRF14 booth #2233

Fri, 2014-01-10 10:18
Categories: DBA Blogs

SQL Server 2012: Installing and configuring Distributed Replay

Fri, 2014-01-10 09:46

Today we are going to discuss one of the powerful new functions of SQL Server 2012. The name of the feature is Distributed Replay which can be used to assess the impact of changes and upgrades or SQL Server tunings by replaying a trace captured from production SQL Server environment to a test environment.
What are the benefits?

Unlike SQL Server profiler, the Distributed Replay can run the recorded trace against as much as 16 Distributed Replay Client, all these clients are controlled by one Distributed Replay Controller. Distributed Replay components are installed separately from the regular SQL Server install. The Distributed Replay Controller can act as a replacement for Ostress, except for the ability to replay SQL and RML files.

This powerful feature will help customers/DBAs in understanding the impact of any application/configuration changes on performance for multiple servers. You can customize the trace replay by choosing appropriate mode for your environment

Environment:

We will use a Distributed Replay controller and a Distributed Client on server SQLNODE1, a second Distributed Replay Client on server SQLNODE2 and a target instance with SQL Server 2008 R2 instance.

For installation we need to create two Active Directory accounts .First CORP\DR_Controller for Distributed Replay Controller and CORP\DR_Client for Distributed Replay Client. Below are the screenshots for reference.

Distributed Replay installation

We will start the installation now — Make sure that you install Management Tools too as it provides Distributed Replay administration tool. For this installation I will install Distributed Replay Controller and Distributed Replay Client, as I already have Management tools installed on my SQL Server instances.

We will start installation on SQLNODE1 as it will act as a controller as well as client.

Step #1: Select the installation type. Here we will add features to existing standalone installation, selecting the same.

Step #2: Setup will install setup support files.

Step #3: You have option to include SQL Server updates.

Step #4: Setup will install setup files.

Step #5: Setup should pass all setup support rules. You may ignore warnings and go ahead on a case-by-case basis.

Step #6: Since I already have the DB Engine installed, I will choose to Add features to existing instance of SQL Server 2012.

Step #7: Since this Server will act as a Controller and Client thus selecting both here.

Step #8: This page will review the disk space requirement.

Step #9: On this page, Provide both the accounts created for Distributed Replay Controller and Distributed Replay Client in Active directory.

Step #10: On Distributed Replay Controller page, add the Active Directory account that we have created.

Step #11: On Distributed Replay Client page, provide the controller name.

Step #12: On Error Reporting Page you can opt for sending information of Error reporting to Microsoft.

Step #13: On this page setup will check if the installation will be blocked by running rules. All green here.

Step #14: We are ready to install. You might like to .cross verify the selected configurations before proceeding.

Step #15: Installation is in progress.

Step #16: Finally the setup is successful. You can close the window now.

Windows Firewall configurations

We will configure the Windows firewall on the Distributed Replay Controller to allow inbound connection for DReplayController.exe application .

Go to Windows firewall, Inbound Rules, and add a new rule for a program:

Browse to the DReplayController.exe location:

Allow the Connection.

Specify a name to the rule.

Similarly create a rule on each Distributed Replay Client to allow Distributed Replay Clients to connect and register to the Distributed Replay controller,
Verification

Start Services and check client registrations.. First start the Controller service on the Distributed Replay Controller SQLNODE1, open a command prompt screen and enter:

NET STOP “SQL Server Distributed Replay Controller”

NET START “SQL Server Distributed Replay Controller”

You should see the result of this command in the log folder of Distributed Replay Controller under C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayController\log.It should be like this

2014-01-06 20:30:42:482 OPERATIONAL [Controller Service] Microsoft SQL Server Distributed Replay Controller – 11.0.2100.60.

2014-01-06 20:30:42:482 OPERATIONAL [Controller Service] © Microsoft Corporation.

2014-01-06 20:30:42:482 OPERATIONAL [Controller Service] All rights reserved.

2014-01-06 20:30:42:482 OPERATIONAL [Controller Service] Current edition is: [Enterprise Edition].

2014-01-06 20:30:42:482 OPERATIONAL [Controller Service] The number of maximum supported client is 16.

2014-01-06 20:30:42:482 OPERATIONAL [Controller Service] Windows service “Microsoft SQL Server Distributed Replay Controller” has started under service account “CORP\DR_Controller”. Process ID is 2888.

2014-01-06 20:30:42:498 OPERATIONAL [Controller Service] Time Zone: India Standard Time.

2014-01-06 20:30:42:498 OPERATIONAL [Common] Initializing dump support.

2014-01-06 20:30:42:498 OPERATIONAL [Common] Dump support is ready.

Now start Distributed Replay Client service on your Distributed Replay Clients SQLNODE1 and SQLNODE2. Verify that they are correctly synchronized with your Distributed Replay Controller.

On both servers, open a command prompt screen and run:

NET STOP “SQL Server Distributed Replay Client”

NET START “SQL Server Distributed Replay Client”

Open the latest log file in the location C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\log and check for text Registered with controller

2014-01-06 20:30:51:000 OPERATIONAL [Client Service] Microsoft SQL Server Distributed Replay Client – 11.0.2100.60.

2014-01-06 20:30:51:000 OPERATIONAL [Client Service] © Microsoft Corporation.

2014-01-06 20:30:51:000 OPERATIONAL [Client Service] All rights reserved.

2014-01-06 20:30:51:000 OPERATIONAL [Client Service] Current edition is: [Enterprise Edition].

2014-01-06 20:30:51:015 OPERATIONAL [Common] Initializing dump support.

2014-01-06 20:30:51:015 OPERATIONAL [Common] Dump support is ready.

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Windows service “Microsoft SQL Server Distributed Replay Client” has started under service account “CORP\DR_Client”. Process ID is 2872.

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Time Zone: India Standard Time.

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Controller name is “SQLNODE1″.

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Working directory is “C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\WorkingDir”.

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Result directory is “C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\ResultDir”.

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Heartbeat Frequency(ms): 3000

2014-01-06 20:30:51:015 OPERATIONAL [Client Service] Heartbeats Before Timeout: 3

2014-01-06 20:30:51:078 OPERATIONAL [Client Service] Registered with controller “SQLNODE1″.

You might get below error message in the Client Log if in case you forget to add the Client Service account in the Controller Service access permission page during setup or you just change the Distributed Replay client service account .I re-produced the scenario and below is the log file.

2014-01-06 18:04:54:603 OPERATIONAL [Client Service] Microsoft SQL Server Distributed Replay Client – 11.0.2100.60.

2014-01-06 18:04:54:619 OPERATIONAL [Client Service] © Microsoft Corporation.

2014-01-06 18:04:54:619 OPERATIONAL [Client Service] All rights reserved.

2014-01-06 18:04:54:806 OPERATIONAL [Client Service] Current edition is: [Enterprise Edition].

2014-01-06 18:04:54:806 OPERATIONAL [Common] Initializing dump support.

2014-01-06 18:04:54:853 OPERATIONAL [Common] Dump support is ready.

2014-01-06 18:04:54:884 OPERATIONAL [Client Service] Windows service “Microsoft SQL Server Distributed Replay Client” has started under service account “CORP\DR_Client”. Process ID is 2800.

2014-01-06 18:04:55:196 OPERATIONAL [Client Service] Time Zone: India Standard Time.

2014-01-06 18:04:55:243 OPERATIONAL [Client Service] Controller name is “SQLNODE1″.

2014-01-06 18:04:55:243 OPERATIONAL [Client Service] Working directory is “C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\WorkingDir”.

2014-01-06 18:04:55:258 OPERATIONAL [Client Service] Result directory is “C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\ResultDir”.

2014-01-06 18:04:55:258 OPERATIONAL [Client Service] Heartbeat Frequency(ms): 3000

2014-01-06 18:04:55:258 OPERATIONAL [Client Service] Heartbeats Before Timeout: 3

2014-01-06 18:04:55:648 CRITICAL [Client Service] [0xC8100005 (6)] Failed to connect controller with error code 0×80070005.

The error code 0×80070005 is also described as “ACCESS DENIED.” Below are the steps to fix it.

  1. On the server where controller is installed. Goto – Start -> Run and type dcomcnfg and open Component Services.
  2. Navigate to Console Root –> Component Services –> Computers –> My Computer –> DCOM Config -> DReplayController
  3. Open the properties of DReplayController and select Security tab
  4. Edit “Launch and Activation Permissions” and grant “Distributed Replay client service account” permission for “Local Activation” and “Remote Activation”.
  5. Edit “Access Permissions” and grant “Distributed Replay client service account” permission for “Local Access” and “Remote Access”.
  6. Add “Distributed Replay client service account” domain user account within “Distributed COM Users” group.
  7. Restart controller and client services like below

NET STOP “SQL Server Distributed Replay Controller”

NET STOP “SQL Server Distributed Replay Client”

NET START “SQL Server Distributed Replay Controller”

NET START “SQL Server Distributed Replay Client”

  1. Check the Distributed Replay Client log file and see the message “Registered with controller SQLNODE1”

Conclusion

At this point, our Distributed Replay lab is ready to use, clients and controller are registered together. We will perform a demonstration in upcoming blog post and will understand this powerful feature.

Categories: DBA Blogs

Log Buffer #354, A Carnival of the Vanities for DBAs

Fri, 2014-01-10 09:33

The new year is here with a bang, and the rythem of everything new is propagating into the database technologies, hence providing database bloggers ample material to write about. This Log Buffer Edition covers all that and more.

Oracle:

Do you know that DB12c sqlldr is a little smarter…finally ? Now, *.ctl file is not always necessary.

ODI – High performance data movement using Datapump.

Informatica PowerCenter 9.5.1 Hot Fix 2 is now available in the Business Intelligence (BI) 11g media pack for download and use with BI Applications 7.9.x.

This was tested on Oracle 11.2. If you create a tablespace with a datafile with autoextend on, the file’s maxsize cannot be less than its size. If it is, you get an ORA-02494

Oracle Exadata X4 (Part 2): The All Flash Database Machine?

SQL Server:

What are companies doing to optimize BYOD practices?

Stairway to SQLCLR Level 1: What is SQLCLR?

Relational database or graph database? Why not have both?

SQL Server 2014 Hybrid Cloud Scenarios: Migrating On-Premises SQL Server to Windows Azure Virtual Machines

An AlwaysOn Bug with Service Broker Transmission Queue

MySQL:

MySQL Enterprise Monitor 2.3.15 Is Now Available.

New MySQL web installer, Have you tried it yet?

One of the most important config for InnoDB is the innodb_buffer_pool_size.

XtraBackup Complains of Missing perl-DBD-MySQL.

High-Velocity Data—AKA Fast Data or Streaming Data—seems to be all the rage these days.

Categories: DBA Blogs

How to Install a Clustered SQL Server 2012 Instance – Step-by-step – Part 4

Fri, 2014-01-10 09:32

We have reached the last article of this series. To close this series out, we will talk about Distributed Transaction Coordinator, or simply DTC. I’ll try to do some simple explanation. After that, I’ll demonstrate how to prepare the DTC for a clustered instance.

What is DTC (MS DTC)?
The MS DTC is a OS level service, which comes automatically installed and running under the Network Service account. Its role is to ensure that a distributed transaction is consistent, even with failures.
Those transactions might be initiated when a transaction is dealing with data on multiple computers via network or when the transaction is dealing with multiple processes in a single computer.

All the participants of a distributed transaction works in sync with the other participants (computers) involved in a transaction, looking for the right moment to commit or abort its work. For this reason, we need to make sure that the computers can reach each other.

Do I need to configure MS DTC on my environment?
The answer for this question is the standard for almost everything involved with SQL Server; It depends. You need to understand whether or not you will perform distributed transactions. If you have more than one instance in the same computer, you won’t need the DTC. On the other hand, if you have a two nodes cluster with two clustered instances communicating with each other, you will need the DTC – the instances could be in different nodes. Another possible scenario is when you have the database engine and SSIS installed, in this case you will need to configure the DTC.

For more information, check this link: http://msdn.microsoft.com/en-us/library/ms189910#MSDTC

How to create a clustered MS DTC?

Since Windows 2008, we are allowed to have more than one instance of MS DTC in a server/cluster. So, for clustered SQL Server installations is a best practice to have a Role exclusively for the DTC and a dedicated DTC for each SQL Server Role.

As documented per Microsoft, the SQL Server follow this path to choose the MS DTC instance to use:

  • Use MS DTC installed to the local group, else
    • Use the mapped instance of MS DTC, else
      • Use the cluster’s default instance of MS DTC, else
        • Use the local machine’s installed instance of MS DTC

To configure a DTC in cluster, we will need a disk and a hostname.

To configure a Role exclusively for the DTC, follow the steps:

  1. Right-click on Roles and pick the “Configure Role” option.
    Screen Shot 2014-01-02 at 14.18.43
  2. A new window will open. Click “next”.Screen Shot 2014-01-02 at 14.18.52
  3. Choose the option “Distributed Transaction Coordinator (DTC)” from the list. Click  “Next”.Screen Shot 2014-01-02 at 14.19.11
  4. Fill the hostname in the “Name” field and the IP in the “Network” section. Click “Next”.Screen Shot 2014-01-02 at 14.20.33
  5. Pick up the disk to be used. Click “Next”.Screen Shot 2014-01-02 at 15.14.42
  6. Review the configurations and click “Next”.Screen Shot 2014-01-02 at 15.14.57
  7. The installation will run and in the last step you will see a report. Click “Finish”.Screen Shot 2014-01-02 at 15.15.11
  8. Now you will be able to see a new Role created in the cluster, with all the indicated resources.
Screen Shot 2014-01-02 at 15.16.51

To add a DTC resource into the SQL Server Role, follow the steps:

  1. Right-click the Role, go to “Add Resource”->”More Resources” -> “Distributed Transaction Coordinator”.Screen Shot 2014-01-02 at 15.30.50
  2. The resource will be created in the selected Role, now we need to configure it. Right-click the “New Distributed Transaction Coordinator” and click on “Properties”.Screen Shot 2014-01-02 at 15.31.20
  3. As referred early on this article, the DTC needs a hostname and a disk to work. On dependencies you can pick up those items as shown, and click “ok”.Screen Shot 2014-01-02 at 15.32.44
  4. Now, let’s bring it online.Screen Shot 2014-01-02 at 15.32.55

How to configure the network for distributed transactions?

Note: On clustered environments, you just need to perform the following steps one time.

  1. On “Server Manager” go to “Tools”->”Component Services” or run the command “dcomcnfg”.Screen Shot 2014-01-02 at 15.33.55
  2. Expand the tree, right-click the desired DTC and choose “Properties”.Screen Shot 2014-01-03 at 11.46.28
  3. Go to the “Security” tab and check “Network DTC Acess” as well as “Allow Inbound” and “Allow Outbound”, as shown bellow. Click Ok.Screen Shot 2014-01-03 at 11.49.02
  • Let’s briefly describe the some of the options on this window:
    • Network DTC Access“: Enable/Disable the network access.
    • Allow inbound“:  Permit a distributed transaction originated from another computer to run on the current computer.
    • Allow outbound“:  Permit a distributed transaction initiated in the current computer to run on a remote computer.
    • Enable XA transactions” and “Enable SNA LU 6.2 Transactions“: Enables/Disable those particular specifications for distributed transactions.

Troubleshooting DTC

There’s a tool called DTC Ping which can help us to verify if the DTC is working correctly on all the computers that should be involved in a transaction.

You can download this tool here: http://www.microsoft.com/en-us/download/details.aspx?id=2868

I recommend the reading of this article, to learn hos to use this tool, as well as troubleshoot the possible errors: Troubleshooting MSDTC issues with the DTCPing tool.

Another great tool is the DTC Tester. You can simulate a distributed transaction on SQL Server:

Screen Shot 2014-01-03 at 14.15.25

To download and get more info about this tool, check this link: http://support.microsoft.com/kb/293799 .

The End

This way we finish this series about how to install a clustered instance. We still have too many details to cover and I will try to create separated articles with best practices, configuration alternatives, etc.

I hope you enjoyed this series and as always, if you have any doubts, contact me!

Thank you for reading!

If you want to check the other parts of this series, here are the links:

Categories: DBA Blogs

A New Year, a New Dancecard

Fri, 2014-01-10 09:29

First things first: Happy New Year y’all! Health, happiness, and all that jazz to each and everyone of youses!

So the Holidays are over — or almost over, if you are one of the few lucky souls. Which means that we have to get back into the saddle. Considering that, ahem, some of us have just fallen off the grid and into the eggnog for the last two weeks, that’s easier said than done. In order to jog those hacking muscles back into shape, I decided to revisit an idea that I outlined in the Holidays of 2012: dancecard.

To recap, when I start a new Dancer (or Catalyst application, I always have to drop in the JavaScript libraries I’m using. That’s tedious. Instead, I would like to have a little tool that invites chosen libraries to come and dance with the app. A kind of Tanzkarte, if you will.

For the 2014 edition of this dancecard, I decided to take a slightly different approach. Instead of keeping a repository of the libraries locally, I wanted to be even lazier and simply have a main registry that would point to the download location or the git repository of the libraries.

For the moment I won’t go into the details of the code. Mostly because I pretty much banged the keyboard like a semi-hangover monkey for the last 2 hours, and it reflects in the quality of the codebase. Let it just be said that the git repo is at the usual place, and that I used MooseX::App to build the app.

So, what did those few hours of hacking gave me? Well, with the config file ~/.dancecard/config.yaml, which looks like:


---
browser_command: firefox %s
libs:
    backbone:
        location: http://backbonejs.org/backbone.js
        website: http://backbonejs.org
        documentation: http://backbonejs.org
        dependencies:
            - underscore
    underscore:
        location: http://underscorejs.org/underscore.js
        website: http://underscorejs.org
        documentation: http://underscorejs.org
...

I can now get a list of all libraries I have at my disposal:


$ dancecard list
INSTALLED LIBS
    no lib installed
AVAILABLE LIBS
    underscore
    bootstrap
    backbone
    slippy
    font-awesome

Of course, I can also install any of those. And… look Ma: dependencies are handled automatically!


$ dancecard install backbone
 backbone installed
 underscore installed

Something I also often do is peruse the online documentation of those libraries. Noticed those “documentation” urls in the yaml registry? It’s there so that I can do,


$ dancecard doc bacbone

and have the documentation page opened automatically in Firefox. Or, for maximum laziness, I can also do,


$ dancecard doc --installed

which will open the documentation URL of every library found in the current project.

Next Steps

If anyone shows interest in the thing, I’ll bundle the app for CPAN consumption. As for features, things that I’ll probably tackle for my own use are:

  • Tags (no software is true software without tags nowadays);
  • Virtual libraries (just as a mean to install a group of libraries in a single go);
  • Fish completion (I wuv the fish shell);
  • Cached local documentation (’cause hitting the ‘Net each time I look for a piece of documentation fills me with guilt);
  • An online main registry of libraries;
  • And whatever else I may think of.
Categories: DBA Blogs

Log Buffer #353, A Carnival of the Vanities for DBAs

Fri, 2014-01-10 08:48

The new fresh year is already here amidst fireworks, hard partying, resolutions, and new hopes. Log Buffer also enters into this brand new year focusing on the emerging trends while appreciating the proven technologies. Enjoy the first log buffer of this year.

Oracle:

Keeping up with the latest Java Runtime Environment (JRE) updates is both a priority and a challenge for most organizations.

Oracle‘s product portfolio offers a multitude of opportunities to combine cross-stack products and deliver incremental value to customers and partners.

Do you Deal with Supplier Profile Management flows?

In case you took a break over New Year’s and aren’t quite ready for “real” work, but still feel obligated to visit sites such as this one, here’s a reward for your efforts.

Using SSD for a temp tablespace on Exadata

SQL Server:

This paper discuss if it is the right approach of using Hadoop as the analytics infrastructure.

Microsoft is releasing SQL Server 2014 with a new set of In-Memory OLTP features to significantly improve the OLTP performance and reduce the processing time for servers with a large amount of memory and multi-core processors.

SQL Server Optimizer — Under the Hood series of short articles.

How to use Profiler to generate TSQL scripts that can be modified and automated to run as a server-side trace by Grant Fritchey.

Big data is now a standard part of information technology architecture for most large organizations.

MySQL:

SSL with MySQL does not have to be complicated

This is a HOW-TO about installing MariaDB Galera Cluster on Debian/Ubuntu.

In many parts of the MySQL world, whether you have users, developers or administrators of MySQL, the season holidays are for family, relax, prayers, travel, or they are simply a moment where you can enjoy what you like or care most about.

How to recover table structure from .frm files with MySQL Utilities

Chinese and Japanese and Korean (CJK) text usually has no spaces between words. Conventional full-text search does its tokenizing by looking for spaces. Therefore conventional full-text search will fail for CJK.

Categories: DBA Blogs

Disabling Triggers in Oracle 11.2.0.4

Wed, 2014-01-08 08:10

In March 2012, I put together a blog post entitled Disabling Oracle triggers on a per-session basis, outlining a way to suspend trigger execution for the current session through a PL/SQL call. Commenter Bryan posted a comment saying he couldn’t get it working in 11.2.0.4:

Unfortunately Oracle seems to have disabled this use in 11.2.0.4, and most likely 12.1 as well. Boo-Hiss! This is needed functionality for DBAs!

A new parameter: enable_goldengate_replication

I tried this on an Oracle 11.2.0.4 system, and I indeed got an error:

SQL> exec sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true);
BEGIN sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); END;

*
ERROR at line 1:
ORA-26947: Oracle GoldenGate replication is not enabled.
ORA-06512: at "SYS.DBMS_XSTREAM_GG_INTERNAL", line 46
ORA-06512: at "SYS.DBMS_XSTREAM_GG", line 13
ORA-06512: at line 1

A quick look at oerr gives a path forward, assuming you do indeed have a GoldenGate license:

[oracle@ora11gr2b ~]$ oerr ora 26947
26947, 00000, "Oracle GoldenGate replication is not enabled."
// *Cause: The 'enable_goldengate_replication' parameter was not set to 'true'.
// *Action: Set the 'enable_goldengate_replication' parameter to 'true'
//           and retry the operation.
//          Oracle GoldenGate license is needed to use this parameter.

The Oracle reference gives a bit more info

ENABLE_GOLDENGATE_REPLICATION controls services provided by the RDBMS for Oracle GoldenGate (both capture and apply services). Set this to true to enable RDBMS services used by Oracle GoldenGate.

The RDBMS services controlled by this parameter also include (but are not limited to):

Service to suppress triggers used by GoldenGate Replicat

As do the GoldenGate 12.1.2 docs:

The database services required to support Oracle GoldenGate capture and apply must be enabled explicitly for an Oracle 11.2.0.4 database. This is required for all modes of Extract and Replicat.

To enable Oracle GoldenGate, set the following database initialization parameter. All instances in Oracle RAC must have the same setting.

ENABLE_GOLDENGATE_REPLICATION=true

So here goes nothing:

SQL> alter system set enable_goldengate_replication=true;

System altered.

SQL> exec sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true);
BEGIN sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); END;

*
ERROR at line 1:
ORA-01031: insufficient privileges
ORA-06512: at "SYS.DBMS_XSTREAM_GG_INTERNAL", line 46
ORA-06512: at "SYS.DBMS_XSTREAM_GG", line 13
ORA-06512: at line 1

Another error: missing privileges. I checked and double-checked that the required GoldenGate privileges were indeed assigned.

Tracing and permission checks

It’s time to run a 100046 trace (SQL trace) to see what’s really going on.

SQL> alter session set events '10046 trace name context forever, level 12';

Session altered.

SQL> exec sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true);
BEGIN sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); END;

*
ERROR at line 1:
ORA-01031: insufficient privileges
ORA-06512: at "SYS.DBMS_XSTREAM_GG_INTERNAL", line 46
ORA-06512: at "SYS.DBMS_XSTREAM_GG", line 13
ORA-06512: at line 1

And tracefile does show some interesting information. A few of the more interesting snippets:

PARSING IN CURSOR #140324121137184 len=76 dep=0 uid=91 oct=47 lid=91 tim=1388531465245781 hv=1323338123 ad='6c1f63a0' sqlid='gvq73797f12cb'
BEGIN sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); END;
END OF STMT
...
PARSING IN CURSOR #140324121064984 len=187 dep=1 uid=0 oct=3 lid=0 tim=1388531465246387 hv=2028900049 ad='6c128db8' sqlid='aa9h2ajwfx3qj'
SELECT COUNT(*) FROM ( SELECT GP.USERNAME FROM DBA_GOLDENGATE_PRIVILEGES GP WHERE GP.USERNAME = :B1 UNION ALL SELECT GRANTEE FROM DBA_ROLE_PRIVS WHERE GRANTEE=:B1 AND GRANTED_ROLE='DBA' )
END OF STMT
...
 Bind#0
...
  value="GGS"
...
 Bind#1
...
  value="GGS"
...

The SQL statement is actually checking two things. The first is looking for the current username in the dba_goldengate_privileges view. This view isn’t listed in the Oracle 11.2 documentation, but it does appear in the 12c docs:

ALL_GOLDENGATE_PRIVILEGES displays details about Oracle GoldenGate privileges for the user.

Oracle GoldenGate privileges are granted using the DBMS_GOLDENGATE_AUTH package.

Related Views

DBA_GOLDENGATE_PRIVILEGES displays details about Oracle GoldenGate privileges for all users who have been granted Oracle GoldenGate privileges.

USER_GOLDENGATE_PRIVILEGES displays details about Oracle GoldenGate privileges. This view does not display the USERNAME column.

I had previously run dbms_goldengate_auth to grant privs here, so should be OK.

The second check simply verifies that the DBA role had been granted to the current user, again as recommended by the documentation. (A side note: in previous versions, I had avoided granting the overly broad DBA role to the GoldenGate user in favor of specific grants for the objects it uses. There’s no reason for the GoldenGate user to need to read and modify data objects that aren’t related to its own replication activities for example. And I would argue that it helps avoid errors such as putting the wrong schema in a map statement when permissions are restricted. But sadly it’s no longer possible in the world of 11.2.0.4.)

Running the query manually to verify that the grants are indeed in place:

SQL> SELECT COUNT(*) FROM ( SELECT GP.USERNAME FROM DBA_GOLDENGATE_PRIVILEGES GP WHERE GP.USERNAME = 'GGS'
UNION ALL SELECT GRANTEE FROM DBA_ROLE_PRIVS WHERE GRANTEE='GGS' AND GRANTED_ROLE='DBA' );

  COUNT(*)
----------
         2

Looks good, so that doesn’t seem to be the problem.

Tracing #2: system properties

Back to the 10046 tracefile:

PARSING IN CURSOR #140324119717656 len=45 dep=1 uid=0 oct=3 lid=0 tim=1388531465253124 hv=3393782897 ad='78ae2b40' sqlid='9p6bq1v54k13j'
select value$ from sys.props$ where name = :1
END OF STMT
...
 Bind#0
...
  value="GG_XSTREAM_FOR_STREAMS"
...
FETCH #140324119717656:c=0,e=44,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=1,plh=415205717,tim=1388531465254441

Because this SQL statement involves an ordinary select without an aggregate function, I can look at the FETCH line in the tracefile to get the number of rows returned. In this case it’s r=0, meaning no rows returned.

The query itself is looking for a system property I haven’t seen before: GG_XSTREAM_FOR_STREAMS. A Google search returns only a single result: the PDF version of the Oracle 11.2 XStream guide. Quoting:

ENABLE_GG_XSTREAM_FOR_STREAMS Procedure
This procedure enables XStream capabilities and performance optimizations for Oracle
Streams components.
This procedure is intended for users of Oracle Streams who want to enable XStream
capabilities and optimizations. For example, you can enable the optimizations for an
Oracle Streams replication configuration that uses capture processes and apply
processes to replicate changes between Oracle databases.
These capabilities and optimizations are enabled automatically for XStream
components, such as outbound servers, inbound servers, and capture processes that
send changes to outbound servers. It is not necessary to run this procedure for
XStream components.
When XStream capabilities are enabled, Oracle Streams components can stream ID key
LCRs and sequence LCRs. The XStream performance optimizations improve efficiency
in various areas, including:
? LCR processing
? Handling large transactions
? DML execution during apply
? Dependency computation and scheduling
? Capture process parallelism

On the surface, I don’t see what this would have to do with trigger execution, but I’m going to try enabling it as per the newly read document anyway:

SQL> exec dbms_xstream_adm.ENABLE_GG_XSTREAM_FOR_STREAMS(enable=>true);

PL/SQL procedure successfully completed.

SQL> exec sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true);
BEGIN sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true); END;

*
ERROR at line 1:
ORA-01031: insufficient privileges
ORA-06512: at "SYS.DBMS_XSTREAM_GG_INTERNAL", line 46
ORA-06512: at "SYS.DBMS_XSTREAM_GG", line 13
ORA-06512: at line 1

No dice.

Tracing #3: process names

Onto the next SQL in the tracefile:

PARSING IN CURSOR #140324120912848 len=114 dep=1 uid=0 oct=3 lid=0 tim=1388531465255628 hv=1670585998 ad='6c2d6098' sqlid='a9mwtndjt67nf'
SELECT COUNT(*) FROM V$SESSION S, V$PROCESS P WHERE P.ADDR = S.PADDR AND S.PROGRAM LIKE 'extract%' AND p.spid = :1
END OF STMT
...
 Bind#0
...
  value="2293"

Now we look in v$session, to see if a session associated with the process with OS PID 2293 (which happens to be the SPID of our current shadow process) has a PROGRAM column starting with the word extract. extract is, naturally, the name of the GoldenGate executable that captures data from the source system. In a GoldenGate system, however, trigger suppression does not happen in the extract process at all, but rather the replicat process that applies changes on the target system. So I’m going to skip this check and move on to the next one in the tracefile:

PARSING IN CURSOR #140324120905624 len=169 dep=1 uid=0 oct=3 lid=0 tim=1388531465257346 hv=3013382849 ad='6c122b38' sqlid='38pkvxattt4q1'
SELECT COUNT(*) FROM V$SESSION S, V$PROCESS P WHERE P.ADDR = S.PADDR AND (S.MODULE LIKE 'OGG%' OR S.MODULE = 'GoldenGate') AND S.PROGRAM LIKE 'replicat%' AND p.spid = :1
END OF STMT
...
 Bind#0
...
  value="2293"

This SQL is similar to the previous one, but instead of looking for a program called extract, it looks for one called replicat, and adds an extra check, so see if the module column either starts with OGG or is called GoldenGate. And since it’s the replicat process that does trigger disabling in GoldenGate, this check is likely to be related.

To make this check succeed, I’m going to have to change both the program and module columns in v$session for the current session. of the two, module is much easier to modify: a single call to dbms_application_info.set_module. But modifying program is less straightforward. One approach is to use Java code with Oracle’s JDBC Thin driver and setting the aptly-named v$session.program property, as explained in De Roeptoeter. But I’m hoping to stay with something I can do in SQL*Plus. If you’ve looked through a packet trace of a SQL*Net connection being established, you will know that the program name is passed by the client at the time of connection establishment, so could be modified by either modifying the network packet in transit. This is also complex to get working, as it also involves fixing checksums and the like. There’s a post on Slavik’s blog with a sample OCI C program that modifies its program information. Again more complexity thn I’d like, but it gave me an idea: if the program is populated by the name of the client-side executable, why don’t we simply copy sqlplus to a name that the dbms_xstream_gg likes better?

[oracle@ora11gr2b ~]$ cp $ORACLE_HOME/bin/sqlplus ./replicat
[oracle@ora11gr2b ~]$ ./replicat ggs

SQL*Plus: Release 11.2.0.4.0 Production on Mon Dec 30 14:09:05 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Enter password:

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> exec dbms_application_info.set_module('OGG','');

PL/SQL procedure successfully completed.

SQL> exec sys.dbms_xstream_gg.set_foo_trigger_session_contxt(fire=>true);

PL/SQL procedure successfully completed.

Success!

Wrapping up

So it looks like you can disable triggers per-session in 11.2.0.4 just like previous versions, but need to jump through a few more hoops to do it. A few conclusions to draw:

  • Oracle patchsets, while normally intended to include bugfixes, can have major changes to underlying functionality too. See Jeremy Schneider’s post on adaptive log file sync for an even more egregious example. So before applying a patchset, test thoroughly!
  • The enforcement of full DBA privileges for the GoldenGate user in Oracle 11.2.0.4 requires very broad permissions to use GoldenGate, which can be a concern in security-conscious or consolidated environments.

TL;DR: Yes you can still disable triggers per-session in Oracle 11.2.0.4, but you have to have a GoldenGate license, set the enable_goldengate_replication parameter, use a program name that starts with replicat, and set your module to OGG.

Categories: DBA Blogs

Statistics gathering and SQL Tuning Advisor

Thu, 2014-01-02 07:50

Our monitoring software found a long running job on one of our client’s databases. The job was an Oracle’s auto task running statistics gathering for more than 3 hours. I was curious to know why it took so long and used a query to ASH to find out the most common SQL during the job run based on the module name. Results were surprising to me: top SQL was coming with SQL Tuning Advisor comment.

Here is the SQL I used:

SQL> select s.sql_id, t.sql_text, s.cnt
  2  from
  3    (select *
  4     from
  5      (
  6        select sql_id, count(*) cnt
  7        from v$active_session_history
  8        where action like 'ORA$AT_OS_OPT_SY%'
  9        group by sql_id
 10        order by count(*) desc
 11      )
 12     where rownum <= 5
 13    ) s,
 14    dba_hist_sqltext t
 15  where s.sql_id = t.sql_id;

SQL_ID        SQL_TEXT                                                                                CNT
------------- -------------------------------------------------------------------------------- ----------
020t65s3ah2pq select substrb(dump(val,16,0,32),1,120) ep, cnt from (select /*+ no_expand_table        781
byug0cc5vn416 /* SQL Analyze(1) */ select /*+  full(t)    no_parallel(t) no_parallel_index(t)          43
bkvvr4azs1n6z /* SQL Analyze(1) */ select /*+  full(t)    no_parallel(t) no_parallel_index(t)          21
46sy4dfg3xbfn /* SQL Analyze(1) */ select /*+  full(t)    no_parallel(t) no_parallel_index(t)        1559

So most queries are coming with “SQL Analyze” comment right in the beginning of SQL which is running from DBMS_STATS call, which is confusing. After some bug search I have found a MOS Doc ID 1480132.1 which includes a PL/SQL stack trace from a DBMS_STATS procedure call, and it is going up to DBMS_SQLTUNE_INTERNAL, which means DBMS_STATS has a call to the SQL Tuning package; very odd:

SQL> select * from dba_dependencies where name = 'DBMS_STATS_INTERNAL' and referenced_name = 'DBMS_SQLTUNE_INTERNAL';

OWNER                          NAME                           TYPE               REFERENCED_OWNER       REFERENCED_NAME
------------------------------ ------------------------------ ------------------ ------------------------------ ----------------------------------
REFERENCED_TYPE    REFERENCED_LINK_NAME                                                                                                     DEPE
------------------ -------------------------------------------------------------------------------------------------------------------------------
SYS                            DBMS_STATS_INTERNAL            PACKAGE BODY       SYS                    DBMS_SQLTUNE_INTERNAL
PACKAGE                                                                                                                                     HARD

Turns out, this call has nothing to do with the SQL Tuning. It is just a call to a procedure in this package, which happen to look like an SQL from SQL Tuning Advisor. I have traced a GATHER_TABLE_STATS call in a test database first with SQL trace and then with DBMS_HPROF and here is how the call tree looks like:

SELECT RPAD(' ', (level-1)*2, ' ') || fi.owner || '.' || fi.module AS name,
       fi.function,
       pci.subtree_elapsed_time,
       pci.function_elapsed_time,
       pci.calls
FROM   dbmshp_parent_child_info pci
       JOIN dbmshp_function_info fi ON pci.runid = fi.runid AND pci.childsymid = fi.symbolid
WHERE  pci.runid = 1
CONNECT BY PRIOR childsymid = parentsymid
  START WITH pci.parentsymid = 27;
NAME                                     FUNCTION                       SUBTREE_ELAPSED_TIME FUNCTION_ELAPSED_TIME                CALLS
---------------------------------------- ------------------------------ -------------------- --------------------- --------------------
...
SYS.DBMS_STATS_INTERNAL                  GATHER_SQL_STATS                           21131962                 13023                    1
  SYS.DBMS_ADVISOR                       __pkg_init                                       88                    88                    1
  SYS.DBMS_SQLTUNE_INTERNAL              GATHER_SQL_STATS                           21118776                  9440                    1
    SYS.DBMS_SQLTUNE_INTERNAL            I_PROCESS_SQL                              21107094              21104225                    1
      SYS.DBMS_LOB                       GETLENGTH                                        37                    37                    1
      SYS.DBMS_LOB                       INSTR                                            42                    42                    1
      SYS.DBMS_LOB                       __pkg_init                                       15                    15                    1
      SYS.DBMS_SQLTUNE_INTERNAL          I_VALIDATE_PROCESS_ACTION                        74                    39                    1
        SYS.DBMS_UTILITY                 COMMA_TO_TABLE                                   35                    35                    1
      SYS.DBMS_SQLTUNE_UTIL0             SQLTEXT_TO_SIGNATURE                            532                   532                    1
      SYS.DBMS_SQLTUNE_UTIL0             SQLTEXT_TO_SQLID                                351                   351                    1
      SYS.XMLTYPE                        XMLTYPE                                        1818                  1818                    1
    SYS.DBMS_SQLTUNE_UTIL0               SQLTEXT_TO_SQLID                                528                   528                    1
    SYS.DBMS_UTILITY                     COMMA_TO_TABLE                                   88                    88                    1
    SYS.DBMS_UTILITY                     __pkg_init                                       10                    10                    1
    SYS.SQLSET_ROW                       SQLSET_ROW                                       33                    33                    1
    SYS.XMLTYPE                          XMLTYPE                                        1583                  1583                    1
  SYS.DBMS_STATS_INTERNAL                DUMP_PQ_SESSTAT                                  73                    73                    1
  SYS.DBMS_STATS_INTERNAL                DUMP_QUERY                                        2                     2                    1
...

So there is a procedure DBMS_SQLTUNE_INTERNAL.GATHER_SQL_STATS which is being called by DBMS_STATS_INTERNAL, and this procedure actually runs a SQL like this:

/* SQL Analyze(0) */ select /*+  full(t)    no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring no_substrb_pad  */to_char(count("ID")),to_char(substrb(dump(min("ID"),16,0,32),1,120)),to_char(substrb(dump(max("ID"),16,0,32),1,120)),to_char(count("X")),to_char(substrb(dump(min("X"),16,0,32),1,120)),to_char(substrb(dump(max("X"),16,0,32),1,120)),to_char(count("Y")),to_char(substrb(dump(min("Y"),16,0,32),1,120)),to_char(substrb(dump(max("Y"),16,0,32),1,120)),to_char(count("PAD")),to_char(substrb(dump(min("PAD"),16,0,32),1,120)),to_char(substrb(dump(max("PAD"),16,0,32),1,120)) from "TIM"."T1" t  /* NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL,NDV,NIL,NIL*/

Which is basically approximate NDV calculation. So, nothing to be afraid of, it’s just the way the code is organized: DBMS_STATS uses API of SQL Tuning framework when you are using DBMS_STATS.AUTO_SAMPLE_SIZE as the ESTIMATE_PERCENT (which is the default & recommended value in 11g+).

Categories: DBA Blogs

SQL Server 2014 CTP2 – Memory Optimization Advisor

Thu, 2014-01-02 07:49

Today I am going to discuss one of the amazing new features, Memory Optimization Advisor of SQL Server 2014 CTP2. I discussed architectural details of SQL Server 2014 In-Memory Optimizer in my last blog post here.

We will discuss Memory Optimization Advisor implementation with a demo in this blog post. This tool helps customers in migrating disk-based tables to memory optimized tables with ease.

Memory Optimization Advisor – In Action

I will use AdventureWorks2008 database .For the demo purpose I have created a copy of table Employee named EmpTempTbl with columns – OrganizationNode, SalariedFlag, CurrentFlag removed.

Let us explore the wizard step by step.

Step # 1: Open Management Studio, connect to instance and the database AdventureWorks2008 and Right click the table EmpTempTbl . Now choose Memory Optimization Advisor.

Step # 2: The Advisor tool will be launched with a page describing feature of the tool. Click Next.

Step #3: On this screen the tool will check if the selected table is fit for migration as a Memory Optimized Table. The tool will report immediately if anything is wrong with the table which is preventing it from migrating as a Memory Optimized Table.

We have the option of generating Report using Generate Report button.

If you click on Generate button then wizard will provide with an option to save the report (HTML File) anywhere on the local disk. In my case, all is green so Click Next

Step #4: This screen shows migration warnings about the limitations of memory optimized object, and a link which will explain the limitations in detail. Click Next.

Step #5: Wizard will take us to below screen  which let us select the options for memory optimization.

  • Memory-Optimized Filegroup :  you can have just one per instance and must create one before moving disk based table to memory optimize table else you will get error.
  • Logical file name Here you can change the logical file name.
  • Path: Points to the location where you will save your logical file.
  • Estimated Current Memory Cost (MB): The estimated current memory cost for this table.

We have two checkboxes on this page and are described as below

  • Option to copy Data from the disk table to the new memory optimized table during the migration process
  • Option to be able to change the durability of the table just Schema (schema_only), in this case data will be lost after each SQL Server service restart. However by default Schema and Data(schema_and_data) is applied.

Step #6: Next screen in the wizard allow you to decide the name of the primary key, its members, its type. You can choose between nonclustered Hash index and nonclustered index here.

I have selected one integer column and one character column for Primary Key.

I selected Char Data Type to show here that we don’t have any other option then BIN2 collation for memory optimized tables. Click Next.

Step #7: This screen of the wizard will list out the summary Migration Actions. You have the option to script those operations by clicking the Script button .I scripted it out. Click Migrate.

Step #8: The migration process can take longer as it depends on number of objects. In this case, it succeeded. Click OK

Now our table is In-Memory. Let’s check the properties of the table to verify it. On SSMS you would be able to see the old table which is now renamed as EmpTempTbl_old and the new table is created under the Tables folder of the database.

Right Click on the newly created table and Go to Properties. You can see that the option Memory Optimized is set to true and the Durability is set to SchemaAndData.

This is a very user friendly tool with explanations and warnings, which will help users to streamline issues well before implementing In-Memory technology. As this blog is written with CTP version of SQL Server 2014 thus things might change during future release of SQL Server 2014.

Categories: DBA Blogs

Log Buffer #352, A Carnival of the Vanities for DBAs

Thu, 2014-01-02 07:44

Jingle bells are all the way towards the happy new year. Whether the database bloggers are basking at white, sandy beaches or warming up indoors in white, snowy mornings, they are gearing up with their blog posts to start the new year with a bang. Log Buffer shares that enthusiasm with them.
Oracle:

When diagnosing ASM issues, it helps to know a bit about the setup – disk group names and types, the state of disks, ASM instance initialisation parameters and if any rebalance operations are in progress.

The ADF framework provides parameters for a bounded task flow parameters are provided in ADF, developer can give them default values with using JDeveloper.

If you like to increase space in linux volume group, you can do it on while machine up.

How to Find Your Oracle Voice in the Oracle Sales Cloud?

ORA-14696: MAX_STRING_SIZE migration is incomplete for pluggable database

SQL Server:

How to Compare Rows within Partitioned Sets to Find Overlapping Dates.

Free eBook: SQL Server Transaction Log Management

A Greg Larsen classic – he discusses how the SQL Server optimizer tries to parameterize a query if it can, as well as how you can build your own parameterized query.

Learn how to create a Windows\SQL Server 2008 virtual cluster.

This technical note provides guidance for Reporting Services. The focus of this technical note is to optimize your Reporting Services architecture for better performance and higher report execution throughput and user loads.

MySQL:

Busy database-backed websites often hit scalability limits in the database first. In tuning MySQL, one of the first things to look at is the max_connections parameter, which is often too low.

One of the routine tasks for a DBA is renaming database schemas, and as such MySQL added a command to carry out that purpose called “RENAME DATABASE <database_name>”.

MySQL Connector/Java supports multi-master replication topographies as of version 5.1.27, allowing you to scale read load to slaves while directing write traffic to multi-master (or replication ring) servers.

This post shows you how to perform a schema upgrade using the Rolling Schema Upgrade (RSU) method.

Several users have reported that certain queries with IN predicates can’t use index scans even though all the columns in the query are indexed.

Categories: DBA Blogs