Skip navigation.

Pythian Group

Syndicate content
Love Your Data
Updated: 37 min 31 sec ago

Log Buffer #469: A Carnival of the Vanities for DBAs

Wed, 2016-04-13 08:40

This Log Buffer Edition digs deep into the realms of Oracle, SQL Server and MySQL and brings together a few of the top blog posts.

Oracle

We’ve all encountered a situation when you want to check a simple query or syntax for your SQL and don’t have a database around. Of course, most of us have at least a virtual machine for that, but it takes time to fire it up, and if you work from battery, it can leave you without power pretty quickly.

View Criteria is set to execute in Database mode by default. There is option to change execution mode to Both. This would execute query and fetch results from database and from memory.  Such query execution is useful, when we want to include newly created (but not committed yet) row into View Criteria result. Newly created row will be included into View Criteria resultset.

Upgrading database hardware in an organization is always a cumbersome process. The most time consuming step is, planing for the upgrade, which mainly includes choosing right hardware for your Oracle databases. After deciding on the hardware type for your databases, rest will be taken care by technical teams involved.

Gluent New World #02: SQL-on-Hadoop with Mark Rittman

The pre-12c implementation of DCD used TNS packages to “ping” the client and relied on the underlying TCP stack which sometimes may take longer. Now in 12c this has changed and DCD probes are implemented by TCP Stack. The DCD probes will now use the TCP KEEPALIVE socket.

 

SQL Server

Snippets will allow you to code faster by inserting chunks of code with few key strokes.

One of more common concerns among database administrators who consider migrating their estate to Azure SQL Database is their ability to efficiently manage the migrated workloads.

A SQL Server Patching Shortcut

Move an Existing Log Shipping Database to a New Monitor Server

Knee-Jerk Performance Tuning : Incorrect Use of Temporary Tables

 

MySQL

MySQL 5.7 sysbench OLTP read-only results: is MySQL 5.7 really faster?

7 Galera Cluster presentations in Percona Live Santa Clara 18-21.4. Meet us there!

Generate JSON Data with dbForge Data Generator for MySQL v1.6!

Extending the SYS schema to show metadata locks

EXPLAIN FORMAT=JSON wrap-up

Categories: DBA Blogs

Installing SQL Server 2016 – Standalone Instance – New Features

Tue, 2016-04-12 09:48

In this article I am going to go through a typical install of SQL Server 2016, and explain some of the best practices to follow when setting up a production server. I will also take a look at the new features when installing SQL Server 2016 compared to older versions.

The version of SQL Server I am using in SQL Server 2016 RC2. That means that some of the features may change between the time of writing and the retail release of SQL Server 2016.

 

Let’s Get Started!

The first screen you come to after opening the installation media:

Initial Screen

  1. Click the Installation button highlighted, and then the first option in the list New SQL Server Standalone Installation.

 

New for SQL Server 2016
For the Seasoned SQL Server DBA you will notice a few additions to this screen.

  • SQL Server Management Tools can now be installed from this screen. The files and binaries will be downloaded when you click this link, as they are no longer bundled with the installation media.
  • SQL Server Data Tools can also be installed from this screen.
  • Additionally, a standalone instance of R can be installed. R is a statistical programming language embedded into SQL server 2016, making it easy for data scientists and BI professionals to get a good level of analysis without leaving the SQL Server environment.

 

  1. Select the edition of SQL SERVER you would like you would like to install, I chose developer as it’s a full featured installation of SQL Server that can be used for development only, and not in a production environment. You could also enter a key in here instead of selecting a version.

Edition

  1. Read and accept the licence.
  2. SQL Server will then check a few rules to make sure you can install SQL server on your hardware.
  3. Select whether or not to allow Microsoft to check for updates and click next.
  4. SQL Server Installation will then install some setup files needed for installation and perform a few more checks. As you can see below, a warning has appeared asking me to make sure the correct ports are open on the firewall. Click Ok and then Next to proceed with the installation.

Rule Check

  1. The next screen is where we want to select all the features we want to install. As my machine is being set up as a dev machine to test SQL 2016 features, I am installing all the features. In a production environment only install the features that are needed. You can also select where to install the binaries and the root instance from this screen. Click next once you have selected the settings needed.

Features

New for SQL Server 2016

  • The ability to install a standalone instance of R is now available in the Shared Features Section.
  • The ability to install R services in the database engine is now available.
  • The ability to install Polybase Query Service for external sources is now available to install. Polybase query service allows users to query big data sources such as Hadoop using common T-SQL statements. If you are planning on installing the Polybase feature, then the Java Runtime SRE needs to be installed first.

 

  1. The next screen is where you will need to name the instance, if this is not the only SQL server installation on this hardware. It also confirms the installation directory as per the previous screens. As this is the only installation of SQL Server on this machine, I am going to leave these settings as default. Click next when ready to proceed.

Default Instance

  1. The Next Screen is the Server Configuration screen. You should run each service under a domain account with a strong password. Whether you use managed accounts or do not enforce password expiration is up to you. However, these accounts are going to be running your services and should always be available. As this is a test machine not connected to a domain, I will leave the defaults. You can also select your default server collation from this window by clicking on the tab at the top highlighted in yellow. It is important also to set the start-up type parameters to allow services to start automatically on reboot if needed.

The ability to allow Perform Volume Maintenance tasks to be checked from this screen is a new feature. This is a best practice among SQL server DBA to allow instant file initialization. Previously this had to be done outside of the installation window.

Server Config

 

New for SQL Server 2016

  • The ability to allow Perform Volume Maintenance tasks to be checked from the Server Configuration Screen.

 

  1. Next up is the database configuration screen, and we are going to step through the tables.

In the first tab you will want to add all the users that require sysadmin access. To administer the server, click on the Add Current user and Add buttons. I always use mixed mode authentication so I can still get to the server if Active Directory plays up, and I add a complex password for security. When your done, click on the Data Directories tab at the top.

Database Config1

This is the screen where you set up all of your default directories for your databases.

Best practice states that we should put Log files on separate disks to Data Files, as there are two different access patterns for these, so they would be more performant on separate disks. Backups should also be on their own disks where possible. Additionally, the OS should also have its own drive separate from all SQL server files. As this is a test server and I only have a c drive, I will leave them as default. Click TempDB tab when ready.

Database Config 2

 

New For SQL 2016

  • This is new in SQL 2016, and previously had to be configured after install. This screen allows us to create files for temp db. Best practice stated there should be 1 file per logical core up to a maximum of 8 as I have 4 cores in my machine I have created 4 files. You can also spread the files over more than one disk if needed. Once you’re happy with your selections click next.

Database Config 3

 

  1. The next screen is to configure Analysis Services. I have configured mine in Multidimensional mode adding myself as sysadmin and setting directories using same best practice as database engine. Look out for a further blog article on SQL server Analysis Services.

SSAS

 

  1. Leave the default options for Reporting Services. Again keep an eye out for another article on Reporting Services.
  2. In the next screen you have to configure the users capable of using the Distributed Replay Controller.
  3. Give a name to the DRC.

DRC_Name

  1. On the next Screen you will need to accept the terms of the features being installed by clicking accept, and then next.
  2. Finally, you can click install and that’s it! SQL Server 2016 and all its new features are installed.

Final

 

Here is a great link on some of the new features available in SQL Server 2016

Categories: DBA Blogs

SQL Server 2016 : A New Security Feature – Always Encrypted

Tue, 2016-04-12 08:15

Security. This word is so important when it comes to data, and there is a reason why. Every business has it’s vital data, and this data has to be accessed only by those who are authorized. Back in 2009, I wrote an article on what measures to take when it comes to securing SQL Server.

There were days when we used to have a third party tool to encrypt the data inside SQL Server. Later, Microsoft introduced Transparent Data Encryption (TDE) bundled with the release of SQL Server 2008. You may be wondering why it is so important to encrypt the data. Inside our database, there may be a case that the customer/application has to enter and store the sensitive information such as Social Security Number (SSN) or Financial/Payment Data, which should not be read in plain text, even by a DBA. With this requirement, a strong encryption and/or data masking comes into the picture.

With the launch of SQL Server 2016 Release Candidate 0 (RC0) , Microsoft has introduced two new features that are my personal favorite – 1)  Always Encrypted and 2) Dynamic Data Masking. Today I am going to walk you through the Always Encrypted feature.

First and foremost, we need to have SQL Server 2016 RC0 installed so that we can test this feature. You can download the RC0 here.  Once you are ready with RC0, create a test database and a table with an Encrypted column to store the sensitive data. There are few prerequisites that I will list for you here. If you want, you can use the sample schema from MS.

  1. Create a sample database
  2. Create Column Master Key
  3. Generate a self-signed certificate (well, you will need to install this certificate on the machine where the application will run)
  4. Configure Column Encryption Key
  5. Create a test table with Always Encrypted column
  6. Create an application to Insert data into the sample table we created in previous step

I have created a sample app and a demo script for the reference which you can download here. Basically, what we have to remember is that we can not insert the value inside the Always Encrypted table directly, we will need to use the tool/app, and the data will always be encrypted when it goes inside the database. This will ensure that the intruder can not get the data as it travels to the database in a cipher text form.

Here is some further reading on this topic. Enjoy reading and testing an excellent feature of SQL Server 2016 RC0.

Categories: DBA Blogs

SQL On The Edge #9 – Azure SQL Database Threat Detection

Mon, 2016-04-11 14:30

Despite being well documented for several years now, every now and then we still run into clients that have bad experiences because of SQL injection attacks. If you’re not familiar, a SQL injection attack happens when an attacker exploits an application vulnerability in how they pass queries and data into the database and insert their own malicious SQL code to be executed. If you want to see different examples and get the full details, the Wikipedia page is very comprehensive.

Depending on how the application is configured, this kind of attack can go all the way from enabling attackers to see data they shouldn’t, to dropping an entire database if your application is allowed to do so. The fact that it’s an application based vulnerability also means that it really depends on proper coding and testing of all inputs in the application to prevent it. In other words, it can be very time-consuming to go back and plug all the holes if the application wasn’t securely built from the ground up.

Built-in Threat Detection

To attack this issue, and as part of the ongoing security story of SQL Server, Microsoft has now invested in the feature called Database Threat Detection. When enabled, the service will automatically scan the audit records generated from the database and will flag any anomalies that it detects. There are many patterns of injections so it makes sense to have a machine be the one reading all the SQL and flagging them. MS is not disclosing the patterns or the algorithms in an effort to make working around the detection more difficult.

What about on-premises?

This feature right now is only available on Azure SQL Db. However, we all know that Azure SQL Db is basically the testing grounds for all major new features coming to the box product. I would not be surprised if the threat detection eventually makes it to the on-premises product as well (my speculation though, nothing announced about this).

Pre-requisites
For this new feature you will need Azure SQL Db, you will also need to have auditing enabled on the database. The current way this works is by analyzing the audit records so it’s 100% reactive, nothing proactive. You will need a storage account as well since that’s where the audit logs get stored. The portal will walk you through this whole process, we’ll see that in the demo video.

Current State
As I mentioned, right now the tool is more of a reactive tool as it only lets you know after it has detected the anomaly. In the future, I would love to see a preventive configuration where one can specify a policy to completely prevent suspicious SQL from running. Sure, there can always be false alarms, however, if all the application query patterns are known, this number should be very low. If the database is open to ad-hoc querying then a policy could allow to only prevent the queries or even shut down the database after several different alerts have been generated. The more flexible the configuration, the better, but in the end what I want to see is a move from alerting me to preventing the injection to begin with.

In the demo, I’m going to go through enabling Azure SQL threat detection, some basic injection patterns and what the alerts look like. Let’s check it out!

 

Categories: DBA Blogs

Enter the Exadata X6

Mon, 2016-04-11 14:20

Data is exploding and Exadata is catching up. With the proliferation of cloud technology and in-memory databases; Oracle Exadata X6-2 and X6-8 has it all. It seems to be an ideal platform for hyper-convergence for any data center running Oracle products.

Following are some of the salient features of Oracle X6:

  • The compute nodes have twenty two-core Intel Xeon E5-2699 v4 processors
  • The memory is DDR4 and of size 256Gb and it can be expanded to 768Gb.
  • The local storage can now be upgraded to 8 drives from default of 4.
  • The cell servers have ten-core Intel Xeon E5-2630 v4 processors.
  • Flashcache has become massive here reaching up to 12.8TB. Full rack has 179TB of flash.
  • There will be up to 1.7 PB of disk capacity (raw) per rack.
  • The infiniband network will have 40Gb/second. There is no change to Infiniband. In X5 it became active/active – but that’s the only difference.
  • On the software side, there are many improvement but one thing which caught my eye is the feature that enables Storage Indexes to be moved along the data when a disk hits predictive failure or true failure. This surely will improve performance by a long way.

With X6, Exadata has surely come a long way forward.

Categories: DBA Blogs

Installing Sample Databases To Get Started In Microsoft SQL Server

Fri, 2016-04-08 08:20

Anyone interested in getting started in SQL server will need some databases to work with/on. This article hopes to help the new and future DBA/Developer get started with a few databases.

There are several places to get sample databases, but one starting out in SQL server should go to the Microsoft sample databases. The reason for this is that there are thousands of Blogs/Tutorials on the internet that use these databases as the basis for the tutorial.

The below steps will detail how to get the sample databases and how to attach them to SQL server to start working with them.

This blog assumes you have a version of SQL server installed if not you can click here for a great tutorial

These 2 Databases are a great start to learning SQL Server from both a Transactional and Data Warehousing point of view.

  • Now that we have downloaded these 2 files we will need to attach them one at a time. First, open SQL Server and connect to your instance.
  • Expand the object explorer tree until you can right click on the folder called databases and then left click on Attach…

ObjectExplorer

  • Click the Add Button and navigate to and select the .mdf file (This is the database file you downloaded).

AtatchDatabase1

  • There is one step a lot of people getting started in SQL server often miss. As we have just attached a data file in order for SQL Server to bring the database online, it needs a log file which we don’t have. The trick to this is, if we remove the log file from the Attach Database window, SQL Server will automatically create a log file for us. To do this, simply select the log file and click remove.

LogRemove

  • Finally, when you window looks like below simply click ok to attach the database.

FinalAtatch

  • Repeat steps 3 to 6 for the second database file and any others you wish to attach.
  • The Databases are now online in SQL server and ready to be used.

FinalObjectTree

And that’s it! You now have an OLTP database and a DW database for BI.

Below are links to some good starting tutorials and some additional databases.

Databases

CodePlex

Stack Overflow

Tutorials

W3Schools

Enjoy!

Categories: DBA Blogs

Oracle Live SQL: Explain Plan

Thu, 2016-04-07 12:14

We’ve all encountered a situation when you want to check a simple query or syntax for your SQL and don’t have a database around. Of course, most of us have at least a virtual machine for that, but it takes time to fire it up, and if you work from battery, it can leave you without power pretty quickly. Some time ago, Oracle began to offer a new service called “Oracle Live SQL” . It provides you with the ability to test a sql query, procedure or function, and have a code library containing a lot of examples and scripts. Additionally, you can store your own private scripts to re-execute them later. It’s a really great online tool, but it lacks some features. I’ve tried to check the  execution plan for my query but, unfortunately, it didn’t work:

explain plan for 
select * from test_tab_1 where pk_id<10;

ORA-02402: PLAN_TABLE not found

So, what could we do to make it work? The workaround is not perfect, but it works and can be used in some cases. We need to create our own plan table using script from an installed Oracle database home $ORACLE_HOME/rdbms/admin/utlxplan.sql. We can open the file and copy the statement to create plan table to SQL worksheet in the Live SQL. And you can save the script in Live SQL code library, and make it private to reuse it later because you will need to recreate the table every time when you login to your environment again. So far so good. Is it enough? Let’s check.

explain plan for 
select * from test_tab_1 where pk_id<10;

Statement processed.

select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
ERROR: an uncaught error in function display has happened; please contact Oracle support
       Please provide also a DMP file of the used plan table PLAN_TABLE
       ORA-00904: DBMS_XPLAN_TYPE_TABLE: invalid identifier


Ok, the package doesn’t work. I tried to create the types in my schema but it didn’t work. So far the dbms_xplan is not going to work for us and we have to request the information directly from our plan table. It is maybe not so convenient, but it give us enough and, don’t forget, you can save your script and just reuse it later. You don’t need to memorize the queries. Here is a simple example of how to get information about your last executed query from the plan table:

SELECT parent_id,id, operation,plan_id,operation,options,object_name,object_type,cardinality,cost from plan_table where plan_id in (select max(plan_id) from plan_table) order by 2;

PARENT_ID	ID	OPERATION	PLAN_ID	OPERATION	OPTIONS	OBJECT_NAME	OBJECT_TYPE	CARDINALITY	COST
 - 	0	SELECT STATEMENT	268	SELECT STATEMENT	 - 	 - 	 - 	9	49
0	1	TABLE ACCESS	268	TABLE ACCESS	FULL	TEST_TAB_1	TABLE	9	49

[/lang] 
I tried a hierarchical query but didn't find it too useful in the Live SQL environment. Also you may want to put unique identifier for your query to more easily find it in the plan_table. 

explain plan set statement_id='123qwerty' into plan_table for
select * from test_tab_1 where pk_id<10;

SELECT parent_id,id, operation,plan_id,operation,options,object_name,object_type,cardinality,cost from plan_table where statement_id='123qwerty' order by id;

PARENT_ID	ID	OPERATION	PLAN_ID	OPERATION	OPTIONS	OBJECT_NAME	OBJECT_TYPE	CARDINALITY	COST
 - 	0	SELECT STATEMENT	272	SELECT STATEMENT	 - 	 - 	 - 	9	3
0	1	TABLE ACCESS	272	TABLE ACCESS	BY INDEX ROWID BATCHED	TEST_TAB_1	TABLE	9	3
1	2	INDEX	272	INDEX	RANGE SCAN	TEST_TAB_1_PK	INDEX	9	2

Now I have my plan_table script and query saved in the Live SQL and reuse them when I want to check the plan for my query. I posted the feedback about the ability to use dbms_xplan and Oracle representative replied to me promptly and assured they are already working implementing dbms_xplan feature and many others including ability to run only selected SQL statement in the SQL worksheet (like we do it in SQLdeveloper). It sounds really good and promising and is going to make the service even better. Stay tuned.

Categories: DBA Blogs

Which Cassandra version should you use for production?

Thu, 2016-04-07 11:47
What version for a Production Cassandra Cluster?

tl;dr; Latest Cassandra 2.1.x

Long version:

A while ago, Eventbrite wrote:
“You should not deploy a Cassandra version X.Y.Z to production where Z <= 5.” (Full post).

And, in general, it is still valid up until today! Why “in general“? That post is old, and Cassandra has moved a lot since them. So we can get a different set of sentences:

Just for the ones that don’t want follow the links, and still pick 3.x for production use, read this:

“Under normal conditions, we will NOT release 3.x.y stability releases for x > 0.  That is, we will have a traditional 3.0.y stability series, but the odd-numbered bugfix-only releases will fill that role for the tick-tock series — recognizing that occasionally we will need to be flexible enough to release an emergency fix in the case of a critical bug or security vulnerability.

We do recognize that it will take some time for tick-tock releases to deliver production-level stability, which is why we will continue to deliver 2.2.y and 3.0.y bugfix releases.  (But if we do demonstrate that tick-tock can deliver the stability we want, there will be no need for a 4.0.y bugfix series, only 4.x tick-tock.)”

What about end of life?

Well, it is about stability, there are still a lot of clusters out there running 1.x and 2.0.x. And since it is an open source software, you can always search in the community or even contribute.

If you still have doubts about which version, you can always contact us!

Categories: DBA Blogs

Log Buffer #468: A Carnival of the Vanities for DBAs

Wed, 2016-04-06 14:38

This Log Buffer Edition rounds up Oracle, SQL Server, and MySQL blog posts of the week.

Oracle:

When using strings such as “FREQ=DAILY; BYDAY=MON,TUE,WED,THU,FRI; BYHOUR=9,10? within the scheduler, sometimes its not readily apparent how this will translate to actual dates and times of the day that the scheduled activity will run. To help you understand, a nice little utility is to use EVALUATE_CALENDAR_STRING”.

Most developers have struggled with wires in SOA composites. You may find yourself in a situation where a wire has been deleted. Some missing wires are restored by JDeveloper. Other missing wires have to be added manually, by simply re-connecting the involved adapters and components. Simple.

In-Memory Parallel Query and how it works in 12c.

Oracle recently launched a new family of offerings designed to enable organizations to easily move to the cloud and remove some of the biggest obstacles to cloud adoption. These first-of-a-kind services provide CIOs with new choices in where they deploy their enterprise software and a natural path to easily move business critical applications from on premises to the cloud.

Two Oracle Server X6-2 systems, using the Intel Xeon E5-2699 v4 processor, produced a world record x86 two-chip single application server SPECjEnterprise2010 benchmark result of 27,509.59 SPECjEnterprise2010 EjOPS. One Oracle Server X6-2 system ran the application tier and the second Oracle Server X6-2 system ran the database tier.

SQL Server:

To be able to make full use of the system catalog to find out more about a database, you need to be familiar with the metadata functions.

Powershell To Get Active Directory Users And Groups into SQL!

A code review is a serious business; an essential part of development. Whoever signs off on a code review agrees, essentially, that they would be able to support it in the future, should the original author of the code be unavailable to do it.

Change SQL Server Service Accounts with Powershell

Learn how to validate integer, string, file path, etc. input parameters in PowerShell as well as see how to test for invalid parameters.

MySQL:

The MySQL Utilities has announced a new beta release of MySQL Utilities. This release includes a number of improvements for usability, stability, and a few enhancements.

In this webinar, we will discuss the practical aspects of migrating a database setup based on traditional asynchronous replication to multi-master Galera Cluster.

Docker has gained widespread popularity in recent years as a lightweight alternative to virtualization. It is ideal for building virtual development and testing environments. The solution is flexible and seamlessly integrates with popular CI tools.

How ProxySQL adds Failover and Query Control to your MySQL Replication Setup

Read-write split routing in MaxScale

Categories: DBA Blogs

Six things I learned about privacy from some of the world’s top IT pros

Wed, 2016-04-06 07:32

 

Everyone has their own experiences with data privacy. Ask almost anyone, and they’ll be able to tell you how a retail company found some innovative way of collecting their personal data, or how Facebook displayed a suggested post or ad that was uncannily targeted at them based on personal information they did not purposely share.

 

The discussion about privacy can get pretty interesting when you get a group of CIOs and IT leaders in the same room to talk about it.  And that’s just what we did at Pythian’s Velocity of Innovation (Velocity) events in New York, San Francisco and Sydney. With their ears constantly to the ground for the latest trends and unparalleled insider IT knowledge, there’s no better group with whom to talk security and privacy than our Velocity panelists and attendees.

 

At Pythian we manage the revenue-generating systems for some of the world’s leading enterprises, and work with our clients on some of the most difficult security and privacy issues. But staying on top of the evolving threats to data privacy is a constant challenge, even for the most knowledgeable security experts. One of our biggest challenges is helping our clients use data responsibly to improve customer experiences, while protecting individual privacy and safeguarding personally identifiable information from malicious threats. So there’s always more we can learn. This is one of the reasons we keep listening to our customers and peers through various channels and events like our Velocity discussions.

 

Privacy is a high stakes game for companies that want to maximize their use of client data while adhering to legislation designed to protect personal information. In many countries, new, more rigorous laws are emerging to enable individuals to maintain more control over their personal information, and to help them understand when they are being monitored or when their personal information is recorded, shared or sold. These laws include the recently  reformed  EU Data protection rules and PIPEDA in Canada. But companies are responding by finding clever ways of making it attractive for consumers to trade their data for services, or worse, by making it impossible to get service at all without giving up certain information.

With these issues in mind, we had some interesting discussions at each of our Velocity sessions. Here are six main privacy-related themes that emerged at these recent events:

 

  1. Privacy and security are not the same

Although are related, they refer to different ideas. Privacy is a major goal of security, and relates to a consumer’s right to safeguard their personal information. It can involve vulnerable data such as social media data, customer response data, demographic data or other personal information. In general, privacy is the individual’s right to keep his or her data to himself or herself. By contrast, security refers to the protection of enterprise or government systems. Security may incorporate customer privacy as part of its agenda, but the two are not synonymous. According to Tim Sheedy, principal analyst with Forrester Research, a panelist at our Sydney event, security and privacy are different things. “But if you have a security breach, privacy becomes an issue,” he said. “So they’re closely linked.”

 

  1. Balance is key

Our panelists agreed that you have to balance protecting customer data with moving the business forward. But if you’re in the insurance business, like San Francisco Velocity panelist and CTO at RMS Cory Isaacson, that’s a tough balance to strike. He has to worry about how his company’s data privacy measures up against industry standards and the regulations they have meet. At the same time he has to ensure that data security doesn’t paralyze innovation in the business. He says you have to be innovative with moving the business forward, as well as keeping up with standards and regulations.

“If you listen to your ISO auditor you’re never going to get out of bed in the morning,” Isaacson said. “If you’re faced with securing highly sensitive customer data, you have to incorporate it into the way you work every day instead of letting it slow you down.”

 

  1. Don’t be creepy

Companies have to walk a fine line between using personal data to enhance the customer experience and being intrusive for no good reason. Forrester’s Sheedy warns that perception is everything. “Don’t be creepy,” he said. “When you’re creepy that that’s when privacy becomes an issue. Don’t underestimate the ability of people to give away information for a discount or for a benefit. But the key is that there has to be a benefit. When there isn’t an upside, it becomes creepy for customers, and you have to wonder if you’ve  gone too far.”

 

  1. People will share data in exchange for perks.

Our Sydney panelists and attendees exchanged ideas about new ways that companies are obtaining information about customers, behaviours and preferences. They talked about social media companies learning your behaviours by using more than just the data you share on your profile, including using tactics like tracking your location through GPS and triangulation between cell phone towers.

 

“When I visit Pythian’s head office in Ottawa, the nearby coffee shop has an app that asks if I want them to prepare my usual drink when it detects I’m nearby. Is that intrusive? No, because I opted in,” said Francisco Alvarez, vice president, APAC at Pythian.

 

A Marketing Magazine article reported that in a recent survey, a majority of consumers will share data for certain benefits: 80 percent said they would share data for rewards from a company, 79 percent would share data for cash back, and 77 percent would share data for coupons. The majority also said they would share personal data for location-based discounts (69 percent).

 

  1. Millennials have a higher tolerance for sharing data.

The Australian panelists talked about this extensively, citing personal experience. They expressed the same concern many of us feel  about how much data their children’s generation is a sharing online and with companies. One of our attendees had a child who was denied a prestigious scholarship because his digital footprint wasn’t favourable. But, Alvarez said, this type of consequence doesn’t seem to deter millennials from freely sharing very personal information. And when it comes to sharing information with businesses in exchange for perceived benefits, their tolerance is very high.  

 

In fact, according to a survey conducted by the USC Annenberg Center for Digital Future and Bovitz Inc., millennials (individuals between the ages of 18 and 35) have a different attitude than Internet users 35 years and older when it comes to sharing their personal data online with businesses.  According to the study results, millennials were more likely than older respondents (35 years and older) to trade some of their personal information in exchange for more relevant advertising.

 

  1. New  approaches for protecting customer data are on the horizon

In our San Francisco discussion, Pythian Chief Data Officer Aaron Lee cited Google’s new initiative for protecting customer data.  The new approach involves placing corporate applications on the Internet itself and focusing its security efforts on registered devices and user authentication.
“It’s really simple idea,” Lee said.  “In most companies the internal network is special. If you’re on the internal network you get stuff you can’t get from outside just because you’re sitting there on the network. That’s not the case anymore. With Google’s Beyond Corp initiative it is like there are no more internal networks. If you want to talk to my service, you treat it like you came in from the internet just like everybody else. That way I can apply a consistent approach to authentication, authorization, all that good stuff. It’s actually acting to reduce the surface of complexity. It’s an approach that takes the simplest way that we can secure these things, and treats them the same regardless of where they actually live,” he said.

Categories: DBA Blogs

Redshift Table Maintenance: Vacuuming

Tue, 2016-04-05 12:38
Overview

Part of the appeal of AWS’ Redshift is that it’s a managed service, which means lower administration costs. While you don’t have to hire a full time DBA to make sure it runs smoothly (from Pythian’s experience it takes ~10-20 hours/month to manage Redshift), there are still some tasks that should be attended to keep it happy:

  • Vacuuming
  • Analyzing
  • Skew analysis
  • Compression analysis
  • Query monitoring

Let us start with Vacuuming as the first topic of a series of deeper dives into this list.

Vacuuming is an integral part of performance maintenance of Redshift.  Since deletes and updates both flag the old data, but don’t actually remove it, if we’re doing those kinds of actions, vacuuming is needed to reclaim that space. Updates and deletes can be pretty big performance hits (a simple update can easily take 60 secs on a 50 million record table on a small cluster, so we’re looking at 20 minutes for a similar update on a 1 billion record table), so we try to avoid them as much as we can on large tables. The space reclamation portion of the vacuum typically accounts for 10% of the time we see spent on the tables.  We can use the SORT ONLY parameter to skip this phase, but we generally have no compelling reason to.

In addition, if tables have sort keys, and table loads have not been optimized to sort as they insert, then the vacuums are needed to resort the data which can be crucial for performance.  While loads of empty tables automatically sort the data, subsequent loads are not. We have seen query times drop by 80% from the implementation of vacuuming, but of course the impact varies with table usage patterns.

The biggest problem we face with vacuuming is the time it takes. While vacuuming does not block reads or writes, it can slow them considerably as well as take significant resources from the cluster, and you can only vacuum one table at a time. Remember that resource utilization can be constrained through WLM queues. A typical pattern we see among clients is that a nightly ETL load will occur, then we will run vacuum and analyze processes, and finally open the cluster for daily reporting. The faster the vacuum process can finish, the sooner the reports can start flowing, so we generally allocate as many resources as we can.

Operations

Let us start with the process itself.  It’s simple enough and you can get syntax documentation from AWS . There’s not too much that’s tricky with the syntax and for most use cases

VACUUM myschema.mytablename;

will suffice.  Note that INTERLEAVED sort keys need the REINDEX parameter added for all re-indexing to occur.  You can discern which tables have this set up by using the query:

select schemaname, tablename
from (SELECT
n.nspname AS schemaname
,c.relname AS tablename
,min(attsortkeyord) min_sort FROM pg_namespace AS n
INNER JOIN&nbsp; pg_class AS c ON n.oid = c.relnamespace
INNER JOIN pg_attribute AS a ON c.oid = a.attrelid
WHERE c.relkind = 'r'
AND abs(a.attsortkeyord) &gt; 0
AND a.attnum &gt; 0
group by 1,2 )
where min_sort&lt;0

In order to give the vacuum process more resources, we preface this command with

SET wlm_query_slot_count TO <N>;

where N is the maximum number of query slots we think we can get away with.  If you’re not sure what that number should be (we’ll discuss WLM queues in another post), usually 5 is a safe number though be warned that if the value of wlm_query_slot_count is larger than the number of available slots for the service class, the vacuum command will fail.

Knowing when to vacuum is reasonably straight forward. Anytime after substantial inserts, updates, or deletes are made is always appropriate, but you can be more exacting by querying two tables:

select * from STL_ALERT_EVENT_LOG where Solution LIKE ‘%VACUUM command%’

and

select * from SVV_TABLE_INFO where unsorted > 8

The latter check works great for daily loads. We will often set the threshold at 8 (percent) immediately after the loads, then run another vacuum process in the evening with a lower threshold (4 percent) that addresses larger tables that take a fair amount of time to vacuum since we want to avoid that situation in the morning.

Last fall AWS built a nice tool to automate vacuums, Analyze & Vacuum Schema Utility, that incorporated these queries. It works quite well, and we recommend it to our clients as a simple way to set up this maintenance. However, note that it does not automatically add the REINDEX parameter for those tables with INTERLEAVED sortkeys. The code is all available, so it is easy enough to adjust to make more custom filtering of tables (on fact_* and dim_* for instance) within a schema.

AWS has built a very useful view, v_get_vacuum_details, (and a number of others that you should explore if you haven’t already) in their Redshift Utilities repository that you can use to gain some insight into how long the process took and what it did. None of the system tables for vacuuming keep any historical information which would be nice for tracking growing process times, but you can see them for a week in STL_QUERY which gets purged to a history of 7 days. I recommend creating a simple process to track the vacuum data:

First create the table:

create table vacuum_history sortkey (xid) as select * from v_get_vacuum_details where processing_seconds > 0;

Then set up a cron process to populate:

0 18 * * * psql -h myRScluster -U myUser -p5439 -c “INSERT INTO vacuum_history SELECT * FROM v_get_vacuum_details WHERE xid > (SELECT MAX(xid) FROM vacuum_history) where processing_seconds > 0;” &> /var/log/vacuum_history.log

Once you start to see tables taking an inordinate amount of time to vacuum, some additional intervention may be appropriate. Our team recently ran into a sizable table (3 billion records) that had been taking 3 hours to vacuum daily. Some issue occurred where the table needed a partial reload of 2 billion rows. Once that finished, we ran a vacuum which kept going all afternoon. Checking SVV_VACUUM_PROGRESS we could see that it would take almost 30 hours to complete. Note that restarting a stopped vacuum does not mean the process will pick up where it left off. Since this would have impacted the daily load performance, we killed the vacuum with “cancel <pid>” using the pid pulled from

select pid, text from SVV_QUERY_INFLIGHT where text like ‘%Vacuum%’

We then ran a deep copy (created a new version of the table and ran a SELECT INTO) which took about 5 hours. The load into an empty table triggers the correct sorting, so a subsequent vacuum took only a few minutes to complete.

Just a note on killing long running vacuums: it sometimes doesn’t work especially once it’s in the initialize merge phase. We’ve found that continually issuing the cancel command while it’s in the sort phase is effective, but the point it to be wary of vacuuming large tables for their first time. Vacuums on large, unsorted tables write temporary data to disk, so there is also the potential to run out of disk and freeze the cluster, so be sure to always check that up to 3x the table size of disk space is available.

There are a few simple strategies to prevent long running vacuums:

  • Load your data in SORTKEY order: The incoming data doesn’t have to be pre-ordered, just greater than existing data.
  • Vacuum often: A table with a small unsorted region vacuums faster than one with a large unsorted region.
  • If tables become too large to vacuum within a maintenance window, consider breaking them apart: We often see multi-billion record tables where the only data being queried is from the last month or two.
  • Deep copies can be a faster solution than vacuums.
Categories: DBA Blogs

Considering Total Cost of Incident in the Age of “What if?”

Tue, 2016-04-05 11:03

 

I’m currently in the process of moving houses and planning a wedding – busy, exciting, happy times. There are lots of positives – more yard space, more square footage, time with friends and family, cake…the list goes on. However, mixed in with these big, happy life changes is an underlying anxiety that something could go wrong. There’s always a chance that accidents, inclement weather or illness could ruin the party for everyone. It’s not pleasant to consider and I’d honestly rather sweep it all under the rug and go on with the merry-making (mmm cake!). But the voice at the back of my head won’t stop whispering what if?

 

It’s this voice that’s prompted me to take steps towards protecting my investments with liability insurance. Now, I’m not keen on laying out funds where I don’t have to, and I’m always the first to try and save a buck (coupons are my best friends!), but where any large investment is being made, it only makes sense for me to part with some of my hard coupon-saved dollars to protect that. I may never need it, true… but what if?

 

At Pythian, we spend a lot of our time exploring the world of what if? What if there’s an outage? What if there’s a breach? What will we lose? What will it cost? These are uncomfortable questions for a lot of organizations and especially uncomfortable when things are going well. If you haven’t had an outage or a data breach yet, it is easier to assume that this won’t happen. What if we invest in protection and that spend is wasted because nothing ever goes wrong? Naïve? Maybe. Comfortable? Oh yeah.

 

In a post-Snowden, post-Target, post-Home Depot world, it’s more important than ever for organizations to move out of the comfortable and really consider what the total impact of any one incident could be on their environment and take steps to insure themselves against it. Let’s spend some time being uncomfortable, considering more than just the Total Cost of Operations and think instead about the Total Cost of Incident.

Categories: DBA Blogs

Mind your rdbms/audit on 12c

Tue, 2016-04-05 09:17

Recently we’ve ran into an interesting question from one of our clients. They were seeing messages of the following form in syslog:


"Aug 11 11:56:02 ***** kernel: EXT3-fs warning (device ******): ext3_dx_add_entry: Directory index full!"

I haven’t encountered this before, and did a bit of research. My initial suspicion ended up being correct, and it was due to too many files being created, somewhere in that file system. I had a look around, and eventually checked out the ORACLE_HOME of the ASM / Grid Infrastructure software, which is running version 12.1.0.2 on that host.

I snooped around using du -sh to check which directories or sub-directories might be the culprit, and the disk usage utility came to a halt after the “racg” directory. Next in line would be “rdbms”. The bulb lit up somewhat brighter now. Entering the rdbms/audit directory, I issued the common command you would if you wanted to look at a directories contents: “ls”.

Five minutes later, there was still no output on my screen. Okay, we found the troublemaker. So we’re now being faced with a directory that has potentially millions of files in it. Certainly we all are aware that “rm” isn’t really able to cope with a situation like this. It would probably run for a couple minutes until it’s done parsing the directory index, and then yell “argument list too long” at us. Alternatively, we could use find, combined with -exec (bad idea), -delete, or even pipe into rm using xargs. Looking around a bit on the good ol’ friend google, I came across this very interesting blog post by Sarath Pillai.

I took his PERL one-liner, adjusted it a wee bit since I was curious how many files we actually got in there and ran it on a sandbox system with a directory with 88’000 files in it:


perl -e 'my $i=0;for(<*>){$i++;((stat)[9]<(unlink))} print "Files deleted: $i\n"'

It completed in 4.5 seconds. That’s pretty good. In Sarath’s tests he was able to delete half a million files in roughly a minute. Fair enough.

After getting the OK from the client, we ran it on the big beast. It took 10 minutes.


Files deleted: 9129797

9.1 million files. Now here comes the interesting bit. This system has been actively using 12.1.0.2 ASM since May 6th, 2015. That’s only 3 months. That translates to 3 million files per month. Is this really a desirable feature? Do we need to start running Hadoop just to be able to mine the data in there?

Looking at some of the files, it seems ASM is not only logging user interactions there, but also anything and everything done by any process that connects to ASM.

As I was writing this, I happened to take another peek at the directory.


[oracle@cc1v3 audit]$ ls -1 | wc -l
9134657

Remember those numbers from before? Three million a month? Double that.

I suspect this was due to the index being full, and Linux has now re-populated the index with the next batch. Until it ran full again.

A new syslog entry just created at the same time seems to confirm that theory:

Aug 12 00:09:11 ***** kernel: EXT3-fs warning (device ******): ext3_dx_add_entry: Directory index full!

After running the PERL one-liner again, we deleted another vast amount of files:


Files deleted: 9135386

It seems that the root cause is the added time stamp to the file names of the audit files that Oracle writes in 12.1. The file names are much more unique, which gives Oracle the opportunity to generate so many more of them. Where in previous versions, with an adequately sized file system you’d probably be okay for a year or more; on 12.1.0.2, on an active database (and our big beast is very active) you have to schedule a job to remove them, and ensure it runs frequently (think 18+ million files in 3 months to put “frequently” into perspective).

Categories: DBA Blogs

Enabling Large Pages on Oracle Database 11g running on IBM AIX

Mon, 2016-04-04 17:30

For implementing Large Pages on AIX first you will need to choose large page size at OS level.
On AIX you can have multiple large page sizes of 4KB, 64KB, 16MB, and 16GB.

In this example we will be using a large page size of 16MB.

Steps for implemenatation:

1- Based on MOS Doc ID 372157.1 first you need to enable Large Pages at OS level

# vmo -r  -o lgpg_size=16777216 -o lgpg_regions=<Total number of pages>
# vmo -o lru_file_repage=0
# vmo -p -o v_pinshm=1
# lsuser -a capabilities oracle
# chuser capabilities=CAP_BYPASS_RAC_VMM,CAP_PROPAGATE oracle
# bosboot -a

This needs a server reboot

For complete instruction please review note ID 372157.1

2- Setting parameters at instance level

On AIX databases you only need to set LOCK_SGA to TRUE:

alter system set lock_sga=TRUE scope=spfile;

Note: On AIX databases, USE_LARGE_PAGES parameter has NO impact.
These parameters are only valid for databases running on Linux, the value of this parameter even if set to FALSE will be ignored on AIX.

By default when Large Pages is available on AIX it will be used by database instances regardless of USE_LARGE_PAGES parameter value. You only need to set LOCK_SGA.

3- Restart the instance and confirm Large Pages is in use:

After setting lock_sga instance must be restarted.
As I explained above, when Large Pages is available at OS level it will be used by instance, but the key point in here is how to confirm whether Large Pages is in use or not.

How to check if Huge Pages is used by Oracle instance.

For Oracle 11g running on AIX, no informational message is written to the alert log as what we see in the alert log of databases running on Linux.

So for your database instance running on AIX do NOT expect following lines in the alert log:

****************** Large Pages Information *****************
Total Shared Global Region in Large Pages = xx MB (100%)
Large Pages used by this instance: xxx (xxx MB)
Large Pages unused system wide = x (xxx MB) (alloc incr 4096 KB)
Large Pages configured system wide = xxx (xxx MB)
Large Page size = 16 MB
***********************************************************

The only way you can make sure large pages is being used by instance is checking memory usage at OS level:

Consider SGA_TARGET in your instance is 8G
Total number of Large Pages (with size of 16M) will be 8G/16M + 1 which is : 8589934592 / 16777216 + 1 = 513

Check the number of large 16M pages in use at OS level before starting your instance:

$ vmstat -P all

System configuration: mem=98304MB

pgsz            memory                           page
----- -------------------------- ------------------------------------
           siz      avm      fre    re    pi    po    fr     sr    cy
   4K  4420992  2926616   487687     0     0     5  1280   2756     0
  64K   582056   581916      754     0     0     0     0      0     0
  16M     2791       87     2704     0     0     0     0      0     0

In this example number of 16M pages in use before starting instance is 87 pages from total available of 2791 pages.

We start the instance with SGA size of 8G:

SQL> startup
ORACLE instance started.

Total System Global Area 8551575552 bytes
Fixed Size                  2238616 bytes
Variable Size            2348812136 bytes
Database Buffers         6190792704 bytes
Redo Buffers                9732096 bytes
Database mounted.
Database opened.
SQL> show parameter sga_target

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
sga_target                           big integer 8G

SQL> show parameter lock_sga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
lock_sga                             boolean     TRUE

Then we check Large pages in use again :

$ vmstat -P all

System configuration: mem=98304MB

pgsz            memory                           page
----- -------------------------- ------------------------------------
           siz      avm      fre    re    pi    po    fr     sr    cy
   4K  4428160  2877041   420004     0     0     5  1279   2754     0
  64K   581608   530522    51695     0     0     0     0      0     0
  16M     2791      600     2191     0     0     0     0      0     0

As you can see the total number of 16M pages in use is now 600 pages, which is exactly 513 pages more than what it was before instance startup.
This proves that 16M pages have been used by our instance.

You can also check memory usage of your instance by checking one of the instance processes like pmon:

$ ps -ef|grep pmon
  oracle 14024886 31392176   0 14:05:34  pts/0  0:00 grep pmon
  oracle 41681022        1   0   Mar 11      -  3:12 ora_pmon_KBS

Then check memory used by this process is from 16M Pages:

$ svmon -P 41681022

-------------------------------------------------------------------------------
     Pid Command          Inuse      Pin     Pgsp  Virtual 64-bit Mthrd  16MB
41681022 oracle         2180412  2109504     1778  2158599      Y     N     Y

     PageSize                Inuse        Pin       Pgsp    Virtual
     s    4 KB               31820          0       1650       9975
     m   64 KB                2959        516          8       2961
     L   16 MB                 513        513          0        513

I hope this will be useful for you, and good luck.

Categories: DBA Blogs

More Fun With Oracle Timestamp Math

Mon, 2016-04-04 14:23
Timestamp Math

Several years ago I wrote an article on Oracle date math.
Amazingly, that article was still available online at the time of this writing.

Working With Oracle Dates

An update to that article is long overdue.
While date math with the DATE data type is fairly well known and straight forward, date math with Oracle TIMESTAMP data is less well known and somewhat more difficult.

Data Types and Functions

Let’s begin by enumerating the data types and functions that will be discussed

Datetime and Interval Data Types

Documentation for Datetime and Interval Data Types

  • Timestamp
  • Timestamp with Time Zone
  • Interval Day to Second
  • Interval Year to Month
Datetime Literals

Documentation for Datetime Literals

  • Date
  • Timestamp
  • Timestamp with Time Zone
  • Timestamp with Local Time Zone
Interval Literals

Documentation for Interval Literals

  • Interval Day to Second
  • Interval Year to Month
Datetime/Interval Arithmetic

Documentation for Datetime/Interval Arithmetic

There is not a link to the heading, just scroll down the page until you find this.

Timestamp Functions

Documentation for Datetime Functions

There quite a few of these available. Most readers will already be familiar with many of these, and so only some of the more interesting functions related to timestamps will be covered.

  • extract
  • to_dsinterval
  • to_yminterval
Timestamp Internals

It is always interesting to have some idea of how different bits of technology work. In Working With Oracle Dates I showed how Date values are stored in the database, as well as how a Date stored in the database is somewhat different than a date variable.

Let’s start by storing some data in a timestamp column and comparing how it differs from systimestamp.

Test table for Timestamp Math Blog:

col c1_dump format a70
col c1 format a35
col funcname format a15

set linesize 200 trimspool on
set pagesize 60

drop table timestamp_test purge;

create table timestamp_test (
c1 timestamp
)
/

insert into timestamp_test values(systimestamp );
insert into timestamp_test values(systimestamp - 366);
commit;

select
'timestamp' funcname, c1, dump(c1) c1_dump
from timestamp_test
union all
select
'systimestamp' funcname, systimestamp, dump(systimestamp) systimestamp_dump
from dual
/

FUNCNAME        C1                                  C1_DUMP
--------------- ----------------------------------- ----------------------------------------------------------------------
timestamp       26-MAR-16 03.09.27.649491 PM -07:00 Typ=180 Len=11: 120,116,3,26,16,10,28,38,182,114,56
timestamp       26-MAR-15 03.09.27.000000 PM -07:00 Typ=180 Len=7: 120,115,3,26,16,10,28
systimestamp    26-MAR-16 03.09.27.687416 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,9,27,0,192,34,249,40,252,0,5,0,0,0,0,0

3 rows selected.

One of the first things you might notice is that the value for Typ is 180 for TIMESTAMP columns, but for SYSTIMESTAMP Typ=188.
The difference is due to TIMESTAMP being an internal data type as stored in the database, while SYSTIMESTAMP is dependent on the compiler used to create the executables.

Another difference is the length; the first TIMESTAMP column has a length of 11, whereas the SYSTIMESTAMP column’s length is 20. And what about that second TIMESTAMP column? Why is the length only 7?

TIMESTAMP with length of 7

An example will show why the second row inserted into TIMESTAMP_TEST has a length of only 7.

  1* select dump(systimestamp) t1, dump(systimestamp-1) t2, dump(sysdate) t3 from dual
15:34:32 ora12c102rac01.jks.com - jkstill@js122a1 SQL- /

T1                                       T2                                       T3
---------------------------------------- ---------------------------------------- ----------------------------------------
Typ=188 Len=20: 224,7,3,22,22,34,35,0,16 Typ=13 Len=8: 224,7,3,21,18,34,35,0      Typ=13 Len=8: 224,7,3,22,18,34,35,0
0,181,162,17,252,0,5,0,0,0,0,0


1 row selected.

T2 was implicitly converted to the same data type as SYSDATE because standard date math was performed on it.

The same thing happened when the second row was inserted TIMESTAMP_TEST.

Oracle implicitly converted the data to a DATE data type, and then implicitly converted it again back to a timestamp, only the standard date information is available following the previous implicit conversion.

You may have noticed that in this example the length of the data is 8, while that stored in the table was 7. This is due to the use of SYSDATE, which is an external data type, whereas any data of DATE data type that is stored in the database is using an internal data type which always has a length of 7.

SYSTIMESTAMP Byte Values

Let’s see if we can determine how each byte is used in a SYSTIMESTAMP value

The following SQL will use the current time as a baseline, and start a point 10 seconds previous, showing the timestamp value and the internal representation.

 

col t1 format a35
col t2 format a38
col dump_t1 format a70
col dump_t2 format a70

set linesize 250 trimspool on

/*
 using to_disinterval() allows performing timestamp math without implicit conversions

see https://en.wikipedia.org/wiki/ISO_8601
 for an explanation of the PTnS notation being used in to_dsinterval()

*/

alter session set nls_date_format = 'yyyy-mm-dd hh24:mi:ss';

-- subtract 1 second from the current date
-- do it 10 times
select
 --systimestamp t1,
 --dump(systimestamp) dump_t1,
 systimestamp - to_dsinterval('PT' || to_char(level) || 'S') t2,
 dump(systimestamp - to_dsinterval('PT' || to_char(level) || 'S')) dump_t2
from dual connect by level <= 10
order by level desc
/


T2                                     DUMP_T2
-------------------------------------- ----------------------------------------------------------------------
26-MAR-16 03.34.55.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,34,55,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.34.56.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,34,56,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.34.57.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,34,57,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.34.58.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,34,58,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.34.59.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,34,59,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.35.00.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,35,0,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.35.01.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,35,1,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.35.02.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,35,2,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.35.03.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,35,3,0,152,108,205,20,252,0,5,0,0,0,0,0
26-MAR-16 03.35.04.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,35,4,0,152,108,205,20,252,0,5,0,0,0,0,0

10 rows selected.

 

From the output we can see that seconds are numbered 0-59, and that the 7th byte in the internal format is where the second is stored. We can also see that the Month is represented by the 3rd bytes, and the Day by the fourth 4th byte.

One would then logically expect the 5th bite to show us the hour. Glancing at the actual time of 3:00 PM it seems curious then the value we expect to be the hour is 19 rather than 15.

The server where these queries is being run has a Time Zone of EDT. Next I ran the same queries on a server with a TZ of PDT, and though the time in the timestamp appeared as 3 hours earlier, the value stored in the 5th byte is still 19. Oracle is storing the hour in UTC time, then using the TZ from the server to get the actual time.

Playing with Time Zones

We can modify the local session time zone to find out how Oracle is calculating the times.

The first attempt is made on the remote client where scripts for this article are developed. The TZ will be set for Ethiopia and then the time checked at the Linux command line and in Oracle.

 

# date
Sat Mar 26 13:16:12 PDT 2016

# TZ='Africa/Addis_Ababa'; export TZ

# date
Sat Mar 26 23:16:17 EAT 2016

# sqlplus jkstill/XXX@p1
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
SQL- !date
Sat Mar 26 23:16:40 EAT 2016

SQL-  l
 1 select
 2 systimestamp t1,
 3 dump(systimestamp) dump_t1
 4* from dual
23:16:50 ora12c102rac01.jks.com - jkstill@js122a1 
SQL- /
T1                                  DUMP_T1
----------------------------------- ----------------------------------------------------------------------
26-MAR-16 04.16.56.254769 PM -04:00 Typ=188 Len=20: 224,7,3,26,20,16,56,0,104,119,47,15,252,0,5,0,0,0,0,0

Setting the TZ on the client clearly has no effect on the time returned from Oracle. Now let’s try while logged on to the database server.

$ date
Sat Mar 26 16:20:23 EDT 2016

$ TZ='Africa/Addis_Ababa'; export TZ

$ date
Sat Mar 26 23:20:38 EAT 2016

$ sqlplus / as sysdba
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

SQL- l
 1 select
 2 systimestamp t1,
 3 dump(systimestamp) dump_t1
 4* from dual
SQL- /

T1                                  DUMP_T1
----------------------------------- ----------------------------------------------------------------------
26-MAR-16 11.22.48.473298 PM +03:00 Typ=188 Len=20: 224,7,3,26,20,22,48,0,80,244,53,28,3,0,5,0,0,0,0,0

 

This experiment has demonstrated two things for us:

  1. Oracle is storing the hour as UTC time
  2. Setting the TZ on the client does not have any affect on the calculations of the time.
What About the Year?

Given the location of the month, it would be expected to find the year in the byte just previous the month byte. There is not just one byte before the month, but two. You will recall that SYSTIMESTAMP has a different internal representation than does the TIMESTAMP data type. Oracle is using both of these bytes to store the year.

Working with this timestamp from an earlier example, we can use the information in Oracle Support Note 69028.1 to see how this works.

 

T2                                     DUMP_T2
-------------------------------------- ----------------------------------------------------------------------
26-MAR-16 03.34.55.349007000 PM -04:00 Typ=188 Len=20: 224,7,3,26,19,34,55,0,152,108,205,20,252,0,5,0,0,0,0,0

 

For the timestamp of March 16 2016 the first two bytes of the timestamp are used to represent the year.

The Formula for AD dates is Byte 1 + ( Byte 2 * 256 ). Using this formula the year 2016 can be arrived at:

224 + ( 7 * 256) = 2016

For the TIMESTAMP data type, the format is somewhat different for the year; actually it works the same way it does for the DATE data type, in excess 100 notation.

 

SQL- l
1* select c1, dump(c1) dump_c1 from timestamp_test where rownum < 2
SQL- /

C1 DUMP_C1
------------------------------ ---------------------------------------------------
26-MAR-16 03.09.27.649491 PM Typ=180 Len=11: 120,116,3,26,16,10,28,38,182,114,56

 

The 2nd byte indicates the current year – 1900.

Decode All Timestamp Components

Now let’s decode all of the data in a TIMESTAMP. First we need some some TIMESTAMP test data

Creating the test data

The following SQL will provide some test data for following experiments.

We may not use all of the columns or rows, but they are available.

 

drop table timestamp_tz_test purge;

create table timestamp_tz_test (
 id integer,
 c1 timestamp,
 c2 timestamp with time zone,
 c3 timestamp with local time zone
)
/

-- create 10 rows each on second apart

begin
for i in 1..10
loop
 insert into timestamp_tz_test values(i,systimestamp,systimestamp,systimestamp );
 dbms_lock.sleep(1);
 null;
end loop;
commit;
end;
/

 

We already know that TIMESTAMP data can store fractional seconds to a billionth of a second.

Should you want to prove that to yourself, the following bit of SQL can be used to insert TIMESTAMP data into a table, with each row being 1E-9 seconds later than the previous row. This will be left as an exercise for the reader.

 

create table t2 as
select level id,
 to_timestamp('2016-03-29 14:25:42.5' || lpad(to_char(level),8,'0'),'yyyy-mm-dd hh24.mi.ssxff') c1
from dual
connect by level <= 1000
/

col dump_t1 format a70
col c1 format a35
col id format 99999999

select id, c1, substr(dump(c1),instr(dump(c1),',',-1,4)+1) dump_t1
from t2
order by id
/

 

Oracle uses 4 bytes at the end of a timestamp to store the fractional seconds.

The value of the least byte is as shown.

Each greater byte will be a power of 256.

The following SQL will make this more clear. Don’t spend too much time at first trying to understand the SQL, at it will become more clear after you see the results.

SQL to decode 1 row of TIMESTAMP data.

 

col id format 99
col t1 format a35
col dumpdata format a50
col tz_type format a10
col ts_component format a40
col label format a6
col real_value format a50

set linesize 200 trimspool on

alter session set nls_timestamp_format = 'yyyy-mm-dd hh24.mi.ssxff';
alter session set nls_timestamp_tz_format = 'yyyy-mm-dd hh24.mi.ssxff tzr';


with rawdata as (
 select c2 t1, dump(c2) dump_t1
 from timestamp_tz_test
 where id = 1
),
datedump as (
 select t1,
 substr(dump_t1,instr(dump_t1,' ',1,2)+1) dumpdata
 from rawdata
),
-- regex from http://nuijten.blogspot.com/2009/07/splitting-comma-delimited-string-regexp.html
datebits as (
 select level id, regexp_substr (dumpdata, '[^,]+', 1, rownum) ts_component
 from datedump
 connect by level <= length (regexp_replace (dumpdata, '[^,]+')) + 1
),
labeldata as (
 select 'TS,DU,CC,YY,MM,DD,HH,MI,SS,P1,P2,P3,P4' rawlabel from dual
),
labels as (
 select level-2 id, regexp_substr (rawlabel, '[^,]+', 1, rownum) label
 from labeldata
 connect by level <= length (regexp_replace (rawlabel, '[^,]+')) + 1
),
data as (
 select db.id, db.ts_component
 from datebits db
 union
 select 0, dumpdata
 from datedump dd
 union select -1, to_char(t1) from rawdata
)
select d.id, l.label, d.ts_component,
 case l.label
 when 'DU' then d.ts_component
 when 'CC' then 'Excess 100 - Real Value: ' || to_char(to_number((d.ts_component - 100)*100 ))
 when 'YY' then 'Excess 100 - Real Value: ' || to_char(to_number(d.ts_component - 100 ))
 when 'MM' then 'Real Value: ' || d.ts_component
 when 'DD' then 'Real Value: ' || d.ts_component
 when 'HH' then 'Excess 1 - Real Value: ' || to_char(to_number(d.ts_component)-1)
 when 'MI' then 'Excess 1 - Real Value: ' || to_char(to_number(d.ts_component)-1)
 when 'SS' then 'Excess 1 - Real Value: ' || to_char(to_number(d.ts_component)-1)
 when 'P1' then 'Fractional Second P1 : ' || to_char((to_number(d.ts_component) * POWER(256,3) ) / power(10,9))
 when 'P2' then 'Fractional Second P2 : ' || to_char((to_number(d.ts_component) * POWER(256,2) ) / power(10,9))
 when 'P3' then 'Fractional Second P3 : ' || to_char((to_number(d.ts_component) * 256 ) / power(10,9))
 when 'P4' then 'Fractional Second P4 : ' || to_char((to_number(d.ts_component) + 256 ) / power(10,9))
 end real_value
from data d
join labels l on l.id = d.id
order by 1
/

When the values for the Pn fractional second columns are added up, they will be equal to the (rounded) value shown in the timestamp.

 

 ID LABEL  TS_COMPONENT                             REAL_VALUE
--- ------ ---------------------------------------- --------------------------------------------------
 -1 TS     2016-03-31 09.14.29.488265 -07:00
  0 DU     120,116,3,31,17,15,30,29,26,85,40,13,60  120,116,3,31,17,15,30,29,26,85,40,13,60
  1 CC     120                                      Excess 100 - Real Value: 2000
  2 YY     116                                      Excess 100 - Real Value: 16
  3 MM     3                                        Real Value: 3
  4 DD     31                                       Real Value: 31
  5 HH     17                                       Excess 1 - Real Value: 16
  6 MI     15                                       Excess 1 - Real Value: 14
  7 SS     30                                       Excess 1 - Real Value: 29
  8 P1     29                                       Fractional Second P1 : .486539264
  9 P2     26                                       Fractional Second P2 : .001703936
 10 P3     85                                       Fractional Second P3 : .00002176
 11 P4     40                                       Fractional Second P4 : .000000296

13 rows selected.

Timezones are recorded in an additional two bytes in TIMESTAMP WITH TIMEZONE and TIMEAZONE WITH LOCAL TIMEZONE data types.

Decoding those two bytes is left as an exercise for the reader.

Timestamp Arithmetic

Now that we have had some fun exploring and understanding how Oracle stores TIMESTAMP data, it is time to see how calculations can be performed on timestamps.

Note: See this ISO 8601 Article to understand the notation being used in to_dsinterval().

Interval Day to Second

It is a common occurrence to add or subtract time to or from Oracle Dates.

How that is done with the Oracle DATE data type is fairly well known.

  • Add 1 Day
    • DATE + 1
  • Add 1 Hour
    • DATE + (1/24)
  • Add 1 Minute
    • DATE + ( 1 / 1440)
  • Add 1 Second
    • DATE + (1/86400)

Following is a brief refresher on that topic:

 

alter session set nls_date_format = 'yyyy-mm-dd hh24:mi:ss';

select sysdate today, sysdate -1 yesterday from dual;

select sysdate now, sysdate - (30/86400) "30_Seconds_Ago" from dual;

select sysdate now, sysdate + ( 1/24 ) + ( 15/1440 ) + ( 42/86400) "1:15:42_Later" from dual;

SQL- @date-calc

Session altered.

TODAY YESTERDAY
------------------- -------------------
2016-03-30 13:39:06 2016-03-29 13:39:06
NOW 30_Seconds_Ago
------------------- -------------------
2016-03-30 13:39:06 2016-03-30 13:38:36

NOW 1:15:42_Later
------------------- -------------------
2016-03-30 13:39:06 2016-03-30 14:54:48

 

While this same method will work with timestamps, the results may not be what you expect. As noted earlier Oracle will perform an implicit conversion to a DATE data type, resulting in truncation of some timestamp data. The next example makes it clear that implicit conversions have converted TIMESTAMP to a DATA type.

alter session set nls_timestamp_format = 'yyyy-mm-dd hh24.mi.ssxff';
alter session set nls_date_format = 'DD-MON-YY';

select systimestamp today, systimestamp -1 yesterday from dual;

select systimestamp now, systimestamp - (30/86400) "30_Seconds_Ago" from dual;

select systimestamp now, systimestamp + ( 1/24 ) + ( 15/1440 ) + ( 42/86400) "1:15:42_Later" from dual;

SQL- @timestamp-calc-incorrect

TODAY                                                                       YESTERDAY
--------------------------------------------------------------------------- ---------
2016-03-31 11.35.29.591223 -04:00                                           30-MAR-16

NOW                                                                         30_Second
--------------------------------------------------------------------------- ---------
2016-03-31 11.35.29.592304 -04:00                                           31-MAR-16

NOW                                                                         1:15:42_L
--------------------------------------------------------------------------- ---------
2016-03-31 11.35.29.592996 -04:00                                           31-MAR-16

 

Oracle has supplied functions to properly perform calculations on timestamps. The previous example will work properly when ds_tointerval is used as seen in the next example.

 

col c30 head '30_Seconds_Ago' format a38
col clater head '1:15:42_Later' format a38
col now format a35
col today format a35
col yesterday format a38

alter session set nls_timestamp_format = 'yyyy-mm-dd hh24.mi.ssxff';
alter session set nls_timestamp_tz_format = 'yyyy-mm-dd hh24.mi.ssxff tzr';


-- alternate methods to subtract 1 day
select systimestamp today, systimestamp - to_dsinterval('P1D') yesterday from dual;
select systimestamp today, systimestamp - to_dsinterval('1 00:00:00') yesterday from dual;

-- alternate methods to subtract 30 seconds
select systimestamp now, systimestamp - to_dsinterval('PT30S') c30 from dual;
select systimestamp now, systimestamp - to_dsinterval('0 00:00:30') c30 from dual;

-- alternate methods to add 1 hour, 15 minutes and 42 seconds
select systimestamp now, systimestamp + to_dsinterval('PT1H15M42S') clater from dual;
select systimestamp now, systimestamp + to_dsinterval('0 01:15:42') clater from dual;

TODAY                               YESTERDAY
----------------------------------- --------------------------------------
2016-03-30 18.10.41.613813 -04:00 2016-03-29 18.10.41.613813000 -04:00

TODAY                               YESTERDAY
----------------------------------- --------------------------------------
2016-03-30 18.10.41.614480 -04:00 2016-03-29 18.10.41.614480000 -04:00

NOW                                 30_Seconds_Ago
----------------------------------- --------------------------------------
2016-03-30 18.10.41.615267 -04:00 2016-03-30 18.10.11.615267000 -04:00

NOW                                 30_Seconds_Ago
----------------------------------- --------------------------------------
2016-03-30 18.10.41.615820 -04:00 2016-03-30 18.10.11.615820000 -04:00

NOW                                 1:15:42_Later
----------------------------------- --------------------------------------
2016-03-30 18.10.41.616538 -04:00 2016-03-30 19.26.23.616538000 -04:00

NOW                                 1:15:42_Later
----------------------------------- --------------------------------------
2016-03-30 18.10.41.617161 -04:00 2016-03-30 19.26.23.617161000 -04:00

 

Extract Values from Timestamps

The values for years, months, days, hours and seconds can all be extracted from a timestamp via the extract function. The following code demonstrates a few uses of this, along with examples of retrieving intervals from two dates.

The values in parentheses for the day() and year() intervals specify the numeric precision to be returned.

 

def nls_tf='yyyy-mm-dd hh24.mi.ssxff'

alter session set nls_timestamp_format = '&nls_tf';

col d1_day format 999999
col full_interval format a30
col year_month_interval format a10

with dates as (
   select
      to_timestamp_tz('2014-06-19 14:24:29.373872', '&nls_tf') d1
      , to_timestamp_tz('2016-03-31 09:42:16.8734921', '&nls_tf') d2
   from dual
)
select
   extract(day from d1) d1_day
   , ( d2 - d1) day(4) to second full_interval
   , ( d2 - d1) year(3) to month year_month_interval
   , extract( day from d2 - d1) days_diff
   , extract( hour from d2 - d1) hours_diff
   , extract( minute from d2 - d1) minutes_diff
   , extract( second from d2 - d1) seconds_diff
from dates
/


 D1_DAY FULL_INTERVAL                  YEAR_MONTH  DAYS_DIFF HOURS_DIFF MINUTES_DIFF SECONDS_DIFF
------- ------------------------------ ---------- ---------- ---------- ------------ ------------
     19 +0650 19:17:47.499620          +001-09           650         19           17   47.4996201

Building on that, the following example demonstrates how the interval value the represents the difference between dates d1 and d2 can be added back to d1 and yield a date with the same value as d1.

 

def nls_tf='yyyy-mm-dd hh24.mi.ssxff'

alter session set nls_timestamp_format = '&nls_tf';

col d1 format a30
col d2 format a30
col full_interval format a30
col calc_date format a30

with dates as (
   select
      to_timestamp('2014-06-19 14:24:29.373872', '&nls_tf') d1
      , to_timestamp('2016-03-31 09:42:16.873492', '&nls_tf') d2
   from dual
)
select
   d1,d2
   , ( d2 - d1) day(4) to second  full_interval
   , d1 + ( d2 - d1) day(4) to second calc_date
from dates
/


D1                             D2                             FULL_INTERVAL                  CALC_DATE
------------------------------ ------------------------------ ------------------------------ ------------------------------
2014-06-19 14.24.29.373872000  2016-03-31 09.42.16.873492000  +0650 19:17:47.499620          2016-03-31 09.42.16.873492000

 

PL/SQL Interval Data Types

 

The ISO 8601 Article previously mentioned will be useful for understanding how time durations may be specified with interval functions.

The following combination of SQL and PL/SQL is used to convert the difference between two timestamps into seconds. The code is incomplete in the sense that the assumption is made that the largest component of the INTERVAL is hours. In the use case for this code that is true, however there could also be days, months and years for larger value of the INTERVAL.

The following code is sampled from the script ash-waits-use.sql and uses PL/SQL to demonstrate the use of the INTERVAL DAY TO SECOND data type in PL/SQL.

 

var v_wall_seconds number
col wall_seconds new_value wall_seconds noprint

declare
	ash_interval interval day to second;
begin

	select max(sample_time) - min(sample_time) into ash_interval from v$active_session_history;


	select
		max(sample_time) - min(sample_time) into ash_interval
	from v$active_session_history
	where sample_time 
	between
		decode('&&snap_begin_time',
			'BEGIN',
			to_timestamp('1900-01-01 00:01','yyyy-mm-dd hh24:mi'),
			to_timestamp('&&snap_begin_time','yyyy-mm-dd hh24:mi')
		)
		AND
		decode('&&snap_end_time',
			'END',
			to_timestamp('4000-12-31 23:59','yyyy-mm-dd hh24:mi'),
			to_timestamp('&&snap_end_time','yyyy-mm-dd hh24:mi')
		);

	:v_wall_seconds := 
		(extract(hour from ash_interval) * 3600 )
		+ (extract(second from ash_interval) * 60 )
		+ extract(second from ash_interval) ;
end;
/


select round(:v_wall_seconds,0) wall_seconds from dual;

 

Similarly the to_yminterval function is used to to perform timestamp calculations with years and months.


col clater head 'LATER' format a38
col now format a35
col today format a35
col lastyear format a38
col nextyear format a38

alter session set nls_timestamp_format = 'yyyy-mm-dd hh24.mi.ssxff';
alter session set nls_timestamp_tz_format = 'yyyy-mm-dd hh24.mi.ssxff tzr';

-- alternate methods to add 1 year
select systimestamp today, systimestamp + to_yminterval('P1Y') nextyear from dual;
select systimestamp today, systimestamp + to_yminterval('01-00') nextyear from dual;


-- alternate methods to subtract 2 months
select systimestamp now, systimestamp - to_yminterval('P2M') lastyear from dual;
select systimestamp now, systimestamp - to_yminterval('00-02') lastyear from dual;

-- alternate methods to add 2 year, 4 months, 6 days ,1 hour, 15 minutes and 42 seconds
select systimestamp now, systimestamp + to_yminterval('P2Y4M')  + to_dsinterval('P2DT1H15M42S') clater from dual;
select systimestamp now, systimestamp + to_yminterval('02-04')  + to_dsinterval('2 01:15:42') clater from dual;

TODAY                               YESTERDAY
----------------------------------- --------------------------------------
2016-03-31 09.06.22.060051 -07:00   2016-03-30 09.06.22.060051000 -07:00

TODAY                               YESTERDAY
----------------------------------- --------------------------------------
2016-03-31 09.06.22.061786 -07:00   2016-03-30 09.06.22.061786000 -07:00


NOW                                 30_Seconds_Ago
----------------------------------- --------------------------------------
2016-03-31 09.06.22.063641 -07:00   2016-03-31 09.05.52.063641000 -07:00


NOW                                 30_Seconds_Ago
----------------------------------- --------------------------------------
2016-03-31 09.06.22.064974 -07:00   2016-03-31 09.05.52.064974000 -07:00

NOW                                 1:15:42_Later
----------------------------------- --------------------------------------
2016-03-31 09.06.22.066259 -07:00   2016-03-31 10.22.04.066259000 -07:00


NOW                                 1:15:42_Later
----------------------------------- --------------------------------------
2016-03-31 09.06.22.067600 -07:00   2016-03-31 10.22.04.067600000 -07:00

While date math with the DATE data type is somewhat arcane, it is not too complex once you understand how it works.

When Oracle introduced the TIMESTAMP data type, that all changed. Timestamps are much more robust than dates, and also more complex

Timestamps bring a whole new dimension to working with dates and times; this brief introduction to working with timestamp data will help demystify the process of doing math with timestamps.

Categories: DBA Blogs

Migrate a SQL Server environment with complex replication without reinitializing or rebuilding replication

Mon, 2016-04-04 11:14

When you have a SQL Server environment where a very complex replication setup is in place, and you need to migrate/move (without upgrading), some or all the servers involved in the replication topology to new servers/Virtual Machines or to a new Data Center/Cloud, this Blog post is for you!

Let’s assume you also have Transactional and/or Merge publications and subscriptions in place, and you need to move the publisher(s) and/or distributor(s) to a new environment. You also have one or more of the following restrictions:

  • You are not sure if the schema at the subscribers is identical to the publisher (i.e.: different indexes, different columns, etc).
  • You cannot afford downtime to reinitialize the subscriber(s)
  • There are too many subscribers to reinitialize and you cannot afford the downtime if anything goes wrong.

Here are the general steps for this migration:
Prior the migration date:

  • New instance has to have same SQL Server version and edition plus patch level as old instance. Windows version and edition can be different but you need to ensure the version of Windows supports the version of SQL Server.
  • The directory structure for the SQL Server files should be identical in the new server as old server and same permissions:
    • Same path for SQL Server binaries
    • Same path and database files names in both servers for system databases
    • Same directories where user database files and T-logs reside
    • Same path for the replication directories (when applies)
  • Copy over any instance-level objects (Logins, Linked Servers and jobs) to new instance; leave jobs disabled if applies or stop SQL Server Agent on new server

On migration date:

  • Disable any jobs, backups and maintenance that should run during the migration window on old server
  • Stop all database activity on old instance or disable logins
  • Restart old instance and verify there is no activity
  • Synchronize all replication agents that are related to the server being migrated
  • Stop and disable replication agents related to the server being migrated
  • Stop both instances
  • Copy over all system database files from old to new server
  • Copy over all user database files from old server to new one
    • Alternatively, backup all user databases on old server before stopping service and copy the files to new server
  • Shutdown old server
  • Rename new server to the name of old server and change the IP of new server to old server’s IP
  • Start the new server
  • Verify that the name of the new instance is like the old server and it’s local
  • If you backed up the user databases previously, you need to restore them to same location and file names as in old server with RECOVERY and KEEP_REPLICATION
  • Verify that all user databases are online and publications + subscribers are there
  • Start all replication agents related to the migrated server and verify replication is working properly
  • Verify that applications are able to connect to the new instance (no need to modify instance name as it is the same as before and same IP)

At any case, it is strongly recommended to test the migration prior to the real cutover, even if the test environment is not identical to Production, just to get a feel for it. Ensure that you are including most replication scenarios you have in Production during your test phase.

The more scripts you have handy for the cutover date, the less downtime you may have.

It is extremely important to also have a good and tested rollback plan.

In future Blog posts I will discuss more complex replication scenarios to be migrated and rollback plans.

If you would like to make suggestions for future blogs, please feel free to add a comment and I will try to include your request in future posts.

Categories: DBA Blogs

Why You Should Consider Moving Your Enterprise Application to the Oracle Cloud

Mon, 2016-04-04 09:32

 

If you’ve decided to migrate your Oracle enterprise applications to the public cloud, it’s a good idea to consider Oracle Cloud alongside alternatives such as Amazon Web Services (AWS) and Microsoft Azure.

Oracle has made big strides in the cloud lately with platform-as-a-service (PaaS) offerings for its middleware and database software, culminating in the release of its first infrastructure-as-a-service (IaaS) offering in late 2015.

Oracle has a clear advantage over the competition when it comes to running its own applications in the cloud: it has full control over product licensing and can optimize its cloud platform for lift-and-shift migrations. This gives you a low-risk strategy for modernizing your IT portfolio.

 

What to expect from Oracle Cloud IaaS

Because Oracle’s IaaS offering is quite new, it has yet to match the flexibility and feature set of Azure and AWS. For example, enterprise VPN connectivity between cloud and on-premises infrastructure is still very much a work in progress. Unlike AWS, however, Oracle provides a free software appliance for accessing cloud storage on-premises. In addition to offering an hourly metered service, Oracle also provides unmetered compute capacity with a monthly subscription. Some customers prefer this option because it allows them to more easily control their spending through a predictable monthly fee rather than a pure pay-as-you-go model.

At the same time, Oracle Cloud IaaS has a limited selection of instance shapes, there is no SSD storage yet or guaranteed input/output performance levels, and transferring data is more challenging for large-volume migrations.

 

What to expect from Oracle Cloud PaaS

Oracle’s PaaS offerings are quickly becoming among the most comprehensive cloud-based services for Oracle Database. They include:

 

Oracle Database Schema Service

This is the entry-level unmetered offering, available starting at $175 a month for a 5GB database schema limit. Tenants share databases but are isolated in their own schemas. This means you have no control over database parameters, only the schema objects created. This service is currently available only with Oracle Database 11g Release 2 (i.e., it is not yet included in the latest release of Oracle Database 12c).

 

Oracle Exadata Cloud Service

This is a hosted service with monthly subscriptions starting at $70,000 for a quarter rack with 28 OCPUs enabled and 42TB of usable storage provisioned. You have full root OS access and SYSDBA database access, so you have total flexibility in managing your environment. However, this means Oracle manages only the bare minimum—the external networking and physical hardware—so you may end up expending the same effort as you would managing Exadata on-premises.

 

Oracle Database Virtual Image Service

This is a Linux VM with pre-installed Oracle Database software. The license is included in the rate. It’s available metered (priced per OCPU per hour of runtime) and unmetered (priced per OCPU allocated per month). As you’ll need to manage everything up from the VM level, including OS management and full DBA responsibilities, the metered service is a particularly good option for running production environments that require full control over the database deployment.

 

Oracle Database-as-a-Service (DBaaS)

This is an extension of Virtual Image Service and includes additional automation for database provisioning during service creation, backup, recovery, and patching. While you are still responsible for the complete management of the environment, the embedded automation and tooling can simplify some DBA tasks.

I should point out that, with the exception of Oracle Database Schema Service, these are not “true” PaaS offerings; they function more like IaaS-style services but with database software licenses included. But this is on the way, as Oracle recently announced plans for a fully managed DBaaS offering  similar to the one available through AWS.

 

While Oracle’s cloud options are still quite new and require additional features for broad enterprise adoption, if this option sparks your interest, now is the time to take the first steps. If you want to learn more about the migration path to Oracle Cloud, check out our white paper, Migrating Oracle Databases to Cloud.

migratingtocloud

Categories: DBA Blogs

Best practice for setting up MySQL replication filters

Fri, 2016-04-01 13:23

It is not uncommon that we need to filter out some DBs or Tables while setting up replication. It is important to understand how MySQL evaluates/process the replication filtering rules to avoid the conflicting or confusion while we setting them up.The purpose of this blog is to illustrate the rules and provide some suggestions for best practice.

MySQL provides 3 levels of filters for setting up replication: Binary log, DB and Table. The binlog filters apply on the master to control how to log the changes. Since MySQL replication is based on the binlog, it is the first level filter and has the highest priority. While the DB-level and Table-level filters apply on the slaves, since each table belongs to a schema, the DB-level filters have higher priority than the Table-level ones. Inside the Table-level filters, MySQl will evaluate the options in the order of: –replicate-do-table, –replicate-ignore-table ,  –replicate-wild-do-table , –replicate-wild-ignore-table.

Based on that, we have the following suggestions for setting up MySQL replication filter as best practice:

I)Do not setup any binlog-level filters unless you really need to and can afford losing the chance of  having an extra full copy of data changes for the master.

II)In DB-level filters, use either one or none of the two options: –replicate-do-db or –replicate-ignore-db. Never use both at the same time.

III) While using binlog_format=’statement’ OR ‘mixed’ (in mixed mode, if  a transaction is deterministic then it will be stored in statement format) and set up –replicate-do-db or –replicate-ignore-db on slaves, make sure never make changes on the tables across the default database on master otherwise you might lose the changes on slave due to default database not matching.

IV)In Table-level filters, use only one of the 2 options, or use the following two combination: –replicate-ignore-table and —replicate-wild-do-table to avoid conflicting and confusing.

For MariaDB replication filters within Galera cluster, it should be used with caution. As a general rule except for InnoDB DML updates, the following replication filters are not honored in a Galera cluster :  binlog-do-db ,binlog-ignore-db, replicate-wild-do-db, replicate-wild-ignore-db. However, replicate-do-db,replicate-ignore-db filters are honored for DDL and DML for both InnoDB & MyISAM engines. As they might create discrepancies and replication may abort (see MDEV-421, MDEV-6229). (https://mariadb.com/kb/en/mariadb/mariadb-galera-cluster-known-limitations/), For the slaves replicating from cluster, the rules are similar with normal replication settings as above.

Here are the details/reasons:

1)Binlog-level filters

A)How MySQL process the Binlog-level filters

There are 2 options for setting binlog filter on master:  –binlog-do-db and –binlog-ignore-db. MySQL will check –binlog-do-db first, if there are any options, it will apply this one and ignore –binlog-ignore-db. If the –binlog-do-db is NOT set, then mysql will check –binlog-ignore-db.If both of them are empty, it will log changes for all DBs.

See the below examples. In scenario 1) no binlog level filters are set and so all changes were logged; In scenario 2) -binlog-do-db and –binlog-ignore-db are all set to m_test and changes on the DB m_test were logged and changes on the DB test were NOT logged;In scenario 3) only –binlog-ignore-db is set to m_test and so changes on the DB m_test were NOT logged and changes on the DB test were  logged;
scenario 1)–binlog-do-db and –binlog-ignore-db is NOT set:

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000003 |      120 |              |                  |                   |

+——————+———-+————–+——————+——————-+

1 row in set (0.00 sec)

mysql> show binlog events in “vm-01-bin.000003” from 120;  

Empty set (0.00 sec)

mysql> insert into t1(id,insert_time) values(10,now());

Query OK, 1 row affected (0.05 sec)

 

mysql> show binlog events in “vm-01-bin.000003” from 120;

+——————+—–+————+———–+————-+—————————————————————+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                          |

+——————+—–+————+———–+————-+—————————————————————+

| vm-01-bin.000003 | 120 | Query      |         1 |         211 | BEGIN                                                         |

| vm-01-bin.000003 | 211 | Query      |         1 |         344 | use `m_test`; insert into t1(id,insert_time) values(10,now()) |

| vm-01-bin.000003 | 344 | Xid        |         1 |         375 | COMMIT /* xid=17 */                                           |

+——————+—–+————+———–+————-+—————————————————————+

3 rows in set (0.00 sec)

scenario 2)–binlog-do-db=m_test and –binlog-ignore-db=m_test:

— insert into tables of DB m_test was logged

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000004 |      656 | m_test       | m_test           |                   |

+——————+———-+————–+——————+——————-+

1 row in set (0.00 sec)

 

mysql> use m_test

 

mysql> insert into t1(insert_time) values(now());

Query OK, 1 row affected (0.02 sec)

 

mysql> show binlog events in “vm-01-bin.000004” from 656;

+——————+—–+————+———–+————-+———————————————————+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                    |

+——————+—–+————+———–+————-+———————————————————+

| vm-01-bin.000004 | 656 | Query      |         1 |         747 | BEGIN                                                   |

| vm-01-bin.000004 | 747 | Intvar     |         1 |         779 | INSERT_ID=13                                            |

| vm-01-bin.000004 | 779 | Query      |         1 |         906 | use `m_test`; insert into t1(insert_time) values(now()) |

| vm-01-bin.000004 | 906 | Xid        |         1 |         937 | COMMIT /* xid=26 */                                     |

+——————+—–+————+———–+————-+———————————————————+

4 rows in set (0.00 sec)

— insert into tables of DB test was NOT logged

mysql> use test;

 

mysql> show master status ;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000004 |      937 | m_test       | m_test           |                   |

+——————+———-+————–+——————+——————-+

 

mysql> insert into t1(`a`) values(‘ab’);

Query OK, 1 row affected (0.03 sec)

 

mysql> show binlog events in “vm-01-bin.000004” from 937;

Empty set (0.00 sec)

 

scenario 3)–Binlog_Do_DB=null –binlog-ignore-db=m_test:

mysql> use m_test

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000005 |      120 |              | m_test           |                   |

+——————+———-+————–+——————+——————-+

mysql> insert into t1(insert_time) values(now());

Query OK, 1 row affected (0.01 sec)

 

mysql> show binlog events in “vm-01-bin.000005” from 120;

Empty set (0.00 sec)

 

mysql> use test

mysql> insert into t1(`a`) values(‘ba’);

Query OK, 1 row affected (0.03 sec)

 

mysql> show binlog events in “vm-01-bin.000005” from 120;

+——————+—–+————+———–+————-+———————————————-+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                         |

+——————+—–+————+———–+————-+———————————————-+

| vm-01-bin.000005 | 120 | Query      |         1 |         199 | BEGIN                                        |

| vm-01-bin.000005 | 199 | Query      |         1 |         305 | use `test`; insert into t1(`a`) values(‘ba’) |

| vm-01-bin.000005 | 305 | Xid        |         1 |         336 | COMMIT /* xid=22 */                          |

+——————+—–+————+———–+————-+———————————————-+

3 rows in set (0.00 sec)

 

B)Best practice for setting up the Binlog-level filters

So, for Binlog-level filter, we will use either one (and ONLY one or none) of the 2 options: –binlog-do-db to make MySQL log changes for the DBs in the list. OR, –binlog-ignore-db to make MySQL log changes for the DBs NOT in the list. Or leave both of them empty to log changes for all the DBs.

However, we usually recommend NOT to setup any binlog-level filters. The reason is that to log changes for all DBs and set up filters only on slaves will achieve the same purpose and let us have an extra full copy of data changes for the master, in case we will need that for recovery.

 

2)DB-level filters

A)How MySQL process the DB-level filters

There are 2 options for setting DB-level filters:  –replicate-do-db or –replicate-ignore-db. MySQL processes these two filters the similar way as it processes the Binlog-level filters, the difference is that it ONLY applies on the slaves and so affects how the slaves replicate from its master. It will check –replicate-do-db first, if there are any options, it will replicate the DBs in the list and ignore –replicate-ignore-db. If the –replicate-do-db is NOT set, then mysql will check –replicate-ignore-db and replicate all the DBs except for the ones in this list.If both of them are empty, it will replicate all the DBs. you can find the process in the below chart from http://dev.mysql.com/doc/refman/5.7/en/replication-rules-db-options.html

There is a trick for DB-level filters though If the binlog_format is set as statement or mixed. (The binlog_format =mixed also applies here, it is because that  in mixed mode replication, in case the transaction  is deterministic it will be resolved to statement which is equivalent to statement mode) .. Since “With statement-based replication, the default database is checked for a match.” (http://dev.mysql.com/doc/refman/5.7/en/replication-rules-db-options.html). If you set up –replicate-do-db and you update a table out of the default database in master, the update statement will not be replicated if the default database you are running command from is not in the  –replicate-do-db. For example, there are 2 DBs in master, you set binlog_format=’statement’ OR ‘mixed’ and set –replicate-do-db=DB1 on slave. when execute the following commands: use DB2; update DB1.t1 … This update command will not be executed on slave. To make the update statement replicated to slave, you need to do: use DB1, update t1 …

For example: with binlog_format=statement or binlog_format=mixed,  we insert into m_test.t1 in two approaches: one is using default DB as m_test, the other one is using default DB test, the changes are all logged in the master. But in slave, after it caught up, only the insert(default DB is m_test) was replicated to slave, and the insert (default DB is test) was NOT replicated. As shown below:

Scenario 1) binlog_format=statement

In master: insert into m_test.t1 in two approaches: one is using default DB as m_test, the other one is using default DB test, the changes are all logged

mysql> use m_test

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> delete from t1;

Query OK, 16 rows affected (0.02 sec)

 

mysql> select * from m_test.t1;

Empty set (0.00 sec)

 

mysql> use m_test

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.04 sec)

 

mysql> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.03 sec)

 

mysql> show binlog events in “vm-01-bin.000006” from 654;

+——————+——+————+———–+————-+—————————————————————-+

| Log_name         | Pos  | Event_type | Server_id | End_log_pos | Info                                                           |

+——————+——+————+———–+————-+—————————————————————-+

| vm-01-bin.000006 |  654 | Xid        |         1 |         685 | COMMIT /* xid=39 */                                            |

| vm-01-bin.000006 |  685 | Query      |         1 |         768 | BEGIN                                                          |

| vm-01-bin.000006 |  768 | Query      |         1 |         860 | use `m_test`; delete from t1                                   |

| vm-01-bin.000006 |  860 | Xid        |         1 |         891 | COMMIT /* xid=48 */                                            |

| vm-01-bin.000006 |  891 | Query      |         1 |         982 | BEGIN                                                          |

| vm-01-bin.000006 |  982 | Intvar     |         1 |        1014 | INSERT_ID=17                                                   |

| vm-01-bin.000006 | 1014 | Query      |         1 |        1148 | use `m_test`; insert into m_test.t1(insert_time) values(now()) |

| vm-01-bin.000006 | 1148 | Xid        |         1 |        1179 | COMMIT /* xid=52 */                                            |

| vm-01-bin.000006 | 1179 | Query      |         1 |        1268 | BEGIN                                                          |

| vm-01-bin.000006 | 1268 | Intvar     |         1 |        1300 | INSERT_ID=18                                                   |

| vm-01-bin.000006 | 1300 | Query      |         1 |        1432 | use `test`; insert into m_test.t1(insert_time) values(now())   |

| vm-01-bin.000006 | 1432 | Xid        |         1 |        1463 | COMMIT /* xid=60 */                                            |

+——————+——+————+———–+————-+—————————————————————-+

12 rows in set (0.00 sec)

 

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

| 18 | 2016-03-20 15:00:01 |

+—-+———————+

2 rows in set (0.00 sec)

 

In slave: after it caught up, only the first insert(default DB is m_test) was replicated to slave, and the insert (default DB is test) was NOT replicated

mysql> show slave status\G

*************************** 1. row ***************************

              Slave_IO_State: Waiting for master to send event

                 Master_Host: 10.0.2.6

                 Master_User: repl

                 Master_Port: 3306

               Connect_Retry: 10

             Master_Log_File: vm-01-bin.000006

         Read_Master_Log_Pos: 1463

              Relay_Log_File: ewang-vm-03-relay-bin.000017

               Relay_Log_Pos: 1626

       Relay_Master_Log_File: vm-01-bin.000006

            Slave_IO_Running: Yes

           Slave_SQL_Running: Yes

             Replicate_Do_DB: m_test

         Replicate_Ignore_DB:

          Replicate_Do_Table:

      Replicate_Ignore_Table:

     Replicate_Wild_Do_Table:

 Replicate_Wild_Ignore_Table:

                  Last_Errno: 0

                  Last_Error:

                Skip_Counter: 0

         Exec_Master_Log_Pos: 1463

             Relay_Log_Space: 1805

             Until_Condition: None

              Until_Log_File:

               Until_Log_Pos: 0

          Master_SSL_Allowed: No

          Master_SSL_CA_File:

          Master_SSL_CA_Path:

             Master_SSL_Cert:

           Master_SSL_Cipher:

              Master_SSL_Key:

       Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

               Last_IO_Errno: 0

               Last_IO_Error:

              Last_SQL_Errno: 0

              Last_SQL_Error:

 Replicate_Ignore_Server_Ids:

            Master_Server_Id: 1

                 Master_UUID: a22b3fb2-5e70-11e5-b55a-0800279d00c5

            Master_Info_File: /mysql/data/master.info

                   SQL_Delay: 0

         SQL_Remaining_Delay: NULL

     Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

          Master_Retry_Count: 86400

                 Master_Bind:

     Last_IO_Error_Timestamp:

    Last_SQL_Error_Timestamp:

              Master_SSL_Crl:

          Master_SSL_Crlpath:

          Retrieved_Gtid_Set:

           Executed_Gtid_Set:

               Auto_Position: 0

1 row in set (0.00 sec)

 

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

+—-+———————+

1 row in set (0.00 sec)

 

Scenario 2) binlog_format=mixed

In master:

mysql> show variables like ‘binlog_format’;

+—————+——-+

| Variable_name | Value |

+—————+——-+

| binlog_format | MIXED |

+—————+——-+

1 row in set (0.00 sec)

 

mysql> show master status;

+——————+———-+————–+——————+——————-+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+——————+———-+————–+——————+——————-+

| vm-01-bin.000007 |      120 |              |                  |                   |

+——————+———-+————–+——————+——————-+

1 row in set (0.00 sec)

 

mysql> use m_test

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.04 sec)

 

mysql> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

 

Database changed

mysql> insert into m_test.t1(insert_time) values(now());

Query OK, 1 row affected (0.04 sec)

 

mysql> show binlog events in “vm-01-bin.000007” from 120;

+——————+—–+————+———–+————-+—————————————————————-+

| Log_name         | Pos | Event_type | Server_id | End_log_pos | Info                                                           |

+——————+—–+————+———–+————-+—————————————————————-+

| vm-01-bin.000007 | 120 | Query      |         1 |         211 | BEGIN                                                          |

| vm-01-bin.000007 | 211 | Intvar     |         1 |         243 | INSERT_ID=19                                                   |

| vm-01-bin.000007 | 243 | Query      |         1 |         377 | use `m_test`; insert into m_test.t1(insert_time) values(now()) |

| vm-01-bin.000007 | 377 | Xid        |         1 |         408 | COMMIT /* xid=45 */                                            |

| vm-01-bin.000007 | 408 | Query      |         1 |         497 | BEGIN                                                          |

| vm-01-bin.000007 | 497 | Intvar     |         1 |         529 | INSERT_ID=20                                                   |

| vm-01-bin.000007 | 529 | Query      |         1 |         661 | use `test`; insert into m_test.t1(insert_time) values(now())   |

| vm-01-bin.000007 | 661 | Xid        |         1 |         692 | COMMIT /* xid=53 */                                            |

+——————+—–+————+———–+————-+—————————————————————-+

8 rows in set (0.00 sec)

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

| 18 | 2016-03-20 15:00:01 |

| 19 | 2016-03-20 15:09:14 |

| 20 | 2016-03-20 15:09:25 |

+—-+———————+

4 rows in set (0.00 sec)

 

In slave:

mysql> show variables like ‘binlog_format’;

+—————+——-+

| Variable_name | Value |

+—————+——-+

| binlog_format | MIXED |

+—————+——-+

1 row in set (0.00 sec)

 

mysql> show slave status\G

*************************** 1. row ***************************

              Slave_IO_State: Waiting for master to send event

                 Master_Host: 10.0.2.6

                 Master_User: repl

                 Master_Port: 3306

               Connect_Retry: 10

             Master_Log_File: vm-01-bin.000007

         Read_Master_Log_Pos: 692

              Relay_Log_File: ewang-vm-03-relay-bin.000023

               Relay_Log_Pos: 855

       Relay_Master_Log_File: vm-01-bin.000007

            Slave_IO_Running: Yes

           Slave_SQL_Running: Yes

             Replicate_Do_DB: m_test

         Replicate_Ignore_DB:

          Replicate_Do_Table:

      Replicate_Ignore_Table:

     Replicate_Wild_Do_Table:

 Replicate_Wild_Ignore_Table:

                  Last_Errno: 0

                  Last_Error:

                Skip_Counter: 0

         Exec_Master_Log_Pos: 692

             Relay_Log_Space: 1034

             Until_Condition: None

              Until_Log_File:

               Until_Log_Pos: 0

          Master_SSL_Allowed: No

          Master_SSL_CA_File:

          Master_SSL_CA_Path:

             Master_SSL_Cert:

           Master_SSL_Cipher:

              Master_SSL_Key:

       Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

               Last_IO_Errno: 0

               Last_IO_Error:

              Last_SQL_Errno: 0

              Last_SQL_Error:

 Replicate_Ignore_Server_Ids:

            Master_Server_Id: 1

                 Master_UUID: a22b3fb2-5e70-11e5-b55a-0800279d00c5

            Master_Info_File: /mysql/data/master.info

                   SQL_Delay: 0

         SQL_Remaining_Delay: NULL

     Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

          Master_Retry_Count: 86400

                 Master_Bind:

     Last_IO_Error_Timestamp:

    Last_SQL_Error_Timestamp:

              Master_SSL_Crl:

          Master_SSL_Crlpath:

          Retrieved_Gtid_Set:

           Executed_Gtid_Set:

               Auto_Position: 0

1 row in set (0.00 sec)

 

mysql> select * from m_test.t1;

+—-+———————+

| id | insert_time         |

+—-+———————+

| 17 | 2016-03-20 14:59:41 |

| 19 | 2016-03-20 15:09:14 |

+—-+———————+

2 rows in set (0.00 sec)

 

B)Best practice for setting up the DB-level filters

Use either one or none of the two options: –replicate-do-db or –replicate-ignore-db. Never use both at the same time.

If you use binlog_format=’statement’  OR ‘mixed’ and set up –replicate-do-db or –replicate-ignore-db on slaves, make sure never make changes on the tables across the default database, otherwise the data discrepancy will be expected in the slaves.

 

3)Table-level filters

There are 4 options for setting Table-level filters: –replicate-do-table, –replicate-ignore-table ,  –replicate-wild-do-table or –replicate-wild-ignore-table. MySQL evaluates the options in order. you can find the process in the below chart from http://dev.mysql.com/doc/refman/5.6/en/replication-rules-table-options.html

 

The above chart shows us that MySQL will first check –replicate-do-table, the tables listed here will be replicated and so won’t be ignored by the following options like –replicate-ignore-table , or –replicate-wild-ignore-table. Then MySQL will check –replicate-ignore-table, the tables listed here will be ignored even if it shows up in the following options  –replicate-wild-do-table. The lowest priority is –replicate-wild-ignore-table.

B)Best practice for setting up the Table-level filters

Due to the priorities for the 4 Table_level options, to avoid confusing/conflicting, we suggest using only one of the 4 options, or using the following two options: –replicate-ignore-table and replicate-wild-do-table so that it is clearly that the tables in –replicate-ignore-table will be ignored and the tables in replicate-wild-do-table will be replicated.

 

Categories: DBA Blogs

What Are Your Options For Migrating Enterprise Applications to the Cloud?

Fri, 2016-04-01 08:16

Migrating your enterprise applications from on-premises infrastructure to the public cloud is attractive for a number of reasons. It eliminates the costs and complexities of provisioning hardware and managing servers, storage devices, and network infrastructure; it gives you more compute capacity per dollar without upfront capital investment; and you gain opportunities for innovation through easier access to new technologies, such as advanced analytical capabilities.

So how do you get there?

You have a few options. At one end of the spectrum, you could simply wait and rationalize, making continuous incremental changes to gain efficiencies. This is obviously a “slow burn” approach. In the middle is a “lift-and-shift” from your current environment into the public cloud. And at the far extreme, you could plunge right in and re-architect your applications—a costly and probably highly complex task.

 

In fact, a true migration “strategy” will involve elements of each of these. For example, you could perform short-term optimizations and migrations on a subset of applications that are ready for the cloud, while transforming the rest of your application stack over the longer term.

 

What to expect from the major public cloud platforms

There are three leading public cloud platforms: Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). As Google doesn’t seem to be driving customers to lift-and-shift their applications to GCP, I’m going to focus on AWS and Azure as potential cloud destinations and, for specificity, take Oracle enterprise databases as the use case.

 

Amazon Web Services

You have two options for migrating Oracle databases to the AWS cloud: infrastructure-as-a-service (IaaS) and platform-as-a-service (PaaS).

 

Deploying Oracle applications in AWS IaaS is much like deploying them on your in-house infrastructure. You don’t get flexible licensing options, but you do have the ability to easily allocate more or less capacity as needed for CPU, memory, and storage. However, because AWS IaaS is virtualized infrastructure, you may experience slower performance due to suboptimal CPU core allocation or processor caches. You’ll also have less flexibility with instance sizes, network topology, storage performance tiers, and the like.

 

AWS Relational Database Service (RDS) for Oracle is a managed PaaS offering where, in addition to giving you the benefits of IaaS, Amazon takes on major DBA and system administrator tasks including provisioning, upgrades, backups, and multi-availability zone replication. This significantly simplifies your operations—but also results in less control over areas such as configuration, patching, and maintenance windows. AWS RDS for Oracle can also be used with a pay-as-you-go licensing model included in the hourly rate.

 

Microsoft Azure

Azure does not have a managed offering for Oracle databases, so the only way to run Oracle Database on Azure is through its IaaS platform. The benefits are very similar to AWS IaaS, but Azure offers additional licensing options (with Windows-based license-included images) and its instances are billed by the minute rather than by the hour. What’s important to keep in mind is that Azure is not as broadly adopted as AWS and offers less flexibility for storage performance tiers and instance sizes. Oracle Database software running on Windows is also not as common as running on Linux.

 

For more in-depth technical details on these options, I encourage you to read our white paper, Migrating Oracle Databases to Cloud. My next blog in this series will look at one other option not discussed here: migrating to Oracle Cloud.

migratingtocloud

Categories: DBA Blogs

5 Phases for Migrating to a Cloud Platform

Thu, 2016-03-31 13:11

Businesses today are increasingly looking to migrate to the cloud to realize lower costs and increase software velocity. They are now asking themselves “when” they should migrate rather than if they “should”, and with many vendors and solutions in the market, it can be difficult to take the first steps in creating a cloud strategy.   

In our latest on-demand webinar, Chris Presley, Solution Architect at Pythian, and Jim Bowyer, Solution Architect at Azure-Microsoft Canada, discuss a five phase framework for cloud transformations, and the benefits of migrating to the cloud with Microsoft Azure.

The five phase framework helps businesses understand the journey to successfully migrate current applications to a cloud platform. Here is a snapshot of the five phases:

 

1. Assessment: Analysis and Planning

A majority of the time investment should be upfront in assessment and preparation because it sets the stage for the actual development and migration, resulting in faster projects, lower costs, and less risk.

In this phase, businesses want to begin understanding the performance and user characteristics of their applications, and any other additional information that will be important during the transformation, such as regulatory, compliance, and legal requirements.

 

2. Preparation: POC, Validation and Final Road Map

The preparation phase is meant to help understand what the rest of the migration is going to look like.

While beneficial in any project, proof of concepts (POCs) are increasingly simple to create and are a great strength when leveraging the cloud. POCs are used to show some functionality and advantage early so you can get everyone – especially business owners – excited about the migration.

 

3. Build: Construct Infrastructure

Once the expectations around the final migration road map are developed, the infrastructure can be built. Jim discusses that beginning to think about automation during this phase is important, and Chris agrees, in particular with developing an automated test bed to help smooth out the migration.

 

4. Migration: Execute Transformation

The migration activity for cloud environments is very short. By this stage, if the planning and preparation has been done properly, “flicking the light switch” to the new environment should be seamless and feel like the easiest part.

Chris talks about creating both detailed success and rollback criteria and how they are both crucial for success in the migration phase. Jim mentions that Microsoft Azure provides a variety of tools to help make rollbacks easier and safer.

 

5. Optimization: IaaS Enhancements

Continually transforming and enhancing after the migration is complete is important for increasing software velocity, which is why businesses migrate to the cloud in the first place. While a piece of functionality may not available today, it may be available tomorrow.

By going back to iterate and take advantage of new functionalities, businesses are able to squeeze out more improvements and create opportunities for differentiation.

 

Learn More

To learn about these five cloud transformation phases in more depth, and how to leverage the cloud with Microsoft Azure, download our free on-demand webinar.

Azure_Webinar (1)

Categories: DBA Blogs