Skip navigation.

Feed aggregator

How to set up Flashback for MongoDB Testing

Pythian Group - Fri, 2016-04-22 11:52


After you’ve upgraded your database to a new version, it’s common that the performance degrades in some cases. To prevent this from happening, we could capture the production database operations and replay them in the testing environment which has the new version installed.

Flashback is a MongoDB benchmark framework that allows developers to gauge database performance by benchmarking queries. Flashback records the real traffic to the database and replays operations with different strategies. The framework is comprised of a set of scripts that fall into 2 categories:

  1. Records the operations(ops) that occur during a stretch of time
  2. Replays the recorded ops

The framework was tested on Ubuntu 10.04.4 LTS


-go 1.4

-git 2.3.7

-python 2.6.5

-pymongo 2.7.1

-libpcap0.8 and libpcap0.8-dev


  1. Download Parse/Flashback source code

# go get

  1. Manually modify the following file to workaround a mongodb-tools compatibility issue

In pass_util.go file:

func GetPass() string {
–    return string(gopass.GetPasswd())
+    if data, errData := gopass.GetPasswd(); errData != nil {
+        return “
+    } else {
+        return string(data)
+    }


  1. Compile the go lang part of the tool

# go build -i ./src/



Suppose you have to two shards, Shard a and Shard b. Each shard has 3 nodes. In each shard a, primary is a1. In shard b, primary is b2.

1. copy sample config file for editing

# cp ./src/

2. Change config for testing


# Indicates which database(s) to record.

“target_databases”: [“test”],

# Indicates which collections to record. If user wants to capture all the

# collections’ activities, leave this field to be `None` (but we’ll always

# skip collection `system.profile`, even if it has been explicit

# specified).

“target_collections”: [“testrecord”],


“oplog_servers”: [

{ “mongodb_uri”: “mongodb://” },

{ “mongodb_uri”: “mongodb://” }




# In most cases you will record from the profile DB on the primary

# If you are also sending queries to secondaries, you may want to specify

# a list of secondary servers in addition to the primary

“profiler_servers”: [

{ “mongodb_uri”: “mongodb://” },

{ “mongodb_uri”: “mongodb://mongodb.b2:27018” }



“oplog_output_file”: “./testrecord_oplog_output”,

“output_file”: “./testrecord_output”,


# If overwrite_output_file is True, the same output file will be

# overwritten is False in between consecutive calls of the recorer. If

# it’s False, the recorder will append a unique number to the end of the

# output_file if the original one already exists.

“overwrite_output_file”: True,


# The length for the recording

“duration_secs”: 3600




“logging_level”: logging.DEBUG



duration_secs indicates the length for the recording. For production capture, should set it at least to 10-12 hrs.

Make sure has write permission to the output dir

  1. Set all primary servers profiling level to 2


2. Start operations recording


3. The script starts multiple threads to pull the profiling results and oplog entries for collections and databases that we are interested in. Each thread works independently. After fetching the entries, it will merge the results from all sources to get a full picture of all operations as one output file.

4. You can run the from any server as long as the server has flashback installed  and can connect to all mongod servers.

5. As a side note, running mongod in replica set mode is necessary (even when there is only one node), in order to generate and access the oplogs


  1. Run flashback. Style can be “real” or ”stress”

        Real: replay ops in accordance to their original timestamps, which allows us to imitate regular traffic.

        Stress: will preload the ops to the memory and replay them as fast as possible. This potentially limits the number of  ops played back per session to the             available memory on the Replay host.

For sharded collections, point the tool to a mongos. You could also point to a single shard primary for non-sharded collections.

./flashback -ops_filename=”./testrecord_output” -style=”real” -url=”localhost:27018″ -workers=10

  • Several pymongo (python’s MongoDB driver) arguments in the code are deprecated causing installation and running errors.
  • Need to define a faster restore method (ie. LVM snapshots) to rollback the test environment after each replay.
  • Need to capture execution times for each query included in the test set to be able to detect excecution plan changes.
  • In a sharded cluster, record can be executed from a single server with access to all primaries and/or secondaries.
  • Pulling oplogs from secondaries is recommended if we are looking to reduce load on the primaries.
  • Memory available would dramatically affect operation’s merge process after recording
  • Memory available would also affect replay times (see Tests summary)
Tests summary


Record test scenario 1


Record server: mongos server (8G RAM)

Time : about 2 hours to finish the recording

Details: Ran record while inserting and updating 1000 documents


Record test scenario 2


Record server: shard a primary node a1 (80G RAM)

Time: about 2 minutes to finish the recording

Details: Ran record while inserting and updating 1000 documents

Record test scenario 3


Record server: shard a primary node a1 (80G RAM)

Time: it took about 20 minutes to finish the recording

Details: Ran record while inserting and updating 100,000 documents

Replay test scenario 1

Replay server: mongos server (8G RAM)

Time: it took about 1 hour to finish the replay

Details: replayed 1000 operations in “real” style


Replay test scenario 2

Replay server: shard a primary node a1 (80G RAM)

Time: about 5 minutes to finish the replay

Details: replayed 1000 operations in “real” style

Replay test scenario 3

Replay server: mongos server (8G RAM)

Time: failed due to insufficient memory

Details: replayed 1000 operations in “stress” style


Replay test scenario 4

Replay server: shard a primary node a1 (80G RAM)

Time: about 1minute to finish the replay

Details: replayed 1000 operations in “stress” style


Replay test scenario 5

Replay server: shard a primary node a1 (80G RAM)

Time: about 20 minutes to finish the replay

Details: replayed 50,000 operations in “stress” style

Categories: DBA Blogs

Pictures from the good ol’ days

Dan Norris - Fri, 2016-04-22 06:44

My friends from childhood will know my dad. He was likely their high school principal (he was mine too) in a very small town (of about 2500 people on a good day). Those who knew our school may have seen the inside of his office; some were there because they stopped in for a nice visit, others were directed there by upset teachers. In either case, seeing the wall in his office was somewhat overwhelming. At peak, he had 70+ 8×10 photos framed and hanging on his wall. The pictures were of various sports teams and graduating classes from his tenure as principal.

I found those pictures in some old boxes recently. Almost 100% of them were taken by one of our high school math teachers, Jim Mikeworth, who was also a local photographer. Mr. Mike said he was fine with me posting the pictures, so I scanned all of them in and posted them online. If you have a facebook account, you may have already seen them, but if not, they are still accessible without a facebook account. You can find the pictures at I hope you enjoy them!

My dad died almost 20 years ago and arguably was one of the most loved men in the history of Villa Grove. He would love for everyone to enjoy this shrine to his office wall of pictures–he was very proud of all the kids that passed through VGHS during his time there (1978-1993, I think).

Have you seen the menu?

Darwin IT - Fri, 2016-04-22 05:49
And did you like it? Hardly possible to miss I think. It kept me nicely busy for a few hours. Got some great examples, and this one is purely based on css and unnumbered lists in combination with anchors. Unfortunately the menu worked with non-classed <ul>, <li> and <a> tags. So embedding the css, caused my other elements to be redefined. (It even redefined the padding of all elements).

But with some trial and error I got it working in a subclassed form. And I like it, do you?

I also found that besides articles, you also can create pages in blogger. Did not know about that, completely overlooked that. I think I try something out, so if you're a regular visitor, you might find that there's work in progress.

The wish for a menu popped up a little while ago, and I kept thinking about it, to be able to get some structure in my articles. From the beginning I tagged every article, but not with a real plan. So I got tags stating 'Oracle BPM Suite', but also 'SOA Suite'. And 'Database', but also 'Database 11g'. Not so straightforward and purposeful.

But a purpose arose. For a longer while I'm thinking about if writing a book would be something for me. I like to write articles on a (ir)regular basis. On this blog you can find a broad range of subjects. But could I do a longer series on a particular subject? And could it lead to a more structured and larger form like a book? I learned from a former co-worker that he had this idea to write articles on a regular basis to buildup a book gradually. And I like that. But what subject would it be? My core focus area is SOA Suite and BPM Suite. But loads of books are written about that. Well, maybe not loads, but at least some recognized, good ones. And even PCS (Process Cloud Service) and ICS (Integration Cloud Service) are (being) covered.

But when Oracle acquired Collaxa in 2004, I worked at Oracle Consulting and got to work with it in the very early days. And I think in the Netherlands at least, I was (one of) the first one(s) from Oracle to provide training on BPEL, at least for Oracle University in the Netherlands. So I got involved in BPEL from the first hour Oracle laid hands on it. Could BPEL be a subject I could cover? Of course I'll not be the first one to cover that. Both on BPEL 1.1 as on 2.0 you can google up a book (is that already a term?), the one on 1.1 I still had stacked in a pile behind another one on my bookshelf.

So let's see where this leads me. You can expect a series on BPEL, in parallel of other articles on subjects that come around during my work. From real novice (do you already use scopes and local variables?), up to some more advanced stuff (how about dynamic partnerlinks; are you already into Correlation Sets, transaction handling, BPEL and Spring? )

It might bleed to death. It might become a nice series and nothing more than that. And it might turn out a real informative stack of articles that could be re-edited into a book. But when I'm at it, turning to cover the more advanced subjects, I plan to pol for what you want to have covered. I think I do know something about BPEL. But as you read with me, maybe you could point me out to subjects I don't know yet. Consider yourself invited to read along.

Introducing Oracle WebLogic Server 12.2.1 Multitenancy: A Q&A Game

Next to our Partner Webcast «Oracle WebLogic Server 12.2.1 Multitenancy and Continuous Availability» delivered earlier this month on the 21st of April 2016, where we've focused on the two new main...

We share our skills to maximize your revenue!
Categories: DBA Blogs

Web technology in APEX Development

Dimitri Gielis - Fri, 2016-04-22 03:54
How did you get started with developing your first APEX app? 

My guess is either you went to and got a free account or Oracle Application Express was already in your company and somebody told you the url you could connect to. For me that is really the power of APEX, you just go to an url and within minutes you created your first app.

Staying within the APEX framework?

With APEX you create web application, but you don't have to worry about CSS, JavaScript, HTML5, Session State etc. it just comes with the framework. In APEX you have Universal Theme to visually adapt the look and feel of your app, there're Dynamic Actions that do all the JavaScript for you and the framework is generating all the HTML and processing that is necessary.
So although we are creating web applications, at first we are not doing what typical web developers do (creating html, css, javascript files).
Oracle closely looks at all the web technology, makes choices which streams they will follow (e.g. JQuery framework) and implements and tests it so we don't have to worry about a thing.

Going to the next level?

The web is evolving fast, and I mean really fast (!) so maybe you saw something really nice on the web you wish you had in your APEX app, but it's not yet declaratively available... now the nice thing about APEX is that you can extend it yourself by using plugins (see the plugins section on or just by writing the code yourself as other web developers do.

Trying new web technology locally

When you want to try those shiny new web things in your APEX app, I recommend trying to get those things working locally first. Last year for example I gave a presentation about Web Components at different Oracle conferences and this year I'll present on Service Workers. All the research I did on those topics where initially not in an APEX context. But how do you get started to try this now?

The first thing you need is a local web server. Depending the OS you're on, you might already have one (e.g. IIS, Apache, ...), if not, here's what I do on OSX.
OSX comes with Python and that allows to create a simple web server.
Open Terminal and go to the directory where you want to test your local files and run:

$ python -m SimpleHTTPServer 8000   (Python 2.7)
$ python3 -m http.server 8000   (Python 3.0)

There're many other ways to have a local web server, see for example this article or a simple web server based on node.js.

The next thing is to start developing your HTML, CSS, JavaScript etc.
To do this development, you probably want some tools; an editor like Sublime or Atom, a CSS and JS preprocessor, Browser extensions, build tools like Gulp etc.
You don't need all those tools, just an editor is fine, but soon enough you want to be more efficient in your development, and tools just help :) Here're some nice articles about different tools: Google Developers - Getting Started, Keenan Payne 13 useful web dev tools and Scott Ge list of web development tools.

Going from custom web development to APEX - use APEX Front-End Boost

So you have your local files developed and next is to integrate them in your APEX app.
You add some code to your APEX pages and upload the files so APEX can see them.
If everything works immediately - great, but most of the time you probably need to make more changes, so you change your local files, test again, upload etc. You could streamline this a bit with setting up a proxy or referencing localhost files while in development... But then you're happy your part of the APEX community...

To ease the above development and integration with APEX, Vincent Morneau and Martin Giffy D'Souza created the excellent APEX Front-End Boost package. The package is using many of the above tools behind the scenes, but it's all integrated in a nice box. This video goes in full detail what the tool is doing for you and how to use it. In short; it fills the bridge of working with a file locally, making it production ready and seeing it immediately in your APEX app :)

In the next post I'll talk about the importance of using https and also setting it up for localhost (also for APEX Front-End Boost).

    Categories: Development

    So how about other cloud providers

    Pat Shuff - Fri, 2016-04-22 01:07
    If you are looking for a cloud hosting provider, the number one question that comes up is which one to use. There are a ton of cloud providers. How do you decide which one is best for you? To be honest, the answer is it depends. It depends on what your problem is and what problem you are trying to solve. Are you trying to solve how you communicate with customers? If so do you purchase something like SalesForce or Oracle Sales Cloud, you get a cloud based sales automation tool. Doing a search on the web yields a ton of references. Unfortunately, you need to know what you are searching for. Are you trying to automate your project management (Oracle Primavera or Microsoft Project)? Every PC magazine and trade publication have opinions on this. Companies like Gartner and Forrester write reviews. Oracle typically does not rate well with any of these vendors for a variety of reasons.

    My recommendation is to look at the problem that you are trying to solve. Are you trying to lower your cost of on-site storage? Look at generic cloud storage. Are you trying to reduce your data center costs and go with a disaster recovery site in the cloud? Look at infrastructure in the cloud and compute in the cloud. I had a chance to play with VMWare VCloud this week and it has interesting features. Unfortunately, it is a really bad generic cloud storage company. You can't allocate 100 TB of storage and access it remotely without going through a compute engine and paying for a processor, operating system, and OS administrator. It is really good if I have VMWare and want to replicate the instances into the cloud or use VMotion to move things to the cloud. Unfortunately, this solution does not work well if I have a Solaris of AIX server running in my data center and want to replicate into the cloud.

    The discussion on replication opens a bigger can of worms. How do you do replication? Do you take database and java files and snap mirror them to the cloud or replicate them as is done inside a data center today? Do you DataGuard the database to a cloud provider and pay on a monthly basis for the database license rather than owning the database? Do you setup a listener to switch between your on-site database and cloud database as a high availability failover? Do you setup a load balancer in front of a web server or Java app server to do the same thing? Do you replicate the visualization files from your VMWare/HyperV/OracleVM/Zen engine to a cloud provider that supports that format? Do you use a GoldenGate or SOA server to physically replicate objects between your on-site and cloud implementation? Do you use something like the Oracle Integration server to synchronize data between cloud providers and your on-premise ERP system?

    Once you decide on what level to do replication/fail over/high availability you need to begin the evaluation of which cloud provider is best for you. Does your cloud provider have a wide range of services that fits the majority of your needs or do you need to get some solutions from one vendor and some from another. Are you ok standardizing on a foundation of a virtualization engine and letting everyone pick and choose their operating system and application of choice? Do you want to standardize at the operating system layer and not care about the way things are virtualized? When you purchase something like SalesForce CRM, do you even know what database or operating system they use or what virtualization engine supports it? Do or should you care? Where do you create your standards and what is most important to you? If you are a health care provider do you really care what operating system that your medical records systems uses or are you more interested in how to import/export ultrasound images into your patients records. Does it really matter which VM or OS is used?

    The final test that you should look at is options. Does your cloud vendor have ways of easily getting data to them and easily getting data out. Both Oracle and Amazon offer tape storage services. Both offer disks that you can ship from your data center to their cloud data centers to load data. Which one offers to ship tapes to you when you want to get them back? Can you only backup from a database in the cloud to storage in the cloud? What does it cost to get your data back once you give it to a cloud provider? What is the outbound charge rate and did you budget enough to even terminate the service without walking away from your data? Do they provide an un-limited read and write service so that you don't get charged for outbound data transfer.

    Picking a choosing a cloud vendor is not easy. It is almost as difficult as buying a house, a car, or a phone carrier. You will never know if you made the right choice until you get penalized for making the wrong choice. Tread carefully and ask the right questions as you start your research.

    Links for 2016-04-21 []

    Categories: DBA Blogs

    Blackboard CEO’s First 100 Days: Reorganization and Learn Ultra Updates

    Michael Feldstein - Thu, 2016-04-21 22:22

    By Phil HillMore Posts (403)

    Just over four years after Providence Equity Partners acquired Blackboard and three years after they brought in Jay Bhatt to replace co-founder Michael Chasen, the company hired Bill Ballhaus as its new CEO at the beginning of January. 100 days in, Ballhaus is starting to make changes to the organization and providing some insights into future corporate directions.

    The most significant change is a reorganization that combines strategy, product management and marketing in one group under Katie Blot. In an interview Michael and I had with Ballhaus and Blot earlier this week, they described the primary motivation for the organizational change as the need to more tightly align those functions. Also significant is that this change means the departure of Mark Strassman, SVP Product Marketing & Management, and Tracey Stout, SVP of Marketing & Sales Effectiveness. Blackboard provided the following statement.

    We are deeply grateful for the many contributions both Mark and Tracey have made at Blackboard. Both of these individuals have been critical to driving the transformation and evolution of our PMM and Marketing organizations.

    Katie Blot joined Blackboard in 2008 as President of Global Services and has been SVP of Corporate & Industry Strategy since early 2015. Her long experience at Blackboard is worth considering as is the fact that both departing executives both worked with Jay Bhatt at Autodesk earlier in their careers and were brought into Blackboard as part of his new management team. I am not suggesting that the purpose of the move was based on corporate pedigree, but I am suggesting that the move effectively changes the balance in how much ed tech experience and even Blackboard experience rests with the top company executives.

    When we asked Ballhaus about lessons learned after his listening tour with customers, he told us that the company must do a few things very well. And the top of his priority list is the Learn LMS product family. This focus on products stands in contrast to Bhatt’s broader and more vague focus on solutions. Michael noted the change in tone back in the 2013 BbWorld keynote:

    The big corporate keynote had to be one of the strangest I’ve ever seen. CEO Jay Bhatt ran through a whole long list of accomplishments for the year, but he only gave each one a few seconds as he rattled through the checklist. He mentioned that the company has a new mission statement but didn’t bother to explain it. It took nearly an hour of mostly talking about big macro trends in education and generalities about the categories of goals that the company has set before he finally got around to new product announcements. And then commenced what I can only describe as a carpet bombing run of announcements—a series of explosions that were over by the time you realized that they had started, leaving you to wonder what the heck had just happened.

    At that same 2013 keynote (and in Michael’s post) Blackboard announced a major UX overhaul for Learn (the Ultra part) and a move to the cloud (the SaaS part). By the 2015 BbWorld conference Michael shared how Ultra was a year late and not yet ready for schools to test. The company has tripped over itself in not getting product out the door and not being able to create effective messaging. Just what is Learn Ultra and Learn SaaS and when will real colleges and universities get to evaluate them?

    When we asked when Learn Ultra would be available for schools to actively pilot (real courses, real students, with major integrations to student rosters, etc), it was interesting to hear both Ballhaus and Blot take a very different approach and give what appears to be much more conservative estimates. Learn Ultra should be available for limited-functionality pilots for specific faculty (e.g. for courses not using the LMS heavily) by Fall 2016 and more broadly for institutions in Spring 2017, leading to general availability in Summer or Fall 2017.

    It is encouraging that Blackboard appears to be increasing its focus on getting the core LMS product updates, and we have also noticed a tighter message about Ultra over the past two months. There is now a Learn Ultra preview for educators, where people can sign up and play around with courses both in Original View (what you know as Learn 9.1) and Ultra View (the new UX). Part of the purpose of this preview is to enable customers to get a better feel of Learn SaaS and also to help them determine whether a Fall 2016 or a Spring 2017 Learn Ultra pilot makes sense for them.

    We will bring you more analysis of the Learn Ultra preview and of the broader analysis of the organizational changes at Blackboard in future posts. Stay tuned, and you can also sign up for more information on our upcoming e-Literate LMS subscription service.

    "2016 is going to be an eventful year for the LMS" ® by @mfeldstein67

    — Phil Hill (@PhilOnEdTech) February 19, 2016

    The post Blackboard CEO’s First 100 Days: Reorganization and Learn Ultra Updates appeared first on e-Literate.

    Storage on Azure

    Pat Shuff - Thu, 2016-04-21 01:07
    Yesterday we were out on a limb. Today we are going to be skating on thin ice. Not only do I know less about Azure than AWS but Microsoft has significantly different thoughts and solutions on storage than the other two cloud vendors. First, let's look at the available literature on Azure storage

    There are four types of storage available with Azure storage services; blob storage, table storage, queue storage, and file storage. Blob storage is similar to the Oracle Block Storage or Amazon S3 storage. It provides blocks of pages that can be used for documents, large log files, backups, databases, videos, and so on. Blobs are objects placed inside of containers that have characteristics and access controls. Table storage offers the ability to store key/attribute entries in a semi-structured dataset similar to a NoSQL database. The queue storage provides a messaging system so that you can buffer and sequence events between applications. The third and final is file based storage similar to dropbox or google docs. You can read and write files and file shares and access them through SMB file mounts on Windows systems.

    Azure storage does give you the option of deciding upon your reliability model by selecting the replication model. The options are locally triple redundant storage, replication between two data centers, replication between different geographical locations, or read access geo-redundant storage.

    Since blob storage is probably more relevant for what we are looking for, let's dive a little deeper into this type of storage. Blobs can be allocated either as block blobs or page blobs. Block blobs are aggregation of blocks that can be allocated in different sizes. Page blobs are of smaller fixed size chunks of 512 bytes for each page blob. Page blogs are the foundation of virtual machines and are used by default to support operating systems running in a virtual machine. Blobs are allocated into containers and inherit the characteristics of the container. Blobs are accessed via REST apis. The address of a blob is formatted as http://(account-name) Note that the account name is defined by the user. It is important to note that the account-name is not unique to your account. This is something that you create and Microsoft adds it to their DNS so that your ip address on the internet can be found. You can't choose simple names like test, testing, my, or other common terms because they have been allocated by someone else.

    To begin the process we need to log into the Azure portal and browser for the Storage create options.

    Once we find the storage management page we have to click the plus button to add a new storage resource.

    It is important to create a unique name. This name will be used as an extension of the REST api and goes in front of the server address. This name must be unique so picking something like the word "test" will fail since someone else has already selected it.

    In our example, we select wwpf which is an abbreviation for a non-profit that I work with, who we play for. We next need to select the replication policy to make sure that the data is highly available.

    Once we are happy with the name, replication policy, resource group, and payment method, we can click Create. It takes a while so we see a deploying message at the top of the screen.

    When we are finished we should see a list of storage containers that we have created. We can dive into the containers and see what services each contains.

    Note that we have the option of blob, table, queue, and files at this point. We will dive into the blob part of this to create raw blocks that can be used for backups, holding images, and generic file storage. Clicking on the blob services allows us to create a blob container.

    Note that the format of the container name is critical. You can't use special characters or capital letters. Make sure that you follow the naming convention for container names.

    We are going to select a blob type container so that we have access to raw blocks.

    When the container is created we can see the REST api point for the newly created storage.

    We can examine the container properties by clicking on the properties button and looking at when it was created, lease information, file count, and other things related to container access rights.

    The easiest way to access this newly created storage is to do the same thing that we did with Oracle Storage. We are going to use the CloudBerry Explorer. In this gui tool we will need to create an attachment to the account. Note that the tool used for Azure is different from the Oracle and Amazon tools. Each cost a little money and they are not the same tool unfortunately. They also only work on a Windows desktop which is challenging if you use a Mac of Linux desktop.

    To figure out your access rights, go to the storage management interface and click on the key at the top right. This should open up a properties screen showing you the account and shared access key.

    From here we can access the Azure blob storage and drag and drop files. We first add the account information then navigate to the blob container and can read and write objects.

    In this example, we are looking at virtual images located on our desktop "E:\" drive and can drag and drop them into a blob container for use by an Azure compute engine.

    In summary, Azure storage is very similar to Amazon S3 and Oracle Storage Cloud Services. The cost is similar. The way we access it is similar. The way we protect and restrict access to it is similar. We can address it through a REST api (which we did not detail) and can access it from our desktop or compute server running in Azure. Overall, storage in the cloud is storage in the cloud. You need to examine your use cases and see which storage type works best for you. Microsoft does have an on-premise gateway product called Azure SimpleStor which is similar to the Amazon Storage Gateway or the Oracle Cloud Storage Appliance. It is more of a hardware solution that attaches via iSCSI to existing servers.

    Learning to answer questions for yourself!

    Tim Hall - Thu, 2016-04-21 00:56

    notes-514998_640It’s not important that you know the answer. It’s important you know how to get the answer!

    I’m pretty sure I’ve written this before, but I am constantly surprised by some of the questions that come my way. Not surprised that people don’t know the answer, but surprised they don’t know how to get the answer. The vast majority of the time someone asks me a question that I can’t answer off the top of my head, this is what I do in this order.

    1. Google their question, often using the subject line of their post or email. A lot of the time, the first couple of links will give me the answer. Sometimes it’s one of my articles that gives me the answer.

    Data Encryption at Rest in Oracle MySQL 5.7

    Pythian Group - Wed, 2016-04-20 12:28


    I’ve previously evaluated MariaDB’s 10.1 implementation of data encryption at rest (, and recently did the same for Oracle’s implementation ( in their MySQL 5.7.


    First, here’s a walkthrough of enabling encryption for MySQL 5.7:

    1. Install keyring plugin.

    1a. Add the following to the [mysqld] section of /etc/my.cnf:


    <script src=””></script>
    1b. Restart the server:

    service mysqld restart

    1c. Verify:

    | keyring_file | ACTIVE        |

    2. Ensure innodb_file_per_table is on.

    2a. Check.

    mysql> show global variables like 'innodb_file_per_table';
    | Variable_name         | Value |
    | innodb_file_per_table | ON    |

    2b. If OFF, add the following to the [mysqld] section of /etc/my.cnf, restart, and alter each existing table to move it to its own tablespace:


    Get list of available InnoDB tables:

    mysql>select table_schema, table_name, engine from information_schema.tables where engine='innodb' and table_schema not in ('information_schema');

    Run ALTER … ENGINE=INNODB on each above InnoDB tables:



    Next, I walked through some testing.

    1. Create some data.

    [root@localhost ~]# mysqlslap --concurrency=50 --number-int-cols=2 --number-char-cols=3 --auto-generate-sql --auto-generate-sql-write-number=10000 --no-drop

    2. Observe the mysqlslap.t1 table is not automatically encrypted. Unlike MariaDB’s implementation, there is not an option to encrypt tables by default.

    2a. Via the mysql client:

    Empty set (0.05 sec)

    2b. Via the command line:

    (Install xxd if required.)

    [root@localhost ~]# yum install vim-common
    [root@localhost ~]# xxd /var/lib/mysql/mysqlslap/t1.ibd | grep -v "0000 0000" | less
    0010dc0: 5967 4b30 7530 7942 4266 664e 6666 3143  YgK0u0yBBffNff1C
    0010dd0: 5175 6470 3332 536e 7647 5761 3654 6365  Qudp32SnvGWa6Tce
    0010de0: 3977 6576 7053 3730 3765 4665 4838 7162  9wevpS707eFeH8qb
    0010df0: 3253 5078 4d6c 6439 3137 6a7a 634a 5465  2SPxMld917jzcJTe

    3. Insert some identifiable data into the table:

    mysql> <strong>insert</strong> into mysqlslap.t1 values (1,2,"private","sensitive","data");
    Query OK, 1 row affected (0.01 sec)
    mysql> select * from mysqlslap.t1 where charcol2="sensitive";
    | intcol1 | intcol2 | charcol1 | charcol2  | charcol3 |
    |       1 |       2 | private  | sensitive | data     |
    1 row in set (0.02 sec)

    4. Observe this data via the command line:

    [root@localhost ~]# xxd /var/lib/mysql/mysqlslap/t1.ibd | grep -v "0000 0000" | less
    04fa290: 0002 7072 6976 6174 6573 656e 7369 7469  ..privatesensiti

    5. Encrypt the mysqlslap.t1 table:

    mysql> <strong>alter</strong> table mysqlslap.t1 encryption='Y';
    Query OK, 10300 rows affected (0.31 sec)
    Records: 10300  Duplicates: 0  Warnings: 0

    6. Observe the mysqlslap.t1 table is now encrypted:

    6a. Via the mysql client:

    | mysqlslap    | t1         | ENCRYPTION="Y" |

    6b. Via the command line:

    [root@localhost ~]# xxd /var/lib/mysql/mysqlslap/t1.ibd | grep "private"
    [root@localhost ~]#

    6c. Observe snippet of the file:

    [root@localhost ~]# xxd /var/lib/mysql/mysqlslap/t1.ibd | grep -v "0000 0000" | less
    0004160: 56e4 2930 bbea 167f 7c82 93b4 2fcf 8cc1  V.)0....|.../...
    0004170: f443 9d6f 2e1e 9ac2 170a 3b7c 8f38 60bf  .C.o......;|.8`.
    0004180: 3c75 2a42 0cc9 a79b 4309 cd83 da74 1b06  &amp;lt;u*B....C....t..
    0004190: 3a32 e104 43c5 8dfd f913 0f69 bda6 5e76  :2..C......i..^v

    7. Observe redo log is not encrypted:

    [root@localhost ~]# xxd /var/lib/mysql/ib_logfile0 | less
    23c6930: 0000 0144 0110 8000 0001 8000 0002 7072
    23c6940: 6976 6174 6573 656e 7369 7469 7665 6461  ivatesensitiveda
    23c6950: 7461 3723 0000 132e 2f6d 7973 716c 736c  ta7#..../mysqlsl

    This is expected because the documentation ( reports encryption of files outside the tablespace is not supported: “Tablespace encryption only applies to data in the tablespace. Data is not encrypted in the redo log, undo log, or binary log.”


    I found in my testing of MariaDB’s implementation of data encryption at rest that there were still places on the file system that a bad actor could view sensitive data. I’ve found the same in this test of Oracle’s implementation. Both leave data exposed in log files surrounding the tablespace files.


    As a bonus to this walkthrough, during this testing, the table definition caught my eye:

    mysql> show create table mysqlslap.t1\G
    *************************** 1. row ***************************
           Table: t1
    Create Table: CREATE TABLE `t1` (
      `intcol1` int(32) DEFAULT NULL,
      `intcol2` int(32) DEFAULT NULL,
      `charcol1` varchar(128) DEFAULT NULL,
      `charcol2` varchar(128) DEFAULT NULL,
      `charcol3` varchar(128) DEFAULT NULL
    1 row in set (0.00 sec)

    As discussed in, the MariaDB implementation does not include the “encrypted=yes” information in the table definition when tables are implicitly encrypted.

    I was curious what would happen if I did a mysqldump of this encrypted table and attempted to restore it to a nonencrypted server. DBAs expect mysqldump to create a portable file to recreate the table definition and data on a different version of mysql. During upgrades, for example, you might expect to use this for rollback.

    Here is my test. I first did the dump and looked inside the file.

    [root@localhost ~]# mysqldump mysqlslap t1 > mysqlslap_t1_dump
    [root@localhost ~]# less mysqlslap_t1_dump
    CREATE TABLE `t1` (
      `intcol1` int(32) DEFAULT NULL,
      `intcol2` int(32) DEFAULT NULL,
      `charcol1` varchar(128) DEFAULT NULL,
      `charcol2` varchar(128) DEFAULT NULL,
      `charcol3` varchar(128) DEFAULT NULL
    <strong>INSERT</strong> INTO `t1` VALUES (

    As expected, that definition makes the dump less portable. The restore from dump is not completed and throws an error (this is not remedied by using –force):

    On a slightly older 5.7 version:

    mysql> select version();
    | version() |
    | 5.7.8-rc  |
    [root@centosmysql57 ~]# mysql mysqlslap < mysqlslap_t1_dump
    ERROR 1064 (42000) at line 25: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'ENCRYPTION='Y'' at line 7

    On a different fork:

    MariaDB [(none)]> select version();
    | version()       |
    | 10.1.12-MariaDB |
    1 row in set (0.00 sec)
    [root@maria101 ~]# mysql mysqlslap < mysqlslap_t1_dump
    ERROR 1911 (HY000) at line 25: Unknown option 'ENCRYPTION'

    This doesn’t have anything to do with the encrypted state of the data in the table, just the table definition. I do like the encryption showing up in the table definition, for better visibility of encryption. Maybe the fix is to have mysqldump strip this when writing to the dump file.

    Categories: DBA Blogs

    Well Run: Oracle PaaS and Fusion Middleware Community Forum XXII

    Usable Apps - Wed, 2016-04-20 12:03

    The Oracle Fusion Middleware Partner Community Forum, run by Oracle EMEA Alliances and Channels Community Manager Jürgen Kress (@soacommunity) is the premiere EMEA event to attend for Oracle Partners who build solutions using Oracle technology.

    This forum is one awesome annual opportunity for partners to come together to show off their greatness, find out about the latest technology from Oracle, and take advantage of hands-on workshops. The forum is about learning and working collaboratively and generating ideas that'll convert into cloud and middleware business wins for partners.

    The Oracle PaaS and Fusion Middleware Community Forum XXII (note that emphasis on PaaSwas held in Valencia, Spain, March 15-18, 2016, which coincided with the amazing color and sounds of Valencia's San José Fallas festival.

    OAUX at the OFMFORUM

    We run this cloud: #OAUX at the #OFMFORUM 

    Jürgen’s report from the event makes impressive reading: There were over 650 attendees, of which about 80% were external (to Oracle) partners, with many already working on live PaaS projects and with lots more PaaS projects in the planning phase. This year saw a big attendance from the Oracle U.S. HQ  team, too, along with some 22 OTN ACEs from around the world. (I caught up with my old pal Rolando Carrasco [@borland_c] from Mexico and many others.)

    All of the sessions were rated "excellent" or "good" in the feedback survey (not one was rated average or below). Bottom line: The event is excellent value for money and a very attractive proposition for partners working on PaaS and middleware projects, and it is of great interest to those working with Oracle ADF, BPM, JET, MAF, SOA, and so on and the range of Oracle Cloud services (ICS, IoT, JCS, MCS, PCS, and more). 

    This year a User Experience and BPM track was added to the program. With so many PaaS partners interested in cloud UX as a competitive differentiator, the forum was a "must-do" outreach opportunity for Oracle Applications User Experience (OAUX). I represented OAUX, co-presenting with Lonneke Dikmans (@lonnekedikmans), Managing Partner at eProseed NL, and spoke about the OAUX Cloud UX Rapid Development Kit (RDK) and how we enable partners to rapidly design, build, and deploy reusable simplified UI cloud solutions with Oracle ADF, Oracle Alta UI, and other technology.

    Lonneke talks about winning business using developer productivity offered by the RDK

    Lonneke talks about winning business using the developer productivity and other benefits offered by the RDK. 

    (Of course, Valencia is La Ciudad del Running, too. How could I not attend?)

    I demoed the Release 10 Cloud UX RDK and showed off the design patterns, eBooks, and developer workspace involved, highlighting key features of the RDK, such as how partners can integrate typical web services easily and quickly deploy a winning simplified UI custom app to PaaS. An excellent overview of the RDK is available on the Fusion Applications Developer Relations blog.
    Lonneke provided the partner side of our enablement story by telling the audience about the eProseed experience of developing a smart city IoT solution to enable modern ways of working (the famous activity-based approach of Erik Veldhoen). eProseed achieved some impressive results from the project through customer insight, storyboards, the RDK's UX design patterns, Oracle ADF templates and components, Oracle Alta UI, and with workshops that directly involved the customer and OAUX. eProseed benefitted from:
    • A higher quality solution where the concept of a modern working style fit easily with the design and development framework of the RDK. 
    • Reusable, consistent development through library and component reuse across different use cases.
    • Increased developer productivity resulting in a faster time to market, a quicker transfer of the ownership of the solution to the customer, and tighter collaboration between their design and development teams. Nothing got "lost in translation."
    • An impressive user experience based on the simplified UI Glance, Scan, Commit design philosophy. This approach crafts solutions that are easy to use because of the application of proven UX design patterns, which also enables users to make seamless transitions between the cloud's integrated apps while maintaining flexible ways of working across different devices using data in the cloud and responsive UIs.
    • An enhanced partner reputation for eProseed. Showcasing a solution with a clear eProseed UX signature offers customers confidence about pursuing solutions in the cloud.

    You can read more about the enablement event we held with eProseed last year.

    Lonneke concluded our presentation by illustrating real results by demoing the very cool "Licht is Uit" IoT solution.

    eProseed is right up there in EMEA with providing solutions that have a smart user experience, are built with Oracle technology, and that leverage OAUX enablement. Catch up with any eProseed sessions you can at other events! 

    Jurgen and Lonneke at the speaker's dinner. Post-sailing and post-running!

    Jürgen and Lonneke at the speakers' dinner. Post-sailing and post-running! 

    And the running? Ah yes. Oracle folks and partners did some running in the city proper and on the coast, too, taking in all of the sights. I am pleased to say that the OAUX running club Design Time @ Run Time technical shirts were a big hit, too. Watch out for those at the next event!

    The Ultan and Lonneke Inaugural EMEA Forum 10K

    The Lonneke and Ultan Inaugural EMEA Forum 10K in Valencia. A half marathon is planned for next year's event.

    Partners with access to the community's Oracle Beehive space can now get my own and Lonneke's presentation at the Forum.


    See you at Forum XXIII!

    Log Buffer #470: A Carnival of the Vanities for DBAs

    Pythian Group - Wed, 2016-04-20 11:39

    This Log Buffer Edition rounds up blog posts from Oracle, SQL Server and MySQL.


    jq is a unix utility that can parse json files and pull out individual elements – think of it as sed/awk for json files.

    Some thoughts about analyzing performance problems.

    Microsites: Add a Map Component and Publish your Microsite

    New Installation Cookbook: Oracle Linux 6.7 with Oracle RAC

    Are you worried about what to do now that Discoverer is almost out of support

    Automatic Big Table Caching in RAC


    SQL Server:

    sp_msforeachdb: Improving on an Undocumented Stored Procedure

    The Practical Problems of Determining Equality and Equivalence in SQL

    TEMPDB Enhancements in SQL Server 2016

    Performance Surprises and Assumptions : DATEADD()

    Knee-Jerk Performance Tuning : Incorrect Use of Temporary Tables



    Virtual Hosting with vsftpd + TLS encryption and MySQL on Ubuntu 15.10

    MySQL 5.7.12 – Part 4: A new MySQL Command Line Shell

    Database Firewall Filter in MaxScale 1.4.1

    Orchestrator-agent: How to recover a MySQL database

    Rosetta Stone: MySQL, Pig and Spark (Basics)

    Categories: DBA Blogs

    5 Reasons to Take a New Look at Oracle ERP Cloud

    Linda Fishman Hoyle - Wed, 2016-04-20 11:17

    A Guest Post by Vice President Terrance Wampler, Oracle Financials Product Strategy (pictured left) 

    Is your finance team spending its time collecting data, updating spreadsheets, and developing reports? If so, it's probably just keeping score for you rather than changing the game. To change the game, your team needs modern technology so it can do value-added work. Wampler presents us with five reasons why nearly two thousand companies around the world have chosen Oracle ERP Cloud―and why you should too. This is a summary of the post and here's a link to the full article.

    Reason #1: Increase Insight
    Insight is listed as #5 in the article, but it’s the one reason why every customer chooses Oracle ERP Cloud. It has a unique reporting platform built with a multi-dimensional model that pulls data directly from the general ledger. This provides real-time reporting on live financial information from a single, central point. Now, finance teams have up-to-the-minute insight into profitability and performance instead of waiting for month end.
    Reason #2: Optimize
    Most people think “big” when they think of Oracle. But SaaS has opened the door to companies that never would have considered us before. Many of our ERP Cloud customers are fast growing companies ($10-$20M revenue) that want the power of an enterprise system. They’ll never have to worry about outgrowing ERP Cloud, which is capable of processing over 230 million transactions per hour. And, they’ll be ready to expand geographically with country-specific tax and legal reporting, payment processing, and budgetary control.
    Reason #3: Simplify
    Traditional ERP systems did wonders for process efficiency, and now it’s time to do the same for information workers. ERP Cloud is designed to be ultra efficient for today’s data-driven, on-the-go employees. They know what needs their immediate attention with real-time graphical indicators on dashboards and work areas. The new infolets page also allows them to see the status of key performance indicators and drill further for more detail.
    Reason #4: Modernize
    Oracle ERP Cloud is made for the digital workplace, enabling finance teams to work socially, collaboratively, and logically. Social capabilities are embedded in the business processes, so team members can always track and refer back to conversations immediately. Other innovations include new integrations between Oracle Procurement Cloud and Oracle Project Management Cloud. Buyers can manage supplier negotiations and requests-for-quote as projects instead of negotiation-by-spreadsheet. They can also author documents using standard, pre-approved contract language; use embedded social workflows to collaborate and review the documents; upload supplier responses into the system; and compare bids using live, negotiation-monitoring analytics.
    Reason #5: Enforce Controls
    ERP Cloud meets the highest security requirements and standardizes data, rules and processes across the enterprise, including accounting, taxes, and payments.

    PS360: A Utility to Extract and Present PeopleSoft Configuration and Performance Data

    David Kurtz - Wed, 2016-04-20 11:16
    IntroductionMost of a PeopleSoft application is itself stored in the database in PeopleTools tables.  Therefore there is lot of information about the configuration and operation of a PeopleSoft system.  There are also performance metrics, particularly about batch processes.
    PS360 is a new tool on which I am working.  It just uses SQL scripts to extract that data to html files, and package them up in a zip file so that they can be sent for further analysis.  The style and method is closely modelled on Enkitec's EDB360 by Carlos Sierra.  This is another free tool used for health check and performance analysis of any Oracle database system.  PS360 aims to gather PeopleSoft specific information that is not presented by EDB360.  It also runs in Oracle's SQL*Plus tool, and so is only available for use with an Oracle database.
    Every section of PS360 is just the output of a SQL query, sometimes pre-processing is done in an anonymous PL/SQL block.  It does not install anything into the database, and does not update any table (other than the PLAN_TABLE which is used for temporary working storage).  Each report is in tabular and/or graphical format.  All the charts are produced with the Google chart API.
    The output falls into three sections.  
    • Configuration: Simple reports of certain configuration tables.
    • Performance Metrics: Process Scheduler and Application Engine timings
    • Consistency Checks: both within PeopleTools tables and between the PeopleSoft and Oracle catalogues.
    Sample PS360 Index Page generated on PeopleSoft Demo DatabaseInstructionsThe tool can be run by anyone with access to the PeopleSoft Owner database user (usually SYSADM).  That user will already have privilege to read the Oracle catalogue.
    Download the tool and unzip it into a directory.  Navigate to the ps360 (master) directory, open SQL*Plus and connect as SYSADM.  Execute the script ps360.sql.  The output will be written to a zip file in the same directory.  Unpack that zip file on your own PC and open the file ps360_[database name]_0_index.html with a browser.
    Feedback EnhancementsI am looking for feedback about the tool, and suggestions for further enhancements.
    Please either leave comments here or e-mail me at

    ©David Kurtz

    Partition Storage -- 2 : New Rows Inserted in 12.1 Partitioned Table

    Hemant K Chitale - Wed, 2016-04-20 09:41
    Following up from the previous post,  given the current sizes of the Partitions (The underlying Tablespace is AUTOALLOCATE)  :

    SQL> col segment_name format a30
    SQL> col partition_name format a12
    SQL> l
    1 select segment_name, partition_name, segment_type, bytes/1024
    2 from user_segments
    3 where segment_name like 'MY_PART_%'
    4* order by 1,2
    SQL> /

    ------------------------------ ------------ ------------------ ----------

    8 rows selected.


    I shall now insert rows so that a Partition has to grow beyond the first extent.

    SQL> insert into my_part_tbl                      
    2 select 25, 'New Row'
    3 from dual
    4 connect by level < 100001;

    100000 rows created.

    SQL> insert into my_part_tbl
    2 select 45, 'New Row'
    3 from dual
    4 connect by level < 500001;

    500000 rows created.

    SQL> /

    500000 rows created.

    SQL> commit;

    Commit complete.

    SQL> select segment_name, partition_name, segment_type, bytes/1024
    2 from user_segments
    3 where segment_name like 'MY_PART_%'
    4 order by 1,2;

    ------------------------------ ------------ ------------------ ----------

    8 rows selected.


    So, now, the first Partition P_100 has grown to exceed 24MB. Let's check the Extent Sizes.

    SQL> select extent_id, blocks, blocks*8192/1024 Size_KB
    2 from user_extents
    3 where segment_name = 'MY_PART_TBL'
    4 and segment_type = 'TABLE PARTITION'
    5 and partition_name = 'P_100'
    6 order by 1;

    ---------- ---------- ----------
    0 1024 8192
    1 1024 8192
    2 1024 8192

    SQL> l
    1 select extent_id, blocks, blocks*8192/1024 Size_KB
    2 from user_extents
    3 where segment_name = 'MY_PART_TBL_NDX'
    4 and segment_type = 'INDEX PARTITION'
    5 and partition_name = 'P_100'
    6* order by 1
    SQL> /

    ---------- ---------- ----------
    0 8 64
    1 8 64
    2 8 64
    3 8 64
    4 8 64
    5 8 64
    6 8 64
    7 8 64
    8 8 64
    9 8 64
    10 8 64
    11 8 64
    12 8 64
    13 8 64
    14 8 64
    15 8 64
    16 128 1024
    17 128 1024
    18 128 1024
    19 128 1024
    20 128 1024
    21 128 1024
    22 128 1024
    23 128 1024
    24 128 1024
    25 128 1024
    26 128 1024
    27 128 1024
    28 128 1024
    29 128 1024
    30 128 1024
    31 128 1024
    32 128 1024
    33 128 1024
    34 128 1024
    35 128 1024
    36 128 1024
    37 128 1024
    38 128 1024
    39 128 1024
    40 128 1024
    41 128 1024
    42 128 1024

    43 rows selected.


    So, the Table Partition grew as 8MB Extents but the Index Partition used the normal AutoAllocate routine of 64KB Extents for the first 1MB (16 Extents) followed by 1MB Extents.

    What might have been the expected size of the Table Partition ?   ** see the UPDATE at the end of this post.

    SQL> exec dbms_stats.gather_table_stats('','MY_PART_TBL',granularity=>'ALL');

    PL/SQL procedure successfully completed.

    SQL> select partition_name, trunc(1.2*avg_row_len*num_rows/1024) Expected_KB
    2 from user_tab_partitions
    3 where table_name = 'MY_PART_TBL'
    4 order by 1
    5 /

    ------------ -----------
    P_100 14179
    P_200 0
    P_300 0
    P_400 0
    P_MAX 0


    I use a 20% factor to account for PCTFREE and block/row overheads.  **  see the UPDATE at the end of  this post.

    Next, what if I use Parallel Insert ?

    SQL> alter table my_part_tbl parallel 4;

    Table altered.

    SQL> alter session enable parallel dml ;

    Session altered.

    SQL> insert /*+ PARALLEL (MANUAL) */
    2 into my_part_tbl
    3 select 125, 'New Row'
    4 from dual
    5 connect by level < 100001;

    100000 rows created.

    SQL> commit;

    Commit complete.

    SQL> insert /*+ PARALLEL (MANUAL) */
    2 into my_part_tbl
    3 select 125, 'New Row'
    4 from dual
    5 connect by level < 500001;

    500000 rows created.

    SQL> commit;

    Commit complete.

    SQL> insert /*+ PARALLEL (MANUAL) */
    2 into my_part_tbl
    3 select 125, 'New Row'
    4 from dual
    5 connect by level < 500001;

    500000 rows created.

    SQL> commit;

    Commit complete.

    SQL> l
    1 select segment_name, partition_name, segment_type, extents, bytes/1024
    2 from user_segments
    3 where segment_name like 'MY_PART_%'
    4* order by 1,2
    SQL> /

    ------------------------- ------------ ------------------ ---------- ----------

    8 rows selected.


    Note the size of P_200.  Parallel Insert actually allocated more extents because each PX server process created at least 1 extent for the rows it inserted (Note : For a smaller row-count it is not necessary that each PX server would have inserted rows, some may not have). So, Parallel Insert can result in more allocated space.

    ** UPDATE :  What about the "expected size" computation based on AVG_ROW_LEN x NUM_ROWS ?

    SQL> select partition_name, avg_row_len, num_rows, blocks, trunc(1.2*avg_row_len*num_rows/8192) Expected_Blocks
    2 from user_tab_partitions
    3 where table_name = 'MY_PART_TBL'
    4 order by 1;

    ------------ ----------- ---------- ---------- ---------------
    P_100 11 1100001 3022 1772
    P_200 12 1100001 3668 1933
    P_300 26 1 1006 0
    P_400 28 1 1006 0
    P_MAX 0 0 0 0


    This shows how any"formula" based on AVG_ROW_LEN and NUM_ROWS can be inaccurate in forecasting the expected size (blocks used).  Also note that P_100 was populated with a Serial Insert while P_200 was populated with a Parallel (Degree=4) Insert.

    Categories: DBA Blogs

    No Arguments Here – Use xargs for File Management Performance

    Pythian Group - Wed, 2016-04-20 09:30


    Database Administrators and System Administrators have this in common: managing a large number of log files is just part of the job on Linux systems.

    Tools such as logrotate significantly simplify the file management task for routinely created log files. Even so, there are still many ‘opportunities’ to exercise your command line fu to manage thousands or millions of files.  These may be files that need to be moved, removed or searched.

    When the files span multiple directories the find command is often used. The following command for instance will find all log files of a certain age and size and remove them.

    find . -name "*.log" -size +1M -exec rm  {} \;


    For a few files this will work just fine, but what happens if the number of files to be processed is several thousands, or even millions?

    The xargs Difference

    Let’s first create 200k files to use for testing. These files will all be empty, there is no need for any content for these tests.

    The script can be used to create the directories and empty files.

    As it takes some time to create the files, we will not use the rm command here, but rather just the file command. The command will be timed as well.

    #  time find . -type f -name file_\* -exec file {} \; >/dev/null
    real    1m24.764s
    user    0m4.624s
    sys     0m12.581s

    Perhaps 1 minute and 24 seconds seems to be a reasonable amount of time to process so many files.

    It isn’t.

    Let’s use a slightly different method to process these files, this time by adding xargs in a command pipe.

     time find . -type f -name file_\* | xargs file >/dev/null
    real    0m0.860s
    user    0m0.456s
    sys     0m0.432s

    Wait, what?!  0.8 seconds? Can that be correct?

    Yes, it is correct. Using xargs with find can greatly reduce the resources needed to iterate through files.

    How then, is is possible for the command that used xargs to complete so much faster than the command that did not use xargs?

    When iterating through a list of files with the -exec  argument to the find command, a new shell is forked for each execution of find.

    For a large number of files this requires a lot of resources.

    For demonstration purposes I will be using the ‘file’ command rather than ‘rm’.

    Could it be that the xargs method may have benefited from the caching effects of running the first find command?

    Could be – let’s run find … -exec again and see if it benefits from caching.

    # time find . -type f -name file_\* -exec file {} \; >/dev/null
    real    1m25.722s
    user    0m3.900s
    sys     0m11.893s

    Clearly any caching didn’t help find … -exec.

    Why Is xargs Fast?

    Why is the use of xargs so much faster than find? In short it is due to find starting a new process for each file it finds when the -exec option is used.

    The command ‘find | xargs’ was wrapped in a shell script to facilitate the use of strace.

    The script takes 2 arguments; the number of files to pipe to xargs and the number files that xargs should send to the file command for each invocation of file.

    The number of files to process is controlled by piping the output of find to head.

    The xargs –max-args argument is used to control how many arguments are sent to each invocation of find.

    We can now use strace with the -c option; -c accumulates a count of all calls along with timing information.

    Calling the script to run for the first 10000 files, with 1000 files sent to each invocation of find:

    # strace -c -f  ./ 10000 1000
    MAX_FILES: 10000
    MAX_ARGS: 1000
    Process 11268 attached
    Process 11269 attached
    Process 11267 resumed
    Process 11269 detached
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     99.55    0.080017        5001        16         2 wait4
      0.35    0.000280           0     12372           newfstatat
      0.09    0.000074           0       208           getdents
      0.01    0.000006           0     10000           lstat
      0.00    0.000000           0       199           read
      0.00    0.000000           0       276         1 write
      0.00    0.000000           0       384        91 open
      0.00    0.000000           0       313         4 close
      0.00    0.000000           0        68        42 stat
      0.00    0.000000           0       189           fstat
      0.00    0.000000           0         5         1 lseek
      0.00    0.000000           0       209           mmap
      0.00    0.000000           0        71           mprotect
      0.00    0.000000           0        37           munmap
      0.00    0.000000           0        72           brk
      0.00    0.000000           0        41           rt_sigaction
      0.00    0.000000           0        80           rt_sigprocmask
      0.00    0.000000           0         2           rt_sigreturn
      0.00    0.000000           0        13        12 ioctl
      0.00    0.000000           0        77        77 access
      0.00    0.000000           0         2           pipe
      0.00    0.000000           0         6           dup2
      0.00    0.000000           0         1           getpid
      0.00    0.000000           0        14           clone
      0.00    0.000000           0        14           execve
      0.00    0.000000           0         2           uname
      0.00    0.000000           0         4         1 fcntl
      0.00    0.000000           0       206           fchdir
      0.00    0.000000           0         5           getrlimit
      0.00    0.000000           0         1           getuid
      0.00    0.000000           0         1           getgid
      0.00    0.000000           0         1           geteuid
      0.00    0.000000           0         1           getegid
      0.00    0.000000           0         1           getppid
      0.00    0.000000           0         1           getpgrp
      0.00    0.000000           0        14           arch_prctl
      0.00    0.000000           0         2         1 futex
      0.00    0.000000           0         1           set_tid_address
      0.00    0.000000           0         1           set_robust_list
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.080377                 24910       232 total

    The largest chunk of time was spent in the wait4 system call. These are waits on execve, of which there were 14.

    Of the 14 calls to execve, there was 1 each for the use of bash (the script itself), find, head and xargs, leaving 10 calls to be consumed by file.

    The following command can be used if you would like to try this yourself:

    strace  -f -e trace=execve  ./ 10000 1000  2>&1 | grep execve

    What happens when the same type of test is run against find with the -exec argument?

    There is no method (that I can find in the man page anyway) by which we can limit the number of files that are sent to the program specified in the -exec argument of find.

    We can still learn what is going on, it is just necessary to wait 1.5 minutes for the command to complete.

    # strace -c -f find . -type f -name file_\*  -exec file {} \; >/dev/null
    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     96.80    4.101094          21    200000           wait4
      0.69    0.029305           0    200000           clone
      0.46    0.019278           0   2602351   1400007 open
      0.44    0.018833           0    600001           munmap
      0.31    0.013108           0   3200017           mmap
      0.30    0.012715           0   1401173           fstat
      0.16    0.006979           0   1200006   1200006 access
      0.15    0.006543           0   1202345           close
      0.15    0.006288           0   1000004    600003 stat
      0.13    0.005632           0   1000004           read
      0.12    0.004981           0    200000           lstat
      0.09    0.003704           0    600026           brk
      0.07    0.003016           0   1000009           mprotect
      0.07    0.002776           0    200001    200000 ioctl
      0.03    0.001079           0    201169           newfstatat
      0.02    0.000806           0      2347           getdents
      0.01    0.000600           0    200000           write
      0.00    0.000003           0    200001           arch_prctl
      0.00    0.000002           0    202341           fchdir
      0.00    0.000000           0         3           rt_sigaction
      0.00    0.000000           0         1           rt_sigprocmask
      0.00    0.000000           0    400001    200000 execve
      0.00    0.000000           0         1           uname
      0.00    0.000000           0         1           fcntl
      0.00    0.000000           0         2           getrlimit
      0.00    0.000000           0         2         1 futex
      0.00    0.000000           0         1           set_tid_address
      0.00    0.000000           0         1           set_robust_list
    ------ ----------- ----------- --------- --------- ----------------
    100.00    4.236742              15811808   3600017 total


    You may have noticed there are twice as many calls to execve than there were files to process.

    This is due to something referenced in the comments of Unless a full path name is specified when running a command, the PATH variable is searched for that command. If the command is not found by the first invocation of execve, then another attempt is made the next directory in PATH.

    The following example shows the difference between using the command name only, and then using the fully pathed name of the file command.

    # strace -e trace=execve -f find -maxdepth 1 -type f -name \*.sh  -exec file {} \;  2>&1 | grep execve
    execve("/usr/bin/find", ["find", "-maxdepth", "1", "-type", "f", "-name", "*.sh", "-exec", "file", "{}", ";"], [/* 83 vars */]) = 0
    [pid  9267] execve("/usr/local/bin/file", ["file", "./"], [/* 83 vars */]) = -1 ENOENT (No such file or directory)
    [pid  9267] execve("/usr/bin/file", ["file", "./"], [/* 83 vars */]) = 0
    [pid  9268] execve("/usr/local/bin/file", ["file", "./"], [/* 83 vars */]) = -1 ENOENT (No such file or directory)
    [pid  9268] execve("/usr/bin/file", ["file", "./"], [/* 83 vars */]) = 0
    [pid  9269] execve("/usr/local/bin/file", ["file", "./"], [/* 83 vars */]) = -1 ENOENT (No such file or directory)
    [pid  9269] execve("/usr/bin/file", ["file", "./"], [/* 83 vars */]) = 0
    # strace -e trace=execve -f find -maxdepth 1 -type f -name \*.sh  -exec /usr/bin/file {} \;  2>&1 | grep execve
    execve("/usr/bin/find", ["find", "-maxdepth", "1", "-type", "f", "-name", "*.sh", "-exec", "/usr/bin/file", "{}", ";"], [/* 83 vars */]) = 0
    [pid  9273] execve("/usr/bin/file", ["/usr/bin/file", "./"], [/* 83 vars */]) = 0
    [pid  9274] execve("/usr/bin/file", ["/usr/bin/file", "./"], [/* 83 vars */]) = 0
    [pid  9275] execve("/usr/bin/file", ["/usr/bin/file", "./"], [/* 83 vars */]) = 0
    Too Much Space

    Regardless of how bad a practice it may be, there will be times that file and directory names may contain space characters. Literal spaces, newlines and tabs can all play havoc with file name processing;  xargs has you covered.

    Two files are created to demonstrate:


    # touch 'this filename has spaces' this-filename-has-no-spaces
    # ls -l
    total 0
    -rw-r--r-- 1 jkstill dba 0 Apr 15 09:28 this filename has spaces
    -rw-r--r-- 1 jkstill dba 0 Apr 15 09:28 this-filename-has-no-spaces

    What happens when the output of find it piped to xargs?


     find . -type f | xargs file
    ./this-filename-has-no-spaces: empty
    ./this:                        ERROR: cannot open `./this' (No such file or directory)
    filename:                      ERROR: cannot open `filename' (No such file or directory)
    has:                           ERROR: cannot open `has' (No such file or directory)
    spaces:                        ERROR: cannot open `spaces' (No such file or directory)

    The spaces in one of the filenames causes xargs to treat each word in the filename as a separate file.

    Because of this it is a good idea to use the -print0 and -0 args as seen in the following example. These arguments change the output terminator of find to the null character, as well as changing the input terminator of xargs to the null character to deal with space characters in file and directory names.


     find . -type f -print0  | xargs -0 file
    ./this-filename-has-no-spaces: empty
    ./this filename has spaces:    empty

    There is quite a bit more to xargs than this, I would encourage you to read the man page and experiment with the options to better learn how to make use of it.

    Hope For find

    For many versions of GNU find there is an easy modification that can be made to the command line that will cause the -exec option to emulate the method xargs uses pass input to a command.

    Simply by changing -exec command {} \; to  -exec command {} +, the find command will execute much faster than previously.

    Here the find command has matched the performance of xargs when processing 200k files:


    # time find . -type f -name file_\*  -exec file {} +  | wc
     200000  400000 8069198
    real    0m0.801s
    user    0m0.436s
    sys     0m0.404s

    This may mean a quick and simple change to maintenance scripts can yield a very large increase in performance.

    Does this mean there is no longer a need for xargs?  Not really, as xargs offers levels of control over the input to piped commands that simply are not available in the find command.

    If you’ve never used xargs, you should consider doing so, as it can reduce the resource usages on your systems and decrease the runtime for maintenance tasks.

    Categories: DBA Blogs

    Oracle Security Vulnerability Scoring Metric Change (CVSS)

    No, Oracle security vulnerabilities didn’t just get a whole lot worse this quarter.  Instead, Oracle updated the scoring metric used in the Critical Patch Updates (CPU) from CVSS v2 to CVSS v3.0 for the April 2016 CPU.  The Common Vulnerability Score System (CVSS) is a generally accepted method for scoring and rating security vulnerabilities.  CVSS is used by Oracle, Microsoft, Cisco, and other major software vendors.

    As we have discussed previously, CVSS v2 did score Oracle security vulnerabilities for the database, middleware, and applications lower than operating system and network component vulnerabilities.  Contrary to what many security researchers claim, the problem is with the CVSS standard, not manipulation of the scores by Oracle.  CVSS v2 puts a premium on the ability to compromise the entire operating system (i.e., root account) or device.  For most Oracle security vulnerabilities, across all products, it is very difficult to compromise the root operating system account by exploiting an Oracle Database, Fusion Middleware, or application (Oracle E-Business Suite, PeopleSoft, etc.) security bug.  Although, there are some exceptions mostly limited to the Oracle Database running on Microsoft Windows Server, which allow compromise of the administrator account.

    To account for this limitation in CVSS, Oracle included in the CPU advisory matrices for informational purposes only a “Partial+” to indicate where the entire database, middleware server, or application could be compromised.  However, this was not reflected in the score since the CVSS standard says a “Complete” impact “… is total information disclosure, resulting in all system files being revealed.”  As a result, Oracle CVSS v2 scores for critical or severe bugs tended to be 6.5 for the Oracle Database, 7.5 for Fusion Middleware, and 6.4 for applications like the Oracle E-Business Suite and PeopleSoft.

    CVSS v3.0 changes the scoring to put more of an emphasis on the asset or component being protected (i.e., database or application).  The key CVSS definition has changed from “system” to “impacted component.”  The scoring algorithm also includes more granularity for privileges required to exploit and the scope of the exploit, such as can a database attack compromise the underlying operating system.

    The Oracle CVSS v3.0 scores will be much higher now, especially for the Fusion Middleware and applications like Oracle E-Business Suite and PeopleSoft.  Critical Fusion Middleware security bugs will rise from 7.5 to 9.8.  Oracle E-Business Suite and PeopleSoft critical security bugs like unauthenticated SQL injection will jump from 6.4 to 9.8.  As almost all Oracle Database security bugs require database authentication, the Oracle Database CVSS scores will go from 6.5 to 8.8 for easy to exploit SQL injection vulnerabilities in PUBLIC packages.

    The critical risk associated with most Oracle security vulnerabilities is still critical.  Now the CVSS score properly reflects the critical nature of many of these bugs.

    Oracle Critical Patch Updates
    Categories: APPS Blogs, Security Blogs

    Oracle E-Business Suite 11i CPU Security Patches Only Available for Tier 1 Support Customers

    Oracle E-Business Suite 11i is impacted by 8 security vulnerabilities in the April 2016 CPU, which includes the Oracle Configurator and Oracle Complex Maintenance, Repair, and Overhaul security bugs listed under the Oracle Supply Chain Products Suite.

    Starting with the April 2016 Critical Patch Update (CPU), Oracle E-Business Suite 11i security patches are only available for Oracle customers with Tier 1 Support contracts, previously referred to as Advanced Customer Support (ACS).  Tier 1 Support must be purchased and is an additional fee on top of standard Oracle maintenance.  Optional Tier 1 Support will include CPU security patches through October 2016.

    CPU information for 11i has been moved from the standard quarterly CPU My Oracle Support (MOS) note for Oracle E-Business Suite to MOS Note ID 2126170.1 “Oracle E-Business Suite Release 11i Critical Patch Update Knowledge Document (For Oracle E-Business Suite 11i Tier 1 Support Customers).”

    For more information on CPU support for 11i, please see MOS Note ID 1596629.1 “ANNOUNCEMENT: Additional Coverage Options for 11.5.10 E-Business Suite Sustaining Support.”

    As an alternative to Oracle Tier 1 Support or as an additional layer of defense for Oracle E-Business Suite 11i, Integrigy’s web application firewall for Oracle E-Business Suite, AppDefend, provides virtual patching of Oracle E-Business Suite web security vulnerabilities, web application attack surface reduction, and protection from SQL injection and cross-site scripting (XSS) attacks.

    Oracle E-Business Suite, Oracle Critical Patch Updates
    Categories: APPS Blogs, Security Blogs

    A Taste of FinTech: Bitterballen and Banking in the Cloud with Profource

    Usable Apps - Wed, 2016-04-20 07:05

    Financial technology (#FinTech) innovation and the future of banking are hot topics. If you wondered for one tiny moment how the importance of financial applications and the cloud fit into the Oracle Cloud User Experience #PaaS4SaaS enablement that Oracle Applications User Experience (OAUX) offers Oracle Partners, well, here's one rocking example from EMEA!

    Recently, we (OAUX) held a hands-on enablement event at Oracle Nederland in Utrecht to work alongside one of our leading EMEA partners, Profource B.V., to design and build a simplified UI Banking Cloud solution using the Cloud UX Rapid Development Kit (RDK) for Release 10. This event was the culmination of upfront design and development exploration done collaboratively between the two teams online.

    Profource and OAUX Teams in Oracle Nederland

    Part of the Profource team with the OAUX enablers: (L-R): Julian Orr, Lancy Silveira, Ronald van Herpen, Martijn Rijpkema, Pam Koertshuis, and Ultan Ó Broin

    Held over 2.5 days, the Microsoft PowerPoint Karaoke was cut to a minimum as design and development teams stormed and formed around the solution requirements and worked fast to iterate designs and build a modern banking cloud solution that was then deployed as a service to the Profource cloud. A great success!

    Banking Cloud Simplified UI Launch Page Wireframe

    Part of the Banking Cloud simplified UI launch experience wireframe. Wireframing the solution using the RDK tools, garnering agreement, and transferring the design to the development team made for rapid, agile innovation and iteration, right through to deployment.

    Banking Cloud Statements Wireframe

    Simply that simplified UI again: This time for an Oracle ERP Cloud Release 10 solution. This is a wireframe view of part of the user experience flow, built using the RDK's Oracle ADF page templates and declarative components, the Oracle Alta UI design system, and the agile Learn-Design-Build approach of the RDK's guidance.

    You can read more about the Banking Cloud event on Profource consultant Pam Koertshuis's (@pkoertshuis) blog: Simplified UI for PAAS 4 SAAS solution.

    Profource consultant, Hakan Biroglu (@hakanbiroglu), said about the event:

    "We [the Profource team] have learned a lot. Not just on a technical level, on how to reuse your RDK, but also on an architectural and design level. Your workshop forced us to rethink our UX solution and to evaluate every item on every page in every flow, 'Why is this needed? What information does it provide me? What does it trigger me to do?'" 

    Normal 0 false false false EN-US JA X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:8.0pt; font-family:"Times New Roman Bold"; color:black; mso-fareast-language:JA;}

    If you're heading to OBUG's APPSCONNECTed16 event in Arnhem, check out the sessions about the Profource Banking Cloud and their other Release 10 cloud solutions (HCM, PPM, and Resource Management) and about how you can use the RDK and OAUX enablement to do the same for your business.

    Many thanks to the Profource team and to Oracle Nederland.  

    If you are an Oracle Partner who wants to get ahead in the Oracle Cloud, you can contact us through the usual OAUX channels or your Oracle PartnerNetwork contacts.

    Oracle Usable Apps Catds and Skerches

    Oracle Applications Cloud User Experience: Enablement for partners from design to deployment. It starts with a sketch . . . .

    Oh, the bittterballen?

    Since you asked...


    Bitterballen: An Oracle Nederland culinary delight that somehow always features as part of OAUX enablement events in Utrecht.