Skip navigation.

Feed aggregator

Old Castles

Pete Scott - Mon, 2014-10-20 06:12
Living here on the Kent Coast we are quite blessed with the number of castles within half and hour’s drive of our cottage. English Heritage manages several nearby castles or forts. The nearest, Richborough, is out and out Roman. We had a lot of Romans roaming around here, they even strolled past my cottage along […]

The Ad Hoc Reporting Myth

Tim Hall - Mon, 2014-10-20 01:42

Empowering users! Giving users access to the information they need, when they need it! Allowing users to decide what they need! These are all great ideas and there are plenty of products out there that can be used to achieve this. The question must be, is it really necessary?

There will always be some users that need this functionality. They will need up-to-the-second ad hoc reporting and will invest their time into getting the most from the tools they are given. There is also a large portion of the user base that will quite happily use what they are given and will *never* invest in the tool set. They don’t see it as part of their job and basically just don’t care.

Back when I started IT, most projects had some concept of a reporting function. A group of people that would discuss with the user base the type of reporting that was needed and identify what was *really needed* and what were just the never ending wish list of things that would never really be used. They would build these reports and they would go through user acceptance and be signed off. It sounds like the bad old days, but what you were left with were a bunch of well defined reports, written by people who were “relatively speaking” skilled at reporting. What’s more, the reporting function could influence the application design. The quickest way to notice that “One True Lookup Table” is a bad design is to try and do some reporting queries. You will soon change your approach.

With the advent of ad hoc reporting, the skills base gradually eroded. We don’t need a reporting function any more! The users are in charge! All we need is this semantic layer and the users can do it all for themselves! Then the people building the semantic layers got lazy and just generated what amounts to a direct copy of the schema. Look at any database that sits behind one of these abominations and I can pretty much guarantee the most horrendous SQL in the system is generated by ad hoc reporting tools! You can blame the users for not investing more time in becoming an expert in the tool. You can blame the people who built the semantic layer for doing a poor job. You can blame the tools. What it really comes down to is the people who used ad hoc reporting as a “quick and easy” substitute for doing the right thing.

There will always be a concept of “standard reports” in any project. Stuff that is known from day one that the business relies on. These should be developed by experts who do it using efficient SQL. If they are not time-critical, they can be scheduled to spread out the load on the system, yet still be present when they are needed. This would relieve some of the sh*t-storm of badly formed queries hitting the database from ad hoc reporting.

I’m going to file this under #ThoseWereTheDays, #GrumpyOldMen and #ItProbablyWasntAsGoodAsIRemember…

Cheers

Tim…

The Ad Hoc Reporting Myth was first posted on October 20, 2014 at 8:42 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

Connecting to Pivotal Cloud Foundry Ops Metrics using Java VisualVM

Pas Apicella - Sun, 2014-10-19 21:34
The Pivotal Ops Metrics tool is a JMX extension for Elastic Runtime. Pivotal Ops Metrics collects and exposes system data from Cloud Foundry components via a JMX endpoint. Use this system data to monitor your installation and assist in troubleshooting. Below is the tile once installed and available with Pivotal Cloud Foundry Ops Manager


Once installed and configured, metrics for Cloud Foundry components automatically report to the JMX endpoint. Your JMX client uses the credentials supplied to connect to the IP address of the Pivotal Ops Metrics JMX Provider at port 44444

1. Start jvisualvm

2. Under plugin ensure you have the VisualVm-Mbeans plugin installed as shown below, or install it to be able to view the MBeans.



3. Create a JMX connection as shown below


4. Finally the CF MBeans can be viewed as shown below.



More Information

Deploying Pivotal Ops Metrics
http://docs.pivotal.io/pivotalcf/customizing/deploy-metrics.htmlhttp://feeds.feedburner.com/TheBlasFromPas
Categories: Fusion Middleware

SQLShell accessing Pivotal GemFire XD 1.3

Pas Apicella - Sun, 2014-10-19 19:09
I stumbled open SQLShell recently as per the URL below. Below I will show how you can connect to Pivotal GemFireXD using SQLShell. I used this to export query results using CSV output.

http://software.clapper.org/sqlshell/users-guide.html

Note: Assuming SQLShell is already installed and instructions below are for Mac OSX

1. Create a file in $HOME/.sqlshell/config as shown below, I just took the sample it ships with. Notice how I have added an alias for "gemfirexd", highlighted below.

# ---------------------------------------------------------------------------
# initialization file for SQLShell

[settings]
#colspacing: 2

[drivers]
# Driver aliases.
postgresql = org.postgresql.Driver
postgres = org.postgresql.Driver
mysql = com.mysql.jdbc.Driver
sqlite = org.sqlite.JDBC
sqlite3 = org.sqlite.JDBC
oracle = oracle.jdbc.driver.OracleDriver
access = sun.jdbc.odbc.JdbcOdbcDriver
gemfirexd = com.pivotal.gemfirexd.jdbc.ClientDriver

[vars]
historyDir: ${env.HOME}/.sqlshell

[db_postgres]
aliases: post
url: jdbc:postgresql://localhost:5432/sampledb
driver: postgres
user: ${system.user.name}
password:
history: $vars.historyDir/postgres.hist

[db_mysql]
#aliases:
driver: mysql
url: jdbc:mysql://localhost:3306/sampledb
user: ${system.user.name}
password:
history: $vars.historyDir/mysql.hist

[db_sqlite3]
aliases: sqlite3
url: jdbc:sqlite:/tmp/sample.db
driver: sqlite
history: $vars.historyDir/sqlite3.hist

[db_oracle]
aliases: ora
schema: example
url: jdbc:oracle:thin:@localhost:1521:sampledb
user: ${system.user.name}
password:
driver: oracle
history: $vars.historyDir/scrgskd

[db_access]
driver: access
url: jdbc:odbc:Driver={Microsoft Access Driver (*.mdb)};DBQ=/tmp/sample.mdb;DriverID=22}

2. Add Pivotal GemFireXd client driver "gemfirexd-client.jar" to "/Applications/sqlshell/lib"

3. With Pivotal GemFireXD cluster up and running connect and run some commands as shown below.
  
[Mon Oct 20 11:56:10 papicella@:~/vmware/software/sqlshell ] $ sqlshell gemfirexd,jdbc:gemfirexd://localhost:1527
SQLShell, version 0.8.1 (2012/03/16 09:43:31)
Copyright (c) 2009-2011 Brian M. Clapper
Using JLine
Type "help" for help. Type ".about" for more information.

sqlshell> .set schema APP

sqlshell> .show tables
ALL_EMPS APPLES_OFFHEAP CUSTOMERS
DEPT EMP EMPLOYEES
EMPS_IN_DEPT_10 EMPS_IN_DEPT_20 EMPS_IN_DEPT_30
EMPS_IN_DEPT_40 OFFICES ORDERDETAILS
ORDERS PAYMENTS PERSON
PRODUCTLINES PRODUCTS TEST_ASYNC
TEST_ASYNC2 TEST_CALLBACKLISTENER
sqlshell> select * from dept;
Execution time: 0.21 seconds
Retrieval time: 0.6 seconds
7 rows returned.

DEPTNO DNAME LOC
------ ---------- --------
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BRISBANE
50 MARKETING ADELAIDE
60 DEV PERTH
70 SUPPORT SYDNEY

sqlshell> .capture to /tmp/results.csv
Capturing result sets to: /tmp/results.csv

sqlshell> select * from emp where deptno = 10;
Execution time: 0.18 seconds
Retrieval time: 0.5 seconds
3 rows returned.

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
----- ------ --------- ---- --------------------- ---- ---- ------
7782 CLARK MANAGER 7839 1981/06/09 00:00:00.0 2450 NULL 10
7839 KING PRESIDENT NULL 1981/11/17 00:00:00.0 5000 NULL 10
7934 MILLER CLERK 7782 1982/01/23 00:00:00.0 1300 NULL 10

sqlshell> .capture off
No longer capturing query results.

sqlshell>
http://feeds.feedburner.com/TheBlasFromPas
Categories: Fusion Middleware

Y Combinator companies has more funding than the sum total of all remaining accelerators

Nilesh Jethwa - Sun, 2014-10-19 13:05

After finishing our call with Bed Bugs , we decided to check out what the startup scene looks like. We used the data from seed-db to let our analytical juices flowing.

First we asked what is the top most program (duh!!) but by how much and who are next in the list and so on.

Like most Data scientists who believe in the power of simple bar graphs we used our first “chart weapon” of choice and here it is what it rendered.

Image

Y Combinator is freaking huge like a dinasaur, infact very much resembles the grass eating Sauropods. In fact we had to create a chart that was 3000 pixels wide just to accommodate all.

Image

See the resemblance between the chart and the Sauropod?

To get better perspective we rendered it in a Treemap as shown

Image

Looking at the treemap, Y Combinator occupies more than the sum total of all the remaining accelerators. That is super amazing but the problem our charts were not coming up beautiful. YC is clearly the outlier and was causing us difficulty to understand the remainder startup ecosystem.

We said, lets cut off the head to dig deeper.

The moment we filtered out YC from our analysis, all of the regions became colorful and that was certainly a visual treat.

Image

Now we could clearly see what are the other accelerators/programs that are roughly the same size.

For example,

TechStars Boulder and AngelPad are roughly the same

TechStars NYC, TechStars Boston and 500Startups are in the same club

Similarly DreamIT, fbFund and Mucker Lab share the same color.

Now let us try to see from the location angle

Image

So we re-established that YC is freaking huge and having them on a chart with other accelerators does not create beautiful visualizations.

ShareThis

Plan depth

Jonathan Lewis - Sun, 2014-10-19 11:20

A recent posting on OTN reminded me that I haven’t been poking Oracle 12c very hard to see which defects in reporting execution plans have been fixed. The last time I wrote something about the problem was about 20 months ago referencing 11.2.0.3; but there are still oddities and irritations that make the nice easy “first child first” algorithm fail because the depth calculated by Oracle doesn’t match the level that you would get from a connect-by query on the underlying plan table. Here’s a simple fail in 12c:


create table t1
as
select
	rownum 			id,
	lpad(rownum,200)	padding
from	all_objects
where	rownum <= 2500
;

create table t2
as
select	* from t1
;

-- call dbms_stats to gather stats

explain plan for
select
	case mod(id,2)
		when 1 then (select max(t1.id) from t1 where t1.id <= t2.id)
		when 0 then (select max(t1.id) from t1 where t1.id >= t2.id)
	end id
from	t2
;

select * from table(dbms_xplan.display);

It ought to be fairly clear that the two inline scalar subqueries against t1 should be presented at the same level in the execution hierarchy; but here’s the execution plan you get from Oracle:

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  2500 | 10000 | 28039   (2)| 00:00:02 |
|   1 |  SORT AGGREGATE      |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL  | T1   |   125 |   500 |    11   (0)| 00:00:01 |
|   3 |    SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  4 |     TABLE ACCESS FULL| T1   |   125 |   500 |    11   (0)| 00:00:01 |
|   5 |  TABLE ACCESS FULL   | T2   |  2500 | 10000 |    11   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T1"."ID"<=:B1)
   4 - filter("T1"."ID">=:B1)

As you can see, the immediate (default?) visual impression you get from the plan is that one of the subqueries is subordinate to the other. On the other hand if you check the id and parent_id columns from the plan_table you’ll find that lines 1 and 3 are both direct descendents of line 0 – so they ought to have the same depth. The plan below is what you get if you run the 8i query from utlxpls.sql against the plan_table.


SQL> select id, parent_id from plan_table;

        ID  PARENT_ID
---------- ----------
         0
         1          0
         2          1
         3          0
         4          3
         5          0

--------------------------------------------------------------------------------
| Operation                 |  Name    |  Rows | Bytes|  Cost  | Pstart| Pstop |
--------------------------------------------------------------------------------
| SELECT STATEMENT          |          |     2K|    9K|  28039 |       |       |
|  SORT AGGREGATE           |          |     1 |    4 |        |       |       |
|   TABLE ACCESS FULL       |T1        |   125 |  500 |     11 |       |       |
|  SORT AGGREGATE           |          |     1 |    4 |        |       |       |
|   TABLE ACCESS FULL       |T1        |   125 |  500 |     11 |       |       |
|  TABLE ACCESS FULL        |T2        |     2K|    9K|     11 |       |       |
--------------------------------------------------------------------------------

So next time you see a plan and the indentation doesn’t quite seem to make sense, perhaps a quick query to select the id and parent_id will let you check whether you’ve found an example where the depth calculation produces a misleading result.

 

Update 20th Oct 2014

A question via twitter – does the error also show up with dbms_xplan.display_cursor(), SQL tuning sets, AWR, etc. or is it just a defect of explain plan. Since the depth is (probably) a derived value for display purposes that Oracle doesn’t use internally for executing the plan I would be inclined to assume that the defect is universal, but I’ve only checked it through explain plan/display, and through execution/display_cursor().

 

 

 


Learning Spark Lightning-Fast Big Data Analytics by Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia; O'Reilly Media

Surachart Opun - Sat, 2014-10-18 12:45
Apache Spark started as a research project at UC Berkeley in the AMPLab, which focuses on big data analytics. Spark is an open source cluster computing platform designed to be fast and general-purpose for data analytics - It's both fast to run and write. Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly much quicker than with disk-based systems like Hadoop MapReduce. Users can write applications quickly in Java, Scala or Python. In additional, it's easy to run standalone or on EC2 or Mesos. It can read data from HDFS, HBase, Cassandra, and any Hadoop data source.
If you would like a book about Spark - Learning Spark Lightning-Fast Big Data Analytics by Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia. It's a great book for who is interested in Spark development and starting with it. Readers will learn how to express MapReduce jobs with just a few simple lines of Spark code and more...
  • Quickly dive into Spark capabilities such as collect, count, reduce, and save
  • Use one programming paradigm instead of mixing and matching tools such as Hive, Hadoop, Mahout, and S4/Storm
  • Learn how to run interactive, iterative, and incremental analyses
  • Integrate with Scala to manipulate distributed datasets like local collections
  • Tackle partitioning issues, data locality, default hash partitioning, user-defined partitioners, and custom serialization
  • Use other languages by means of pipe() to achieve the equivalent of Hadoop streaming
With Early Release - 7 chapters. Explained Apache Spark overview, downloading and commands that should know, programming with RDDS (+ more advance) as well as working with Key-Value Pairs, etc. Easy to read and Good examples in a book. For people who want to learn Apache Spark or use Spark for Data Analytic. It's a book, that should keep in shelf.

Book: Learning Spark Lightning-Fast Big Data Analytics
Authors: Holden KarauAndy KonwinskiPatrick WendellMatei ZahariaWritten By: Surachart Opun http://surachartopun.com
Categories: DBA Blogs

Going dotty – Generating a Filename containing a parameter value in SQL*Plus

The Anti-Kyte - Sat, 2014-10-18 12:01

As I have alluded to previously, I was not born in the UK.
Nope, my parents decided to up-sticks and move from London all the way to the
other side of the world, namely Auckland.
Then they had me. Then they came back.
To this day, they refuse to comment on whether these two events were related.

I went back to New Zealand a few years ago.
As I wandered around places that I hadn’t seen since I was five, it was strange how memories that I had forgotten came flooding back.
That last sentence doesn’t make much sense. It’s probably more accurate to say that memories I hadn’t thought about for years came flooding back.

I recently remembered something else I once knew, and then forgot – namely how to generate a SQL*Plus file name which includes a parameter value.

The scenario

I’ve got a script that lists all of the employees in a given department :

accept deptno prompt 'Enter Department ID : '
spool department.lis

select first_name, last_name
from hr.employees
where department_id = &deptno
order by employee_id
/

spool off

Now, rather than it just creating a file called department.lis, I want to create a file that includes the department number I’m querying.

Obvious…but wrong

You might think the following is a reasonable attempt to do this :

accept deptno prompt 'Enter Department ID : '
spool department_&deptno.lis

select first_name, last_name
from hr.employees
where department_id = &&deptno
order by employee_id
/

spool off

Unfortunately, SQL*Plus insists on being obtuse and outputting the following file :

ls
department_10lis.lst

It is at this point that a colleague came to the rescue ( thanks William)…

Going dotty

This will do the job…

accept deptno prompt 'Enter Department ID : '

spool department_&deptno..lis

select first_name, last_name
from hr.employees
where department_id = &deptno
order by employee_id
/

spool off

Run this and we not only get :

Enter Department ID : 10
old   3: where department_id = &deptno
new   3: where department_id = 10

FIRST_NAME           LAST_NAME
-------------------- -------------------------
Jennifer             Whalen

SQL> 

…we get a file, appropriately named :

ls
department_10.lis

The magic here is that the “.” character delimits the variable substitution.
Just to prove the point, we can do the same with a positional parameter :

set verify off

spool department_&1..lis

select first_name, last_name
from hr.employees
where department_id = &1
order by employee_id
/

spool off

…run this and we get :

SQL> @position_param.sql 10

FIRST_NAME           LAST_NAME
-------------------- -------------------------
Jennifer             Whalen

SQL> 

…and the appropriate file…

ls
department_10.lis

On that note, I’m off to the pub. Now, where did I leave my keys ?


Filed under: Oracle, SQL Tagged: spool; filename including a variable value, SQL*Plus

Video and Slides - Data Caching Strategies for Oracle Mobile Application Framework

Andrejus Baranovski - Sat, 2014-10-18 09:08
I have recorded a video tutorial, based on my OOW'14 session - Data Caching Strategies for Oracle Mobile Application Framework. ADF developers who could not attend OOW'14 in San Francisco, this is for you !

Here you can view the slides:


Data Caching Strategies for Oracle Mobile Application Framework
Watch the first part of the tutorial:


Watch the second part of the tutorial:


Described solution is based on the sample application from the blog post - Transactional Data Caching for ADF Mobile MAF Application.

Bandwidth and Latency

Hemant K Chitale - Sat, 2014-10-18 08:40
Here is, verbatim, an article I posted on Linked-In yesterday  [For other posts on Linked-In, view my Linked-In profile] :

Imagine an 8-lane highway. Now imagine a 4-lane highway. Which has the greater bandwidth ?Imagine your organisation sends its employees on a wekend "retreat" by bus. You have the choice of two locations, one that is 200kilometres away and the other is 80kilometres away. Assume that buses travel at a constant speed of 80kmph. Which resort will your employees get to faster ?The first question is about bandwidth. The second is about latency.(Why should I assume a fixed speed for the buses ? Because, I can assume a fixed speed at which electrons transfer over a wire or photons over a light channel).Expand the question further. What if the organisation needs to send 32 employees in a 40-seater bus. Does it matter that the bus can travel on an 8-lane highway versus a 4-lane highway (assuming minimal other traffic on the highways at that time) ?Too often, naive "architects" do not differentiate between the two. If my organisation needs to configure a standby (DR) location for the key databases and has a choice of two locations but varying types of network services, it should consider *both* bandwidth and latency. If the volume of redo is 1000MBytes per minute and this, factoring overheads for packetizing the "data", translates to 167Mbits per second, should I just go ahead and buy bandwidth of 200Mbits per second ? If the two sites have two different network services providers offering different bandwidths, should I simply locate at the site with the greater bandwidth ? What if the time it takes to synchronously write my data to site "A" is 4ms and the time to site "B" is 8ms ? Should I not factor the latency ? (I am assuming that the "write to disk" speed of hardware at either site is the same -- i.e. the hardware is the same). I can then add the complications of network routers and switches that add to the latency. Software configurations, flow-control mechanisms, QoS definitions and hardware configuration can also impact bandwidth and latency in different ways.Now, extend this to data transfers ("output" or "results") from a database server to an application server or end-user. If the existing link is 100Mbps and is upgraded to 1Gbps, the time to "report" 100 rows is unlikely to change as this time is a function of the latency. However, if the number of concurrent users grows from 10 to 500, the bandwidth requirement may increase and yet each user may still have the same "wait" time to see his results (assuming that there are no server hardware constraints returning results for 500 users).On the flip side, consider ETL servers loading data into a database. Latency is as important as bandwidth. An ETL scheme that does "row-by-row" loads relies on latency, not bandwidth. Increasing bandwidth doesn't help such a scheme.Think about the two.
Categories: DBA Blogs

ReadyNAS 104

Tim Hall - Sat, 2014-10-18 07:36

My old NAS went pop a little while ago and I’ve spent the last few weeks backing up to alternate servers while trying to decide what to get to replace it.

Reading the reviews on Amazon is a bit of a nightmare because there are always scare stories, regardless how much you pay. In the end I decided to go for the “cheap and cheerful” option and bought a ReadyNAS 104. I got the diskless one and bought a couple of 3TB WD Red disks, which were pretty cheap. It supports the 4TB disks, but they are half the price again and I’m mean. Having just two disks means I’ve got a single 3TB RAID 1 volume. I can add a third and fourth disk, which will give me approximately 6 or 9 TB. It switches to RAID 5 by default with more than 2 disks.

The setup was all web based, so I didn’t have any OS compatibility issues. Installation was really simple. Slipped in the disks. Plugged the ethernet cable to my router and turned on the power. I went to the website (readycloud.netgear.com), discovered my device and ran through the setup wizard. Job done. I left it building my RAID 1 volume overnight, but I was able to store files almost immediately, while the volume was building.

The web interface for the device is really simple to use. I can define SMB/AFP/NFS shares in a couple of clicks. Security is really obvious. I can define iSCSI LUNs for use with my Linux machines and it has TimeMachine integration if you want that.

The cloud-related functionality is optional, so if you are worried about opening up a potential security hole, you can easily avoid it. I chose not to configure it during the setup wizard.

I was originally going to spend a lot more on a NAS, but I thought I would chance this unit. So far I’m glad I did. It’s small, solid and silent. Fingers crossed it won’t go pair-shaped. :)

I’ve got all the vital stuff on it now. I’ll start backing up some of my more useful VMs to it and see if I need to buy some more disks. I’ve got about 10TB of storage on servers, but most of it is taken up with old VMs I don’t care about, so I will probably be a bit selective.

Cheers

Tim…

PS. I think NetGear might be doing a revamp of their NAS lineup soon, so you might get one of these extra cheap in the near future. They’re already claim to be about 50% RRP, but most RRPs are lies. :)

 

ReadyNAS 104 was first posted on October 18, 2014 at 2:36 pm.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

Bed bugs in Boston – Analysis of Boston 311 public dataset

Nilesh Jethwa - Fri, 2014-10-17 11:22

Digging into the Boston public Dataset can reveal interesting and juicy facts.

Even though there is nothing juicy about Bed bugs but the data about Boston open cases for Bed bugs is quite interesting and worth looking at.

We uploaded the entire 50 mb data dump which is around 500K rows into the Data Visualizer and filtered the category for Bed Bugs. Splitting the date into its date hierarchy components we then plotted the month on the Y axis.

InfoCaptor : Analytics & dashboards

It seems that the City of Boston started collecting this data around 2011 and has only partial data for that year.

Interestingly, the number of Bed bug cases seem to rise during the summer months.

Now if we break the lines into Quarters (we just add the quarter hierarchy to the mix)

InfoCaptor : Analytics & dashboards

ShareThis

Yosemite : It’s like OS X, but more boring to look at!

Tim Hall - Fri, 2014-10-17 11:19

I went on my MacBook last night and saw I had updates available on the App Store. I figured this was one of those Twitter updates that seem to happen every time you blink. Much to my surprise it was a new version of OS X. You can tell how little of an Apple fanboy I am. I didn’t even know this was due, let alone here already. :)

I figured, what the heck and let it start. About 20 minutes later it was done and now I have Yosemite on my MacBook Pro (mid 2009). I wasn’t really timing, so that’s a guess.

First impressions.

  • It’s like OS X, but more boring to look at! Everything is flat and looks a little bland. I’m told this is the look and feel from the iPhone, but I don’t have one of those so I don’t know. I’m sure in a week I won’t remember the old look. The only reminder is the icons for all the non-Apple software I have installed, which still look like they are trying to fit in with the old look. :)
  • I asked one of my colleagues at work and he said it is meant to be faster. I don’t see that myself, but this is a 5 year old bit of kit.
  • Launchpad is straight out of GNOME3. I never use it anyway. Perhaps it always looked like this???
  • Mission Control and Dashboard are also things I never use, so I can’t tell if they have changed for the better or not. :)
  • The light colour background of the Application menu looks odd. Not bad, but different.

What’s broken? So far nothing. I can run VirtualBox, iTerm, Chrome and PowerPoint, so that is pretty much all I do with the laptop.

So in conclusion, Yosemite has completely changed my whole world and Apple are a bunch of geniuses right? Well, actually it’s a pretty mundane change as far as my usage is concerned. I’m sure it’s all terribly cloudy and someone will throw a “rewritten from the ground up” in there somewhere, kinda like Microsoft do when they release the same stuff year after year with a different skin…

By the way, it didn’t cost me anything to upgrade from pretty to bland!

Cheers

Tim…

 

 

Yosemite : It’s like OS X, but more boring to look at! was first posted on October 17, 2014 at 6:19 pm.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

Public, private health care systems possess security vulnerabilities

Chris Foot - Fri, 2014-10-17 11:12

System and database administrators from health care institutions are facing several challenges.

On one hand, many are obligated to migrate legacy applications to state-of-the-art electronic health record solutions. In addition, they need to ensure the information contained in those environments is protected.

Operating systems, network configurations and a wealth of other factors can either make or break security architectures. If these components are unable to receive frequent updates from vendor-certified developers, it can cause nightmares for database administration professionals. 

Windows XP no longer a valid option 
When Microsoft ceased to provide to support for Windows XP in early April, not as many businesses upgraded to Windows 7 or 8 as the software vendor's leaders had hoped. This means those using XP will not receive regular security updates, leaving them open to attacks as hackers work to find vulnerabilities with the OS.

Despite continuous warnings from Microsoft and the IT community, Information Security Buzz contributor Rebecca Herold believes that the a large percentage of medical devices currently in use are running XP. Her allegations are based on reports submitted by health care electronics producers that stated they leverage XP for the sensors' graphical user interfaces, as well as to create a connection to external databases. 

Because Microsoft has yet to release the source code of XP, health care companies using these implementations have no way of identifying vulnerabilities independently. Even if the source code was distributed, it's unlikely that the majority of medical providers could use in-house resources to search for security flaws. The only way to defend the servers linked with devices running XP is to employ database active monitoring. 

Public sector experiencing vulnerabilities 
Healthcare.gov apparently isn't picture-perfect, either. Fed Scoop reported that white hat hackers working for the U.S. Department of Health and Human Services' Office of the Inspector General discovered that personally identifiable information was secured, but some data controlled by the Centers for Medicare and Medicaid Services lacked adequate protection

After an assessment of CMS and databases was completed, the IG advised the organization to encode files with an algorithm approved by Federal Information Processing Standards 140-2. However, authorities at the CMS deduced this wasn't necessary. 

Although this wasn't the first audit of Healthcare.gov (and it likely won't be the last), the information held within its servers is too valuable for cybercriminals to ignore. Setting up an automated, yet sophisticated intrusion detection program to notify DBAs when user activity appears inconsistent is a step the CMS should strongly consider taking. 

The post Public, private health care systems possess security vulnerabilities appeared first on Remote DBA Experts.

JPMorgan hack joins list of largest data breaches in history [VIDEO]

Chris Foot - Fri, 2014-10-17 09:02

Transcript

Hi, welcome to RDX. With news about data breaches sweeping the Web on a regular basis, it's no surprise that the latest victim was a major U.S. bank.

According to Bloomberg, hackers gained access to a server operated by JPMorgan Chase, stealing data on 76 million homes and 7 million small businesses.

After further investigation, the FBI discovered the hackers gained access to a server lacking two-factor authentication. From there, the hackers found fractures in the bank's custom software, through which JPMorgan's security team unknowingly gave them access to data banks.

To prevent such attacks from occurring, firms should regularly assess their databases and solutions to find vulnerabilities.

Thanks for watching! Be sure to visit us next time for info on RDX's security services.

The post JPMorgan hack joins list of largest data breaches in history [VIDEO] appeared first on Remote DBA Experts.

Debugging High CPU Usage Using Perf Tool and vmcore Analysis

Pythian Group - Fri, 2014-10-17 08:08

There are several tools and technologies available to debug deeper into high CPU utilization in a system; perf, sysrq, oprofile, vmcore, and more. In this post, I will narrate the course of debugging a CPU utilization issue using technologies like perf and vmcore.

Following sar output is from a system which faces high %system usage.

[root@prod-smsgw1 ~]# sar 1 14
Linux 2.6.32-431.20.5.el6.x86_64 (xxxxx) 08/08/2014 _x86_64_ (8 CPU)05:04:57 PM CPU %user %nice %system %iowait %steal %idle
05:04:58 PM all 2.90 0.00 15.01 0.38 0.00 81.72
05:04:59 PM all 2.02 0.00 10.83 0.13 0.00 87.03
05:05:00 PM all 3.27 0.00 13.98 0.76 0.00 81.99
05:05:01 PM all 9.32 0.00 16.62 0.25 0.00 73.80

From ‘man sar’.

%system
Percentage of CPU utilization that occurred while executing at the system level (kernel). Note
that this field includes time spent servicing hardware and software interrupts.

This means that the system is spending considerable time on catering to kernel code. System runs a java application which is showing high CPU usage.

perf – Performance analysis tools for Linux, is a good place to start in these kind of scenarios.

‘perf record’ command would capture system state for all cpus in perf.data file. -g would allow call graph and -p allows profiling a process.

‘perf report’ command would show the report.

Samples: 18K of event ‘cpu-clock’, Event count (approx.): 18445, Thread: java(3284), DSO: [kernel.kallsyms]
58.66% java [k] _spin_lock ?
31.82% java [k] find_inode ?
2.66% java [k] _spin_unlock_irqrestore ?
2.44% java [k] mutex_spin_on_owner

Here we can see that considerable time is spend in spinlock and find_inode code for the java application..

While investigation was going on, system crashed and dumped a vmcore. Vmcore is a memory dump of the system captured by tools like kdump.

I downloaded the debuginfo file and extracted the vmlinux to analyse the vmcore.

# wget wget http://debuginfo.centos.org/6/x86_64/kernel-debuginfo-2.6.32-431.20.5.el6.x86_64.rpm
# rpm2cpio kernel-debuginfo-2.6.32-431.20.5.el6.x86_64.rpm |cpio -idv ./usr/lib/debug/lib/modules/2.6.32-431.20.5.el6.x86_64/vmlinux

Then ran following command.

# crash ./usr/lib/debug/lib/modules/2.6.32-431.20.5.el6.x86_64/vmlinux /var/crash/127.0.0.1-2014-08-07-17\:56\:19/vmcoreKERNEL: ./usr/lib/debug/lib/modules/2.6.32-431.20.5.el6.x86_64/vmlinux
DUMPFILE: /var/crash/127.0.0.1-2014-08-07-17:56:19/vmcore [PARTIAL DUMP]
CPUS: 8
DATE: Thu Aug 7 17:56:17 2014
UPTIME: 1 days, 13:08:01
LOAD AVERAGE: 91.11, 91.54, 98.02
TASKS: 1417
NODENAME: xxxxx
RELEASE: 2.6.32-431.20.5.el6.x86_64
VERSION: #1 SMP Fri Jul 25 08:34:44 UTC 2014
MACHINE: x86_64 (2000 Mhz)
MEMORY: 12 GB
PANIC: “Oops: 0010 [#1] SMP ” (check log for details)
PID: 11233
COMMAND: “java”
TASK: ffff88019706b540 [THREAD_INFO: ffff880037a90000]
CPU: 6
STATE: TASK_RUNNING (PANIC)

From the vmcore I see that dtracedrv module was loaded and unloaded (possibly for running dtrace), this resulted in several warnings (first warning from ftrace is expected) and then kernel panicked as memory got corrupted. Instruction pointer is corrupted, which points to memory corruption. Looks like Panic was triggered by dtrace module.

/tmp/dtrace/linux-master/build-2.6.32-431.20.5.el6.x86_64/driver/dtrace.c:dtrace_ioctl:16858: assertion failure buf->dtb_xamot != cached
Pid: 8442, comm: dtrace Tainted: P W ————— 2.6.32-431.20.5.el6.x86_64 #1
Pid: 3481, comm: java Tainted: P W ————— 2.6.32-431.20.5.el6.x86_64 #1
Call Trace:
[] ? dump_cpu_stack+0x3d/0×50 [dtracedrv]
[] ? generic_smp_call_function_interrupt+0×90/0x1b0
[] ? smp_call_function_interrupt+0×27/0×40
[] ? call_function_interrupt+0×13/0×20
[] ? _spin_lock+0x1e/0×30
[] ? __mark_inode_dirty+0x6c/0×160
[] ? __set_page_dirty_nobuffers+0xdd/0×160
[] ? nfs_mark_request_dirty+0x1a/0×40 [nfs]
[] ? nfs_updatepage+0x3d2/0×560 [nfs]
[] ? nfs_write_end+0×152/0x2b0 [nfs]
[] ? iov_iter_copy_from_user_atomic+0×92/0×130
[] ? generic_file_buffered_write+0x18a/0x2e0
[] ? nfs_refresh_inode_locked+0x3e1/0xbd0 [nfs]
[] ? __generic_file_aio_write+0×260/0×490
[] ? __put_nfs_open_context+0×58/0×110 [nfs]
[] ? dtrace_vcanload+0×20/0x1a0 [dtracedrv]
[..]
BUG: unable to handle kernel paging request at ffffc90014fb415e
IP: [] 0xffffc90014fb415e
PGD 33c2b5067 PUD 33c2b6067 PMD 3e688067 PTE 0
Oops: 0010 [#1] SMP
last sysfs file: /sys/devices/system/node/node0/meminfo
CPU 6
Modules linked in: cpufreq_stats freq_table nfs fscache nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs ipv6 ppdev parport_pc parport microcode vmware_balloon sg vmxnet3 i2c_piix4 i2c_core shpchp ext4 jbd2 mbcache sd_mod crc_t10dif vmw_pvscsi pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dtracedrv]Pid: 11233, comm: java Tainted: P W ————— 2.6.32-431.20.5.el6.x86_64 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
RIP: 0010:[] [] 0xffffc90014fb415e
RSP: 0018:ffff880037a91f70 EFLAGS: 00010246
RAX: 0000000000000001 RBX: 0000000000000219 RCX: ffff880037a91d40
RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000000
RBP: 00007fba9a67f4c0 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 00000000000003ff R12: 000000000001d4c0
R13: 0000000000000219 R14: 00007fb96feb06e0 R15: 00007fb96feb06d8
FS: 00007fb96fec1700(0000) GS:ffff880028380000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffc90014fb415e CR3: 000000031e49e000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process java (pid: 11233, threadinfo ffff880037a90000, task ffff88019706b540)
Stack:
0000000000000000 0000000000002be1 ffffffff8100b072 0000000000000293
000000000000ebe6 0000000000002be1 0000000000000000 0000000000000007
00000030692df333 000000000001d4c0 0000000000000001 00007fb96feb06d8
Call Trace:
[] ? system_call_fastpath+0×16/0x1b
Code: Bad RIP value.
RIP [] 0xffffc90014fb415e
RSP
CR2: ffffc90014fb415e
crash>

This allowed me to have have a look at the CPU usage issue happening in the system. Other way to capture a vmcore is to manually panic the system using sysrq + c.

None of the runnable and uninterruptable_sleep processes are running for long time..

Looking at the oldest D state process..

crash> bt 4776
PID: 4776 TASK: ffff88027f3daaa0 CPU: 6 COMMAND: “java”
#0 [ffff88027f3dfd88] schedule at ffffffff815287f0
#1 [ffff88027f3dfe50] __mutex_lock_killable_slowpath at ffffffff8152a0ee
#2 [ffff88027f3dfec0] mutex_lock_killable at ffffffff8152a1f8
#3 [ffff88027f3dfee0] vfs_readdir at ffffffff8119f834
#4 [ffff88027f3dff30] sys_getdents at ffffffff8119f9f9
#5 [ffff88027f3dff80] system_call_fastpath at ffffffff8100b072
RIP: 00000030692a90e5 RSP: 00007fa0586c51e0 RFLAGS: 00000206
RAX: 000000000000004e RBX: ffffffff8100b072 RCX: 00007fa0cd2cf000
RDX: 0000000000008000 RSI: 00007fa0bc0de9a8 RDI: 00000000000001f6
RBP: 00007fa0bc004cd0 R8: 00007fa0bc0de9a8 R9: 00007fa0cd2fce58
R10: 00007fa0cd2fcaa8 R11: 0000000000000246 R12: 00007fa0bc004cd0
R13: 00007fa0586c5460 R14: 00007fa0cd2cf1c8 R15: 00007fa0bc0de980
ORIG_RAX: 000000000000004e CS: 0033 SS: 002b

Looking at its stack..

crash> bt -f 4776
PID: 4776 TASK: ffff88027f3daaa0 CPU: 6 COMMAND: “java”
[..]
#2 [ffff88027f3dfec0] mutex_lock_killable at ffffffff8152a1f8
ffff88027f3dfec8: ffff88027f3dfed8 ffff8801401e1600
ffff88027f3dfed8: ffff88027f3dff28 ffffffff8119f834
#3 [ffff88027f3dfee0] vfs_readdir at ffffffff8119f834
ffff88027f3dfee8: ffff88027f3dff08 ffffffff81196826
ffff88027f3dfef8: 00000000000001f6 00007fa0bc0de9a8
ffff88027f3dff08: ffff8801401e1600 0000000000008000
ffff88027f3dff18: 00007fa0bc004cd0 ffffffffffffffa8
ffff88027f3dff28: ffff88027f3dff78 ffffffff8119f9f9
#4 [ffff88027f3dff30] sys_getdents at ffffffff8119f9f9
ffff88027f3dff38: 00007fa0bc0de9a8 0000000000000000
ffff88027f3dff48: 0000000000008000 0000000000000000
ffff88027f3dff58: 00007fa0bc0de980 00007fa0cd2cf1c8
ffff88027f3dff68: 00007fa0586c5460 00007fa0bc004cd0
ffff88027f3dff78: 00007fa0bc004cd0 ffffffff8100b072crash> vfs_readdir
vfs_readdir = $4 =
{int (struct file *, filldir_t, void *)} 0xffffffff8119f7b0
crash>crash> struct file 0xffff8801401e1600
struct file {
f_u = {
fu_list = {
next = 0xffff88033213fce8,
prev = 0xffff88031823d740
},
fu_rcuhead = {
next = 0xffff88033213fce8,
func = 0xffff88031823d740
}
},
f_path = {
mnt = 0xffff880332368080,
dentry = 0xffff8802e2aaae00
},

[..]

crash> mount|grep ffff880332368080
ffff880332368080 ffff88033213fc00 nfs nanas1a.m-qube.com:/vol/test /scratch/test/test.deploy/test/test-internal

The process was waiting while reading from above nfs mount.

Following process seems to the culprit.

crash> bt 9104
PID: 9104 TASK: ffff8803323c8ae0 CPU: 0 COMMAND: “java”
#0 [ffff880028207e90] crash_nmi_callback at ffffffff8102fee6
#1 [ffff880028207ea0] notifier_call_chain at ffffffff8152e435
#2 [ffff880028207ee0] atomic_notifier_call_chain at ffffffff8152e49a
#3 [ffff880028207ef0] notify_die at ffffffff810a11ce
#4 [ffff880028207f20] do_nmi at ffffffff8152c0fb
#5 [ffff880028207f50] nmi at ffffffff8152b9c0
[exception RIP: _spin_lock+30]
RIP: ffffffff8152b22e RSP: ffff88001d209b88 RFLAGS: 00000206
RAX: 0000000000000004 RBX: ffff88005823dd90 RCX: ffff88005823dd78
RDX: 0000000000000000 RSI: ffffffff81fd0820 RDI: ffffffff81fd0820
RBP: ffff88001d209b88 R8: ffff88017b9cfa90 R9: dead000000200200
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88005823dd48
R13: ffff88001d209c68 R14: ffff8803374ba4f8 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
— —
#6 [ffff88001d209b88] _spin_lock at ffffffff8152b22e
#7 [ffff88001d209b90] _atomic_dec_and_lock at ffffffff81283095
#8 [ffff88001d209bc0] iput at ffffffff811a5aa0
#9 [ffff88001d209be0] dentry_iput at ffffffff811a26c0
#10 [ffff88001d209c00] d_kill at ffffffff811a2821
#11 [ffff88001d209c20] __shrink_dcache_sb at ffffffff811a2bb6
#12 [ffff88001d209cc0] shrink_dcache_parent at ffffffff811a2f64
#13 [ffff88001d209d30] proc_flush_task at ffffffff811f9195
#14 [ffff88001d209dd0] release_task at ffffffff81074ec8
#15 [ffff88001d209e10] wait_consider_task at ffffffff81075cc6
#16 [ffff88001d209e80] do_wait at ffffffff810760f6
#17 [ffff88001d209ee0] sys_wait4 at ffffffff810762e3
#18 [ffff88001d209f80] system_call_fastpath at ffffffff8100b072

From upstream kernel source..

/**
* iput – put an inode
* @inode: inode to put
*
* Puts an inode, dropping its usage count. If the inode use count hits
* zero, the inode is then freed and may also be destroyed.
*
* Consequently, iput() can sleep.
*/
void iput(struct inode *inode)
{
if (inode) {
BUG_ON(inode->i_state & I_CLEAR);if (atomic_dec_and_lock(&inode->i_count, &inode->i_lock))
iput_final(inode);
}
}
EXPORT_SYMBOL(iput);#include
/**
* atomic_dec_and_lock – lock on reaching reference count zero
* @atomic: the atomic counter
* @lock: the spinlock in question
*
* Decrements @atomic by 1. If the result is 0, returns true and locks
* @lock. Returns false for all other cases.
*/
extern int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock);
#define atomic_dec_and_lock(atomic, lock) \
__cond_lock(lock, _atomic_dec_and_lock(atomic, lock))

#endif /* __LINUX_SPINLOCK_H */

Looks like the process was trying to drop dentry cache and was holding to the spinlock while dropping an inode associated with it. This resulted in other processes waiting on spinlock, resulting in high %system utilization.

When the system again showed high %sys usage I checked and found large slab cache.

[root@xxxxx ~]# cat /proc/meminfo
[..]
Slab: 4505788 kB
SReclaimable: 4313672 kB
SUnreclaim: 192116 kB

Checking slab in a running system using slabtop, I saw that nfs_inode_cache is the top consumer.

ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
[..]
2793624 2519618 90% 0.65K 465604 6 1862416K nfs_inode_cache

I ran ‘sync’ and then ‘echo 2 > /proc/sys/vm/drop_caches’ to drop the dcache, which fixed the high %sys usage in the system.

[root@xxxxx ~]# sar 1 10
Linux 3.10.50-1.el6.elrepo.x86_64 (prod-smsgw4.sav.mqube.us) 08/12/2014 _x86_64_ (8 CPU)11:04:45 AM CPU %user %nice %system %iowait %steal %idle
11:04:46 AM all 1.51 0.00 13.22 0.50 0.00 84.76
11:04:47 AM all 1.25 0.00 12.55 0.13 0.00 86.07
11:04:48 AM all 1.26 0.00 8.83 0.25 0.00 89.66
11:04:49 AM all 1.63 0.00 11.93 0.63 0.00 85.80
^C
[root@xxxxx ~]# sync
[root@xxxxx ~]# sar 1 10
Linux 3.10.50-1.el6.elrepo.x86_64 (prod-smsgw4.sav.mqube.us) 08/12/2014 _x86_64_ (8 CPU)11:05:23 AM CPU %user %nice %system %iowait %steal %idle
11:05:24 AM all 1.50 0.00 13.03 0.75 0.00 84.71
11:05:25 AM all 1.76 0.00 9.69 0.25 0.00 88.30
11:05:26 AM all 1.51 0.00 9.80 0.25 0.00 88.44
11:05:27 AM all 1.13 0.00 10.03 0.25 0.00 88.60
^C
[root@xxxxx ~]# echo 2 > /proc/sys/vm/drop_caches
[root@xxxxx ~]# cat /proc/meminfo
[..]
Slab: 67660 kB

[root@prod-smsgw4 ~]# sar 1 10
Linux 3.10.50-1.el6.elrepo.x86_64 (prod-smsgw4.sav.mqube.us) 08/12/2014 _x86_64_ (8 CPU)

11:05:58 AM CPU %user %nice %system %iowait %steal %idle
11:05:59 AM all 1.64 0.00 1.38 0.13 0.00 96.86
11:06:00 AM all 2.64 0.00 1.38 0.38 0.00 95.60
11:06:01 AM all 2.02 0.00 1.89 0.25 0.00 95.84
11:06:02 AM all 2.03 0.00 1.39 4.68 0.00 91.90
11:06:03 AM all 8.21 0.00 2.27 2.65 0.00 86.87
11:06:04 AM all 1.63 0.00 1.38 0.13 0.00 96.86
11:06:05 AM all 2.64 0.00 1.51 0.25 0.00 95.60

From kernel documentation,

drop_cachesWriting to this will cause the kernel to drop clean caches, dentries and
inodes from memory, causing that memory to become free.To free pagecache:
echo 1 > /proc/sys/vm/drop_caches
To free dentries and inodes:
echo 2 > /proc/sys/vm/drop_caches
To free pagecache, dentries and inodes:
echo 3 > /proc/sys/vm/drop_caches

java application was traversing through nfs and was accessing large number of files, resulting in large number of nfs_inode_cache entries, resulting in in a large dcache.

Tuning vm.vfs_cache_pressure would be a persistent solution for this.

From kernel documentation,

vfs_cache_pressure
——————Controls the tendency of the kernel to reclaim the memory which is used for
caching of directory and inode objects.At the default value of vfs_cache_pressure=100 the kernel will attempt to
reclaim dentries and inodes at a “fair” rate with respect to pagecache and
swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer
to retain dentry and inode caches. When vfs_cache_pressure=0, the kernel will
never reclaim dentries and inodes due to memory pressure and this can easily
lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100
causes the kernel to prefer to reclaim dentries and inodes.

Categories: DBA Blogs

NZOUG14 Beckons

Pythian Group - Fri, 2014-10-17 07:50

New Zealand is famous for Kiwis, pristine landscape, and the New Zealand Oracle User Group (NZOUG) conference.  The location of choice is New Zealand when it comes to making Lord of the Rings and making Oracle Lord of the Databases.

NZOUG 2014 will be held 19–21 November in the Owen G. Glenn Building at the University of Auckland. The main conference will be held on the 20th and 21st, preceded by a day of workshops on the 19th. It’s one of the premier Oracle conferences in Southern hemisphere.

Where there is Oracle, there is Pythian. Pythian will be present in full force in NZOUG 2014.

Following are Pythian sessions at NZOUG14:

12c Multi-Tenancy and Exadata IORM: An Ideal Cloud Based Resource Management
Fahd Mirza Chughtai

Everyone Talks About DR – But Why So Few Implement It
Francisco Munoz Alvarez

DBA 101: Calling All New Database Administrators
Gustavo Rene Antunez

My First 100 Days with an Exadata
Gustavo Rene Antunez

Do You Really Know the Index Structures?
Deiby Gómez

Oracle Exadata: Storage Indexes vs Conventional Indexes
Deiby Gómez

Oracle 12c Test Drive
Francisco Munoz Alvarez

Why Use OVM for Oracle Database
Francisco Munoz Alvarez

Please check the full agenda of NZOUG14 here.

Categories: DBA Blogs

Log Buffer #393, A Carnival of the Vanities for DBAs

Pythian Group - Fri, 2014-10-17 07:47

Bloggers get connected to both the databases and their readers through their blogs. Bloggers act like a bridge here. Log Buffer extends this nexus through the Log Buffer Edition.

Oracle:

MS Sharepoint and Oracle APEX integration.

Just a couple of screenshots of sqlplus+rlwrap+cygwin+console.

Say “Big Data” One More Time (I dare you!)

Update OEM Harvester after 12.1.0.4 Upgrade

Insight in the Roadmap for Oracle Cloud Platform Services.

SQL Server:

Troubleshoot SQL P2P replication doesn’t replicate DDL schema change.

Set-based Constraint Violation Reporting in SQL Server.

Where do you start fixing a SQL Server crash when there isn’t a single clue?

A permission gives a principal access to an object to perform certain actions on or with the object.

When you can’t get to your data because another application has it locked, a thorough knowledge of SQL Server concurrency will give you the confidence to decide what to do.

MySQL:

MySQL 5.7.5- More variables in replication performance_schema tables.

Multi-source replication for MySQL has been released as a part of 5.7.5-labs-preview downloadable from labs.mysql.com.

How to install multiple MySQL instances on a single host using MyEnv?

Percona Toolkit for MySQL with MySQL-SSL Connections.

InnoDB: Supporting Page Sizes of 32k and 64k.

Categories: DBA Blogs

sreadtim

Jonathan Lewis - Fri, 2014-10-17 06:22

Here’s a question that appeared in my email a few days ago:

 

Based on the formula: “sreadtim = ioseektim + db_block_size/iotrfrspeed” sreadtim should always bigger than ioseektim.

But I just did a query on my system, find it otherwise, get confused,

SQL> SELECT * FROM SYS.AUX_STATS$;<

SNAME                          PNAME                               PVAL1 PVAL2
------------------------------ ------------------------------ ---------- --------------------
SYSSTATS_INFO                  STATUS                                    COMPLETED
SYSSTATS_INFO                  DSTART                                    10-08-2014 10:45
SYSSTATS_INFO                  DSTOP                                     10-10-2014 10:42
SYSSTATS_INFO                  FLAGS                                   1
SYSSTATS_MAIN                  CPUSPEEDNW                     680.062427
SYSSTATS_MAIN                  IOSEEKTIM                              10
SYSSTATS_MAIN                  IOTFRSPEED                           4096
SYSSTATS_MAIN                  SREADTIM                            4.716
SYSSTATS_MAIN                  MREADTIM                            2.055
SYSSTATS_MAIN                  CPUSPEED                             1077
SYSSTATS_MAIN                  MBRC                                    4
SYSSTATS_MAIN                  MAXTHR                          956634112
SYSSTATS_MAIN                  SLAVETHR                           252928

How do we explain this ?

 

This question highlights two points – one important, the other only slightly less so.

The really important point is one of interpretation.  Broadly speaking we could reasonably say that the (typical) time required to perform a single block read is made up of the (typical) seek time plus the transfer time which, using the names of the statistics above, would indeed give us the relationship: sreadtim = ioseektim + db_block_size/iotfrspeed; but we have to remember that we are thinking of a simplified model of the world. The values that we capture for sreadtim include the time it takes for a request to get from Oracle to the O/S, through the various network software and hardware layers and back again, the formula ignores those components completely and, moreover, doesn’t allow for the fact that some “reads” could actually come from one of several caches without any physical disc access taking place; similarly we should be aware that the time for an actual I/O seek would vary dramatically with the current position  of the read head, the radial position of the target block, the speed and current direction of movement of the read head, and the rotational distance to the target block. The formula is not attempting to express a physical law, it is simply expressing an approximation that we might use in a first line estimate of performance.

In fact we can see in the figures above that multi-block reads (typically of 4 blocks)  were faster than single block reads on this hardware for the duration of the sampling period – and that clearly doesn’t fit with the simple view embedded in our formula of how disc drives work.  (It’s a fairly typical effect of SANs, of course, that large read requests make the SAN software start doing predictive read-ahead, so that the next read request from Oracle may find that the SAN has already loaded the required data into its cache.)

There is, however, the second point that these figures highlight – but you have to be in the know to spot the detail: whatever the complexities introduced by SAN caching, we’re notlooking at the right figures. The ioseektim and iotfrspeed shown here are the default values used by Oracle. It looks as if the user has called dbms_stats.gather_system_stats() with a 48 hour workload (8th Oct to 10th Oct), but hasn’t yet executed the procedure using the ‘noworkload’ option. Perhaps the ioseektim and iotfrspeed figures from a noworkload call would look a little more reasonable when compared with the 4.716 milliseconds of the workload single block read. There may still be a large gap between the model and the reality, but until the two sets of figures we’re using come from the same place we shouldn’t even think about comparing them.


Using rlwrap with Apache Hive beeline for improved readline functionality

Rittman Mead Consulting - Fri, 2014-10-17 06:18

rlwrap is a nice little wrapper in which you can invoke commandline utilities and get them to behave with full readline functionality just like you’d get at the bash prompt. For example, up/down arrow keys to move between commands, but also home/end to go to the start/finish of a line, and even ctrl-R to search through command history to rapidly find a command. It’s one of the standard config changes I’ll make to any system with Oracle’s sqlplus on, and it works just as nicely with Apache Hive’s commandline interface, beeline.

beeline comes with some of this functionality (up/down arrow) but not all (for me, it was ‘home’ and ‘end’ not working and printing 1~ and 5~ respectively instead that prompted me to setup rlwrap with it).

Installing rlwrap

To install rlwrap simply add the EPEL yum packages to your repository configuration:

sudo rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/`uname -p`/epel-release-6-8.noarch.rpm

and then install rlwrap from yum:

sudo yum install -y rlwrap

Use

Once rlwrap is installed you can invoke beeline through it manually, specifying all the standard beeline options as you would normally: (I’ve used the \ line continuation character here just to make the example nice and clear)

rlwrap -a beeline \
-u jdbc:hive2://bdanode1:10000 \
-n rmoffatt -p password \
-d org.apache.hive.jdbc.HiveDriver

Now I can connect to beeline, and as before I press up arrow to access commands from when I previously used the tool, but I can also hit ctrl-R to start typing part of a command to recall it, just like I would in bash. Some other useful shortcuts:

  • Ctrl-lclears the screen but with the current line still shown
  • Ctrl-kdeletes to the end of the line from the current cursor position
  • Ctrl-udeletes to the beginning of the line from the current cursor position
  • Esc-fmove forward one word
  • Esc-bmove backward one word
    (more here)

And most importantly, Home and End work just fine! (or, ctrl-a/ctrl-e if you prefer).

NB the -a argument for rlwrap is necessary because beeline already does some readline-esque functions, and we want rlwrap to forceable override them (otherwise neither work very well). Or more formally (from man rlwrap):

Always remain in “readline mode”, regardless of command’s terminal settings. Use this option if you want to use rlwrap with commands that already use readline.

Alias

A useful thing to do is to add an alias directly in your profile so that it is always available to launch beeline under rlwrap, in this case as the rlbeeline command:

# this sets up "rlbeeline" as the command to run beeline
# under rlwrap, you can call it what you want though. 
cat >> ~/.bashrc<<EOF
alias rlbeeline='rlwrap -a beeline'
EOF
# example usage:
# rlbeeline /
# -u jdbc:hive2://bdanode1:10000 /
# -n rmoffatt -p password /
# -d org.apache.hive.jdbc.HiveDriver

If you want this alias available for all users on a machine create the above as a standalone .sh file in /etc/profile.d/.

Autocomplete

One possible downside of using rlwrap with beeline is that you lose the native auto-complete option within beeline for the HiveQL statements. But never fear – we can have the best of both worlds, with the -f argument for rlwrap, specifying a list of custom auto-completes. So this is even a level-up for beeline, because we could populate it with our own schema objects and so on that we want auto-completed.

As a quick-start, run beeline without rlwrap, hit tab-twice and then ‘y’ to show all options and paste the resulting list into a text file (eg beeline_autocomplete.txt). Now call beeline, via rlwrap, passing that file as an argument to rlwrap:

rlwrap -a -f beeline_autocomplete.txt beeline

Once connected, use auto-complete just as you would normally (hit tab after typing a character or two of the word you’re going to match):

Connecting to jdbc:hive2://bdanode1:10000
Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
[...]
Beeline version 0.12.0-cdh5.0.1 by Apache Hive
0: jdbc:hive2://bdanode1:10000> SE
SECOND        SECTION       SELECT        SERIALIZABLE  SERVER_NAME   SESSION       SESSION_USER  SET
0: jdbc:hive2://bdanode1:10000> SELECT

Conclusion

rlwrap is the tool that keeps on giving; just as I was writing this article, I noticed that it also auto-highlights opening parentheses when typing the closing one. Nice!

Categories: BI & Warehousing