Skip navigation.

Feed aggregator

Basic (newbie) install CoreOS on VirtualBox – Getting started with Docker

Marco Gralike - Sat, 2015-04-11 06:18
I got intrigued by this Dutch article mentioning Docker and CoreOS. So on this Saturday,…

LinkedIn: I Told You So (Sorta)

Michael Feldstein - Sat, 2015-04-11 05:54

By Michael FeldsteinMore Posts (1024)

In December 2012, I tweeted:

Let it be known that I was the first to predict that Coursera will be acquired by LinkedIn.

— Michael Feldstein (@mfeldstein67) December 5, 2012

At the time, Coursera was the darling of online ed startups. Since then, it has lost its way somewhat, while Lynda.com has taken off like a rocket. Which is probably one big reason why LinkedIn chose to acquire Lynda.com (rather than Coursera) for $1.5 billion. I still think it’s possible that they could acquire a MOOC provider as well, but Udacity seems like a better fit than Coursera at this point.

I’ve said it before and I’ll say it again: LinkedIn is the most interesting company in ed tech.

The post LinkedIn: I Told You So (Sorta) appeared first on e-Literate.

MariaDB and MaxScale

DBMS2 - Fri, 2015-04-10 10:48

I chatted with the MariaDB folks on Tuesday. Let me start by noting:

  • MariaDB, the product, is a MySQL fork.
  • MariaDB, product and company alike, are essentially a reaction to Oracle’s acquisition of MySQL. A lot of the key players are previously from MySQL.
  • MariaDB, the company, is the former SkySQL …
  • … which acquired or is the surviving entity of a merger with The Monty Program, which originated MariaDB. According to Wikipedia, something called the MariaDB Foundation is also in the mix.
  • I get the impression SkySQL mainly provided services around MySQL, especially remote DBA.
  • It appears that a lot of MariaDB’s technical differentiation going forward is planned to be in a companion product called MaxScale, which was released into Version 1.0 general availability earlier this year.

The numbers around MariaDB are a little vague. I was given the figure that there were ~500 customers total, but I couldn’t figure out what they were customers for. Remote DBA services? MariaDB support subscriptions? Something else? I presume there are some customers in each category, but I don’t know the mix. Other notes on MariaDB the company are:

  • ~80 people in ~15 countries.
  • 20-25 engineers, which hopefully doesn’t count a few field support people.
  • “Tiny” headquarters in Helsinki.
  • Business leadership growing in the US and especially the SF area.

MariaDB, the company, also has an OEM business. Part of their pitch is licensing for connectors — specifically LGPL — that hopefully gets around some of the legal headaches for MySQL engine suppliers.

MaxScale is a proxy, which starts out by intercepting and parsing MariaDB queries.

  • As you might guess, MaxScale has a sharding story.
    • All MaxScale sharding is transparent.
    • Right now MaxScale sharding is “schema-based”, which I interpret to mean as different tables potentially being on different servers.
    • Planned to come soon is “key-based” sharding, which I interpret to mean as the kind of sharding that lets you scale a table across multiple servers without the application needing to know that is happening.
    • I didn’t ask about join performance when tables are key-sharded.
  • MaxScale includes a firewall.
  • MaxScale has 5 “well-defined” APIs, which were described as:
    • Authentication.
    • Protocol.
    • Monitoring.
    • Routing.
    • Filtering/logging.
  • I think MaxScale’s development schedule is “asynchronous” from that of the MariaDB product.
  • Further, MaxScale has a “plug-in” architecture that is said to make it easy to extend.
  • One plug-in on the roadmap is replication into Hadoop-based tables. (I think “into” is correct.)

I had trouble figuring out the differences between MariaDB’s free and enterprise editions. Specifically, I thought I heard that there were no feature differences, but I also thought I heard examples of feature differences. Further, there are third-party products included, but plans to replace some of those with in-house developed products in the future.

A few more notes:

  • MariaDB’s optimizer is rewritten vs. MySQL.
  • Like other vendors before it, MariaDB has gotten bored with its old version numbering scheme and jumped to 10.0.
  • One of the storage engines MariaDB ships is TokuDB. Surprisingly, TokuDB’s most appreciated benefit seems to be compression, not performance.
  • As an example of significant outside code contributions, MariaDB cites Google contributing whole-database encryption into what will be MariaDB 10.1.
  • Online schema change is on the roadmap.
  • There’s ~$20 million of venture capital in the backstory.
  • Engineering is mainly in Germany, Eastern Europe, and the US.
  • MariaDB Power8 performance is reportedly great (2X Intel Sandy Bridge or a little better). Power8 sales are mainly in Europe.
Categories: Other

Counting

Jonathan Lewis - Fri, 2015-04-10 10:27

There’s a live example on OTN at the moment of an interesting class of problem that can require some imaginative thinking. It revolves around a design that uses a row in one table to hold the low and high values for a range of values in another table. The problem is then simply to count the number of rows in the second table that fall into the range given by the first table. There’s an obvious query you can write (a join with inequality) but if you have to join each row in the first table to several million rows in the second table, then aggregate to count them, that’s an expensive strategy.  Here’s the query (with numbers of rows involved) that showed up on OTN; it’s an insert statement, and the problem is that it takes 7 hours to insert 37,600 rows:


    INSERT INTO SA_REPORT_DATA
    (REPORT_ID, CUTOFF_DATE, COL_1, COL_2, COL_3)
    (
    SELECT 'ISRP-734', to_date('&DateTo', 'YYYY-MM-DD'),
           SNE.ID AS HLR
    ,      SNR.FROM_NUMBER||' - '||SNR.TO_NUMBER AS NUMBER_RANGE
    ,      COUNT(M.MSISDN) AS AVAILABLE_MSISDNS
    FROM
           SA_NUMBER_RANGES SNR          -- 10,000 rows
    ,      SA_SERVICE_SYSTEMS SSS        --  1,643 rows
    ,      SA_NETWORK_ELEMENTS SNE       --    200 rows
    ,      SA_MSISDNS M                  --    72M rows
    WHERE
           SSS.SEQ = SNR.SRVSYS_SEQ
    AND    SSS.SYSTYP_ID = 'OMC HLR'
    AND    SNE.SEQ = SSS.NE_SEQ
    AND    SNR.ID_TYPE = 'M'
    AND    M.MSISDN  >= SNR.FROM_NUMBER
    AND    M.MSISDN  <= SNR.TO_NUMBER
    AND    M.STATE  = 'AVL'
    GROUP BY
           SNE.ID,SNR.FROM_NUMBER||' - '||SNR.TO_NUMBER
    )  

The feature here is that we are counting ranges of MSISDN: we take 10,000 number ranges (SNR) and join with inequality to a 72M row table. It’s perfectly conceivable that at some point the data set expands (not necessarily all at once) to literally tens of billions of rows that are then aggregated down to the 37,500 that are finally inserted.

The execution plan shows the optimizer joining the first three tables before doing a merge join between that result set and the relevant subset of the MSISDNs table – which means the MSISDNs have to be sorted and buffered (with a probably spill to disc) before they can be used. It would be interesting to see the rowsource execution stats for the query – partly to see how large the generated set became, but also to see if the ranges involved were so large that most of the time went in constantly re-reading the sorted MSISDNs from the temporary tablespace.

As far as optimisation is concerned, there are a couple of trivial things around the edges we can examine: we have 10,000 number ranges but insert 37,600 results, and the last stages of the plan generated those results so we’ve scanned and aggregated the sorted MSISDNs 37,600 times. Clearly we could look for a better table ordering that (eliminated any number ranges early), then did the minimal number of joins to MSISDN, aggregated, then scaled up to 37,600: with the best join order we might reduce the run time by a factor of 3 or more. (But that’s still a couple of hours run time.)

What we really need to do to make a difference is change the infrastructure in some way – prefereably invisibly to the rest of the application. There are a number of specific details relating to workload, read-consistency, timing, concurrency, etc. that will need to be considered, but broadly speaking, we need to take advantage of a table that effectively holds the “interesting” MSISDNs in sorted order. I’ve kept the approach simple here, it needs a few modifications for a production system. The important bit of the reports is the bit that produces the count, so I’m only going to worry about a two-table join – number ranges and msidn; here’s some model data:


execute dbms_random.seed(0)

create table msisdns
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(1e9,1e10))      msisdn
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6
;

create table number_ranges
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        trunc(dbms_random.value(1e9,1e10))      from_number,
        trunc(dbms_random.value(1e9,1e10))      to_number
from
        generator       v1
where
        rownum  <= 1000
;

update number_ranges set
        from_number = to_number,
        to_number = from_number
where
        to_number < from_number
;

commit;

I’ve created a table of numbers with values between 10e9 and 10e10 to represent 1 million MSISDNs, and a list of 1,000 number ranges – making sure that the FROM number is not greater than the TO number. Now I need a “summary” table of the MSISDNs, which I’m going to create as an index-organized table:


create table tmp_msisdns (
        msisdn,
        counter,
        constraint tmp_pk primary key (msisdn, counter)
)
organization index
as
select
        msisdn,
        row_number() over(order by msisdn)      counter
from
        msisdns
;

This is only a demonstration so I’ve haven’t bothered with production-like code to check that the MSISDNs I had generated were unique (they were); and I’ve casually included the row_number() as part of the primary key as a performance fiddle even though it’s something that could, technically, allow some other program to introduce bad data if I made the table available for public use rather than task specific.

Finally we get down to the report. To find out how many MSISDN values there are between the FROM and TO number in a range I just have to find the lowest and highest MSISDNs from tmp_msisdn in that range and find the difference between their counter values, and add 1. And there’s a very fast way to find the lowest or highest values when you have the appropriate index – the min/max range scan – but you have to access the table twice, once for the low, once for the high. Here’s the necessary SQL, with execution plan from 12.1.0.2:


select
        nr.from_number, nr.to_number,
--      fr1.msisdn, fr1.counter,
--      to1.msisdn, to1.counter,
        1 + to1.counter - fr1.counter range_count
from
        number_ranges   nr,
        tmp_msisdns     fr1,
        tmp_msisdns     to1
where
        fr1.msisdn = (
                select min(msisdn) from tmp_msisdns where tmp_msisdns.msisdn >= nr.from_number
        )
and     to1.msisdn = (
                select max(msisdn) from tmp_msisdns where tmp_msisdns.msisdn <= nr.to_number
        )
;

-------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |               |       |       |  4008 (100)|          |
|   1 |  NESTED LOOPS                   |               |  1000 | 38000 |  4008   (1)| 00:00:01 |
|   2 |   NESTED LOOPS                  |               |  1000 | 26000 |  2005   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL            | NUMBER_RANGES |  1000 | 14000 |     2   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN             | TMP_PK        |     1 |    12 |     2   (0)| 00:00:01 |
|   5 |     SORT AGGREGATE              |               |     1 |     7 |            |          |
|   6 |      FIRST ROW                  |               |     1 |     7 |     3   (0)| 00:00:01 |
|*  7 |       INDEX RANGE SCAN (MIN/MAX)| TMP_PK        |     1 |     7 |     3   (0)| 00:00:01 |
|*  8 |   INDEX RANGE SCAN              | TMP_PK        |     1 |    12 |     2   (0)| 00:00:01 |
|   9 |    SORT AGGREGATE               |               |     1 |     7 |            |          |
|  10 |     FIRST ROW                   |               |     1 |     7 |     3   (0)| 00:00:01 |
|* 11 |      INDEX RANGE SCAN (MIN/MAX) | TMP_PK        |     1 |     7 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("FR1"."MSISDN"=)
   7 - access("TMP_MSISDNS"."MSISDN">=:B1)
   8 - access("TO1"."MSISDN"=)
  11 - access("TMP_MSISDNS"."MSISDN"<=:B1)

Execution time – with 1 million MSISDNs and 1,000 ranges: 0.11 seconds.

For comparative purposes, and to check that the code is producing the right answers, here’s the basic inequality join method:


select
        nr.from_number, nr.to_number, count(*) range_count
from
        number_ranges   nr,
        msisdns         ms
where
        ms.msisdn >= nr.from_number
and     ms.msisdn <= nr.to_number
group by
        nr.from_number, nr.to_number
order by
        nr.from_number
;

-----------------------------------------------------------------------------------------------
| Id  | Operation             | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |               |       |       |       |   472K(100)|          |
|   1 |  HASH GROUP BY        |               |   707K|    14M|  6847M|   472K (17)| 00:00:19 |
|   2 |   MERGE JOIN          |               |   255M|  5107M|       | 13492  (77)| 00:00:01 |
|   3 |    SORT JOIN          |               |  1000 | 14000 |       |     3  (34)| 00:00:01 |
|   4 |     TABLE ACCESS FULL | NUMBER_RANGES |  1000 | 14000 |       |     2   (0)| 00:00:01 |
|*  5 |    FILTER             |               |       |       |       |            |          |
|*  6 |     SORT JOIN         |               |  1000K|  6835K|    30M|  3451   (7)| 00:00:01 |
|   7 |      TABLE ACCESS FULL| MSISDNS       |  1000K|  6835K|       |   245  (14)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - filter("MS"."MSISDN"<="NR"."TO_NUMBER")
   6 - access("MS"."MSISDN">="NR"."FROM_NUMBER")
       filter("MS"."MSISDN">="NR"."FROM_NUMBER")

The two queries produced the same results (apart from ordering); but the second query took 2 minutes 19.4 seconds to complete.

 

Update:

In a moment of idle curiosity I recreated the data with 40 Million rows in the MSISDNs table to get some idea of how fast the entire report process could go when re-engineered (remember the OP has 72M rows, but select the subset flagged as ‘AVL’). It took 1 minute 46 seconds to create the IOT – after which the report for 1,000 number ranges still took less than 0.2 seconds.

 

 

 

 

 


Licensing Oracle in a public cloud: the CPU calculation impact

Pythian Group - Fri, 2015-04-10 09:18

First of all a disclaimer: I don’t work for Oracle nor do I speak for them. I believe this information to be correct, but for licensing questions, Oracle themselves have the final word.

With that out of the way, followers of this blog may have seen some of the results from my testing of actual CPU capacity with public clouds like Amazon Web Services, Microsoft Azure, and Google Compute Engine. In each of these cases, a CPU “core” was actually measured to be equivalent to an x86 HyperThread, or half a physical core. So when provisioning public cloud resources, it’s important to include twice as many CPU cores as the equivalent physical hardware. The low price and elasticity of public cloud infrastructure can however offset this differential, and still result in a cost savings over physical hardware.

One place this difference in CPU core calculation can have a significant impact, however, is software licensing. In this post I’ll look at Oracle database licensing in particular.

Oracle databases can be licensed using many metrics, including unlimited use agreements, embedded licenses, evaluation/developer licenses, partner licenses, and many more. But for those without a special agreement in place with Oracle, there are two ways to license products: Named User Plus (NUP) and processor licenses. NUP licenses are per-seat licenses which have a fixed cost per physical user or non-user device. The definition of a user is very broad, however. Quoting the Oracle Software Investment Guide:

Named User Plus includes both humans and non-human operated devices. All human users and non-human operated devices that are accessing the program must be licensed. A non-human operated device can be many things, such as a temperature-monitoring device. It is important to note that if the device is operated by a person, then this person must be licensed. As described in illustration #1, the 400 employees who are operating the 30 forklifts must be licensed because the forklift is not a “non-human operated device”.

So, if the application has any connection outside the organization (batch data feeds and public web users would be examples), it’s very difficult to fit the qualifications to count as NUP licenses.

Now, this leaves per-processor licenses, using processor cores that can potentially run the database software as licensing metric. When running in a public cloud, however, there is an immediate issue, which is your Oracle instance could presumably run on any of the thousands of servers owned by the cloud provider, so unique physical processors are virtually impossible to count. Fortunately, Oracle has provided a way to properly license Oracle software in public cloud environments: Licensing Oracle Software in the Cloud Computing Environment. It sets out a few requirements, including:

  • Amazon EC2, Amazon S3, and Microsoft Azure are covered under the policy.
  • There are limits to the counting of sockets and the number of cores per instance for Standard Edition and Standard Edition One.

But most importantly is the phrase customers are required to count each virtual core as equivalent to a physical core. Knowing that each “virtual core” is actually half a physical core, it can shift the economics of public cloud usage for Oracle database significantly.

Here’s an example of a general-purpose AWS configuration and a close equivalent on physical hardware. I’m excluding costs of external storage and datacenter costs (power, bandwidth, etc) from the comparison.

  • m3.2xlarge
  • 8 virtual / 4 physical CPU cores (from an E5-2670 processor at 2.6GHz)
  • 30GB RAM
  • 2x80GB local SSD storage
  • 3-year term

Total: $2989 upfront

A physical-hardware equivalent:

  • A single quad-core E5-2623 v3 processor at 3GHz
  • 32GB RAM
  • Oracle standard edition one
  • 2x120GB local SSD
  • 3-year 24×7 4hr on-site service

I priced this out at dell.com and came out with a total of $3761.

Now let’s add in an Oracle license. From the Oracle Price List, a socket of Standard Edition One costs $5800, with an additional $1276/year for support. Due to the counting of CPU cores, our AWS hardware requires two sockets of licensing. So instead of saving $772, we end up paying $9628 more.

 Standard Edition One

If we were to use Oracle Enterprise edition (excluding any options or discounts), that becomes an extra $157,700. Not small change anymore.

 Enterprise Edition

So before you make the jump to put your Oracle databases on a public cloud, check your CPU core counts to avoid unexpected licensing surprises.

Categories: DBA Blogs

SQLDeveloper VAR NUMBER

Dominic Brooks - Fri, 2015-04-10 08:59

Just been tearing my hair out on some unexpected SQL behaviour in SQL Developer.
Eventually I managed to get to the root cause of the problem as illustrated below.
There is a silent failure which can cause unexpected results from your SQL.

SQL*Plus:

SQL> var ps number
SQL> exec :ps := 4001644945;

PL/SQL procedure successfully completed.

SQL> select :ps from dual;

       :PS
----------
4001644945

SQL>

SQL Developer v3.2 and v4.0 via “Run As Script(F5)”:

var ps number
exec :ps := 4001644945;
select :ps from dual;

anonymous block completed
       :PS
----------
         0 

Looks like SQL Developer is running into an integer problem and silently “dealing” with it.

var ps number
exec :ps := 2147483647
select :ps from dual;
exec :ps := 2147483648
select :ps from dual;

anonymous block completed
       :PS
----------
2147483647 

anonymous block completed
       :PS
----------
         0 

No such problem when you do a Run (F9) and use the bind dialog.
One to watch out for if you do a lot of sql execution via var like me.
Think I’ve seen it before. Think I might even have blogged about it before!


Come See Integrigy at Collaborate 2015

Come see Integrigy's session at Collaborate 2015 in Las Vegas (http://collaborate.ioug.org/). Integrigy is presenting the following paper:

IOUG #763
Detecting and Stopping Cyber Attacks against Oracle Databases
Monday, April 13th, 9:15 - 11:30 am
North Convention, South Pacific J

If you are going to Collaborate 2015, we would also be more than happy to talk with you about your Oracle security or questions. If you would like to talk with us while at Collaborate, please contact us at info@integrigy.com.

 

Tags: Conference
Categories: APPS Blogs, Security Blogs

Log Buffer #418: A Carnival of the Vanities for DBAs

Pythian Group - Fri, 2015-04-10 08:20

This Log Buffer edition has collected some of the valuable blog posts from different databases like Oracle, SQL Server and MySQL.

Oracle:

  • Accessing HDFS files on a local File system using mountable HDFS – FUSE
  • enq: TM – contention
  • The Four A’s of Data Management
  • ODI, Big Data SQL and Oracle NoSQL
  • Using the RIDC Client to Interface with Oracle Webcenter Content

SQL Server:

  • SQL Server 2014 has introduced a rebuilt Cardinality Estimator (CE) with new algorithms
  • Creating a multi-option parameter report for SQL Server Reporting Services
  • Re-factoring a database object can often cause unexpected behavior in the code that accesses that object
  • What is Database Continuous Integration?
  • Deleting Historical Data from a Large Highly Concurrent SQL Server Database Table

MySQL:

For years it was very easy to defend InnoDB’s advantage over competition. Covering index reads were saving I/O operations and CPU everywhere, table space and I/O management allowed focusing on database and not on file systems or virtual memory behaviors, and for the past few years InnoDB compression was the way to have highly efficient OLTP.

InnoDB locks and deadlocks with or without index for different isolation level.

pquery binaries with statically included client libs now available!

MySQL Group Replication – mysql-5.7.6-labs-group-replication.

MySQL 5.7 aims to be the most secure MySQL Server release ever, and that means some significant changes in SSL/TLS.

Categories: DBA Blogs

Log Buffer #417: A Carnival of the Vanities for DBAs

Pythian Group - Fri, 2015-04-10 08:06

This Log Buffer travels wide and deep to scour through the Internet to bring some of the most valuable and value-adding blog posts from Oracle, SQL Server and MySQL.

Oracle:

What is SQLcl ? SQLcl is a new command line interface like SQL*PLUS coming along with SQL Developper 4.1 Early Adopter. It’s a lightweight tool (only 11MB) developed by the SQL Developer team, which is fully compatible with Windows and Unix/Linux.  Also, you don’t need to install it so it’s totally portable.

Find Users with DBA Roles.

Virtual Compute Appliance 2.0.2 Released.

In case you are not familiar with WLST (the WebLogic Scripting Tool), it is a powerful scripting runtime for administering WebLogic domains.

The following article gives some useful hints-and-tips Richard used recently in helping people customizing tables and lists-of-values using Page Composer.

SQL Server:

With the idea of a generic Dacpac defined by international standard, comes the potential for a Visual Studio developer. This uses SSDT to create a generic database model to a SQL-92 compliant standard that can then be deployed to any one of the major RDBMSs.

Using the APPLY operator to reduce repetition and make queries DRYer.

Image a situation when you use the SQL Server RAND() T-SQL function as a column in a SELECT statement, and the same value is returned for every row as shown below. In this tip, Dallas Snider explains how you can get differing random values on each row.

This articles describes two ways to shred Unicode Japanese character from xls files into SQL Server table using SSIS.

Arshad Ali demonstrates how you can use the command line interface to tune SQL queries and how you can use SQL Server Profiler to capture the workload for tuning with Database Engine Tuning Advisor.

MySQL:

Postgres-Performance seit 7.4.

The Ubuntu 12.04.3 LTS release only provides MySQL 5.1 and MySQL 5.5 using the default Ubuntu package manager.

As part of a MySQL 5.5 to MySQL 5.6 upgrade across several Ubuntu servers of varying distros an audit highlighted a trivial but interesting versioning identification error in Ubuntu’s packaging of MySQL.

MySQL 5.6 will now automatically recreate the InnoDB redo log files during a MySQL restart if the size (or number) of these logs changes, i.e. a change to innodb_log_file_size.

Mermaids have the same probability of fixing your permission problems, but people continue believing in the FLUSH PRIVILEGES myth.

Categories: DBA Blogs

Disable Lock Escalation in SQL Server

Pythian Group - Fri, 2015-04-10 07:55

If a lot of rows or pages are locked, the SQL Server escalates to a table-level lock, to save resources. Each single lock takes approx. 100 bytes. So if you have many locks it takes a lot of resources to manage them. (There is a great blog about lock escalation, if you want some more info: http://blogs.msdn.com/b/sqlserverstorageengine/archive/2006/05/17/lock-escalation.aspx)

 

Until SQL Server 2008, there was no way to change the lock escalation for a single table. You could deactivate the escalation for the server by using the Trace Flags:

  • 1211 - Disables Lock Escalation completely – allows to use 60% of the allocated memory – if 60% of memory is used and more locking is needed you will get an out-of-memory error.
  • 1224 - Disables Lock Escalation until the memory threshold of 40% allocated memory is reached – after that Lock Escalation is enabled.

 

But that was in most cases not a good choice and caused a lot of performance problems. In SQL-Server 2008 and above there is a new table option (ALTER-TABLE) that can be used to change the default Lock-Escalation. This helps you if you have a table where you want to disable the escalation or if the table is partitioned.

 

On a partitioned table activating the AUTO Option can improve concurrency, by escalating the locks to the partition-level and not to the table-level.

ALTER TABLE – table option:

SET ( LOCK_ESCALATION = { AUTO | TABLE | DISABLE } )

  • AUTO (should be considered if you have a partitioned table)
  • If tables is partitioned – the locks will be escalated to the partition-level
  • If table is not partitioned – the locks will be escalated to the table-level
  • TABLE
  • Default behavior
  • Locks are escalated to the table-level
  • DISABLE
  • Lock escalation to the table-level is deactivated in most cases
  • In some necessary cases it’s allowed to escalate to the table-level

This is a cool feature, that are many developers are not aware of.

Thanks for Reading!

Categories: DBA Blogs

APEX 5.0: Declarative Definition of CSS and JavaScript Files to Load for a Plug-in

Patrick Wolf - Fri, 2015-04-10 04:58
A best practice for performant HTML pages is to load CSS files in the HTML head to reduce redrawing of the page when a CSS file gets loaded. But because the PL/SQL code of a Plug-in isn’t executed util a … Continue reading →
Categories: Development

Identify difference in CBO parameters across two executions of a SQL

Oracle in Action - Fri, 2015-04-10 01:16

RSS content

Various initialization parameters  influence the choice of execution plan by the optimizer. Multiple executions of the same SQL with  different optimizer parameters  may employ different execution plans and hence result in  difference in performance . In this post, I will demonstrate how we can identify the CBO parameters which had  different values during two executions of the same SQL.

I will explore two scenarios :

  • the cursors are still available in shared pool
  • the cursors have been flushed to AWR

Overview:

  •  Flush shared pool
  •  Execute the same statement in two sessions with different values of OPTIMIZER_MODE
  •  Verify that the cursors for the statement are still in the shared pool
  •  Find out the execution plans employed by two cursors
  •  Find out the optimizer parameters which had different values during two executions of the SQL
  •  Take snapshot so that cursors are flushed to AWR
  •  Verify that the cursors for the statement are in AWR
  •  Find out the execution plans employed by two cursors
  •  Find out the value of parameter OPTIMIZER_MODE during two executions of the SQL

Implementation

- Flush shared pool

SQL>conn / as sysdba

alter system flush shared_pool;

- Execute the same statement in two sessions with different values of OPTIMIZER_MODE

– Note that there is difference in the time elapsed during two executions of the same statement

– Session – I –

SQL>alter session set optimizer_mode = 'ALL_ROWS';

set timing on
select s.quantity_sold, s.amount_sold, p.prod_name from sh.sales s,
sh.products p where s.prod_id = p.prod_id;
set timing off

918843 rows selected.

Elapsed: 00:01:19.57

-- Session - II  --
SQL>alter session set optimizer_mode = 'FIRST_ROWS_1';

set timing on
select s.quantity_sold, s.amount_sold, p.prod_name from sh.sales s,
sh.products p where s.prod_id = p.prod_id;
set timing off

918843 rows selected.

Elapsed: 00:01:16.63

 – Verify that the cursors for the statement are still in the shared pool and find out SQL_ID of the statement

SQL>set pagesize 200
col sql_text for a50

select sql_id, child_number, sql_text
from v$sql
where sql_text like '%select  s.quantity_sold, s.amount_sold%';

SQL_ID CHILD_NUMBER SQL_TEXT
------------- ------------ ----------------------------------------------
394uuwwbyjdnh 0 select sql_id, child_number, sql_text from v$sql w
here sql_text like '%select s.quantity_sold, s.am
ount_sold%'

6y2xaw3asr6xv 0 select s.quantity_sold, s.amount_sold, p.prod_nam
e from sh.sales s, sh.products p where s.prod_id =
p.prod_id

6y2xaw3asr6xv 1 select s.quantity_sold, s.amount_sold, p.prod_nam
e from sh.sales s, sh.products p where s.prod_id =
p.prod_id

 – Find out the execution plans employed by two cursors

Note that

  • child 0 uses hash join whereas child 1 employs nested loops join
  • Outline Data shows that optimizer_mode was ALL_ROWS during first execution and FIRST_ROWS(1) during second execution.
SQL>select * from table (dbms_xplan.display_cursor('6y2xaw3asr6xv', 0, format => 'TYPICAL +OUTLINE'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
SQL_ID 6y2xaw3asr6xv, child number 0
-------------------------------------
select s.quantity_sold, s.amount_sold, p.prod_name from sh.sales s,
sh.products p where s.prod_id = p.prod_id

Plan hash value: 1019954709
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 495 (100)| || |
|* 1 | HASH JOIN | | 918K| 36M| 495 (3)| 00:00:06 | | |
| 2 | TABLE ACCESS FULL | PRODUCTS | 72 | 2160 | 3 (0)| 00:00:01 | | |
| 3 | PARTITION RANGE ALL| | 918K| 10M| 489 (2)| 00:00:06 | 1 | 28 |
| 4 | TABLE ACCESS FULL | SALES | 918K| 10M| 489 (2)| 00:00:06 | 1 | 28 |
-------------------------------------------------------------------------

Outline Data
-------------
 /*+
 BEGIN_OUTLINE_DATA
 IGNORE_OPTIM_EMBEDDED_HINTS
 OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
 DB_VERSION('11.2.0.1')
 ALL_ROWS
 OUTLINE_LEAF(@"SEL$1")
 FULL(@"SEL$1" "P"@"SEL$1")
 FULL(@"SEL$1" "S"@"SEL$1")
 LEADING(@"SEL$1" "P"@"SEL$1" "S"@"SEL$1")
 USE_HASH(@"SEL$1" "S"@"SEL$1")
 END_OUTLINE_DATA
 */

Predicate Information (identified by operation id):
---------------------------------------------------
 1 - access("S"."PROD_ID"="P"."PROD_ID")
select * from table (dbms_xplan.display_cursor('6y2xaw3asr6xv', 1, format => 'TYPICAL +OUTLINE'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
SQL_ID 6y2xaw3asr6xv, child number 1
-------------------------------------
select s.quantity_sold, s.amount_sold, p.prod_name from sh.sales s,
sh.products p where s.prod_id = p.prod_id

Plan hash value: 3603960078
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| | | |
| 1 | NESTED LOOPS | | | | | | | |
| 2 | NESTED LOOPS | | 1 | 42 | 3 (0)| 00:00:01 | | |
| 3 | PARTITION RANGE ALL | | 1 | 12 | 2 (0)| 00:00:01 | 1 | 28 |
| 4 | TABLE ACCESS FULL | SALES | 1 | 12 | 2 (0)| 00:00:01 | 1 | 28 |
|* 5 | INDEX UNIQUE SCAN | PRODUCTS_PK | 1 | | 0 (0)| | | |
| 6 | TABLE ACCESS BY INDEX ROWID| PRODUCTS | 1 | 30 | 1 (0)| 00:00:01 | | |
-------------------------------------------------------------------------

Outline Data
-------------
 /*+
 BEGIN_OUTLINE_DATA
 IGNORE_OPTIM_EMBEDDED_HINTS
 OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
 DB_VERSION('11.2.0.1')
 FIRST_ROWS(1)
 OUTLINE_LEAF(@"SEL$1")
 FULL(@"SEL$1" "S"@"SEL$1")
 INDEX(@"SEL$1" "P"@"SEL$1" ("PRODUCTS"."PROD_ID"))
 LEADING(@"SEL$1" "S"@"SEL$1" "P"@"SEL$1")
 USE_NL(@"SEL$1" "P"@"SEL$1")
 NLJ_BATCHING(@"SEL$1" "P"@"SEL$1")
 END_OUTLINE_DATA
 */

Predicate Information (identified by operation id):
---------------------------------------------------
 5 - access("S"."PROD_ID"="P"."PROD_ID")

- It can be verified from v$sql_shared_cursor also that there was optimizer mode mismatch between two executions of the SQL

SQL>select SQL_ID, child_number, OPTIMIZER_MODE_MISMATCH
from v$sql_shared_cursor
where sql_id = ‘6y2xaw3asr6xv';

SQL_ID CHILD_NUMBER O
————- ———— –
6y2xaw3asr6xv 0 N
6y2xaw3asr6xv 1 Y

 – We can also employ v$sql_optimizer_env to find out optimizer parameters which have different values during 2 executions

SQL>set line 500
col name for a15

select child1.name, child1.value child1_value, child2.value child2_value
from v$sql_optimizer_env child1, v$sql_optimizer_env child2
where child1.sql_id = '6y2xaw3asr6xv' and child1.name like 'optimizer%'
and child2.sql_id = '6y2xaw3asr6xv' and child2.name like 'optimizer%'
and child1.name = child2.name
and child1.value <> child2.value;

NAME CHILD1_VALUE CHILD2_VALUE
--------------- ------------------------- -------------------------
optimizer_mode first_rows_1 all_rows
optimizer_mode all_rows first_rows_1

 – Take snapshot so that cursors are flushed to AWR

SQL>exec dbms_workload_repository.create_snapshot ('ALL');

 – Verify that the statement is  in AWR

SQL> set pagesize 200
SQL> col sql_text for a50
SQL>  select sql_id, sql_text from dba_hist_sqltext where sql_text like '%select s.quantity_sold, s.amount_sold%';
SQL_ID SQL_TEXT

------------- --------------------------------------------------
6y2xaw3asr6xv select s.quantity_sold, s.amount_sold, p.prod_nam
e from sh.sales s,
sh.products

9fr7ycjcfqydb select sql_id, sql_text from v$sql where sql_text
like '%select s.quantity_sold

 – Find out the execution plans that were  employed by various  cursors of the same statement

Note that

  • one cursor  uses hash join whereas other cursor employs nested loops join
  • The value of parameter OPIMIZER_MODE was ALL_ROWS during one execution and FIRST_ROWS(1) during second execution
SQL>select * from table(dbms_xplan.display_awr ('6y2xaw3asr6xv', format => 'TYPICAL +OUTLINE'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
SQL_ID 6y2xaw3asr6xv
--------------------
select s.quantity_sold, s.amount_sold, p.prod_name from sh.sales s,
sh.products p where s.prod_id = p.prod_id

Plan hash value: 1019954709
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 495 (100)| || |
| 1 | HASH JOIN | | 918K| 36M| 495 (3)| 00:00:06 | | |
| 2 | TABLE ACCESS FULL | PRODUCTS | 72 | 2160 | 3 (0)| 00:00:01 | | |
| 3 | PARTITION RANGE ALL| | 918K| 10M| 489 (2)| 00:00:06 | 1 | 28 |
| 4 | TABLE ACCESS FULL | SALES | 918K| 10M| 489 (2)| 00:00:06 | 1 | 28 |
-------------------------------------------------------------------------

Outline Data
-------------
 /*+
 BEGIN_OUTLINE_DATA
 IGNORE_OPTIM_EMBEDDED_HINTS
 OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
 DB_VERSION('11.2.0.1')
 ALL_ROWS
 OUTLINE_LEAF(@"SEL$1")
 FULL(@"SEL$1" "P"@"SEL$1")
 FULL(@"SEL$1" "S"@"SEL$1")
 LEADING(@"SEL$1" "P"@"SEL$1" "S"@"SEL$1")
 USE_HASH(@"SEL$1" "S"@"SEL$1")
 END_OUTLINE_DATA
 */

SQL_ID 6y2xaw3asr6xv
--------------------
select s.quantity_sold, s.amount_sold, p.prod_name from sh.sales s,
sh.products p where s.prod_id = p.prod_id

Plan hash value: 3603960078
-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| | | |
| 1 | NESTED LOOPS | | | | | | | |
| 2 | NESTED LOOPS | | 1 | 42 | 3 (0)| 00:00:01 | | |
| 3 | PARTITION RANGE ALL | | 1 | 12 | 2 (0)| 00:00:01 | 1 | 28 |
| 4 | TABLE ACCESS FULL | SALES | 1 | 12 | 2 (0)| 00:00:01 | 1 | 28 |
| 5 | INDEX UNIQUE SCAN | PRODUCTS_PK | 1 | | 0 (0)| | | |
| 6 | TABLE ACCESS BY INDEX ROWID| PRODUCTS | 1 | 30 | 1 (0)| 00:00:01 | | |
-------------------------------------------------------------------------

Outline Data
-------------
 /*+
 BEGIN_OUTLINE_DATA
 IGNORE_OPTIM_EMBEDDED_HINTS
 OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
 DB_VERSION('11.2.0.1')
 FIRST_ROWS(1)
 OUTLINE_LEAF(@"SEL$1")
 FULL(@"SEL$1" "S"@"SEL$1")
 INDEX(@"SEL$1" "P"@"SEL$1" ("PRODUCTS"."PROD_ID"))
 LEADING(@"SEL$1" "S"@"SEL$1" "P"@"SEL$1")
 USE_NL(@"SEL$1" "P"@"SEL$1")
 NLJ_BATCHING(@"SEL$1" "P"@"SEL$1")
 END_OUTLINE_DATA
 */

 — We can also use dba_hist_sqlstat to check the optimizer mode during two executions

SQL>select sql_text, optimizer_mode   
   from dba_hist_sqlstat h, dba_hist_sqltext t
     where h.sql_id = '6y2xaw3asr6xv'
       and t.sql_id = '6y2xaw3asr6xv';

SQL_TEXT OPTIMIZER_
-------------------------------------------------- ----------
select s.quantity_sold, s.amount_sold, p.prod_nam ALL_ROWS
e from sh.sales s,
sh.products

select s.quantity_sold, s.amount_sold, p.prod_nam FIRST_ROWS
e from sh.sales s,
sh.products

Summary:

The difference in values of various CBO parameters during various executions of the same statement can be found out while the statement is in shared pool or has been flushed to disk .

————————————————————————————————————————-

Related links:

Home
Tuning Index



Tags:  

Del.icio.us
Digg

Comments:  0 (Zero), Be the first to leave a reply!
You might be interested in this:  
Copyright © ORACLE IN ACTION [Identify difference in CBO parameters across two executions of a SQL], All Right Reserved. 2015.

The post Identify difference in CBO parameters across two executions of a SQL appeared first on ORACLE IN ACTION.

Categories: DBA Blogs

Video Tutorial: XPLAN_ASH Active Session History - Part 5

Randolf Geist - Fri, 2015-04-10 01:02
The next part of the video tutorial explaining the XPLAN_ASH Active Session History functionality continuing the actual walk-through of the script output.

More parts to follow.

Dbvisit replicate REDO_READ_METHOD

Yann Neuhaus - Thu, 2015-04-09 12:15

A frequent question about replication is the overhead in the source, because in a lot of cases the source is production. Dbvisit replicate comes with the possibility to do the minimum on the source: only the FETCHER process is there to read de redo logs and sends it to the MINE process wich can be on another server.

But maybe even that - reading those critical online redo logs - is worrying you. That's not a reason to avoid to do a PoC with the production as source. Let's see how we can use the 'archived logs only'.

Oracle Priority Support Infogram for 09-APR-2015

Oracle Infogram - Thu, 2015-04-09 12:05

RDBMS
How to log on even when SYSDBA can't do so? from Tanel Poder at tech.e2sn.com.
How-to set the current database schema of an application using a global context, from the AMIS Technology Blog.
SQL Developer
Using the SQL Developer to Browse NoSQL Datastores, from Oracle University.
SQL
An article that will help navigate the uncertainties of the sea of functions ‘around the center’: MEDIAN: FOR WHEN YOU DON'T REALLY MEAN IT, from Johnny-G Incorporated.
Exalogic
Avoiding Memory Leaks in Exalogic when using the SDP Support from JDK 1.7, from WebLogic Partner Community EMEA.
Solaris
Getting fixes faster, from Solaris 11 Maintenance Lifecycle.
Ops Center
Adding an Asset to a Group During Discovery, from the Ops Center blog.
SOA
From the SOA & BPM Partner Community Blog: Top SOA articles at OTN –thanks to the Community
Fusion
Updates to the Fusion Middleware Lifetime Support Policy have been published.  Forms/Reports have been extended.
Mobile Computing
From The Oracle Mobile Platform Blog:
How to implement iBeacon in MAF 2.1.1
How to implement push notifications in MAF 2.1.1
Data Integration
ODI, Big Data SQL and Oracle NoSQL, from Data Integration.
Free Stuff
Good news! You can get free access to books. But…you have to live in California or move there in order to get your books: OT: Never buy an Oracle book again (if you live in California), from Toad World.
EBS
From the Oracle E-Business Suite Support blog:
Webcast: Implement & Customize Transaction Account Derivation Rules (TADR) in AGIS
Check Out the EBS Procurement Approval Analyzer
Release of the EBS Procurement Encumbrance Accounting Analyzer!
Eradicate FRM-40654 or FRM-4073 When Updating Your Items
Webcast: Installation and Setup of Manufacturing Operations Center
Webcast: Project Manufacturing Costing
Webcast: Demantra Debugging Using Proactive Services Tools Plus Latest Tips and Tricks
From the Oracle E-Business Suite Technology blog:
Third Recommended Patch Collection for EBS 12.1.3 Now Available
Reminder: Upgrade EBS 12.2 to FMW 11.1.1.7

E-Business Suite Technology Sessions at OAUG Collaborate 15

Four Weeks with the Basis Peak

Oracle AppsLab - Thu, 2015-04-09 10:53

Right after I finished wearing the Nike+ Fuelband for three weeks, I moved straight on to another wearable device, the Basis Peak.

The Peak falls into a category Ultan (@ultan, @usableapps) calls the “super watch,” a term that nicely differentiates watches like the Peak (and the Fitbit Surge) from fitness bands (e.g. Jawbone UP24, Nike+ Fuelband, Fitbit Flex), smartwatches (e.g. Android Wear, Apple Watch, Pebble), and serious athletic training gadgets (e.g. Garmin, Polar).

Look at that list and tell me it doesn’t need differentiation. Wearables are definitely a thing.

I’ve been curious about the Basis since before the company was acquired by Intel. Lab alumna, Joyce (@joybot12), had lots of good things to say about the Peak’s ancestor, the Basis B1, and the device collects a lot of data. And I love data, especially data about me.

Unfortunately, the company doesn’t offer any developer integrations, just an export feature.

Anyway, here are some real reviews by people who do that reviews for a living to check those out before reading my ramblings, Engadget, PC Magazine, and this guy.

The watch

Basis bills the Peak as the “ultimate fitness and sleep tracker,” and the device packs an impressive array of technology into a small package. For sensors, it has:

  • Optical Heart Rate Sensor
  • Galvanic Skin Response
  • Skin Temperature
  • 3-Axis Accelerometer

Plus, the Peak has a nifty gray-scale touchscreen display and has a backlight that I eventually discovered, which is nice, albeit not terribly intuitive. I found all the gestures a bit clunky OOTB, and I can’t be the only one because Basis sent an email of how-tos to me right after I created my account. But, like anything, once I learned, all was good.

Fun fact, the little guy is water resistant up to 5 ATM or 50 meters of pressure, and I took it swimming without any leakage.

The optical heart rate sensor is pretty cool (lasers!), and it makes for some spooky lighting in the dark, which is a fun way to creep out your spouse a la Blair Witch/Cloverfield.

I read somewhere that this type of heart rate monitor isn’t real time; I did notice that it was frequently searching for a heart rate, but the charts would show a continuous number. So, if you’re into constant heart rate, this isn’t for you, but it was good enough for me.

IMG_20150317_085720

After wearing the Fuelband, the Peak felt bulky, and its rubber band wasn’t terribly comfortable. Actually, it was uncomfortable, especially since to get the best sensor data, you’re supposed to keep it tight.

On Day 2, I was positive I wouldn’t make it a week, let alone three, but I got used to it. Plus, the data kept me going, more on that in a bit.

The housing on the underside of the band did leave a nice mark after a few days, but that disappeared shortly after I stopped wearing the Peak.

IMG_20150317_085708

Like every other device, it uses Bluetooth Low Energy (BLE) to sync with a smartphone, and Basis has apps for both iOS and Android.

Syncing the Peak with its smartphone was often an adventure. The watch would frequently lose its BLE connection with the phone, and I learned quickly that trying to reset that connection was futile. I tried and tried and eventually had to remove and re-associate the watch with the app to get the data flowing.

Because syncing was such a chore, I missed the instant gratification after a run, quantifying the steps, calories, etc. At one point, I confused the watch accidentally and lost about a day’s worth of data. I changed the date on the phone (cough, Candy Crush), and during that one minute period, the watch synced.

The date changed confused the watch, and I couldn’t reset it without dumping all its data and starting from scratch. Definitely my bad, but given how often the Peak wouldn’t sync, it seemed a bit ironic.

The Peak’s battery is solid, even with all the tech onboard; functionally, I saw about 5 days on the first charge, not too shabby.

And finally, the Peak gets attention. Maybe it was the white band, or more likely the Apple Watch buzz, but several people asked me about it, a few assuming it was the Watch itself. If nothing else, the rising interest tide in the Watch has raised the collective consciousness about technology on the wrist.

The app and data

The only reason I soldiered through the discomfort of wearing the Peak is because it produces an impressive array of data, and I love me some data about me.

I’ll start with the smartphone app, which I didn’t use much except for glance-scan type information because Basis doesn’t follow Jeremy’s (@jrwashley) 10-90-90 rule for their mobile app, i.e. they cram all the graphs and information into the small viewport.

For reference, 10-90-90 refers to 10% of the tasks that 90% of users need 90% of the time. This provides a baseline to scale experiences down to less-capable devices in a thoughtful way.

I get why Basis built their smartphone app this way though; it allows the user to get full information in mobile way. The My Basis web app provides all the data in a very appealing set of visualizations, and this is where I went to pour over the data I’d generated.

Screenshot_2015-04-09-09-29-45 Screenshot_2015-04-04-08-13-51 Screenshot_2015-04-09-09-30-50

As with the other wearables we’ve tested (Misfit Shine, Nike+ Fuelband, Fitbit Surge), the Peak has game elements to encourage activity, called Habits. One of the first Habits that comes unlocked OOTB is called “Wear It,” which you can achieve by simply wearing the band for 12 hours.

Habits   Basis

This tells you a lot about the comfort of the device.

Unlocked Habits are pretty basic, burn 2,500 calories, take 10,000 steps, and as you achieve them, more difficult habits can be unlocked and added, e.g. run for 30 minutes, move every hour, get more sleep, etc.

The thresholds for these Habits are configurable, but none of them is overly challenging. As you progress, you’ll find yourself working on half a dozen or more Habits every day. Habits can be paused, which I found valuable when I went on vacation last week.

Overall, the game seems targeted at casual users vs. athletes, but oddly, the data collected seems like the detail that athletes would find valuable. Maybe I didn’t play long enough.

Ah the data.

The Peak collects the usual stuff, calories and steps, and also heart rate. Additionally, it measures skin temperature and perspiration level, although I’m not sure what to do with those.

Activity Details   Basis

Patterns   Basis

On the sleep side, the Peak measures, light sleep, REM sleep and deep sleep, and it tracks interruptions and tossing and turning.

Sleep Details   Basis

While the Shine and Fuelband made me nutty-complusive about activity data, The Peak turned my compulsion toward the sleep data. I found myself studying the numbers, questioning them and trying to sleep-hack.

Not that any of that mattered, sadly, I live a poor-sleep lifestyle.

In other news, I finally found my personal killer use case for smart/superwatches, glanceable phone and text notifications. Because I carry my phone in my back pocket, I often miss calls and texts, but not with the Peak on my wrist.

My wife especially loved this feature.

In the past with the Pebble and Samsung Gear Live, I had too many notifications turned on, email, calendar, etc., and I didn’t wear these devices long enough to modify the settings.

Finally, the Peak helped me realize what a sadly inactive life I lead. 10,000 steps was a challenge for me every day, unless I went to the gym for a run.

I felt a twinge when it came time to take off the Nike+ Fuelband, and despite the discomfort, I pondered wearing the Basis Peak for longer too, specifically for the data it collected.

Maybe I’m stumbled onto something, like wearables-detachment disorder. These are very personal devices, and I wonder if people develop an emotional attachment to them.

We’ll see when I’m done testing the next wearable.

Find the comments.Possibly Related Posts:

Previewing Four Sessions at the Atlanta Rittman Mead BI Forum 2015

Rittman Mead Consulting - Thu, 2015-04-09 08:00

In a post earlier this week I previewed three sessions at the upcoming Brighton Rittman Mead BI Forum 2015; in this post I’m going to look at four particularly interesting sessions at the Atlanta Rittman Mead BI Forum 2015 event running the week after Brighton, on May 13th-15th 2015 at the Renaissance Atlanta Midtown Hotel, Atlanta GA. As well as an optional one-day masterclass on big data development by myself and Jordan Meyer on the 13th, the main event itself has keynotes and product update sessions from Oracle’s BI product management team, a data visualisation challenge and a guest talk by John Foreman, author of the book “Data Smart” and Chief Data Scientist at Mailchimp; in terms of the main sessions though there are four that I’m particularly interested in, starting with one by a speaker new to the BI Forum, Qualogy’s Hasso Schaap, who’ll be talking to us about their use of Oracle’s new BI Cloud Service in his session “Developing strategic analytics applications on OBICS PaaS”

NewImage

“In this session I’ll tell how we use the Oracle BI Cloud Service in our development plans for a strategic analytics application. Focussing on Strategic HR Planning there’s so much you can do with your data that we decided to put it in a packaged app. I will discuss the important parts of the development process and show how we fixed the issues we came up with. Developing in the BI Cloud is different and expectations are also different. 
As an example there’s the part of prediction. How do we predict based on data in the BI Cloud and what are other possibilities. With prediction we were able to tell our customers a different story. A story that was different than before using old-school tools and techniques. In this session I will uncover some of the most appreciated functionality and will happily elaborate on the story behind ‘The present, the future, development and scenario planning’.”

My second featured session is by someone very-well known to previous BI Forum attendees, and to the wider Oracle BI+DW community: Stewart Bryson. Stewart of course used to head-up Rittman Mead in the US and then went-on to become our first Chief Innovation Officer, before leaving to start his own company Red Pill Analytics with Kevin McGinley, another old friend of Rittman Mead and the BI Forum. We’re very pleased to have both Stewart and Kevin delivering sessions at the Atlanta BI Forum, and for Stewart’s session he’s talking about something very close to his heart – “Supercharging BI Delivery with Continuous Integration”:

NewImage

“One of the things I’ve never understood about the lifecycle features in most BI tools is why the designers feel the need to roll their own source control and DevOps features. Instead of focusing on deeper integration with tools and processes that exist in the other 90% of development paradigms, BI vendors instead start with a clean palette and create something completely siloed and desperately alone. 
In this presentation, we’ll take a look at how some of these other development paradigms approach DevOps — paying perhaps the closest attention to the world of Java development and other JVM languages. We’ll see how approaches such as continuous integration and continuous delivery play a part in rapid, iterative delivery, and how we can apply some of those approaches to the world of OBIEE development.”

My third session is by another speaker new to the BI Forum, but someone who’s well-known in the BI and data warehousing world and who I met in-person for the first time at last year’s Oracle Openworld: Sumit Sarkar. Sumit works for Progress Software, makers of the DataDirect ODBC drivers that powers OBIEE’s connection to Hadoop, for example, as well as connectors to MongoDB, Salesforce, Oracle RightNow and Eloqua, and as he’ll explain in his session “Make sense of NoSQL data using OBIEE”, NoSQL databases : 

NewImage

“NoSQL databases have stormed the top 10 db-engines rankings with MongoDB at #4 and Cassandra at #8.  It’s inevitable that these NoSQL databases, storing unstructured data without a standard query language, will have BI requirements for unarmed OBIEE teams.  Not even a complete Oracle stack can save you with the release of Oracle NoSQL.This will be the first session of its kind to tackle standards based NoSQL connectivity.  
So join me at BI Forum ’15  to take control of NoSQL data with your RPD and expand big data skills and thought leadership within your organization.  Learn how organizations are using SQL access to NoSQL databases for integration across existing business intelligence platforms. We’ll talk about common challenges and gotchas that shops are facing when exposing unstructured NoSQL data to OBIEE.  It can get out of hand pretty quickly otherwise …”

My final selection is from CERN, the European Organization for Nuclear Research and home of course of the Large Hadron Collider (and who announced on April 1st the first unequivocal evidence for The Force, almost upstaging our announcement of Oracle E-Business Suite being ported to Hadoop and MongoDB). There’s several session at both the Brighton and Atlanta BI Forums on Oracle’s new Big Data Discovery tool, and in this session CERN’s Manuel Martin Marquez will be talking about their work in this area, in his session “Governed Information Discovery: Data-driven decisions for more efficient operations at CERN”

NewImage

“The European Centre for Nuclear Research, CERN, is running the world’s largest and more powerful particle accelerator complex in order to shed light on how the Universe works and which are its main building blocks.  CERN’s particle accelerators and detectors infrastructure is comprehensively heterogeneous and complex. A number of critical subsystems, which represent cutting-edge technology in several engineering fields, need to be considered: cryogenics, power converters, magnet protection, etc. The historical monitoring and control data derived from these systems has persisted mainly using Oracle database technologies, but also other sorts of data formats such as JSOM, XML and plain text files. All of these must be integrated and combined in order to provide a full picture and better understanding of the overall status of the accelerator complex.
Therefore, a key challenge is to facilitate easy access to, flexible interaction with, and dynamic visualization of heterogeneous data from different sources and domains.  In our session, we will share our experience with a potential solution for finding insights within our data, Oracle Endeca Data Discovery. In addition, we will feature practical examples relating to future possibilities for improving the control and monitoring of CERN’s accelerator complex, optimization results for accelerator operations and a demo of the implemented solution”

Full agenda details on the Atlanta Rittman Mead BI Forum 2015 can be found on the event homepage, along with details of the optional one-day masterclass on Delivering the Oracle Information Management and Big Data Reference Architecture, and our first-ever Data Visualisation Bake-Off, using the DonorsChoose.org dataset. Registration is now open and the event takes place between May 13th and 15th April 2015, at the Renaissance Atlanta Midtown Hotel, Atlanta GA. 

 
Categories: BI & Warehousing

Tuning Database XQuery Statements (2)

Marco Gralike - Thu, 2015-04-09 06:45
So we concluded the post “Tuning Database XQuery Statements (1)” with the following SQL statement…

Which analytic technology problems are important to solve for whom?

DBMS2 - Thu, 2015-04-09 05:52

I hear much discussion of shortfalls in analytic technology, especially from companies that want to fill in the gaps. But how much do these gaps actually matter? In many cases, that depends on what the analytic technology is being used for. So let’s think about some different kinds of analytic task, and where they each might most stress today’s available technology.

In separating out the task areas, I’ll focus first on the spectrum “To what extent is this supposed to produce novel insights?” and second on the dimension “To what extent is this supposed to be integrated into a production/operational system?” Issues of latency, algorithmic novelty, etc. can follow after those. In particular, let’s consider the tasks:

  • Reporting for regulatory compliance (financial or otherwise). The purpose of this is to follow rules.
    • This is non-innovative almost by design.
    • Somebody probably originally issued the regulations for a reason, so the reports may be useful for monitoring purposes. Failing that, they probably are supported by the same infrastructure that also tries to do useful monitoring.
    • Data governance is crucial. Submitting incorrect data to regulators can have dire consequences. That said, when we hear about poor governance of poly-structured data, I question whether that data is being used in the applications where strong governance is actually needed.
  • Other routine, monitoring-oriented business intelligence. The purpose can be general monitoring or general communication. Sometimes the purpose is lost to history entirely. :) This is generally lame, at least technically, unless interesting requirements are added.
    • Displaying it on mobile devices makes it snazzier, and in some cases more convenient. Whoop-de-do.
    • Usually what makes it interesting these days is the desire to actually explore the data and gain new insights. More on that below.
    • BI for inherently non-tabular data is definitely an unsolved problem.
    • Integration of BI with enterprise apps continues to be an interesting subject, but one I haven’t learned anything new about recently.
    • All that said, this is an area for some of the most demanding classical data warehouse installations, specifically ones that are demanding along dimensions such as concurrency or schema complexity. (Recall that the most complicated data warehouses are often not the largest ones.) Data governance can be important here as well.
  • Investigation by business analysts or line-of-business executives. Much of the action is here, not least because …
    • … it’s something of a catch-all category.
      • “Business analyst” is a flexible job description, and business analysts can have a variety of goals.
      • Alleged line-of-business executives doing business-analyst work are commonly delegating it to fuller-time business analysts.
    • These folks can probably manage departmental analytic RDBMS if they need to (that was one of Netezza’s early value propositions), but a Hadoop cluster stretches them. So easy deployment and administration stories — e.g. “Hadoop with less strain”/”Spark with less strain” — can have merit. This could be true even if there’s a separate team of data wranglers pre-processing data that the analysts will then work with.
    • Further, when it comes to business intelligence:
      • Tableau and its predecessors have set a high bar for quality of user interface.
      • The non-tabular BI challenges are present in spades.
      • ETL reduction/elimination (Extract/Transform/Load) is a major need.
    • Predictive modeling by business analysts is problematic from beginning to end; much progress needs to be made here.
  • Investigation by data scientists. The “data scientist”/”business analyst” distinction is hardly precise. But for the purpose of this post, a business analyst may be presumed to excel at elementary mathematics — even stock analysts just use math at a high school level — and at using tabular databases, while data scientists (individuals or teams) have broader skill sets and address harder technical or mathematical problems.
    • The technology for “data science” is generally on the newish side. Management and performance at scale are still improving.
    • There’s a need and/or desire for more sophisticated analytic tools, in predictive modeling and graph.
  • Rapid-response trouble-shooting. There are some folks — for example network operators — whose job includes monitoring things moment to moment and, when there’s a problem, reacting quickly.
  • “Operationalization” of investigative results. This is a hot area, because doing something with insights — “insights” being a hot analytic buzzword these days — is more valuable than merely having them.
    • This is where short-request kinds of data stores — NoSQL or otherwise — are often stressed, especially in the low-latency analytics they need to support.
    • This is the big area for any kind of “closed loop” predictive modeling story, e.g. in experimentation.
    • At least in theory, this is another big area for streaming.

And finally — across multiple kinds of user group and use case, there are some applications that will only be possible when sensors or other data sources improve.

Bottom line: Almost every interesting analytic technology problem is worth solving for some market, but please be careful about finding the right match.

Related links

Categories: Other