Skip navigation.

Yann Neuhaus

Syndicate content
All blog entries from http://www.dbi-services.com/
Updated: 54 min 40 sec ago

testing postgresql development snapshots

Thu, 2015-05-28 03:00
If you want to test some of the features that will be coming with the next postgresql release even before an official beta version is released you can use the daily development snapshot .

SQL Server 2016 CTP2 is out!

Wed, 2015-05-27 14:22

As announced by Satya Nadella at the Microsoft Ignite keynote in Chicago, SQL Server 2016 CTP2 is finally out and it follows a previous private CTP.

This CTP will provide some interesting features to evaluate as:

  • Always Encrypted
  • Stretch databases
  • Operational analytics

However, this CTP does not stop here and will incorporate a myriad of new features and many improvements, such as:

  • Polybase
  • AlwaysOn enhancements
  • Row level security
  • Dynamic data masking
  • Native JSON support
  • Temporary data support
  • Query store
  • MDS enhancements
  • Enhanced hybrid backup to Azure

Installation updates are also included with CTP now!

 

Ok just for fun, here the first point that I noticed during my first SQL Server 2016 installation:

 

blog_46_-_2_-_install_sql2016_-_tempdb

 

Tempdb is now configurable during the installation process  (the number of files is computed according to the number of CPUs available on the server). There will be other points for sure!.

Ready to evaluate SQL Server 2016? At dbi services, this is the case and we'll have the opportunity to talk about the new features during this summer.  We'll also the opportunity to give you an quick overview of new In-memory features during our In-Memory event.

12c Dynamic Sampling and Standard Edition

Tue, 2015-05-26 14:47

12c is coming with more dynamic sampling, now called dynamic statistics and using the new Adaptive Dynamic Sampling algorithm. The goal is to have better estimations and better estimations gives better execution plans. However, this new approach will increase parse time because dynamic sampling kicks in more often, reads more blocs, and run more queries.

It's probably not a problem for applications that are well designed, using bind variables to avoid to many parses, having good statistics (histograms where it makes sense, extended statistics for correlated columns). The SQL Plan Directives are there to trigger dynamic sampling only where misestimates have been observed. An OLTP application should not parse often, and should not have huge misestimates. A reporting use-case can spend more time on parsing and the few seconds spend to do dynamic sampling will probably benefit to the execution time.

In addition to that, in order to lower the dynamic sampling overhead, Oracle 12c Adaptive Dynamic Sampling run its queries with the /*+ result_cache(snapshot=3600) */ hint. The result is cached in the result cache and is not invalidated by dependencies. So even when the underlying table is updated, the dynamic sampling result is still valid in cache for 3600 seconds. This is why doing more dynamic sampling is not a big overhead according that:

  • your result cache is sized accordingly. The default (0.25% of MEMORY_TARGET or 0.5% of SGA_TARGET or 1% of SHARED_POOL_SIZE) is probably too low to fit all the dynamic sampling result for frequently parsed statements.
  • your result cache is enabled, meaning that you are in Enterprise Edition
So the question of the day is that I want to know if the RESULT_CACHE hint is just ignored in Standard Edition, or if there is a mechanism that allows it from Adaptive Dynamic Sampling.

If you have a bad application (not using bind variables, parse at each execution) and you are in Standard Edition, then there is a risk that the current parse contention you suffer from (CPU and latches) will be more problematic (more CPU and I/O). Let's try the following:

declare
 c sys_refcursor;
begin
 for i in 1..100
 loop
  open c for 'select count(*) COUNT'||i||' from DEMO_TABLE where a+b=c+d';
  dbms_sql.return_result(c);
  null;
 end loop;
end;
/
which run 100 times the same statement not using bind variables. So I'm parsing it each time, but it's reading the same table with same predicate, so the result of dynamic sampling should not change a lot.

I'll run it in Standard and Enterprise editions, with no dynamic sampling, and with the new AUTO level.

A SharePoint - Business Analyst? What for?

Mon, 2015-05-25 08:04
Business analysis is becoming a widely accepted and valued position in organizations today, particularly those that are looking for new ways to enhance information systems.
Good business analysis allows us to deal effectively with complexity, which in turn enables creativity and innovation.    Image-13 Quick Reminder of what is SharePoint

Microsoft SharePoint is a robust collaboration platform that empowers people to work together. It's a CMS to share, organize, discover, build and manage projects / data.
This is a Collaboration tool.

 

Challenges with SharePoint Implementations

A Sharepoint Business Analyst has challenges to drive:

  • Because of the huge range of capabilities of SP, users are thinking it can do anything, BA has to prioritize needs Priorits.
  • In order to get this, the BA has to involve the Business, including the right people (departments, communication, SP Support Team, sponsors,...) to implement SharePoint successfully.
  • Understanding the Users requirements is one of the hugest part, BA has to translate in "IT - SharePoint words" the information given. Without this understanding, most of the time you will get this affirmation: "SharePoint? yes we have it, but we don't use it".
  • Define the Business and Stakeholder needs and perspectives is the key ! Perspectives could be so different depending the user role, an IT person will thinks everyone wants "wikis" whereas Records managers trust that all should be classified as a record, Marketing person thinks if it sparkles people will use it and other will always use folders. It really depends of the user.
  • User adoption: each organization is different, "change is hard" they use to say, it is creating complexity: the Business Analyst will have to plan and communicate change in order to manage the steps using the best way.
What is the SharePoint Business Analyst role?

The Business Analyst should be one of the most critical functions in your organization.The role of the business analyst is to:  

  • Understand what the business does, how it operates
  • Examine existing business processes
  • Identify gaps in processes, opportunities for improvements and automation
  • Capture Users requirements (URS), create mockups
  • Generate technical requirements, proof of concept solutions
  • Help implement the new processes, features and tools
  • Document improvements, measure, repeat the process

A Business Analyst must also understand the capabilities and constraints of SharePoint, and being aware of the latest Versions/Products/Features; this person must have business analysis skills competencies.

CONCLUSION

By defining the problem, you will build the right solution, plain and simple. That's why the Business Analyst function is so important for a SharePoint Project Management.

 

How to import SQL Plan Directives

Thu, 2015-05-21 10:01

Today I've presented SQL Plan Directives at the SOUG Romandie event. I had a question about the import/export of directives by Data Pump. The idea is that a lot of testing has been done on QA in order to validate the upgrade to 12c. A few directives had bad consequences (see Ludovico Caldara blog post for an example), then directives have been disabled. When going to production, they want to start with those directives disabled. Yes, they can be imported. We have to pack them into a staging table, import that table, and unpack them. It's similar tho what we do with SQL Plan Baselines.

testcase: a PERMANENT SPD

I'm using the state of the demo in my presentation where I have a PERMANENT directive that I have disabled because I don't want to do too much dynamic sampling.

SQL> show user
USER is "DEMO"
SQL> select table_name,num_rows,last_analyzed from user_tables;

TABLE_NAME             NUM_ROWS LAST_ANA
-------------------- ---------- --------
STGTAB
DEMO_TABLE                 1000 17:35:51

SQL> select table_name,column_name,num_distinct,last_analyzed from user_tab_col_statistics;

TABLE_NAME           COLUMN_NAME  NUM_DISTINCT LAST_ANA
-------------------- ------------ ------------ --------
DEMO_TABLE           A                       2 17:35:51
DEMO_TABLE           B                       2 17:35:51
DEMO_TABLE           C                       2 17:35:51
DEMO_TABLE           D                       2 17:35:51
DEMO_TABLE           SYS_STSPJNMI            2 17:35:51

SQL> select directive_id,state,created,last_modified,last_used,enabled,extract(notes,'//internal_state/text()') internal_state from dba_sql_plan_directives where directive_id in( select directive_id from dba_sql_plan_dir_objects where owner=user ) order by created;

           DIRECTIVE_ID STATE      CREATED  LAST_MOD LAST_USE ENABLED INTERNAL_
----------------------- ---------- -------- -------- -------- ------- ---------
    9456295843023884415 USABLE     17:35:45 18:07:16 18:07:16 NO      PERMANENT

Pack directive

Datapump can export/import the table, the extended statistics and the statistics, but the SQL Plan Directives are not included. We have to pack then into a staging table in order to export/import that table and umpack it at the destination site.

SQL> exec dbms_spd.create_stgtab_directive('STGTAB');

PL/SQL procedure successfully completed.

SQL> select dbms_spd.pack_stgtab_directive('STGTAB',user,9456295843023884415) from dual;

DBMS_SPD.PACK_STGTAB_DIRECTIVE('STGTAB',USER,9456295843023884415)
-----------------------------------------------------------------
                                                                1

DataPump

Here I'll import the DEMO table, and the STGTAB which contains the packed statistics, through a loopback database link, and import it to another schema: DEMO_SITE2

SQL> create database link LOOPBACK connect to demo identified by demo using '&_CONNECT_IDENTIFIER';
Database link created.

SQL> host impdp demo/demo@&_CONNECT_IDENTIFIER network_link=LOOPBACK tables="DEMO.DEMO_TABLE,DEMO.STGTAB" remap_schema='DEMO:DEMO_&&1';

Import: Release 12.1.0.1.0 - Production on Thu May 21 18:07:42 2015

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
Starting "DEMO"."SYS_IMPORT_TABLE_01":  demo/********@//192.168.78.114/DEMO14 network_link=LOOPBACK tables=DEMO.DEMO_TABLE,DEMO.STGTAB remap_schema='DEMO:DEMO
_SITE2'
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 256 KB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . imported "DEMO_SITE2"."STGTAB"                            6 rows
. . imported "DEMO_SITE2"."DEMO_TABLE"                     1000 rows
Processing object type TABLE_EXPORT/TABLE/INDEX/INDEX
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/MARKER
Job "DEMO"."SYS_IMPORT_TABLE_01" successfully completed at Thu May 21 18:08:18 2015 elapsed 0 00:00:37

Check what is imported:

SQL> show user
USER is "DEMO_SITE2"
SQL> select table_name,num_rows,last_analyzed from user_tables;

TABLE_NAME             NUM_ROWS LAST_ANA
-------------------- ---------- --------
DEMO_TABLE                 1000 17:35:51
STGTAB

SQL> select table_name,column_name,num_distinct,last_analyzed from user_tab_col_statistics;

TABLE_NAME           COLUMN_NAME  NUM_DISTINCT LAST_ANA
-------------------- ------------ ------------ --------
DEMO_TABLE           A                       2 17:35:51
DEMO_TABLE           B                       2 17:35:51
DEMO_TABLE           C                       2 17:35:51
DEMO_TABLE           D                       2 17:35:51
DEMO_TABLE           SYS_STSPJNMI            2 17:35:51

SQL> select directive_id,type,state,created,last_modified,last_used,enabled,notes from dba_sql_plan_directives where directive_id in( select directive_id from dba_sql_plan_dir_objects where owner=user ) order by created;

no rows selected

I have my statistics (columns and extended stats for the column group) but no directives. If I stop here, a new query will probably create a new SQL Plan Directive, which will became PERMANENT and will trigger Adaptive Dynamic Sampling. I want to avoid that and get my disabled directive from the previous environment.

Unpack directive

SQL> select dbms_spd.unpack_stgtab_directive('STGTAB',user,9456295843023884415) from dual;

DBMS_SPD.UNPACK_STGTAB_DIRECTIVE('STGTAB',USER,9456295843023884415)
-------------------------------------------------------------------
                                                                  0
Look at the returned number: no directive has been unpacked. Because I'm now into another schema, I have to update the owner. I don't know if there is a better way to do it, but here I update the STGTAB:
SQL> update stgtab set c1='DEMO_SITE2';

6 rows updated.

SQL> commit;

Commit complete.

don't forget to commit. It doesn't work if you don't.
SQL> select dbms_spd.unpack_stgtab_directive('STGTAB',user,9456295843023884415) from dual;

DBMS_SPD.UNPACK_STGTAB_DIRECTIVE('STGTAB',USER,9456295843023884415)
-------------------------------------------------------------------
                                                                  1

SQL> select directive_id,state,created,last_modified,last_used,enabled,extract(notes,'//internal_state/text()') internal_state from dba_sql_plan_directives where directive_id in( select directive_id from dba_sql_plan_dir_objects where owner=user ) order by created;

           DIRECTIVE_ID STATE      CREATED  LAST_MOD LAST_USE ENABLED INTERNAL_
----------------------- ---------- -------- -------- -------- ------- ---------
   18397758907897483632 USABLE     18:09:12                   NO      PERMANENT

The directive is there. It's disabled which is what I wanted. Creation date and last used has been reset. If you see a directive with a status different than NEW and without a LAST_USED then it's an imported one. Note that the directive ID has changed.

Conclusion

When you have directives in HAS_STATS, then because Data Pump imports the statistics by default (including extended statistics), you probably don't need to import the SQL Plan Directives. But if you disabled some statistics and want the same in another environment, you have to pack/import/unpack them.

This morning event was a great event. Thanks Ludovico for the picture.

Live demos are best! @FranckPachot pic.twitter.com/aJfBO9cX5C

— Ludovico Caldara (@ludodba) May 21, 2015

DBA_SQL_PLAN_DIRECTIVE.LAST_USED

Thu, 2015-05-21 02:30

If you have read Matching SQL Plan Directives and queries using it then you know how to use the '+metrics' format of dbms_xplan.

21:49:54 SQL> explain plan for select distinct * from DEMO_TABLE 
              where a=0 and b=0 and c=0 and d=0;
Explained.

21:50:01 SQL> select * 
              from table(dbms_xplan.display(format=>'basic +rows +note +metrics'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------
Plan hash value: 3823449216

-------------------------------------------------
| Id  | Operation          | Name       | Rows  |
-------------------------------------------------
|   0 | SELECT STATEMENT   |            |     1 |
|   1 |  SORT UNIQUE NOSORT|            |     1 |
|   2 |   TABLE ACCESS FULL| DEMO_TABLE |   500 |
-------------------------------------------------

Sql Plan Directive information:
-------------------------------
  Used directive ids:
    2183573658076085153
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement
So, when I run this query with predicates on columns A,B,C,D I'm using the directive id 2183573658076085153. 'Using' means doing dynamic sampling in order to get good estimations, because the directive tells us that there is a misestimate when using only the static statistics. Then we can look at that directive:
21:50:11 SQL> select directive_id,state,created,last_modified,last_used 
              from dba_sql_plan_directives where directive_id=2183573658076085153;

           DIRECTIVE_ID STATE      CREATED  LAST_MODIFIED LAST_USED
----------------------- ---------- -------- ------------- -------------
    2183573658076085153 USABLE     21:41:50 21:41:55.0000 21:41:55.0000
Look at the timestamps. I'm at 21:50 and the directive which has been created 9 minutes ago at 21:41:50 has been used 5 seconds later at 21:45:55 and it's the last usage.

Let's run the query now. I know that it will use the directive:

21:50:40 SQL> select distinct * from DEMO_TABLE where a=0 and b=0 and c=0 and d=0;

         A          B          C          D
---------- ---------- ---------- ----------
         0          0          0          0
It I check to be sure that the directive has been used:
21:50:55 SQL> select * from table(dbms_xplan.display_cursor(format=>'basic +rows +note'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
select distinct * from DEMO_TABLE where a=0 and b=0 and c=0 and d=0

Plan hash value: 3823449216

-------------------------------------------------
| Id  | Operation          | Name       | Rows  |
-------------------------------------------------
|   0 | SELECT STATEMENT   |            |       |
|   1 |  SORT UNIQUE NOSORT|            |     1 |
|   2 |   TABLE ACCESS FULL| DEMO_TABLE |   500 |
-------------------------------------------------

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement

Yes: SPD used, we have dynamic sampling and accurate cardinality estimation.
The SPD has been used by the query that I've run at 21:50:40

So I'll will check the LAST_USED timestamp, after being sure that what has been modified in memory is written to dictionary:

21:50:59 SQL> exec dbms_spd.flush_sql_plan_directive;
PL/SQL procedure successfully completed.

21:51:07 SQL> select directive_id,state,created,last_modified,last_used 
              from dba_sql_plan_directives where directive_id=2183573658076085153;

           DIRECTIVE_ID STATE      CREATED  LAST_MODIFIED LAST_USED
----------------------- ---------- -------- ------------- -------------
    2183573658076085153 USABLE     21:41:50 21:41:55.0000 21:41:55.0000
The LAST_USED has not been updated.

LAST_USED

This is not a bug. Updating the dictionary for all directives used during the previous 15 minutes (the automatic flush frequency) would be too expensive, so it is not done at each flush. The LAST_USED goal is to manage SPD retention - drop those that have not been used for a while (53 weeks by default) - and the retention is defined as weeks. So the LAST_USED is updated only if it is one week higher than the previous LAST_USED. Not exactly one week but 6.5 days.

6.5 days

In fact, when LAST_USED is updated, it adds 6.5 days to the current timestamp. We can see that in the underlying table:

SQL> alter session set nls_date_format='dd-mon-yyyy hh24_mi:ss';
Session altered.

SQL> select type,state,created,last_modified,last_used from sys.opt_directive$ where dir_id=2183573658076085153;
      TYPE      STATE CREATE  LAST_MODIFI LAST_USED
---------- ---------- ------- ----------- --------------------
         1          3 19-may- 19-may-2015 26-may-2015 09:41:26
Last used as displayed by the DBA_SQL_PLAN_DIRECTIVES is 19-may-2015 at 21:41:26 but the internal table OPT_DIRECTIVE$ stores 26-may-2015 09:41:26 which is 6.5 days later. The view subtracts 6.5 days to it in order to expose it to us. But we must keep in mind that this timestamp will not be updated until 26-may-2015.

Conclusion: We cannot rely on LAST_USED when we want to know if a directive has been used recently. We must use explain plan for that. Note that this implementation is not new: it's exactly the same as the LAST_EXECUTED column in DBA_SQL_PLAN_BASELINES. I'ts an optimization because tracking each usage would be too expensive.

what that in-memory term is about, and what not

Thu, 2015-05-21 02:10

Everybody is talking about in-memory databases these days. And everybody is talking about columnar store for sets of data because this can be a benefit for analytic queries. And a lot of people start mixing these terms not realizing that these topics are not exchangeable.

 

Alfresco: video thumbnails and transformations

Thu, 2015-05-21 01:00


Alfresco support, by default, the preview in your browser of some video formats like mp4 but doesn't support some others like mkv, avi or wmv. Even if Google Chrome can read mkv files for example, if you try to use the action "View in Browser" (this is not the same thing as the preview), Chrome will not try to play the mkv file but will download it instead. That's why if you upload an mp4 video in Alfresco, you will certainly be able to watch this video directly in your browser. In this blog, I will try to explain how to configure an external tool to take care of the thumbnails creation and transformations for your videos. To achieve that, we will install and configure ffmpeg. In addition to that, you can also very easily configure Alfresco to embed an external video player like "FlowPlayer" that would take care of playing all video formats directly "streaming" from Alfresco, not using your browser. Basically, this is done by replacing the preview page for some Mime types but I will not describe it in details here.


For this blog post, I will use /opt/alfresco-4.2.c as the $ALF_HOME environment variable: the folder where Alfresco has been installed. The current release of ffmpeg is version 2.6.3. This is at least working for Alfresco v4.2.x and v5.x.


I. Installation of ffmpeg


The ffmpeg binaries/executables are available in the ffmpeg website. From this website, download the "Static" build for Windows, Linux or Mac depending on the Operating System on which Alfresco has been installed. For this example, I will use a linux host (RedHat 6.x, 64 bits) but I can assure you that it's working properly on Windows too. I already installed ffmpeg with several versions of Alfresco from v4.2.x to 5.x.

[alfresco ~]$ cd /opt
[alfresco /opt]$ wget http://johnvansick...elease-64bit-static.tar.xz
[alfresco /opt]$ tar -xJf ffmpeg-release-64bit-static.tar.xz
[alfresco /opt]$ chmod -R 755 ffmpeg-2.6.3-64bit-static
[alfresco /opt]$ chown -R alfresco:alfresco ffmpeg-2.6.3-64bit-static
[alfresco /opt]$ mv ffmpeg-2.6.3-64bit-static /opt/alfresco-4.2.c/ffmpeg


These commands will download, extract, change the permissions/ownership and rename the folder to something more friendly (at a more proper location too).


II. Configuration of Alfresco


Once ffmpeg has been installed, the next step is to configure Alfresco to use it. The first thing to do is to add some parameters in the well-known alfresco-global.properties file. Don't hesitate to customize these paremeters, remove the lines that aren't needed, aso...

[alfresco /opt]$ cd /opt/alfresco-4.2.c/tomcat/shared/classes

[alfresco /classes]$ cat alfresco-global.properties
### Begin of the file with your custom parameters ###
### E.g.: dir.root, db.driver, db.username, aso... ###

### FFMPEG executable location ###
ffmpeg.exe=/opt/alfresco-4.2.c/ffmpeg/ffmpeg

### Video Thumbnails parameters ###
# ffmpeg.thumbnail
content.transformer.ffmpeg.thumbnail.priority=50
content.transformer.ffmpeg.thumbnail.extensions.3g2.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.3gp.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.asf.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.avi.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.avx.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.flv.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.mov.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.movie.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.mp4.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.mpeg2.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.mpg.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.ogv.jpg.supported=true
content.transformer.ffmpeg.thumbnail.extensions.wmv.jpg.supported=true

### Video Transformations parameters ###
# ffmpeg.flv
content.transformer.ffmpeg.flv.priority=50
content.transformer.ffmpeg.flv.extensions.3g2.flv.supported=true
content.transformer.ffmpeg.flv.extensions.3gp.flv.supported=true
content.transformer.ffmpeg.flv.extensions.asf.flv.supported=true
content.transformer.ffmpeg.flv.extensions.avi.flv.supported=true
content.transformer.ffmpeg.flv.extensions.avx.flv.supported=true
content.transformer.ffmpeg.flv.extensions.mov.flv.supported=true
content.transformer.ffmpeg.flv.extensions.movie.flv.supported=true
content.transformer.ffmpeg.flv.extensions.mp4.flv.supported=true
content.transformer.ffmpeg.flv.extensions.mpeg2.flv.supported=true
content.transformer.ffmpeg.flv.extensions.mpg.flv.supported=true
content.transformer.ffmpeg.flv.extensions.ogv.flv.supported=true
content.transformer.ffmpeg.flv.extensions.wmv.flv.supported=true
# ffmpeg.mp4
content.transformer.ffmpeg.mp4.priority=50
content.transformer.ffmpeg.mp4.extensions.3g2.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.3gp.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.asf.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.avx.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.mov.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.movie.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.mpeg2.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.mpg.mp4.supported=true
content.transformer.ffmpeg.mp4.extensions.wmv.mp4.supported=true
content.transformer.avi.mp4.extensions.mpg.mp4.supported=true
content.transformer.ogv.mp4.extensions.wmv.mp4.supported=true
# ffmpeg.mp3
content.transformer.ffmpeg.mp3.priority=50
content.transformer.ffmpeg.mp3.extensions.aiff.mp3.supported=true
content.transformer.ffmpeg.mp3.extensions.au.mp3.supported=true
content.transformer.ffmpeg.mp3.extensions.m4a.mp3.supported=true
content.transformer.ffmpeg.mp3.extensions.oga.mp3.supported=true
content.transformer.ffmpeg.mp3.extensions.wav.mp3.supported=true


In the above configuration, the parameter "ffmpeg.exe=/opt/alfresco-4.2.c/ffmpeg/ffmpeg" is the location of the binary file named ffmpeg. If you install Alfresco and ffmpeg on a Windows host, then this parameter looks like: "ffmpeg.exe=C:/Alfresco-4.2.c/ffmpeg/bin/ffmpeg.exe"


Once this is done, you need to enable the thumbnail & transformation extensions in Alfresco and this is done using the following steps:

[alfresco /classes]$ cd /opt/alfresco-4.2.c/tomcat/shared/classes/alfresco/extension
[alfresco /extension]$ mv video-thumbnail-context.xml.sample video-thumb-context.xml
[alfresco /extension]$ mv video-transformation-context.xml.sample video-transf-context.xml


If you want, you can parse these two files to understand what Alfresco will do with ffmpeg but basically these files define the commands and options that will be used by Alfresco.


And that's it, Alfresco has been configured to use ffmpeg for thumbnails creation and transformations of your video. For Alfresco to take these changes into account, simply restart your Application Server using an Alfresco service:

[alfresco ~]$ service alfresco restart


Or using Alfresco default scripts:

[alfresco ~]$ /opt/alfresco-4.2.c/alfresco.sh stop
[alfresco ~]$ /opt/alfresco-4.2.c/alfresco.sh start



To check if ffmpeg is working properly, simply open Alfresco Share, browse your repository to find a video with an avi or wmv format and a thumbnail should now be present (if not, just refresh the page...). You can also try to upload a video and if a thumbnail is created for this video, then ffmpeg is working!



Row Store vs Column Store in SAP HANA

Wed, 2015-05-20 00:00

The SAP HANA database allows you to create your tables in Row or Column Store mode. In this blog, I will demonstrate that each method has its advantages and disadvantages and should be used for specific cases.

Thanks to two kind of tests, I will show you that the Row Store mode should be used for simple SELECT SQL queries, without aggregation and the Column Store mode should be used for complex SELECT queries, containing aggregation levels.

If you want to have more information regarding the Column Store or the In-memory technologies, don't hesitate to assist at the next dbi services event:

http://www.dbi-services.com/index.php/newsroom-e/events/event-l-in-memory-r-boost-your-it-performance

Test 1: Simple SELECT query Goal of the tests

This test will show you the difference of performance using a Row Store and a Column Store table in a simple SQL query.

Description of the test

A SELECT query will be send to the database and we will check the Server time response.

SQL Query Using a Row Store table

The SQL is the following:

1_SQL_ROW_STORE.PNG

Using a Column Store table

The SQL is the following:

2_SQL_COLUMN_STORE.PNG

Tables Row Store Table

You can find here information regarding the Row Store table used in the test.

Name:                 SALES_ROW

Table type:          Row Store

Row count:         10 309 873

Index:                1

Partition:            0 (SAP HANA doesn’t allow the possibility to create partition on Row Store table)

3_TABLE_ROW_STORE_1.PNG

 

4_TABLE_ROW_STORE_2.PNG

Column Store Table

You can find here information regarding the Column Store table used in the test.

Name:                  SALES_COLUMN

Table type:           Column Store

Row count:          10 309 873

Index:                 0 (SAP HANA automatically apply a index if it is need)

Partition:             1 RANGE partition on CUST_ID

6_TABLE_COLUMN_STORE_2.PNG

Result of the test Using the Row Store table

8_SQL_Q1_SELECT_ROW_RESULT.PNG

Using the Column Store table

9_SQL_Q1_SELECT_COLUMN_RESULT.PNG

Test 2: Complex SELECT query Goal of the tests

This test will show you the difference of performance using a Row Store and a Column Store table in a complex SQL query.

Description of the test

A SELECT query will be send to the database and we will check the Server time response.

SQL Query Using a Row Store table

The SQL is the following:

10_SQL_ROW_STORE.PNG

Using a Column Store table

The SQL is the following:

11_SQL_COLUMN_STORE.PNG

Tables Row Store Fact Table

You can find here information regarding the Row Store table used in the test.

Name:                  SALES_ROW

Table type:          Row Store

Row count:         10 309 873

Index:                   2

Partition:             0 (SAP HANA doesn’t allow the possibility to create partition on Row Store table)

Column Store Fact Table

You can find here information regarding the Column Store table used in the test.

Name:                  SALES_COLUMN

Table type:          Column Store

Row count:         10 309 873

Index:                   0 (SAP HANA automatically apply a index if it is need)

Partition:             1 RANGE partition on CUST_ID

Result of the test Using the Row Store tables

12_SQL_Q2_SELECT_ROW_RESULT.PNG

Using the Column Store tables

13_SQL_Q2_SELECT_COLUMN_RESULT.PNG

Conclusion

Row and Column store modes in SAP HANA should be used in two different contexts:

 - Tables in Row store mode must be used in SELECT queries WITHOUT any aggregation functions

 -Tables in Column store mode are powerful when they are used to create analytical queries or view, using aggregation functions (GROUP BY, …)

The performance can be highly optimized if the tables selected in the queries have the right store mode.

 

 

 

List all RMAN backups that are needed to recover

Tue, 2015-05-19 09:49

This blog post is something I had in draft and Laurent Schneider blog post reminds me to publish it. With the right RMAN configuration you should not have to managed backup files yourself. The RMAN catalog knows them and RMAN should be able to access them. If you want to keep a backup for a long time, you just tell RMAN to keep it.
But sometimes, RMAN is not connected to your tape backup software, or the backups are not shared on all sites, and you have to restore or copy the set of files that is needed for a restore database or a duplicate database.

A customer was in that case, identifying the required files from their names because they are all timestamped with the beginning of the backup job. It's our DMK default. In order to rely on that, the 'backup database plus archivelog' was run. And in order to be sure to have all archived logs in those backup sets, any concurrent RMAN job are blocked during that database backup. Because if a concurrent job is doing some archivelog backups, they will be timestamped differently.

RPO and availability

I don't like that. I don't want that anything can block the backup of archived logs.
They are critical for two reasons:

  • The Recovery Point Objective is not fulfilled if some archivelog backups are delayed
  • The frequency of archivelog backup is also defined to prevent a full FRA
But if we allow concurrent backup of archived logs, we need something else to be able to identify the whole set of files that are needed to restore the database at that point in time. then my suggestion was to generate the list of those files after each database backup, and keep that list. When we need to restore that backup, then we can send the list to the backup team ans ask them to restore them.

The script

Here is my script, I'll explain later:

echo "restore controlfile preview; restore database preview;" | rman target / | awk '
/Finished restore at /{timestamp=$4}
/Recovery must be done beyond SCN /{if ($7>scn) scn=$7 }
/^ *(Piece )Name: / { sub(/^ *(Piece )Name: /,"") ; files[$0]=1 }
END{ for (i in files) print i > "files-"timestamp"-SCN-"scn".txt" }
' 
this script generate the following file:
files-20150519013910-SCN-47682382860.txt
which list the files needed to do a RESTORE/RECOVER UNTIL SCN 47682382860

the content of the file is:

oracle@dbzhorap01:/home/oracle/ [DB01PP1] sort files-20150519019910-SCN-47682382860.txt
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168278_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168279_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168280_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168281_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168282_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168283_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168284_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169462_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169463_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169464_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169465_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169466_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169467_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169468_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169469_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169470_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169471_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169472_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169481_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169482_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169473_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169474_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169475_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169476_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169477_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169478_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169479_p1.bck
and lists the backup pieces for the incremental 0, incremental 1 and archivelogs needed to recover to a consistent state that can be opened. The script lists only backup sets so we are supposed have have backed up the latest archived logs (with backup database plus archivelog for example).

You can put an 'until scn'^but my primary goal was to run it just after a backup database in order to know which files have to be restored to get that backup (restore or duplicate).

Restore preview

The idea is to rely on RMAN to find the files that are needed to restore and recover rather than doing it ourselves from the recovery catalog. RMAN provides the PREVIEW restore for that:

RMAN> restore database preview
Starting restore at 20150501390436
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4

List of Backup Sets
===================

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
166388  Incr 0  10.53G     DISK        00:52:56     20150516031010
        BP Key: 166388   Status: AVAILABLE  Compressed: YES  Tag: WEEKLY
        Piece Name: /u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168278_p1.bck
  List of Datafiles in backup set 166388
  File LV Type Ckp SCN    Ckp Time       Name
  ---- -- ---- ---------- -------------- ----
  1    0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/system.329.835812499
  2    0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/undotbs1.525.835803187
  10   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.676.835815153
  17   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.347.835815677
  23   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.277.835814327
  25   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.342.835811161
...
BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
167586  Incr 1  216.09M    DISK        00:01:34     20150519012830
        BP Key: 167586   Status: AVAILABLE  Compressed: YES  Tag: DAYLY
        Piece Name: /u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169479_p1.bck
  List of Datafiles in backup set 167586
  File LV Type Ckp SCN    Ckp Time       Name
  ---- -- ---- ---------- -------------- ----
  43   1  Incr 47681921440 20150519012700 +U01/DB01Pp/datafile/cpy_idx.346.835815097

List of Backup Sets
===================

BS Key  Size       Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
167594  105.34M    DISK        00:00:23     20150519015400
        BP Key: 167594   Status: AVAILABLE  Compressed: YES  Tag: DAYLY
        Piece Name: /u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169481_p1.bck

  List of Archived Logs in backup set 167594
  Thrd Seq     Low SCN    Low Time       Next SCN   Next Time
  ---- ------- ---------- -------------- ---------- ---------
  3    59406   47681333097 20150519010239 47682617820 20150519014652
  4    46800   47681333143 20150519010240 47682617836 20150519014652
  1    76382   47681333188 20150519010240 47682618254 20150519014655
  2    60967   47681333315 20150519010242 47682385651 20150519013711
...

Media recovery start SCN is 47681637369
Recovery must be done beyond SCN 47682382860 to clear datafile fuzziness
Finished restore at 20150501390440
You see the list of datafiles backupsets and archivelog backupsets and at the end you have information about SCN. Let me explain what are those SCNs.

Recovery SCN

Because it is online backup the datafiles are fuzzy. We need to apply redo generaed during backup.

The 'media recovery start SCN' is the begining of the archivelog to be applied:

SQL> select scn_to_timestamp(47681637369) from dual;

SCN_TO_TIMESTAMP(47681637369)
---------------------------------------------------------------------------
19-MAY-15 01.10.38.000000000 AM

The 'recovery must be done beyond SCN' is the last redo that must be applied to have datafiles consistent:

SQL> select scn_to_timestamp(47682382860) from dual;

SCN_TO_TIMESTAMP(47682382860)
---------------------------------------------------------------------------
19-MAY-15 01.35.58.000000000 AM

In my example, the backup (incremental level 1 + archivelog) started at 01:00:00 and was completed at 01:35:00

Conclusion

And I have a file with the list of backups that are needed to restore or duplicate the database at that point in time. Why do I need that when RMAN is supposed to be able to retrieve them itself? Because sometimes we backup to disk and the disk is backed up to tape without RMAN knowing it. Of course RMAN can connect directly to the tape backup software but that is not for free. Or we want to duplicate to another site where backups are not shared. We need to know which files we have to bring there. And that sometimes requires a request to another team so it's better to have the list of all files we need.

As usual, don't hesitate to comment if you see something to improve in my small script.

SQL Server 2014: First Service Pack (SP1) is available

Tue, 2015-05-19 01:48

May 14th, Microsoft has released the first Service Pack (SP1) for SQL Server 2014. It is more than thirteen months after the RTM version.
SQL Server 2014 Service Pack 1 includes all of the CU from 1 to 5.

Which issues are fixed in this SP1

There are 29 hotfixes:

  • 19 for the Engine
  • 6 for SSRS
  • 3 for SSAS
  • 1 for SSIS

 

b2ap3_thumbnail_PieSQL2014SP1.jpg

Some improvements are:

  • Performance improvement of Column store with batch mode operators and a new Extended Event
  • Buffer pool extension improvement
  • New cardinality estimator to boost queries performances

Historic of SQL Server 2014

The build version of SQL Server 2014 SP1 is 12.0.4100.1.
Here, a quick overview of SQL Server 2014 builds since the CTP1:

Date SQL Server 2014 version Build

Juin 2013

Community Technology Preview 1 (CTP1)

11.00.9120

October 2013

Community Technology Preview 2 (CTP2)

12.00.1524

April 2014

RTM

12.00.2000

April 2014

Cumulative Update 1 (CU1)

12.00.2342

June 2014

Cumulative Update 2 (CU2)

12.00.2370

August 2014

Cumulative Update 3 (CU3)

11.00.2402

October 2014

Cumulative Update 4 (CU4)

12.00.2430

December 2014

Cumulative Update 5 (CU5)

11.00.2456

May 2015

Service Pack 1 (SP1)

12.00.4100

If you need more information about SQL Server 2014 SP1 or to download it, click here.

As a reminder, Service Packs are very critical and important for bug fixing point of view, product upgrade so take care to install it quickly ;-)
See you.

SQL Saturday Lisbon: from Francesinha to Bacalhau

Mon, 2015-05-18 23:45

The last week-end, I was at the SQL Saturday 369 that held in Lisbon. If you take a look at the agenda, you'll probably see that there is a lot of interesting sessions with a lot of famous speakers. Unfortunately, I was not able to attend to all sessions, so I decided to focus only on those that have a direct correlation with my work.

First, 2 "headache" sessions given by Paul White (aka @SQL_Kiwi) about the query optimizer and some internal stuffs. The QO is definitely a very interesting topic and I'm always willing to discover more and more with guys like Paul to improve my skills.

Then, 2 sessions about In-Memory features with SQL Server 2016. In fact, I'm already aware about potential new features about the next SQL Server version, but attending to a session given by Niko Neugebauer about columnstore and discuss about next features adds always a certain value for sure. Thanks Niko and Murilo Miranda for your sessions! 

Finally another "headache" session to finish this day about batch mode and CPU archictectures given by Chris Adkin. We had a very deep dive explaination about batch mode and how it improves performance with CPU savings.  

 

Moreover, it was also the opportunity to meet some of my SQL Server MVP friends like Jean-Pierre Riehl and Florian Eiden ...

 

blog_45_-_1_-_french_team_sqlsat369

 

... and have a good dinner with the SQL Saturday staff and other speakers. A lot of countries represented here: Portugal, Germany, UK, New Zealand, France and probably others.

 

blog_45_-_2_-_dinner_sqlsat369

 

A beautiful city, a good weather, a lot of very good speakers and a very good staff ... maybe the secret sauce of a successful SQL Server event!

I'm pretty sure that it will be the same to the next SQL Saturday in Paris and I will be there (maybe as a speaker this time)

Variations on 1M rows insert (3): using SQL Server In-Memory features

Mon, 2015-05-18 11:00

Let’s continue with this series about inserting 1M rows and let’s perform the same test with a new variation by using SQL Server In-Memory features. For this blog post, I will still use a minimal configuration that consists of only 1 virtual hyper-V machine with 1 processor, 512MB of memory. In addition my storage includes VHDx disks placed on 2 separate SSDs  (one INTEL SSDC2BW180A3L and one Samsung SSD 840 EVO). No special configuration has been performed on Hyper-V.

Let's begin by the creation script of my database DEMO:

 

CREATE DATABASE [demo] ON PRIMARY ( NAME = N'demo_data', FILENAME = N'E:\SQLSERVER\demo_data.mdf' , SIZE = 1048576KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ), FILEGROUP [demo_hk_grp] CONTAINS MEMORY_OPTIMIZED_DATA DEFAULT ( NAME = N'demo_hk', FILENAME = N'E:\SQLSERVER\HK' , MAXSIZE = UNLIMITED) LOG ON ( NAME = N'demo_log', FILENAME = N'F:\SQLSERVER\demo_hk_log.ldf' , SIZE = 1395776KB , MAXSIZE = 2048GB , FILEGROWTH = 10%) GO   ALTER DATABASE [demo] SET COMPATIBILITY_LEVEL = 120 GO   ALTER DATABASE [demo] SET RECOVERY SIMPLE; GO

 

Next the creation script of all user objects that includes:   - 2 disk-based tables: DEMO_DB_PK (with a clustered primary key) and DEMO_DB_HP (a heap table) - 2 In-Memory optimized tables: DEMO_HK_SCH_DATA (data arepersisted) and DEMO_HK_SCH (only schema is persisted)

 

CREATE TABLE [dbo].[DEMO_DB_PK] (        [id] [int] NOT NULL primary key,        [text] [varchar](15) COLLATE French_CI_AS NULL,        [number] [int] NULL, )   CREATE TABLE [dbo].[DEMO_DB_HP] (        [id] [int] NOT NULL,        [text] [varchar](15) COLLATE French_CI_AS NULL,        [number] [int] NULL, )     CREATE TABLE [dbo].[DEMO_HK_SCH_DATA] (        [id] [int] NOT NULL,        [text] [varchar](15) COLLATE French_CI_AS NULL,        [number] [int] NULL,   PRIMARY KEY NONCLUSTERED HASH (        [id] )WITH ( BUCKET_COUNT = 2097152) )WITH ( MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_AND_DATA ) GO   CREATE TABLE [dbo].[DEMO_HK_SCH] (        [id] [int] NOT NULL,        [text] [varchar](15) COLLATE French_CI_AS NULL,        [number] [int] NULL,   PRIMARY KEY NONCLUSTERED HASH (        [id] )WITH ( BUCKET_COUNT = 2097152) )WITH ( MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_ONLY ) GO

 

Finally, the last script of creating 7 stored procedures in order to test different cases:

sp_demo_insert_demo_db_hp: insert 1M rows inside a disk-based heap table sp_demo_insert_demo_db_pk: insert 1M rows inside a disk-based clustered table sp_demo_insert_demo_hk_sch_data: insert 1M rows inside an In-Memory optimized table in INTEROP with data persisted on disk sp_demo_insert_demo_hk_sch: insert 1M rows inside an In-Memory optimized table in INTEROP with only schema persisted on disk sp_demo_insert_demo_hk_sch_data_cp: insert 1M rows inside an In-Memory optimized table in NATIVE with data persisted on disk and durability sp_demo_insert_demo_hk_sch_data_cp_d: insert 1M rows inside an In-Memory optimized table in NATIVE with data persisted on disk and delayed durability enable sp_demo_insert_demo_hk_sch_cp: insert 1M rows inside an In-Memory optimized table in NATIVE with only schema persisted on disk  

Just as reminder, INTEROP procedures allow using both disk-based and In-Memory optimized tables whereas the NATIVE (or natively compiled) procedures doesn't support disk-based tables. However, using the latter is very interesting in performance perspective because it improves drastically the execution time.

 

CREATE PROCEDURE [dbo].[sp_demo_insert_demo_db_hp] AS   SET NOCOUNT ON;   DECLARE @i INT = 1;   WHILE @i <= 1000000
  BEGIN          INSERT INTO dbo.DEMO_DB_HP VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);   SET @i += 1;   END GO

 

CREATE PROCEDURE [dbo].[sp_demo_insert_demo_db_pk] AS   SET NOCOUNT ON;   DECLARE @i INT = 1;   WHILE @i <= 1000000
  BEGIN          INSERT INTO dbo.DEMO_DB_PK VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);   SET @i += 1;   END GO

 

create procedure [dbo].[sp_demo_insert_demo_hk_sch_data] AS   SET NOCOUNT ON;   DECLARE @i INT = 1;   WHILE @i <= 1000000
  BEGIN          INSERT INTO dbo.DEMO_HK_SCH_DATA VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);   SET @i += 1;   END GO

 

CREATE PROCEDURE [dbo].[sp_demo_insert_demo_hk_sch] AS   SET NOCOUNT ON;   DECLARE @i INT = 1;   WHILE @i <= 1000000
  BEGIN          INSERT INTO dbo.DEMO_HK_SCH VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);   SET @i += 1;   END GO

 

Note that for the following natively compiled stored procedures, I rewrote one portion of code because it concerns CASE statement which is not supported with SQL Server 2014.

 

create procedure [dbo].[sp_demo_insert_demo_hk_sch_data_cp] with native_compilation, schemabinding, execute as owner as begin atomic with ( transaction isolation level=snapshot, language=N'us_english')          DECLARE @i INT = 1;        DECLARE @test_case INT = RAND() * 10;        DECLARE @name VARCHAR(20);          IF @test_case = 1              SET @name = 'Marc'        ELSE IF @test_case = 2              SET @name = 'Bill'        ELSE IF @test_case = 3              SET @name = 'George'        ELSE IF @test_case = 4              SET @name = 'Eliot'        ELSE IF @test_case = 5              SET @name = 'Matt'        ELSE IF @test_case = 6              SET @name = 'Trey'        ELSE              SET @name = 'Tracy';          WHILE @i <= 1000000
       BEGIN                INSERT INTO [dbo].[DEMO_HK_SCH_DATA] VALUES (@i, @name, RAND() * 10000);                SET @i += 1;        END End go

 

create procedure [dbo].[sp_demo_insert_demo_hk_sch_cp] with native_compilation, schemabinding, execute as owner as begin atomic with ( transaction isolation level=snapshot, language=N'us_english')          DECLARE @i INT = 1;        DECLARE @test_case INT = RAND() * 10;        DECLARE @name VARCHAR(20);          IF @test_case = 1              SET @name = 'Marc'        ELSE IF @test_case = 2              SET @name = 'Bill'        ELSE IF @test_case = 3              SET @name = 'George'        ELSE IF @test_case = 4              SET @name = 'Eliot'        ELSE IF @test_case = 5              SET @name = 'Matt'        ELSE IF @test_case = 6              SET @name = 'Trey'        ELSE              SET @name = 'Tracy';          WHILE @i <= 1000000
       BEGIN                INSERT INTO [dbo].[DEMO_HK_SCH] VALUES (@i, @name, RAND() * 10000);                SET @i += 1;        END end go

 

create procedure [dbo].[sp_demo_insert_demo_hk_sch_data_cp_d] with native_compilation, schemabinding, execute as owner as begin atomic with ( transaction isolation level=snapshot, language=N'us_english', delayed_durability = on)          DECLARE @i INT = 1;        DECLARE @test_case INT = RAND() * 10;        DECLARE @name VARCHAR(20);          IF @test_case = 1              SET @name = 'Marc'        ELSE IF @test_case = 2              SET @name = 'Bill'        ELSE IF @test_case = 3              SET @name = 'George'        ELSE IF @test_case = 4              SET @name = 'Eliot'        ELSE IF @test_case = 5              SET @name = 'Matt'        ELSE IF @test_case = 6              SET @name = 'Trey'        ELSE              SET @name = 'Tracy';          WHILE @i <= 1000000
       BEGIN                INSERT INTO [dbo].[DEMO_HK_SCH_DATA] VALUES (@i, @name, RAND() * 10000);                SET @i += 1;        END end GO

 

Ok it's time to run the different test scenarios. You will see below the different results and their related wait statistics that I found on my environment:

 

blog_44_-_1_-_results_

 

blog_44_-_2_-_wait_stats_

 

First of all, let’s notice that using In-Memory optimized tables in INTEROP mode seems to not improve the overall performance in all cases. Indeed, we still deal with the pretty same duration and the same CPU time as well when writing to the transaction log and checkpoint files for the In-Memory optimized tables with data persisted on disk – SCHEMA_AND_DATA.

Next, using In-Memory tables with only schema persisted on disk – SCHEMA_ONLY - contributes to better results (performance gain x 2) as we may expect. Indeed, the WRITELOG wait type has completely disappeared because data are not persisted in this case, so we minimize considerably the amount of records inside the Tlog.

Finally let’s have a look at the tests concerning natively compiled stored procedures. If we refer to the first tests (either for disk-based tables or for INTEROP), we can see that we reduce drastically the CPU consumption by using natively compiled stored procedures (roughly 97% in the best case). So, inserting 1M rows is very fast in this case.

 

Moreover, if we focus only on the results only between In-Memory optimized tables with different durability (SCHEMA_AND_DATA and SCHEMA_ONLY), we may notice that using transaction delayed durability may help. Once again persisting data by writing into the TLog and checkpoint files seems to slow down the insert process.

Some wait types still remain as PREEMPTIVE_OS_CREATEFILE, PREEMPTIVE_OS_CLOSEHANDLE and PREEMPTIVE_IS_FILEOPS and I will probably focus on them later. At this point, I would suspect a misconfigured storage or maybe my Hyper-V settings but I have to verify this point. I already double checked that I enabled instant file initialization according to the Microsoft documentation and disabled also some others features like 8.3 names, file indexing and last modification date tracking as well. So I will come back soon when I have more information.

But anyway for the moment we get a new time reference here: 2’’59’ for disk-based tables against 778 ms for In-Memory optimized tables + natively compiled stored procedures in the best scenario that tends to state that with In-Memory optimized we may get a huge performance improvement depending on our scenario.

So stay connected and see you soon for the next story!

 

Sharepoint and Nintex Workflow: automate your Business Processes

Mon, 2015-05-18 03:58
 

Nintex_workflow_logo

 

What is a WORKFLOW?

A workflow is the representation of an operation sequence, declared as work of a person or group, an organization of staff, or one or more simple or complex mechanisms. These mechanisms could be automated using specific tools.

In other words, we could compare a "workflow" to a "recipe".
Imagine you have to cook something sweet, first you choose the kind, then open the book, and you have to follow steps, one by one with validation, to finally get a beautiful apple pie, Business Processes are the same. Step by step, each validation counts, in order to get a final "product". By NOT USING PROCESSES ... processnogood   ... following a WORKFLOW   precessgood Nintex Workflow

Nintex is an independent software vendor and workflow company with customers in over 90 countries. Nintex was founded in 2004, the company's products include Nintex Workflow, recognized by Forrester analysts as the most utilized third party tool for Microsoft SharePoint, Nintex Forms, Nintex Workflow for Project Server, Nintex Live and Nintex Mobile apps.

Sharepoint and Nintex Workflow is the best combination for advanced workflows

Sharepoint already has an adequate function integrated that enables you to use workflows.
As soon as users starts to use it, they became familiar with the tool and as usual, they wants more and more ... and here starts advanced needs.


Nintex is the best tool recommended. It can be integrated into the SharePoint user interface so it does not require the use of Visual Studio or SharePoint Designer. Nintex Workflow provides an intuitive visual interface and allows users and IT pro to create and maintain workflows more easily. Nintex Workflow 2007 and 2010 add a ‘drag and drop’ design program for workflows, connectivity and advanced workflow features to the Microsoft SharePoint 2010 platform.

Why automation?

The workflow automation ensures perfect interaction between documents, user and management applications.


The advantages of Nintex Workflow

  • It is intuitive and easy to use.
  • You can view the real-time status of workflows.
  • It improves your business performance.
  • It offers reusable workflows.
  • It is easy to use and manage; no client software is needed.
CONCLUSION

Benefits of workflow automation

  • Optimum productivity and competitiveness through a continuous processing workflows.
  • Shorter response times and better customer service through faster processing and full availability of information.
  • Optimal cash with a better customer invoice processing cycle
  • Manage financial and legal risks through audit processes and data retention over several years.
  • Reliability and consistency of the process to ensure compliance with procedures.
  • Transparency workflows with continuous tracking of documents, identifying bottlenecks and speeding decision making.
  • A complete history of all changes made by users , available at any time
  • A quick adaptability to environmental changes or customers business needs, providing a decisive competitive advantage.

 

megadesk  Use Workflow Automation with Nintex on SharePoint make life easier

Variations on 1M rows insert (6): CPU Flame Graph

Sun, 2015-05-17 12:21

If you followed the 1M rows insert variations, you have seen how something simple brought me into an investigation with the latest tools available to see where the time is spent. When the time is CPU time, there is no wait event to get the detail of DB Time. I've used perf events in the previous post and I'll now visualize them with CPU Flame Graph. My goal is to understand why my test cases is 2x longer in 11.2.0.4 vs 11.2.0.3

The idea to use Flame Graphs come from Luca Canali:

Variations on 1M rows insert (5): using 'perf'

Sat, 2015-05-16 13:13

In previous post I explained that I've observed a 2x CPU usage when running my insert test case in 11.2.0.4 vs. 11.2.0.2 I discussed that with @martinberx @OracleSK @TanelPoder @xtner @Ycolin @fritshoogland @lleturgez @LucaCanaliDB on twitter and it became clear that:

  • 'perf' is the tool to use when investigating CPU usage.
  • we must be use hardware event 'cpu-cycles' and not 'cpu-clock' and they are not available in my VirtualBox VM
So I installed a VMware VM with OEL7 and the following two versions of Oracle: 11.2.0.3 and 11.2.0.4

Elapsed Time

On my VirtualBox VMs my test case elapsed time was 77 seconds on 12.1.0.2 instead of 35 seconds on 11.2.0.2. I've tested it in different situations: 32-bit and 64-bit linux, 11.2.0.1, 11.2.0.2, 11.2.0.4 and it was clear that the CPU time used to execute the test case is consistently about 35 seconds in version 11.2.0.2 and before. And the double on versions 11.2.0.4 and 12c.

I tried perf but it wasn't possible to get cpu-cycles samples even when setting SSE4 passthrough. So I installed a VMware environment (and I've enabled 'Virtualize CPU performance counters in VM sessions') and tried the buk insert version.

Here are a few runs on 11.2.0.4

Elapsed: 00:00:18.01
Elapsed: 00:00:16.40
Elapsed: 00:00:15.61
Elapsed: 00:00:15.54
Here is the elapsed time on 11.2.0.3
Elapsed: 00:00:08.34
Elapsed: 00:00:08.35
Elapsed: 00:00:09.87
Elapsed: 00:00:09.25
So I've two conclusions for the moment:
  • It's not the same time in VMWare and VirtualBox
  • I've reproduced my issue where elapsed time is x2 11.2.0.4

cpuinfo

The performance difference probably comes from the CPU features that are enabled in the VMware VM:

[oracle@vmware home]$ grep ^flags /proc/cpuinfo
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c rdrand hypervisor lahf_lm ida arat epb xsaveopt pln pts dtherm fsgsbase smep
and that are not in the VBox VM:
[oracle@vbox home]$ grep ^flags /proc/cpuinfo
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good nopl pni monitor ssse3 lahf_lm

perf

In order to use perf, I've read Stefan Koehler blog about it.

First, I get the process id for my session:

SQL> select spid from v$process 
where addr=(select paddr from v$session where sid=sys_context('userenv','sid'));
SPID
------------------------
55597
Then I record the sampling of cpu-cycles:
perf record -e cpu-cycles -o /tmp/perf.out -g -p 55597
Then I run my PL/SQL block inserting 1M rows, bulk insert, commit only at the end. And I stop the sampling with ^c.

I show the report with:

perf report -i /tmp/perf.out -g none -n --stdio

Perf report for run on 11.2.0.3

# ========
# captured on: Sat May 16 21:59:53 2015
# hostname : localhost.localdomain
# os release : 3.8.13-35.3.1.el7uek.x86_64
# perf version : 3.8.13-35.3.1.el7uek.x86_64
# arch : x86_64
# nrcpus online : 1
# nrcpus avail : 1
# cpudesc : Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
# cpuid : GenuineIntel,6,58,9
# total memory : 1780608 kB
# cmdline : /usr/libexec/perf.3.8.13-35.3.1.el7uek.x86_64 record -e cpu-cycles -o /tmp/perf.out -g -p 55256
# event : name = cpu-cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 1, precise_ip = 0, id = { 10 }
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
# ========
#
# Samples: 28K of event 'cpu-cycles'
# Event count (approx.): 18991352213
#
# Overhead      Samples  Command       Shared Object                                       Symbol
# ........  ...........  .......  ..................  ...........................................
#
    13.11%         3763   oracle  oracle              [.] kdiins0
     3.43%          958   oracle  oracle              [.] lnxadd
     2.73%          761   oracle  oracle              [.] pfrfd1_init_locals
     2.47%          688   oracle  oracle              [.] lnxmul
     2.23%          635   oracle  oracle              [.] __intel_new_memcpy
     2.18%          608   oracle  oracle              [.] pevm_ENTER
     1.90%          529   oracle  oracle              [.] lnxsub
     1.84%          519   oracle  oracle              [.] pmusgel_Get_Element
     1.75%          500   oracle  oracle              [.] kdkcmp1
     1.68%          467   oracle  oracle              [.] pfrrun_no_tool
     1.58%          440   oracle  oracle              [.] pfrust
     1.54%          429   oracle  oracle              [.] lnxmin
     1.22%          349   oracle  oracle              [.] kauxsin
     1.07%          297   oracle  oracle              [.] pfrinstr_INMDH
     0.94%          269   oracle  [kernel.kallsyms]   [k] native_apic_mem_write
     0.88%          244   oracle  oracle              [.] pevm_RET
     0.87%          248   oracle  oracle              [.] pfsabrc
     0.85%          237   oracle  oracle              [.] pfrinstr_ADDN
     0.83%          231   oracle  oracle              [.] pfrxca
     0.78%          217   oracle  oracle              [.] pfrinstr_INHFA1
     0.75%          209   oracle  oracle              [.] pfrinstr_SUBN
     0.73%          204   oracle  oracle              [.] kgiPinObject
     0.70%          200   oracle  oracle              [.] pfrb2_convert_var2aad
     0.69%          191   oracle  oracle              [.] pmusise_Insert_Element
     0.67%          187   oracle  oracle              [.] pfrinstr_INFR
     0.64%          179   oracle  libc-2.17.so        [.] __memmove_ssse3_back

Perf report for run on 11.2.0.4

# ========
# captured on: Sat May 16 19:46:51 2015
# hostname : localhost.localdomain
# os release : 3.8.13-35.3.1.el7uek.x86_64
# perf version : 3.8.13-35.3.1.el7uek.x86_64
# arch : x86_64
# nrcpus online : 1
# nrcpus avail : 1
# cpudesc : Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
# cpuid : GenuineIntel,6,58,9
# total memory : 1780608 kB
# cmdline : /usr/libexec/perf.3.8.13-35.3.1.el7uek.x86_64 record -e cpu-cycles -o /tmp/perf.out -g -p 35344
# event : name = cpu-cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 1, precise_ip = 0, id = { 5 }
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
# ========
#
# Samples: 25K of event 'cpu-cycles'
# Event count (approx.): 17296175886
#
# Overhead      Samples  Command      Shared Object                                       Symbol
# ........  ...........  .......  .................  ...........................................
#
    14.20%         3646   oracle  oracle             [.] kdiins0
     3.61%          918   oracle  oracle             [.] lnxadd
     3.20%          814   oracle  oracle             [.] pfrfd1_init_locals
     2.49%          637   oracle  oracle             [.] __intel_new_memcpy
     2.43%          618   oracle  oracle             [.] lnxmul
     2.34%          596   oracle  oracle             [.] pevm_ENTER
     2.05%          527   oracle  oracle             [.] kdkcmp1
     2.01%          513   oracle  oracle             [.] pmusgel_Get_Element
     2.00%          507   oracle  oracle             [.] lnxsub
     1.99%          505   oracle  oracle             [.] pfrrun_no_tool
     1.74%          443   oracle  oracle             [.] lnxmin
     1.74%          441   oracle  oracle             [.] pfrust
     1.72%          434   oracle  libc-2.17.so       [.] __memmove_ssse3_back
     1.25%          321   oracle  oracle             [.] kauxsin
     1.22%          311   oracle  oracle             [.] pfrfd_init_frame
     1.17%          299   oracle  oracle             [.] pfrinstr_INMDH
     1.09%          276   oracle  oracle             [.] kglpnp
     0.96%          243   oracle  oracle             [.] pfrinstr_ADDN
     0.94%          239   oracle  oracle             [.] pfrxca
     0.93%          239   oracle  oracle             [.] pfsabrc
     0.91%          230   oracle  oracle             [.] pmusise_Insert_Element
     0.90%          228   oracle  oracle             [.] __PGOSF347_pfrinstr_INHFA1
     0.81%          206   oracle  oracle             [.] kss_get_owner
     0.80%          204   oracle  oracle             [.] pfrinstr_XCAL
     0.72%          182   oracle  oracle             [.] pevm_RET
     0.70%          177   oracle  oracle             [.] pevm_ARGEIBBI
     0.70%          178   oracle  oracle             [.] pfrb2_convert_var2aad

I don't see many differences here, so it does not seem that there is an additional code path that is run.

So the question is still opened... But 'perf' is really a great tool to get the detail about the database time with is not accounted in wait events. There are a lot of references about it at the end of Stefan Koehler's blog.

Update May 17th: It was not clear when I posted that yesterday, but all tests on that VMware environment are done using the bulk version with forall but no append_values hint.

Optimized Row Columnar (ORC) format in PostgreSQL

Sat, 2015-05-16 06:28

Nowadays everybody is talking about columnar storage format. What can PostgreSQL do in this area? There is no native support for that in PostgreSQL but thanks to the fact that PostgreSQL is highly extensible there is a foreign data wrapper called cstore_fdw. Lets take a look on what it can do.

Variations on 1M rows insert (4): IN MEMORY

Fri, 2015-05-15 14:09

Oracle In-Memory option is for reporting. Oracle has chosen an hybrid approach: maintain the In-Memory Column Store in addition of the buffer cache. Dont' forget our In-Memory event, but for the moment, let's try the 1M row insert when the target table is IN MEMORY.

12c

I've done the previous variations on 11g because I wanted to use the Developer Days VM that has also TimesTen installed. But now I want to test some 12c features. So I run the same as the 1st variation. The one on a table with one index (primary key), conventional inserts committed only at the end. The one that takes 35 seconds on 11.2.0.2.

Here is the same on an equivallent VM with 12.1.0.2:

SQL> set timing on
SQL> declare
  2   type people_array is varray(12) of varchar(15);
  3   people people_array := people_array( 'Marc', 'Bill', 'George', 'Eliot', 'Matt', 'Trey', 'Tracy','Greg', 'Steve', 'Kristina', 'Katie', 'Jeff' );
  4   people_count number :=people.COUNT;
  5   n number;
  6  begin
  7   for i in 0..1e6 loop
  8    n:=trunc(dbms_random.value(0,10000));
  9    insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
 10    -- 
 11   end loop;
 12   commit;
 13  end;
 14  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:17.03
Yes it's 2 times longer and I don't know why. I tried with different versions and for the moment it seems that the CPU consumption in 11.2.0.4 or 12c is x2 when compared to 11.2.0.3. I didn't investigate further and I'll probably come back to that after attending Luca Canali session next week about modern linux tools for Oracle troubleshooting.

Please don't take any conclusion about it now. In real life, 12c is not slower than 11g. This is just one testcase on one specific context and there is nothing to say about it before understanding what happens. This test is just there to set the baseline as 01:17:00 on that 12c database.

In-Memory

The question of the day is: How this 1 million rows insert behave on an IN MEMORY table? We know that there is an overhead to maintain both the buffer cache and the In-Memory Column Store. And we know that this is probably not on problem because In-Memory often let us to get rid of a few indexes and the gain in index maintenance compensates the overhead.

SQL> create table DEMO ("id" number primary key, "text" varchar2(15), "number" number) 
INMEMORY PRIORITY CRITICAL;
Table created.
I've created the table in-memory with on-demand population. The load duration is not higher than without in-memory:
PL/SQL procedure successfully completed.
Elapsed: 00:01:23.35	
However in-memory journal has been updated:
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
IM space private journal bytes allocated                           25100288
IM space private journal bytes freed                               25100288
IM transactions rows journaled                                       394895
The 25MB is the size of my 1M rows but not all rows have been populated in memory:
SQL> select segment_name,inmemory_size,bytes,bytes_not_populated from v$im_segments;

SEGMENT_NA    INMEMORY_SIZE            BYTES      BYTES_NOT_POPULATED
---------- ---------------- ---------------- ------------------------
DEMO              1,179,648       23,068,672               11,354,112
If I query the table we still read some blocks from buffer cache:
SQL> set autotrace on 
SQL> select count(*) from demo;

  COUNT(*)
----------
   1000002

Execution Plan
----------------------------------------------------------
Plan hash value: 2180342005

----------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |    28   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |      |     1 |            |          |
|   2 |   TABLE ACCESS INMEMORY FULL| DEMO |  1025K|    28   (0)| 00:00:01 |
----------------------------------------------------------------------------

Statistics
----------------------------------------------------------
         13  recursive calls
          1  db block gets
       4681  consistent gets
       1795  physical reads
     145188  redo size
        545  bytes sent via SQL*Net to client
        551  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

And then the whole table is now populated in memory:
SQL> select segment_name,inmemory_size,bytes,bytes_not_populated from v$im_segments;

SEGMENT_NA    INMEMORY_SIZE            BYTES      BYTES_NOT_POPULATED
---------- ---------------- ---------------- ------------------------
DEMO              8,585,216       23,068,672                        0

But even then, we need to read some blocks from buffer cache:
SQL> set autotrace on
SQL> select count(*) from demo;

  COUNT(*)
----------
   1000002

Execution Plan
----------------------------------------------------------
Plan hash value: 2180342005

----------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |    28   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |      |     1 |            |          |
|   2 |   TABLE ACCESS INMEMORY FULL| DEMO |  1025K|    28   (0)| 00:00:01 |
----------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1381  consistent gets
          0  physical reads
          0  redo size
        545  bytes sent via SQL*Net to client
        551  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Conclusion

The update if In-Memory is not very well documented. From this simple test, it seems that updating the in-memory column store has very limited overhead when storing the modifications into the in-memory transactional log. However, reading the rows just after the insert seems to be not very optimal. Not all rows have been updated into the transaction log. And even when all is populated, buffer cache is still read. If you want to know a bit more, with deep tracing, you can check Mahmoud Hatem investigations about that, with deep tracing.

In this post there are more questions than answers, but the fact is that maintaining the In-Memory Column Store is not a big overhead, which make it possible on our OLTP databases.

Migration of QlikView Services

Fri, 2015-05-15 12:00


QlikView is a product of the software company named Qlik (previously known as QlikTech) which was designed to provide business intelligence & visualization capabilities using reports or dashboards. The QlikView solution is in fact composed of some components just like a Documentum environment (Content Server, WebServer, JMS, aso...). So QlikView is composed of three main components:

  • QlikView Server: the core of the QlikView solution which provide access to the QlikView Documents for the users
  • QlikView Publisher: an entity used to automatically reload and manage more efficiently the QlikView Documents
  • QlikView Desktop: a development tool that can be used to build your reports/dashboards


Based on these descriptions, the only element that is needed on all environments is the QlikView Server. Of course at a certain point of your project, you may also need a QlikView Desktop to build your reports/dashboards but once done, you just don't really need it anymore. The QlikView Publisher isn't necessary but it will definitively make your life easier.

 

I. Considerations


To be more precise, QlikView doesn't just provide some components but it provides some Services. In fact, each Service is represented by a Windows Service and it can be seen as a specific role with dedicated features. In the QlikView world, an upgrade or a migration is almost the same thing. The main difference is that some elements may change between two main releases of QlikView: the path of a folder, the .NET Framework used, aso... So if you plan to upgrade or migrate your installation (or a part of your installation), then the most important thing to understand is probably that you need to take care of each Service, one Service at a time.


To improve the performance of QlikView, the QlikView Server is designed to mainly use the RAM to store QlikView Documents because the access to the Memory is way more faster than the access to the hard drive. For example, when a user opens a QlikView Document (using a browser) of 1GB (size of the document), then 4GB or RAM are used to store the document in the Memory. Each additional user that will access this QlikView Document will increase this amount by 40% of the document's size (+ 400 MB of RAM per user). On the other side, the QlikView Publisher is designed to use CPUs for its calculations, aso...


When using QlikView in a small company or with a small number of users, installing all QlikView Services in one Windows Server is often sufficient. A Windows Server with 64, 128 or 256GB or RAM and 16, 32 or 64 CPUs is something quite "normal" for QlikView. However, if your QlikView environment starts to show some weaknesses (jobs failure, locked tasks, QMC not responding, aso...) then it's probably the time to do something... Because the QlikView Server and Publisher handle the Memory and CPU consumption in a very different way, a best practice is always to separate them but for small companies it may not be necessary.


II. Migration of Services - Theory


Because of this last consideration, in the two remaining parts of this post I will try to explain how to separate the QlikView Services as it is recommended. So what are the Services provided by QlikView?

  • QlikView Server
  • QlikView WebServer (when using the WebServer of QlikView
  • QlikView Settings Service (when using IIS)
  • QlikView Distribution Service
  • QlikView Management Service
  • QlikView Directory Service Connector


You can also have some additional Services according to what have been installed on the QlikView environment like the QlikView Offline Service (offline access via a mobile/tablet) or the QlikView SAP Network Server. The best practice is generally to do the following:

Server 1 - focus on RAM

  • QlikView Server
  • QlikView WebServer


Server 2 - focus on CPU

  • QlikView Management Service
  • QlikView Directory Service Connector
  • QlikView Distribution Service


III. Migration of Services


The general procedure to migrate a QlikView Service from Server 1 to Server 2 is always the same but some steps differs a little bit for a specific Service. Remember that the best thing to do is always to do one service at a time and to check that QlikView is still working properly between each migration. So an overview of this procedure would be:

  1. Installation of the QlikView Service on Server 2
  2. Configuration
  3. Uninstallation of the QlikView Service on Server 1



Installation of the QlikView Service on Server 2


The installation of a Service on a separate server during a migration is a quite simple step:

  1. Stop the QlikView Service on Server 1
  2. Run the QlikView Server installer: QlikViewServer_x64Setup.exe (64 bits Windows Server 2k8) or QlikViewServer_Win2012andUp.exe (64 bits Windows Server 2012)
  3. On the screen to specify what should be installed, always choose the "Custom" proposal and then check which QlikView Service should be installed.
  4. Reboot the Server 2



Configuration


The configuration part is quite simple since there is only one task that should be executed: change a URL. This task is the same for all QlikView Services except one: the QlikView Management Service. For all other QlikView Services, here is what should be done:

  1. Open the QlikView Management Console using a browser. The default URL is something like: http://##Server_1_Hostname##:4780/qmc/SystemSetup.htm
  2. Expand the folder that correspond to the QlikView Service (e.g. "Distribution Services" for the QlikView Distribution Service)
  3. Click on the element inside this folder (e.g. "QDS@##Server_1_Hostname##" for the QlikView Distribution Service)
  4. Click on the "General" tab
  5. Change the URL value replacing ##Server_1_Hostname## with ##Server_2_Hostname##
  6. Click on "Apply" to save your changes


And that should be sufficient for QlikView to know that you installed the QlikView Service on another server.



So as said above, the configuration part is different for the QlikView Management Service. This Service is the one that takes care of the configuration on the QlikView Management Console. That's why it doesn't make much sense to change something on the QlikView Management Console that was just installed on another server (the installer knows on which server it was executed)...


So what should be done in case this QlikView Service has been installed on Server 2? Well it's also quite simple: almost all configurations of QlikView are stored in something they called the "QVPR Database" (QVPR for QlikView Repository). By default this QVPR Database is just a XML Repository but it can also be a SQL Server database. There is a setting on the QlikView Management Console (System > Setup > Management Service > Repository tab) that control if the QVPR Database should be backed-up or not (never, daily, every X minutes...) and there is even a button to "Backup Now" the configuration. The location of these backups if defined in this page too but if you want to open the real location of the XML Repository, you should take a look at "C:/ProgramData/QlikTech/ManagementService/QVPR/". So the configuration part for the QlikView Management Service consists of:

  1. Stop the QlikView Management Service on Server 2
  2. Copy the QVPR backup from Server 1 to Server 2
  3. Restore the QVPR backup to the Server 2 (replace existing files with the ones from the backup)
  4. Start the QlikView Management Service on Server 2
  5. Check if all configurations are OK



Uninstallation of the QlikView Service on Server 1


The uninstallation step is quite simple too... Everything is simple with QlikView, isn't it? ;)

  1. Run the QlikView Server installer: QlikViewServer_x64Setup.exe (64 bits Windows Server 2k8) or QlikViewServer_Win2012andUp.exe (64 bits Windows Server 2012)
  2. QlikView will detect that some components are already installed
  3. On the second screen, select "Modify"
  4. On the next screen, click on the QlikView Service and disable it ("This feature will not be available")
  5. Complete the "installation" process to uninstall the QlikView Service

 

Once the three QlikView Services have been migrated from Server 1 to Server 2, you should be able to see an improvment in the performances or at least less issues with the QlikView Server & Publisher! ;)

 

 

tmux - an alternative to screen

Fri, 2015-05-15 00:37

You may already use screen for multiplexing a terminal. This is especially useful when you want to start long running processes on a server and do not want to loose the connection because of a ssh connection timeout, firewall settings or other reasons. With screen the session keeps running even if you got disconnected somehow and you may re-attach to the screen session at any point later.