Skip navigation.

Feed aggregator

Variations on 1M rows insert (2): write commit

Yann Neuhaus - Mon, 2015-05-11 22:29

In this blog post, I will try to do the same than my colleagues about Oracle and for PostgreSQL. As a reminder, we’ve seen in my previous blog post that SQL Server is designed to commit transactions implicitly by default and inserting 1M rows in this case may have a huge impact on the transaction log throughput. Each transaction is synchronously committed to the transaction log. In this blog post, we’ll see a variation of the previous test

Indeed, since SQL Server 2014 version, it is possible to change a little bit this behaviour to improve the overall performance of our test by using a feature called delayed durability transaction. This is a performance feature for sure but you will have to trade durability for performance. As explained here, SQL Server uses a write-ahead logging protocol (WAL) and using this new feature will temporarily suspend this requirement.

So, let’s perform the same test but this time I will favour the overall throughput performance by using delayed durability option.

 

alter database demo set delayed_durability = allowed; go     DECLARE @i INT = 1;   WHILE @i &lt= 1000000 BEGIN           begin tran;          INSERT INTO DEMO VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);                     commit tran with (delayed_durability = on);          SET @i += 1; END

 

-- 00:00:20 – Heap table -- 00:00:19 – table with clustered index

 

If I refer to my first test results with implicit commit behaviour, I may effectively notice a big performance improvement (86%). You may also note that I used this option at the transaction level after enabling delayed durability at the database level but in fact you have other possibilities. Depending on your context, you may prefer either to enable or to force this option directly at the database level.

Do I have to enable it? If your business is comfortable making throughput versus durability and this option improves the overall performance, go ahead but keep in mind that you also have others ways to improve your transaction log throughput before enabling this option (please read the Paul Randal’s blog posts)

Want to meet in one day our experts from all technologies? Come to our Event In-Memory: boost your IT performance! where we talk about SQL Server, Oracle and SAP HANA.

 

 

Efficiency

Jonathan Lewis - Mon, 2015-05-11 14:24

Here’s a question to which I don’t know the answer, and for which I don’t think anyone is likely to need the answer; but it might be an entertaining little puzzle for thr curious.

Assume you’re doing a full tablescan against a simple heap table of no more than 255 columns (and not using RAC, Exadata, In-memory, etc. etc. etc.), and the query is something like:


select  {columns 200 to 250}
from    t1
where   column_255 = {constant}
;

To test the predicate Oracle has to count its way along each row column by column to find column 255. Will it:

  1. transfer columns 200 to 250 to local memory as it goes, then check column_255 — which seems to be a waste of CPU if the predicate doesn’t evaluate to TRUE
  2. evaluate the predicate, then walk the row again to transfer columns 200 to 250 to local memory if the predicate evaluates to TRUE — which also seems to be a waste of CPU
  3. one or other of the above depending on which columns are needed, how long they are, and the selectivity of the predicate

How would you test your hypothesis ?


Worth Reading: Use of adjuncts and one challenge of online education

Michael Feldstein - Mon, 2015-05-11 12:36

By Phil HillMore Posts (319)

There is a fascinating essay today at Inside Higher Ed giving an inside, first-person view of being an adjunct professor.

2015 is my 25th year of adjunct teaching. In the fall I will teach my 500th three-credit college course. I have put in many 14- to 16-hour days, with many 70- to 80-hour weeks. My record is 27 courses in one year, although I could not do that now.

I want to share my thoughts on adjunct teaching. I write anonymously to not jeopardize my precarious positions. How typical is my situation?

The whole essay is worth reading, as it gives a great view into what the modern university and the implications of using adjuncts. But I want to highlight one paragraph in particular that captures the challenge of understanding online education.

I have taught many online courses. We have tapped about 10 percent of the potential of online courses for teaching. But rather than exploring the untapped 90 percent, the college where I taught online wanted to standardize every course with a template designed by tech people with no input from instructors.

I want to design amazing online courses: courses so intriguing and intuitive and so easy to follow no one would ever need a tutorial. I want to design courses that got students eager to explore new things. Let me be clear, I am not talking about gimmicks and entertainment; I am talking about real learning. Is anyone interested in this?

It is naive to frame the debate over online education as solely, or primarily, an issue of faculty resistance. Yes, there are faculty members who are against online education, but one reason for this resistance is a legitimate concern for the quality of courses. What the essay reminds us is that part of the quality issue arises from structural issues from the university and not from the actual potential of well-design and well-taught online courses.

David Dickens at Google+ had an interesting comment based on the “tech people” reference that points to the other side of the same coin.

As a tech guy I can tell you, we’d love to have the time and tools to work with motivated adjuncts (or anyone else), but often times we have to put out something that will work for everyone, will scale, and will be complete and tested before the end of the week.

It is endlessly frustrating to know that there is so much more that could be done. After all, we tech folks are completely submerged in our personal lives with much more awesome tech than we can include in these sorts of “products” as we are constrained to publish them.

There is an immense difference between A) the quality of online education and B) the quality of well-designed and well-taught online education, and that is even different than C) the potential of online education. It is a mistake to conflate A), B), and C).

Update: David is on a roll while in discussion with George Station. This clarification builds on the theme of this post.

My point is that IT isn’t the barrier, but rather we are the mask behind which all the real barriers like to hide. We’d love to do more but can’t, and we get put in the position of taking the blows that should be directed towards the underlying issues.

The post Worth Reading: Use of adjuncts and one challenge of online education appeared first on e-Literate.

Get SQL Server Network Configuration with PowerShell

Yann Neuhaus - Mon, 2015-05-11 09:29

This blog is part of a set which try to automate the SQL Server administration with PowerShell. It explains how to retrieve the network configuration of a SQL Server instance with PowerShell.

I will refer to my previous blog Get SQL Server services with PowerShell. I presented how to get the SQL Server Engine service. I will assume the service has already been retrieved.

 

Disclaimer: I am not a developer but a SQL Server dba. If you find errors or some ways of improvement, I will be glad to read your comments!

 

SQL Engine service

I use the SMO objects to access the SQL Server Engine service. The object obtained has several properties.

This object can retrieve important information concerning the SQL Server Engine service, such as:

 

service_properties.png

 

TCP/IP information

This time, we will use the WMI objects issued from the SMO.  To retrieve the corresponding TCP/IP object concerning the specific instance, proceed as follows:

wmi_object_tcp_ip.png

 

The “Microsoft.SqlServer.Management.Smo.Wmi” namespace contains all the classes that represent the SQL Server WMI.

 

Now, to display the TCP/IP information of your instance, proceed as follows:

tcp_ip_information.png

 

Named Pipe information

As for TCP/IP, we build the WMI object for the Named Pipe with:

wmi_object_named_pipe.png

 

Now to display the Named Pipe information, proceed as follows:

named_pipe_information.png

 

Shared Memory information

As for the previous ones, we build the WMI object for the Shared Memory with:

wmi_object_shared_memory_20150512-063808_1.png

 

Now to display the Shared Memory information, proceed as follows:

shared_memory_information.png

 

Next steps

With these commands, you can generate a dashboard for a specific instance. Here is an example from our Database Management Kit (DMK):

dmk_service_20150512-064356_1.png

 

I hope this blog will help you in your work ;)

Mobile My Oracle Support: Knowledge Search

Joshua Solomin - Mon, 2015-05-11 08:41
Mobile Knowledge Search

Parallel Execution

Jonathan Lewis - Mon, 2015-05-11 03:16

This is another little reference list I should have created some time ago. It covers a series of posts on interpreting parallel execution plans and understanding where the work happens.

I may add further links to this page in the future relating to other aspects of parallel execution.

 


WebSocket Accelerated Live Data Synchronization for MAF

Andrejus Baranovski - Mon, 2015-05-11 02:42
New generation Mobile and Web applications are going to use WebSockets. This is one of the fastest and convenient ways to transfer JSON formatted data from the server to the client and back. Huge advantage - server could initiate data synchronisation request and deliver JSON messages directly to the client. You should read my previous post about detail information how to configure MAF with WebSocket - Oracle MAF and WebSockets Integration - Live Twitter Stream.

In this post I would like to describe, how to integrate further information received through WebSocket with MAF components. I will use MAF Tree Map UI component to display live data for Tweets locations.

Below you could see screen recording - MAF Tree Map is getting refreshed automatically, each time when update through WebSocket channel is being received. This is amazing, how fast data through WebSocket channel is coming. In this test, MAF is handling around 1 update per second, received from WebSocket:


Let's look how it is implement behind the scenes. There is MAF Tree Map UI component, this is regular MAF component to render graph visualisation:


UI component is based on Data Bindings, collection is initialized from Data Control method:


Method handling data update from WebSocket is responsible to update data collection - it invokes helper method, where data collection is re-constructed. Standard Refresh Provider is helping to repaint data displayed on the UI:


Simple and effective. Download sample application - AltaMobileApp_v2.zip.

Final Agenda for the Rittman Mead BI Forum 2015 Atlanta – Running this Week!

Rittman Mead Consulting - Mon, 2015-05-11 02:17

It’s the Monday before the Atlanta Rittman Mead BI Forum 2015, and delegates will start to arrive in Atlanta tomorrow and during the rest of the week. The first event in Brighton was excellent, and we’re hoping for something even better in Atlanta this week!

Safe travels for everyone coming to Atlanta, the official Twitter hashtag for the event is #biforum2015, and in the meantime here’s the final agenda for Atlanta’s BI Forum 2015:

Rittman Mead BI Forum 2015
Renaissance Atlanta Midtown Hotel, Atlanta, GA, USA 

Wednesday 13th May 2015

  • 9.00am – 9.30am Registration for Masterclass attendees
  • 9.30am – 12.30pm Masterclass Part 1
  • 12.30pm – 1.30pm Lunch
  • 1.30pm – 4.30pm Masterclass Part 2
  • 5.30pm – 6.30pm Drinks Reception in Renaissance Midtown Hotel Atlanta
  • 6.30pm – 7.30pm Oracle Keynote – Chris Lynskey
  • 7.30pm – 9.30pm Dinner at Renaissance Midtown Hotel Atlanta

Thursday 14th May 2015

  • 8.45am – 9.00am Welcome and Opening Comments
  • 9.00am – 9.45am Chris Lynskey (Oracle Corporation) – Looking Ahead to Oracle BI 12c and Visual Analyzer
  • 9.45am – 10.30am Robin Moffatt (Rittman Mead) – Smarter Regression Testing for OBIEE
  • 10.30am – 11.00am Coffee
  • 11.00am – 11.45pm Chris Lynskey (Oracle Corporation) – Big Data Discovery
  • 11.45am – 12.30pm Mark Rittman (Rittman Mead) and Tim Vlamis (Vlamis Software Solutions) – Big Data Discovery – Examples from the Field
  • 12.30pm – 1.30pm Lunch
  • 1.30pm – 2.30pm Day 1 Debate – “Self-Service BI – The Answer to Users’ Prayers, or the Path to Madness?”
  • 2.30pm – 3.15pm Tim German / Cameron Lackpour – Hybrid Mode – An Essbase Revolution
  • 3.15pm – 3.45pm Coffee
  • 3.45pm – 4.30pm Kevin McGinley (Red Pill Analytics) – Agile BI Applications: A Case Study
  • 6.00pm – 6.45pm Guest Speaker/Keynote – John Foreman – How Mailchimp used qualitative and quantitative analysis to build their next product
  • 7.00pm – 7.45pm Depart for dinner at restaurant
  • 8.00pm – 10.00pm Dinner at external venue

Friday 15th May 2015

  • 09.00am – 09.45am Stewart Bryson (Red Pill Analytics) – Supercharge BI Delivery with Continuous Integration
  • 09.45am – 10.30am Gabby Rubin (Oracle Corporation) – Solid Standing for Analytics in the Cloud
  • 10.30am – 11.15am Hasso Schaap (Qualogy) – Developing strategic analytics applications on OBICS PaaS
  • 11.15am – 11.30am Coffee
  • 11.30am – 12.15pm Andy Rocha and Pete Tamisin (Rittman Mead) – OBIEE Can Help You Achieve Your GOOOOOOOOOALS!
  • 12.15pm – 1.00pm Christian Screen (Sierra-Cedar) – 10 Tenets for Making Your Oracle BI Applications Project Succeed Like a Boss
  • 1.00pm – 1.30pm Short Lunch
  • 1.30pm – 2.30pm Data Visualization Bake-off
  • 2.30pm – 3.15pm Sumit Sarkar (Progress Software) – NoSQL and OBIEE
  • 3.15pm – 3.45pm Closing Remarks, and Best Speaker Award

If you’re interested in coming along to the Rittman Mead BI Forum 2015 in Atlanta, GA, there are still spaces available with details of the event here, and the registration form here – otherwise we’ll look forward to seeing you all at the Renaissance Atlanta Midtown Hotel later this week.

Categories: BI & Warehousing

SharePoint 2016: What’s new? And What to expect?

Yann Neuhaus - Mon, 2015-05-11 02:11

The Ignite Event took place in Chicago last week. According to what has been presented, here are the first news about the new SharePoint 2016! 

SharePoint 2016 looks like an enhancement of the existing SharePoint 2013 version.
Most of the SP2013 features will continue to work but in a more powerful way.

alt SharePoint 2016 will be focused on the following points:
  • Security Compliance
  • Advanced Analytics and Insights (data analysis and reporting)
  • Cloud experience
The new features for this release include:
  • Links: All URLs will be “Resource ID” based URLs. That means that if one of your document is rename, the document link won’t be broken.
  • User Profile Service Application won’t be available anymore.
  • In order to use the User Profile Services, you will need to use the Forefront Identity Manager separated from the SharePoint Farm (outside). The Active Directory will be there but there are no sync anymore
  • SharePoint 2016 is a Cloud Inspired Infrastructure. The emphasis here is building a Hybrid platform to let users benefit from enhanced cloud innovations that Microsoft often releases for Office 365. One of the most important experience that MS will be introducing to On-Premises clients is the use of Cloud based Search. The power of Delve & Office Graph can also be now applied to the On-Premises Content and the integrated cloud of course.
  • Add On-Premises Content to online Delve board
  • Zero downtime Patching: all upgrades of SharePoint can be done online.

  • New Service applications will have an integration option for Delve
  • The New Hybrid Extranet will allow many scenarios
  • New Repair button to restore Server Services depending on its role
  • The Cloud subscription of services such as Azure Rights Management Services to on-premises setup
  • The InfoPath 2013 application will work with SharePoint Server 2016
Analytics and Insights:

There is a huge improvement in how SharePoint records and displays data analytic and Reporting. SharePoint 2016 will provide monitoring form:

  • Services
  • Actions
  • Usage
  • Engagement
  • Diagnostics
  • In-Memory Analytics (SQL Server 2016 release). Let's see in the next few months what will be announced. Maybe some enhancement of the integration between BI and SP2016

Social:
  • Social improvements start with better Yammer integration to avoid social silos, better integration with the micro-blogging platform, with new capabilities..
Content Management capabilities:
  • Improvement in Content limits: removed Character limit in URLs, increased File size to 10GB, remove the 5000 item limit. The list threshold has been increased

  • Encryption enhancements
  • Multi-factor authentication
  • Data loss prevention
  • eDiscovery
  • Metadata management
  • Policy control
  • Mobile device management
Installation & Configuration Enhancements:
  • OS – Windows Server 2012 R2, Windows Server 2010
  • New Server Installation Wizard will allow users to perform a Role based Installations
  • Database– 64 bit of SQL 2014
A brief summary of SharePoint 2016 using one word would be: CLOUD

 

Links for 2015-05-10 [del.icio.us]

Categories: DBA Blogs

Change Password in Oracle WebLogic

Have you ever forgot the password of WebLogic domain? It is a recurring question I receive now and again: "I’ve forgot my password, how can I reset the AdminServer password in WebLogic domain?...

We share our skills to maximize your revenue!
Categories: DBA Blogs

what will be coming with the next major version of postgresql ?

Yann Neuhaus - Sun, 2015-05-10 13:14

A PostgreSQL major version is usually released once a year. For the next major version, which is 9.5, this probably will be in the second half of 2015.

Variations on 1M rows insert (4): APPEND_VALUES

Yann Neuhaus - Sun, 2015-05-10 12:39

In the first variation I did a bulk insert with the PL/SQL FORALL statement. A comment suggests to add the APPEND_VALUES hint that appeared in 11gR2. APPEND_VALUES hint does the same than APPEND but it's for inserts with VALUES clause: It inserts in direct-path - directly into the datafile, bypassing the buffer cache. Of course, this makes sense only with bulk insert because you don't want to do that with only one row.

APPEND_VALUES

So, from the previous test, inserting 1 million rows in only one transaction (no intermediate commits) into a table with only one index (primary key) took 46 seconds. With bulk insert, the time is only 7 seconds. Let's do the same bulk, but with the APPEND_VALUES hint. I changed only the insert line:

SQL> create table DEMO ("id" number primary key, "text" varchar2(15), "number" number) ;
Table created.

SQL> set timing on
SQL> declare
  2   type people_array is varray(12) of varchar(15);
  3   people people_array := people_array( 'Marc', 'Bill', 'George', 'Eliot', 'Matt', 'Trey', 'Tracy','Greg', 'Steve', 'Kristina', 'Katie', 'Jeff' );
  4   people_count number :=people.COUNT;
  5   type DEMO_record_type is record ("user_id" number , "name" varchar2(15), "number" number) ;
  6   type DEMO_collection_type is table of DEMO_record_type index by binary_integer;
  7   DEMO_collection DEMO_collection_type;
  8  begin
  9   -- fill the collection
 10   for i in 1..1e6 loop
 11    DEMO_collection(i)."user_id":=i;
 12    DEMO_collection(i)."name":=people( dbms_random.value(1,people.count) );
 13    DEMO_collection(i)."number":=trunc(dbms_random.value(0,10000));
 14    null;
 15   end loop;
 16   forall i in 1..DEMO_collection.count insert /*+ append_values */ into DEMO values DEMO_collection(i);
 17   commit;
 18  end;
 19  /
and here is the execution time:
PL/SQL procedure successfully completed.
Elapsed: 00:00:05.28
It's a bit faster. However this testcase is not very well suited to show the benefit.

First, you must know that inserting in direct-path locks the table. We bypass the buffer cache and it's the buffer cache that manages concurrency when updating the blocks. Only one session can modify a block, and that's available also in RAC. So, when you bypass the buffer cache you must lock the table (or the partition if you insert into a partition) to be sure you are the only one that inserts after the high water mark.

Now about the testcase. Here I insert into an empty table. The direct-path insert is very efficient for index maintenance because instead of maintaining index row-by-row, the insert is build afterwards (data inserted after high water mark is read and sorted) and merged to the existing index. This optimization gives better performance on an index that has an higher depth.

APPEND

If you want to have an idea about how fast is direct-path insert, let's see how long it takes to insert the 1 million rows into another table - identical, with a primary key. This is easy with CTAS:

SQL> set timing on
SQL> create table DEMO2 ("id" primary key , "text" , "number" ) as select * from DEMO;
Table created.
Elapsed: 00:00:01.01
One second. That's the fastest way to insert 1 million rows in batch when you have them in a table (or file through external table), when you can lock the whole table and you don't need to do intermediate commits.
We can also do the same with INSERT /*+ APPEND */
SQL> truncate table DEMO2 reuse storage;
Table truncated.
Elapsed: 00:00:00.17

SQL> insert /*+ append */ into DEMO2 select * from DEMO;
1000000 rows created.
Elapsed: 00:00:01.36
An it's the same time (I'll not compare when difference is less than one second).

The index maintenance is fast, but you can go further by disabling the indexes (ALTER INDEX ... UNUSABLE) before and rebuild them afterwards.

However, my initial testcase was made thinking of OLTP, concurrent inserts from difference sessions. So direct-path writes is definitly not for that as it locks the table. But I wanted to show how 1 million rows is very small when we are doing thinks in bulk from one session.

Kerberos SSO for Documentum CS 6.7 SP1

Yann Neuhaus - Sun, 2015-05-10 11:00


In a previous post, I shared some tips to configure the Kerberos SSO with Documentum D2 3.1 SP1. Since that day, I worked on different projects to also setup the Kerberos SSO on some other components of Documentum. In this post I will try to explain in detail what need to be done to configure the Kerberos SSO for the Content Server. Actually it's not that hard to do it but you may face some issues if you try to follow the official documentation of EMC.


So what are the pre-requisites to setup the Kerberos SSO for the Content Server? Well in fact you just need a Content Server of course and an Active Directory to generate the keytab(s). Just to let you know, I used a Content Server 6.7 SP1 and an Active Directory on a Windows Server 2008 R2. Let's define the following properties:

  • Active Directory - user = cskrb
  • Active Directory - password = ##cskrb_pwd##
  • Active Directory - domain = DOMAIN.COM
  • Active Directory - hostname1 = adsrv1.domain.com
  • Active Directory - hostname2 = adsrv2.domain.com
  • Documentum - repository (docbase) = REPO


I. Active Directory prerequisites


As always when working with Kerberos on an Active Directory, the first thing to do is to create a user. So let's create this user with the following properties:

  • User name: cskrb
  • Support AES 128 bits encryption
  • WARNING: This account MUST NOT support AES 256 bits encryption
  • Trust for Delegation to any service (Kerberos Only)
  • Password never expires
  • Account never expires
  • Account not locked


Once the user has been created, you can proceed with the keytab creation using the comment prompt on the Active Directory host:

cs_keytab.png


According to the documentation of EMC, you can create one keytab with several keys inside for the Documentum repositories. Actually, that's wrong! It's not possible in the Microsoft world to generate a keytab with more than one Service Principal Name (SPN) in it, only the Linux implementations of Kerberos allow that. If you try to do so, your Active Directory may loop forever trying to add a second SPN to the keytab. That will considerably slow down your Active Directory and it may even crash...


If you want to setup the Kerberos SSO for more than one repository, you will have to create one user per repository and generate one keytab per user. So just repeat these two steps above for each repository, replacing the user name, user password and repository name... What is possible with an Active Directory is to map more than one SPN to a user. That can be useful for a Load Balancer setting for example but the keytab will always contain one SPN and therefore it seems that this solution isn't suitable for the Content Server.


The second remark here is that the documentation of EMC often uses the DES encryption only for the keytab but as shown above, you can of course specify the encryption to use or simply specify "ALL" to add all possible encryptions in this keytab. By default Kerberos will always use the stronger encryption. In our case as the Content Server doesn't support AES 256 bits encryption, the AES 128 bits encryption will be used instead.


II. Configuration of the Content Server side


So let's start the configuration of the Kerberos SSO for the Content Server. The first thing to do is of course to transfer the keytab created previously (REPO.keytab) from the Active Directory to the Content Server's host. This CS's host can be a Windows Server or a Linux Server, it doesn't matter as long as the Linux Server is part of your enterprise network (well if it's properly configured). In this post, I will use a Linux server because we usually install Documentum on Linux.


During the installation of the Content Server, the installer creates some default authentication folders, some security elements, aso... Therefore, you have to put the newly created keytab in this specific location for the Content Server to automatically recognize it. Please make sure that the keytab belongs to the Documentum installation owner (user and group) on the file system with the appropriate permissions (640). The correct location is:

$DOCUMENTUM/dba/auth/kerberos/



Then create the file "/etc/krb5.conf" with the following content:

[libdefaults]
noaddresses = true
udp_preference_limit = 1
default_realm = DOMAIN.COM
default_tgs_enctypes = aes128-cts arcfour-hmac-md5 des-cbc-md5 des-cbc-crc rc4-hmac
default_tkt_enctypes = aes128-cts arcfour-hmac-md5 des-cbc-md5 des-cbc-crc rc4-hmac
permitted_enctypes = aes128-cts arcfour-hmac-md5 des-cbc-md5 des-cbc-crc rc4-hmac
dns_lookup_realm = true
dns_lookup_kdc = true
passwd_check_s_address = false
ccache_type = 3
kdc_timesync = 0
forwardable = true
ticket_lifetime = 24h
clockskew = 72000

[domain_realm]
.domain.com = DOMAIN.COM
domain.com = DOMAIN.COM
adsrv1.domain.com = DOMAIN.COM
adsrv2.domain.com = DOMAIN.COM

[realms]
DOMAIN.COM = {
master_kdc = adsrv1.domain.com:88
kdc = adsrv1.domain.com:88
kpasswd = adsrv1.domain.com:464
kpasswd_server = adsrv1.domain.com:464
kdc = adsrv2.domain.com:88
kpasswd = adsrv2.domain.com:464
kpasswd_server = adsrv2.domain.com:464
}

[logging]
default = /var/log/kerberos/kdc.log
kdc = /var/log/kerberos/kdc.log

[appdefaults]
autologin = true
forward = true
forwardable = true
encrypt = true



You can of course customize this content with what is suitable for you depending on your environment. Moreover, if the file "/etc/krb5.conf" already exist and you don't want to modify it or if you can't modify it, then you can still create this file wherever you want. For example, create the folder "$DOCUMENTUM/kerberos/" and put this file inside. Then edit the file "~/.bash_profile" and add the following line into it to reference this new location:

export KRB5_CONFIG=$DOCUMENTUM/kerberos/krb5.conf



Once done, simply restart your ssh session or source the file "~/.bash_profile" for the new environment variable to be available.


The last thing to do is to refresh the Kerberos configuration for the Content Server to know that a keytab is available on the File System (and therefore enable the dm_krb authentication plugin). This process is known as the re-initialization of the Content Server. There are two main ways to do it: with or without Documentum Administrator (DA). When using DA, you can simply click on a button while the Content Server is running and that's the main advantage of this method. If you don't have a DA installed, then I guess you will have to reboot the Content Server for the changes to take effect.


To re-initialize the Content Server using DA, here is what need to be done:

  • Open DA in Internet Explorer
  • Log in to the repository "REPO" with the account of the Documentum installation owner or any account with sufficient permissions
  • Expand the "Basic Configuration item"
  • Click on "Content Servers"
  • Right-click on the repository you are connected to
  • Click on "Properties"
  • Check "Re-Initialize Server"
  • Click on "Ok"


Once done, you should be able to confirm that the reload of the dm_krb plugin was successful by checking the log file of the repository:

$DOCUMENTUM/dba/log/REPO.log



If everything goes well, you will see some lines showing that the Content Server was able to parse the keytab successfully. On a next blog, I will certainly explain how to configure the Kerberos SSO for the DFS part. Stay tuned!


bulk loading semi structured data in postgresql

Yann Neuhaus - Sun, 2015-05-10 10:00

The last post took a look at how to efficiently load 1m rows into a table in PostgreSQL. In this post I'll take a look on how to do the same with semi structured data.

D2 performance issues due to KB3038314 IE patch

Yann Neuhaus - Sun, 2015-05-10 08:58

I ran into a strange issue by a customer. When trying to open a huge VD on the D2’s right panel the browser freezes.

It seems to be due to an Internet Explorer security patch. It is introducing huge performance issues. So if you run into strange issues concerning your web browser check the patch version of IE. The security patch which causes issues is KB3038314.

Alfresco: some useful database queries

Yann Neuhaus - Sun, 2015-05-10 04:50


In my previous post, I talked about the Lifecycle of Alfresco Nodes. You may have noticed that I tried to insert in my explanations some elements that are specific to databases (tables, fields, aso...). These elements are quite essential to prepare a post like this one: more database oriented. I already explained what exactly are the consequences on the database side when a node is removed and I will try in this post to share some useful queries regarding these points but not only!


For this post, I used my local Alfresco Community 4.2.c installation with a PostgreSQL database. For your information, it just take 30 minutes to get this test environment ready with the Alfresco's installer (Windows, Mac or Unix). Of course, use the Database only for your daily administration work is certainly not the best idea but in some cases, it can really be faster and easier to just run some SQL commands at the DB level...


I. Document information


So let start this post with some generic queries that can be used to retrieve some information about documents. In this part, all columns of the results will be the same because I just pick up the same fields in my queries but the filter part (the WHERE clause) changes a little bit to be able to retrieve some information from different elements.

The first command I would like to show you is how to retrieve some information about documents based on the size of the content. Here, I just uploaded the document "Test_Lifecycle.docx" with a size of 52MB. So based on that, let's say that I want to retrieve all elements on my Alfresco installation with a content that is bigger than 40MB. In the same approach, you can select all elements with a content that is smaller than or between XX and YYMB. The conversion in MB is done using the round() function. Therefore, if you want this value to be in KB instead, just remove one division by 1024 in each round() function:

All documents bigger than 40MB
SELECT n.id AS "Node ID",
       n.store_id AS "Store ID",
       round(u.content_size/1024/1024,2) AS "Size (MB)",
       n.uuid AS "Document ID (UUID)",
       n.audit_creator AS "Creator",
       n.audit_created AS "Creation Date",
       n.audit_modifier AS "Modifier",
       n.audit_modified AS "Modification Date",
       p1.string_value AS "Document Name",
       u.content_url AS "Location"
FROM alf_node AS n,
     alf_node_properties AS p,
     alf_node_properties AS p1,
     alf_namespace AS ns,
     alf_qname AS q,
     alf_content_data AS d,
     alf_content_url AS u
WHERE n.id=p.node_id
  AND ns.id=q.ns_id
  AND p.qname_id=q.id
  AND p.long_value=d.id
  AND d.content_url_id=u.id
  AND p1.node_id=n.id
  AND p1.qname_id IN (SELECT id FROM alf_qname WHERE local_name='name')
  AND round(u.content_size/1024/1024,2)>40
ORDER BY u.content_size DESC;



I will just put it once but here is the result of this command in this case:

 Node ID | Store ID | Size (MB) |          Document ID (UUID)          | Creator |         Creation Date         | Modifier |       Modification Date       |    Document Name    |                            Location                            
---------+----------+-----------+--------------------------------------+---------+-------------------------------+----------+-------------------------------+---------------------+----------------------------------------------------------------
131856 | 6 | 52.00 | eb267742-c018-4ba5-8ca4-75ca23c860f0 | Morgan | 2015-04-30T12:05:50.613+02:00 | Morgan | 2015-04-30T12:05:50.613+02:00 | Test_Lifecycle.docx | store://2015/4/30/12/5/0e111f05-7fcf-4a44-b719-b94cd04dd5ab.bin


So why did I selected these fields?!

  • Node ID: can be useful to join different tables
  • Store ID: a Store ID of 6 means that your document is in its active life. A Store ID of 5 means that this document has been deleted by a user and is now in the global trashcan
  • Size (MB): what we are searching for...
  • Document ID (UUID): the unique identifier of this document. The simplest way to preview this document is just to open the following url in any browser: http://HOSTNAME:PORT/share/page/document-details?nodeRef=workspace://SpacesStore/eb267742-c018-4ba5-8ca4-75ca23c860f0 (workspace://SpacesStore for store_id=6)
  • Creator, Modifier, Dates: well...
  • Document Name: can be useful to know the type of document without opening an URL (file extension)
  • Location: the actual location of the content's file on the File System. The "store://" refers to $ALF_DATA/contentstore/


The second command I would like to show you is how to retrieve some information based on the actual UUID of a document. As explained above, the UUID of a document can be found in the URL of its detail's page:

A document using its UUID
SELECT n.id AS "Node ID",
       n.store_id AS "Store ID",
       round(u.content_size/1024/1024,2) AS "Size (MB)",
       n.uuid AS "Document ID (UUID)",
       n.audit_creator AS "Creator",
       n.audit_created AS "Creation Date",
       n.audit_modifier AS "Modifier",
       n.audit_modified AS "Modification Date",
       p1.string_value AS "Document Name",
       u.content_url AS "Location"
FROM alf_node AS n,
     alf_node_properties AS p,
     alf_node_properties AS p1,
     alf_namespace AS ns,
     alf_qname AS q,
     alf_content_data AS d,
     alf_content_url AS u
WHERE n.id=p.node_id
  AND ns.id=q.ns_id
  AND p.qname_id=q.id
  AND p.long_value=d.id
  AND d.content_url_id=u.id
  AND p1.node_id=n.id
  AND p1.qname_id IN (SELECT id FROM alf_qname WHERE local_name='name')
  AND n.uuid='eb267742-c018-4ba5-8ca4-75ca23c860f0';



Another possible command would be to find some information based on the File System location. That can be useful for example if there is a big document on the File System and you want to know the type of this document with the extension, its name or maybe some other information about the creator/modifier:

A document using its path on the File System
SELECT n.id AS "Node ID",
       n.store_id AS "Store ID",
       round(u.content_size/1024/1024,2) AS "Size (MB)",
       n.uuid AS "Document ID (UUID)",
       n.audit_creator AS "Creator",
       n.audit_created AS "Creation Date",
       n.audit_modifier AS "Modifier",
       n.audit_modified AS "Modification Date",
       p1.string_value AS "Document Name",
       u.content_url AS "Location"
FROM alf_node AS n,
     alf_node_properties AS p,
     alf_node_properties AS p1,
     alf_namespace AS ns,
     alf_qname AS q,
     alf_content_data AS d,
     alf_content_url AS u
WHERE n.id=p.node_id
  AND ns.id=q.ns_id
  AND p.qname_id=q.id
  AND p.long_value=d.id
  AND d.content_url_id=u.id
  AND p1.node_id=n.id
  AND p1.qname_id IN (SELECT id FROM alf_qname WHERE local_name='name')
  AND u.content_url='store://2015/4/30/12/5/0e111f05-7fcf-4a44-b719-b94cd04dd5ab.bin';



II. Number of...


From a reporting point of view, let's say that you need some information regarding the number of... something. In this case and if you want to use your DB directly, then there is a really simple solution (simple but is it the best?) because Alfresco provide a Database architecture that is quite simple to understand and to use to get what you need. Indeed, if you take a look at the "alf_qname" table, you will see that every element that is part of Alfresco has its QName listed here. A QName is the Qualified Name of a repository item. This can be seen as a kind of "Super-Type":

alfresco=> SELECT * FROM alf_qname;
 id | version | ns_id | local_name
----+---------+-------+------------
  1 |       0 |     1 | store_root
  2 |       0 |     1 | aspect_root
  3 |       0 |     1 | container
  4 |       0 |     1 | children
  5 |       0 |     2 | user
...
 24 |       0 |     6 | folder
...
 51 |       0 |     6 | content
...
133 |       0 |     6 | thumbnail
134 |       0 |    13 | rendition
...


As you can see above, if you are searching for something that has a content, it can be done quite easily using the id or the local_name that correspond to that. So based on this table, here are some queries that can be useful:

Retrieve the number of users in the Repository
SELECT count(*)
FROM alf_node AS n,
     alf_qname AS q
WHERE n.type_qname_id=q.id
  AND q.local_name='user';

 

Retrieve the number of elements with a content in the Repository (include system's documents)
SELECT count(*)
FROM alf_node AS n,
     alf_qname AS q
WHERE n.type_qname_id=q.id
  AND q.local_name='content';

 

Retrieve the number of thumbnails in the Repository
SELECT count(*)
FROM alf_node AS n,
     alf_qname AS q
WHERE n.type_qname_id=q.id
  AND q.local_name='thumbnail';

 

Retrieve the number of renditions in the Repository
SELECT count(*)
FROM alf_node AS n,
     alf_qname AS q
WHERE n.type_qname_id=q.id
  AND q.local_name='rendition';



Of course you can do that for all QNames but you can also be more precise! So based on the query to retrieve the number of elements with a content, if you only want the number of documents of a specific type, then you can simply complete your query:

Retrieve the number of XML documents in the Repository
SELECT count(*)
FROM alf_node AS n,
     alf_qname AS q,
     alf_node_properties AS p
WHERE n.type_qname_id=q.id
  AND p.node_id=n.id
  AND p.qname_id IN
    (SELECT id
     FROM alf_qname
     WHERE local_name='name')
  AND q.local_name='content'
  AND p.string_value LIKE '%.xml';

 

Retrieve the number of PDF documents in the Repository
SELECT count(*)
FROM alf_node AS n,
     alf_qname AS q,
     alf_node_properties AS p
WHERE n.type_qname_id=q.id
  AND p.node_id=n.id
  AND p.qname_id IN
    (SELECT id
     FROM alf_qname
     WHERE local_name='name')
  AND q.local_name='content'
  AND p.string_value LIKE '%.pdf';


As the creation date, creator, modification date and modifier information are also stored on the "alf_node" table, you can also very easily filter your query based on the creation/update date of an Alfresco Node. That's pretty cool, right?! ;)


III. Lifecycle specific


To complete the relation between this blog post and the previous one, I wanted to share some queries that can be used to identify the current state of a document. As explained in my previous post, a document that is not yet deleted will be in the store named "workspace://SpacesStore". A document that has been deleted by a user will be in the sotre named "archive://SpacesStore" and when this document is removed from the global trashcan, the orphan_time is set to the current timestamp. With all these information and with the "alf_node" and "alf_content_url" tables we can easily build our own queries to find what is needed.

alfresco=> SELECT * FROM alf_store;
 id | version | protocol  |       identifier        | root_node_id
----+---------+-----------+-------------------------+--------------
  1 |       1 | user      | alfrescoUserStore       |            1
  2 |       1 | system    | system                  |            5
  3 |       1 | workspace | lightWeightVersionStore |            9
  4 |       1 | workspace | version2Store           |           10
  5 |       1 | archive   | SpacesStore             |           11
  6 |       1 | workspace | SpacesStore             |           12
(6 rows)


So let's find all documents that have been created and aren't deleted yet:

All documents created in their active life
SELECT *
FROM alf_node
WHERE store_id=6
  AND type_qname_id=51;


The next step on the lifecycle is when the documents have been deleted by a user but aren't deleted from the global trashcan (orphan_time is still NULL):

All documents created that are in the global trashcan (deleted by users)
SELECT *
FROM alf_node
WHERE store_id=5
  AND type_qname_id=51;


Finally, when the documents are removed from the global trashcan, some references/fields are removed, the QName of these documents change from "content" (51) to "deleted" (140) on the "alf_node" table and the orphan_time is set to the current timestamp on the "alf_content_url" table:

All elements that have been removed from the global trashcan and that are now orphaned
SELECT *
FROM alf_content_url
WHERE orphan_time IS NOT NULL;



I hope you enjoyed this blog post because it was quite hard for me to write something about database queries without giving up my soul to the DB world! See you soon ;).


Oracle things that piss me off (pt 2) - No Direction

Gary Myers - Sun, 2015-05-10 00:33
The SQL Developer team has been chugging forward with it's SQL Command Line (sqlcl) tool.

As I developer, I understand where they are coming from. SQL Developer benefited from being able to run scripts built for the SQL*Plus command line tool. Then there's the temptation to add a few more useful titbits to the tool. And if it is built 'properly', then it would be relatively easy to decouple it from the GUI and have it as a stand-alone. 

BUT.....

where's the big picture ?

I'm pretty sure (but happy to be corrected) that "SQL Developer" is part of the 12.1 database installation. It is certainly referenced in the guides. So I'd assume that the next 12.2 release will have "SQL Developer" and "sqlcl" command line tool and SQL Plus. I couldn't guess whether the sqlplus will be offered as a last gasp, "to be deprecated" option or whether the long term plan is to supply two SQL command line tools.

Unix/Linux users are probably used to something similar, as they generally have the options of different shells, such as bash, ksh, csh etc. But to remedy any confusion, scripts are generally written with a shebang so it can automatically work out which of the available shells it should use.

What DBAs are most likely to end up with is a script for which they'll have to guess whether it is aimed at sqlplus or sqlcl (or, if they are lucky, a comment at the start of the code).

Having the clients "sort of" compatible makes it worse. It is harder to tell what it is aimed at, and what might go wrong if the incorrect client is used. Plus opting for compatibility perpetuates some of the dumb crud that has accumulated in sqlplus over the decades.

For example:
This is an SQL statement:
SET ROLE ALL;
This is a directive to the SQLPlus client
SET TIMING ON
You could tell the subtle difference between the SET as SQL statement and SET as sqlplus directive by the semi-colon at the end. Except that both sqlplus and sqlcl will happily accept a semicolon on the end of a 'local' SET command.

If you think it is hard keeping track of what commands are processed by the database, and what are processed by the client, we also have commands that do both.



16:01:49 SQL> select sysdate, to_char(sysdate), cast(sys_context('USERENV','NLS_DATE_FORMAT') as varchar2(20)) dt_fmt,
  2          cast(sys_context('USERENV','NLS_CALENDAR') as varchar2(20)) cal
  3          from dual;

SYSDATE                 TO_CHAR(SYSDATE)   DT_FMT               CAL
----------------------- ------------------ -------------------- --------------------
10/MAY/15               10/MAY/15          DD/MON/RR            GREGORIAN


16:02:35 SQL> alter session set nls_date_format = 'DD/Mon/YYYY';

Session altered.

16:02:40 SQL> select sysdate, to_char(sysdate), cast(sys_context('USERENV','NLS_DATE_FORMAT') as varchar2(20)) dt_fmt,
  2          cast(sys_context('USERENV','NLS_CALENDAR') as varchar2(20)) cal
  3          from dual;

SYSDATE            TO_CHAR(SYSDATE)     DT_FMT               CAL
------------------ -------------------- -------------------- --------------------
10/May/2015        10/May/2015          DD/Mon/YYYY          GREGORIAN

To clarify this, the statement returns one column as a DATE, which will be converted to a string by the client according to its set of rules, and one column as a string converted from a DATE by the database's set of rules.
The ALTER SESSION has been interpreted by both the client AND the server.
This becomes obvious when we do this:
16:02:44 SQL> alter session set nls_calendar='Persian';
Session altered.
16:06:22 SQL> select sysdate, to_char(sysdate),  2       cast(sys_context('USERENV','NLS_DATE_FORMAT') as varchar2(20)) dt_fmt,  3       cast(sys_context('USERENV','NLS_CALENDAR') as varchar2(20)) cal  4       from dual;
SYSDATE                 TO_CHAR(SYSDATE)       DT_FMT               CAL----------------------- ---------------------- -------------------- ----------10 May       2015       20 Ordibehesht 1394    DD Month YYYY        Persian
The database knows what to do with the Persian calendar, but the sqlcl client didn't bother. SQLPlus copes with this without a problem, and can also detect when the NLS_DATE_FORMAT is changed in a stored procedure in the database rather than via ALTER SESSION. I assume some NLS values are available/fed back to the client via OCI.
If I was going for a brand-new SQL client, I'd draw a VERY strong line between commands meant for the client and commands intended for the database (maybe a : prefix, reminiscent of vi). I'd also consider that some years down the track, I might be using the same client to extract data from the regular Oracle RDBMS, their mySQL database, a cloud service.... 
To be honest, I'd want one tool that is aimed at deploying DDL to databases (procedures, new columns etc) and maybe data changes (perhaps through creating and executing a procedure). A lot of the rest would be better off supplied as a collection of libraries to be used with a programming language, rather than as a client tool. That way you'd get first class support for error/exception handling, looping, conditions....
PS.When it comes to naming this tool, bear in mind this is how the XE install refers to the SQL Plus client:


YouTube Sunday : Troubleshoot Fusion Middleware Pre-Requisite Failure : Kernel Setting

Online Apps DBA - Sat, 2015-05-09 19:13
    We’ve started our YouTube Channel covering videos related to Oracle Apps, Fusion Middleware,  Fusion Applications, and database (Subscribe to our Channel by clicking link above to get latest videos). We’ll be posting Videos every Sunday and this weeks Video is on how to fix Oracle Fusion Middleware Installation Pre-Requisite Failure related to kernel setting .     […] The post YouTube Sunday :...

This is a content summary only. Visit my website http://onlineAppsDBA.com for full links, other content, and more!
Categories: APPS Blogs