Pythian Group
Perl and Server Management Objects (SMO)?
No, I do not have squiggly worms in my head, and no, I haven’t gone over to the dark-side. It’s just that I had an opportunity over the past few days to attempt to use Microsoft’s Server Management Objects (SMO) with Perl to manage a SQL Server 2005 DB.
To make a long story into a short post, I blundered into the Win32::CLR module on CPAN, a little gem from Toshiyuki Yamato.
Here is all you need to get started.
- Perl 5.10 (Activestate and Strawberry Perl both work fine.)
- The Win32::CLR module.
- MS Visual C++ Express 2005 or 2008 (it is free and you can get it here: www.microsoft.com/express/download/#webInstall)
- Open a DOS prompt from Visual C++ and Makefile and the nmake install Win32::CLR.
Once you have a working version of Win32::CLR in your Perl installation, you should check that you have the SMO objects installed on you computer. These are normally stored in c:\Program Files\Microsoft SQL Server\100\SDK\Assemblies. The DLL we are interested in is called Microsoft.SqlServer.Smo.dll. If it is not there, you will most likely have to get the latest Service Pack from Microsoft.
Finally, so that Perl can find it, you will have to make sure that you put the directory of the DLL in the path.
Once you have jumped these little hurdles, you can then get down to coding. As good Perl programmers, we start with:
use strict;
and then use our CLR mod:
use Win32::CLR; use utf8; binmode STDOUT, ":encoding(Shift_JIS)"; # japanese character set binmode STDERR, ":encoding(sjis)"; # if japanese windows
(I am not sure why we have to use the binmode here, so if you want an answer you will have to ask Toshi.)
We then load the SMO DLL . . .
my $smo= Win32::CLR->load_from("Microsoft.SqlServer.Smo.dll");
. . . and create a server object that we can play with.
my $svr = Win32::CLR->create_instance("Microsoft.SqlServer.Management.Smo.Server");
my $conn=$svr->get_property("ConnectionContext");
$conn->set_property("LoginSecure",0);
$conn->set_property("DatabaseName","xxx");
$conn->set_property("Login","xxx") ;
$conn->set_property("Password","xxx");
The above will set up the connection for us and then . . .
print "Edition=".$svr->get_property("Edition")."\n";
. . . will connect to the server and print out the Edition details. So if you run it you will see:
Edition=Enterprise Evaluation Edition (64-bit)
Now for the big question: Why?
For one thing, ‘SQL-DMO’, which I have been using with Win32::OLE, is on the way out for any SQL Server edition after 2005.
Second, I though I would give PowerShell a try as a replacement for my Perl Scripts. But, after a few hours trying to get my head around it, I found it was just going to be easier to use Perl for what I had to do.
I think that anyone using PowerShell might find that the above Perl solution will pay off in the long run, as any program you write in Perl can be fully modular or even object oriented in design, whereas PowerShell’s OO is more limited.
As a bonus, I found that the monitoring footprint (disk-space, memory usage, CPU usage) on my server was smaller with Perl than with PowerShell, and very much smaller than a similar C# .Net app.
Of course there are some limitations in using Win32::CLR and SMO, but nothing insurmountable. You will have to write a number of little helper subroutines because Win32::CLR exposes only the Base Class Library of objects, and not the many wrappers that one is more familiar with.
But that is another story—one that I’ll tell in an upcoming post.
BXB9HQNVFAAR
Blogrotate #7: The Weekly Roundup of News for System Administrators
Is it Friday already? Where has the week gone? Whatever, we’ve got lots of good news tidbits for you this week, including several follow-ups to previous stories. Enough jaw-jacking, let’s get to the news.
Operating SystemsThis week we got an early alpha of Google Chrome OS, which is slated for full release sometime in Q4 of 2010. ZDNet blogs and Ars technica have three good first looks at Chrome. First up is Adrian Kingsley-Hughes article Chrome OS – The good, the bad and the ugly, and how it fits in with Windows, Mac and Linux.
For a more security related view, Ryan Naraine has an early look into Chrome OS security with Inside the Google Chrome OS security model.
Lastly, Jon Stokes at Ars Technica has his own first look with screenshots in Chrome OS: Internet failing at PC > PC failing at Internet.
There’s a new Fedora in town. The popular Linux desktop put out by Red Hat released its newest version. Bill has gone crazy for Fedora articles today, so here’s a point form list of what he found.
- Fedora 12 Release Announcement. The announcement has links to the feature list, release notes, download page, etc.
- The Red Hat press release about Fedora 12 called Fedora 12 Unites Latest Features and Usability Into Compelling Free Distribution.
- Tip of the hat: Fedora 12 a strong update by Ryan Paul goes through some initial impressions of Fedora 12 and what it brings to the table.
- Christopher Smart at Linux Magazine has his own first look in Fedora, Still Pushing The Envelope.
- And finally the first major bug has been reported and fixed in Fedora 12. Fedora 12 lets unprivileged users install packages has the initial information about the bug, and PackageKit change has information on the fix.
Have you ever wondered just how much you can take out of Windows and still have a usable system? The Minwin project set out to find that out for you. Warren Rumak discusses Minwin and what it’s all about in Inside “MinWin”: the Windows 7 kernel slims down.
Data CentersScuttlemonkey at SlashDot posted a question about how to evaluate a datacenter. This question has elicited a flurry of discussion on the topic with some good (and bad) stories, but buried inside are many good thoughts on criteria that would be useful in any evaluation. Read more in “How Do You Evaluate a Data Center?“.
Data Center Knowledge has an interesting article by Kevin Normandeau. It’s all about how a greener datacenter can pay off in the long run. “…Amazon, Toyota, and Nike, have realized that focusing on limiting energy calories in the datacenter and elsewhere pays profitability dividends on the financial side” says the article, which draws on a whitepaper from IDC on the subject.
Also at Data Center Knowledge, Rick Miller has a note about Rackspace expanding its headquarters with a new 120,000 square foot expansion. See Rackspace Expands Its Headquarters for more, and a video tour of one of their offices.
HardwareCnet news has an article about the recently released Square Trade survey of laptop reliability. Square Trade is a warranty provider who offer coverage for many brands of laptops so they should know. See Who makes the most reliable laptops for some excerpts from the report, and the full report can be found in PDF format from the Square Trade website.
SecurityA follow up from last week regarding the Microsoft “sudo” patent. According to Ryan Paul this patent does not cover sudo at all. Read more about it in Microsoft’s pseudo sudo patent doesn’t really cover sudo.
A follow up to a post from a couple of weeks ago—there is now an exploit for the mentioned SSL/TLS attack vector. Dan Goodin at The Register has more in his article Researcher busts into Twitter via SSL reneg hole. More technical details on the exploit can be found in Understanding the TLS Renegotiation Attack.
SoftwareA follow up to last week’s article, Microsoft confirms the Windows 7 USB/DVD download tool contained GPLv2 code and has indicated that they will provide the source/binaries for the tool under the GPL. Peter Galli from Microsoft’s Open Source division has more in Update on the Windows 7 USB/DVD Tool.
InternetThe UN-backed Internet Governance Forum was on last week in Egypt. One of the hot topics was ownership of the DNS root domain, which is currently controlled by the US. This made sense when the internet was only in the US universities, but with the global nature of the ‘Net, there is no good reason for a single country to have control of something that controls the basic functionality of the internet. Janna Quitney Anderson has more in IGF attendees: America, surrender the root zone file!
VirtualizationA follow-up review on VMWare Fusion 3, running Windows 7 in OSX. Dave Girard has put Fusion through its paces; read about his results in Running Windows 7 under OS X: Ars reviews VMware Fusion 3.
That’ll do it for this week’s edition. As always feel free to add your own news or perspective in the comments. See y’all next week!
Log Buffer #170: a Carnival of the Vanities for DBAs
This is the 170th edition of Log Buffer, the weekly review of database blogs. Welcome. Let’s kick off this week with a double-helping of . . .
SQL ServerThere are lots of good technical posts this week. The SSIS Junkie has some observations and a straw poll on sort transform arbitration. He writes, “This post was prompted by a thread on the MSDN SSIS forum today where the poster was asking how he could replicate the behaviour of SSIS’s Sort transform using T-SQL, specifically he wanted to know how the Sort transform chooses what data to pass through when the ‘Remove Duplicates’ option is checked.”
Another poll, courtesy of Tibor Karaszi: do you perform log backup for the model database?
Eric Johnson has a lesson in looping through rows in a table in SSIS 2008, which begins, “When writing code against a SQL Server, as we usually are doing in SSIS Packages, you often need to iterate over all the rows in a table. This can be done using an SSIS Foreach Loop Container, but the how is not obvious.”
Simple-Talk’s Tony Davis wonders, Do Scalar UDFs give SQL Server a Bad Name? “Many developers seem to regard SQL Server as if it were a science-fiction alien planet where unsuspecting crew-members in blue jumpers occasionally die horribly; everything is suddenly unsafe, and potentially malicious: nothing really works properly and so any serious code should be kept well away from it. Is this developer ignorance, or is their fear justified?”
From Merrill Aldrich comes a trick question — part quattro. Spoiler: TPH is an evil trap. As a commenter says: “Very interesting, I’ve never heard this vehement an argument against TPH before.”
Thinking outside the box lets us in on how to tell if you are running on a virtual environment, with a handy little bit of code.
Kalen Delaney elucidates UPDATE Locks, “ . . . a hybrid of SHARED and EXCLUSIVE locks. [Contrary] to what you might think, UPDATE locks are not just acquired for UPDATE operations.”
Stephen Forte shows how SQL Server R2 Does SQL Azure.
Now that PASS 2009 in Seattle, Washington has passed, it’s time to fondly look back on it. Kendal Van Dyke shares his experiences and some photos in Looking Back – PASS Summit 2009 Day 4.
The Rambling DBA, Jonathan Kehayias does so with the benefits of attending PASS realized: “ . . . as a testament to the value of attending this conference the very first session I attended, diagnosed and provided information for a problem that has existed in one of my servers for many weeks but was impossible to diagnose unless you knew what you were looking at.”
Thomas LaRock, SQL Rockstar concurs: “If you want to grow your skills, then you need to connect, learn, and share with others. And there is no better place to do that than at PASS.”
Not that PASS was the only game in town. Just down the coast a bit, Baron Schwartz gives his recap of Portland OpenSQL Camp 2009.
Selena Marie Deckelmann reports that OpenSQLCamp was awesome!
PostgreSQLPavel Stehule has some news of a longtime plpgsql misfeature removed. He writes, “plpgsql is good language – simple, robust with good error diagnostic. But it had one bizarre behave. plpgsql connects two worlds – procedural ADA like code and SQL statements. Usually there are not problems. But there are one exception – collision of identifiers.”
In the latest in his Waiting for 8.5 series, Hubert Lubacziewski introduces and tests something new: TRIGGERS on columns.
MySQLOn code.openark.org, Shlomi Noach is surprised by questions or queries. He writes, “I was used to checking for the ‘questions’ global status variables . . . So, for example, I could run com_select/questions to learn the SELECT ratio out of all queries. . . . Apparently, as of 5.0.72-5.0.76 & 5.1.31 this has changed. A new status variable was introduced, called ‘queries’.” What’s the difference? Is this good or bad? Shlomi and his readers kick it around.
Venu Anuganti also was surprised, in his case by InnoDB Tablespace Corruption: “When . . . InnoDB crashes, it automatically recovers during the next start by rolling back/forward based on what was pending and un-flushed/un-committed changes at the time of crash. . . . [On] one of the servers; we ran out of disk space . . . on data directory . . . and server was running for few hours in this mode . . . [It] became un-available and not responding after a while. Only option left was to kill the server process and its PID along with cleaning the stuff to get the space back. After I (re)started the server . . . the tablespace is corrupted.”
From Arnold Daniels comes a version of versioning MySQL data, which Arnold introduces thus: “ . . . You’re probably using a versioning control system . . . to safeguard your data. Advantages of using a VCS are that you can walk to the individual changes for a document, see who made each change and revert back to specific revision if needed. These are features which would also be nice for data stored in a database. With the use of triggers we can implement versioning for data stored in a MySQL db.” Example code follows.
On the MySQL Perfomance Blog, Morgan Tocker qualifies his earlier piece on why you don’t want to shard: “What I didn’t mention was that if you’ve established that you will need to eventually shard, is it better to just get it out of the way early? My answer is almost always no. That is to say I disagree with a statement I’ve been hearing recently; ’shard early, shard often’”
Morgan also mentions that interviews for InfiniDB and TokuDB are next: “I’d like to announce that Robert Dempsey (InfiniDB storage engine) and Bradley C. Kuszmaul (TokuDB storage engine) have also accepted an interview. If you have any questions about either storage engine, please post them here by Friday 20th November.”
Here’s a new blog to watch, particularly if you’re new to MySQL or Drizzle: Kent Bozlinski’s Learning Drizzle. Kent says, “I’m not really scared of rain after living in Seattle for seven years. I am a little scared of sticking my neck out and writing about something which (for the moment) I know almost nothing about.” The post is Drizzle is Scary (A Little).
OracleMaybe you’ve already heard about the fabulous unpopularity of the new My Oracle Support. Daniel Fink comes back with some data and commentary on just that, with his My Oracle Support Survey Results.
Richard Foote asks, An index only performs how much work???, the result of looking into exactly why index rebuilds can improve performance so significantly.
Kerry Osborne gives a lesson on fixing bad index hints in SQL Profiles (automatically). He says, “With 10g and 11g, it appears the goal [or Outlines] has swung away from the “locking” concept and towards allowing the optimizer more flexibility. . . . I must say that I find this decision to be irritating at best. . . . One of the main offenders in this regard is the use of a new format available for index hints as of 10g.”
Inside the Oracle Optimizer covers similar turf in answering to the question, What should I do with old hints in my workload?, or more specifically, “When moving from 10g to 11g, should hints in existing SQL be removed?”
Tom Kyte wants your opinions on comparative window functions: “ . . . they could be getting better in the near future. . . . analytics [could be allowed] to access the current row value to be compared against any other row value in a defined window. . . . I’ve already supplied them with my feedback (which started with “this is an awesome idea”) – and you can too – by posting it here. They’ll be checking back to see what you say.”
If you like staying on top of fresh things, perhaps John Piwowar’s method of retrieving Oracle patches with wget would also appeal to you.
And that is all for now. If you think I’ve missed a worthwhile DB blog from this week, please mention it in a comment.
Log Buffer will be back in a week’s time. See you then!
Upgrading to Fedora 12? You might need more /boot space!
Today, I had a spare Fedora 11 machine sitting next to me, so I thought I’d try the upgrade to the newly-released Fedora 12, aka “Constantine.” Fedora support cycles are rather short compared to Ubuntu, so Fedora 11 will likely be de-supported in 6 to 7 months. Normally I’d wait a little longer into the Fedora 12 cycle for others to find the fun upgrade bugs and have them fixed for me, but I didn’t mind having to re-install from scratch on this machine if I needed to.
Following the Fedora documentation, I decided to use the “preupgrade” tool. Everything was going smoothly until the machine restarted to begin installation of the new packages. I got a message that there wasn’t enough space in my /boot partition. Specifically, the message claimed that there was insufficient disk space in /mnt/sysimage/boot. I found this rather odd and troubling, since I had let the Fedora installer determine the /boot partition size when I originally installed Fedora 11.
Turns out that this is a known problem with the preupgrade tool. A kind soul in #fedora on IRC directed me to the list of common Fedora 12 bugs, in particular the preupgrade free space check. I installed the updated preupgrade package as directed, but again got the error. That’s when I followed the next link for additional tips to free up space in /boot. The first was to remove obsolete kernels, which I had already done. The next was to run tune2fs on /boot filesystem to free up reserved blocks, which aren’t needed for /boot. I strongly suggest you visit the links provided for helpful screenshots and commands to follow.
After making these changes, the upgrade worked and am I’m the proud owner of a Fedora 12 Constantine laptop, with a slightly brighter shade of blue desktop than that crusty old Fedora 11. ;)
New Oracle ACE at Pythian
I recently nominated my colleague and good friend Christo Kutrovsky for the Oracle ACE program. Today, I’m excited to share that he has been accepted and is now part of the community of 200+ Oracle ACEs and Oracle ACE Directors.
Christo has been an active blogger on our blog and has made numerous top-class presentations at Oracle Conference world-wide. He has also been an active contributor on the Oracle-L list.
Christo has been always representing an independent voice in the community and it’s a pleasure to have him on board. I’m looking forward to his contributions to the Oracle ACE program and the Oracle community at large.
Welcome Christo!
Oracle Parallel Query Sorting and Index Creation Performance Problems
Ever wondered why recreating certain indexes takes forever, even when you do so in parallel? Ever wondered why certain PQ queries just don’t run that fast?
Here’s a serious performance bug that’s been in Oracle for a while, and finally there are hints of it been fixed, but only partially.
The bug happens when performing sorting operations in parallel, and the source data is already well sorted. The “ranger” doesn’t do a good job of properly assigning row ranges to sorter processes, and ~90% of the rows end up being sent to the same parallel process, regardless of the level of parallelism. So even if you have 256 CPUs, only about 10% performance improvement is achieve, instead of a factor of your parallelism when running the query in parallel.
For example, if the non parallel sort/index creation took 45 minutes, running with parallel 32 will take 41 minutes, instead of the possible 1.4 minutes (assuming you have sufficient horsepower).
When running a “Sort” operation in parallel, there are 2 sets of parallel processes. Producers and consumers. The number of producer/consumer pairs depends on your parallelism settings. So in the case of parallel 32 there are 32 producers and 32 consumers. This is well illustrated in the Oracle documentation HERE (Oracle web based documentation down, will update later). As each ‘producer’ is reading data it is sending it to the appropriate consumer for that “range”. For example, consumer 1 takes A-B, consumer 2 takes C-F, consumer 3 G-L and etc. The exact split is dynamically calculated by the “Ranger” process. Unfortunately it doesn’t work so good with sorted data.
The same applies to index creation. Index creation is basically a big sort, followed by writing out the result set into a B-Tree structure. Index creation suffers from the exact same ranging issues, at least until 11.2.
Here’s an example:
-- Create mini-sample table
create table mytest_s as select rownum r from dual connect by level <=400000;
-- Fetch only 1 row, no need to fetch all
begin
for c in ( select /*+PARALLEL(t,4)*/ * from mytest_s t order by 1) loop
exit;
end loop;
end;
/
select dfo_number "d", tq_id as "t", server_type, num_rows,rpad('x',round(num_rows*10/nullif(max(num_rows) over (partition by dfo_number, tq_id, server_type),0)),'x') as "pr", round(bytes/1024/1024) mb, process, instance i,round(ratio_to_report (num_rows) over (partition by dfo_number, tq_id, server_type)*100) as "%", open_time, avg_latency, waits, timeouts,round(bytes/nullif(num_rows,0)) as "b/r"
from v$pq_tqstat order by dfo_number, tq_id, server_type desc, process;
d t SERVER_TYPE NUM_ROWS pr MB PROCESS I % OPEN_TIME AVG_LATENCY WAITS TIMEOUTS b/r
- - ----------- ---------- ----------- --- ------- - ---- ---------- ----------- ------- ---------- ----
1 0 Ranger 372 xxxxxxxxxx 0 QC 1 100 0 0 0 0 11
1 0 Producer 126144 xxxxxxxxxx 1 P012 1 32 0 0 19 2 6
1 0 Producer 47304 xxxx 0 P013 1 12 0 0 8 1 6
1 0 Producer 110376 xxxxxxxxx 1 P014 1 28 0 0 17 2 6
1 0 Producer 116176 xxxxxxxxx 1 P015 1 29 0 0 16 0 6
1 0 Consumer 7885 0 P008 1 2 0 0 6 1 5
1 0 Consumer 7884 0 P009 1 2 0 0 6 1 6
1 0 Consumer 7884 0 P010 1 2 0 0 7 2 6
1 0 Consumer 376347 xxxxxxxxxx 2 P011 1 94 0 0 16 4 6
1 1 Producer 5536 xxxxxxxxxx 0 P008 1 55 0 0 577 568 3
1 1 Producer 4508 xxxxxxxx 0 P011 1 45 0 0 587 571 4
1 1 Consumer 100 xxxxxxxxxx 0 QC 1 100 0 0 1 0 161
As you can see from this test case, the sorter processes (4 consumers) had a very uneven split, with 94% of the rows been sent to only one consumer. I tested this case with parallel 64, and in that case 90% gets sent to 1 consumer, with the other 10% evenly distributed on the remaining ones.
This essentially reduces your execution time by at most 10%.
A very similar thing happens if you create an index. I did, however, catch an anomaly in the test case. With parallel 4, the distribution is 66/34/0/0, while with parallel 8, it’s 100/0/0/0/0/0/0/0. I.e., terrible as all the work will be performed by 1 process, absolutely the same as Serial, only a little worse as there will be inter-process communication.
create index on mytest_s (r) parallel 4;
select dfo_number "d", tq_id as "t", server_type, num_rows,rpad('x',round(num_rows*10/nullif(max(num_rows) over (partition by dfo_number, tq_id, server_type),0)),'x') as "pr", round(bytes/1024/1024) mb, process, instance i,round(ratio_to_report (num_rows) over (partition by dfo_number, tq_id, server_type)*100) as "%", open_time, avg_latency, waits, timeouts,round(bytes/nullif(num_rows,0)) as "b/r"
from v$pq_tqstat order by dfo_number, tq_id, server_type desc, process;
SERVER_TYPE NUM_ROWS pr MB PROCESS I % OPEN_TIME AVG_LATENCY WAITS TIMEOUTS b/r
----------- ---------- ----------- -- -------- - --- ---------- ----------- ---------- ---------- ----------
Ranger 12 xxxxxxxxxx 0 QC 1 100 0 0 1 0 3974
Producer 132601 xxxxxxxxxx 2 P004 1 33 0 0 19 1 18
Producer 95265 xxxxxxx 2 P005 1 24 0 0 14 2 18
Producer 95265 xxxxxxx 2 P006 1 24 0 0 14 1 18
Producer 79497 xxxxxx 1 P007 1 20 0 0 11 0 18
Consumer 262308 xxxxxxxxxx 4 P000 1 66 0 0 76 73 18
Consumer 164 0 P001 1 0 0 0 77 74 19
Consumer 164 0 P002 1 0 0 0 76 73 19
Consumer 137364 xxxxx 2 P003 1 34 0 0 76 73 18
Producer 1 xxxxxxxxxx 0 P000 1 25 0 0 0 0 322
Producer 1 xxxxxxxxxx 0 P001 1 25 0 0 0 0 322
Producer 1 xxxxxxxxxx 0 P002 1 25 0 0 0 0 322
Producer 1 xxxxxxxxxx 0 P003 1 25 0 0 0 0 322
Consumer 4 xxxxxxxxxx 0 QC 1 100 0 0 1 0 322
create index on mytest_s (r) parallel 8;
SERVER_TYPE NUM_ROWS pr MB PROCESS I % OPEN_TIME AVG_LATENCY WAITS TIMEOUTS b/r
----------- ---------- ------------- -- -------- - --- ---------- ----------- ---------- ---------- ----------
Ranger 0 0 QC 1 0 0 6 2
Producer 47304 xxxxxxxx 1 P008 1 12 0 0 14 2 18
Producer 51246 xxxxxxxxx 1 P009 1 13 0 0 15 1 18
Producer 47304 xxxxxxxx 1 P010 1 12 0 0 15 2 18
Producer 45220 xxxxxxxx 1 P011 1 11 0 0 15 2 18
Producer 59130 xxxxxxxxxx 1 P012 1 15 0 0 17 3 18
Producer 35478 xxxxxx 1 P013 1 9 0 0 10 3 18
Producer 55188 xxxxxxxxx 1 P014 1 14 0 0 16 1 18
Producer 59130 xxxxxxxxxx 1 P015 1 15 0 0 17 1 18
Consumer 400000 xxxxxxxxxx 7 P000 1 100 0 0 52 49 18
Consumer 0 0 P001 1 0 0 0 52 49
Consumer 0 0 P002 1 0 0 0 52 49
Consumer 0 0 P003 1 0 0 0 52 49
Consumer 0 0 P004 1 0 0 0 52 49
Consumer 0 0 P005 1 0 0 0 52 49
Consumer 0 0 P006 1 0 0 0 52 49
Consumer 0 0 P007 1 0 0 0 52 49
Producer 1 xxxxxxxxxx 0 P000 1 13 0 0 0 0 322
Producer 1 xxxxxxxxxx 0 P001 1 13 0 0 0 0 322
Producer 1 xxxxxxxxxx 0 P002 1 13 0 0 0 0 322
Producer 1 xxxxxxxxxx 0 P003 1 13 0 0 0 0 322
Producer 1 xxxxxxxxxx 0 P004 1 13 0 0 0 0 322
Producer 1 xxxxxxxxxx 0 P005 1 13 0 0 0 0 322
Producer 1 xxxxxxxxxx 0 P006 1 13 0 0 0 0 322
Producer 1 xxxxxxxxxx 0 P007 1 13 0 0 0 0 322
Consumer 8 xxxxxxxxxx 0 QC 1 100 0 0 2 1 322
To further explore the implications of this bug, I created a more elaborate test case. I created several types of data, and tested ordering against each “class.”
create table mytest as
select rownum pk, trunc(dbms_random.value(0,400000)) rnd,
floor(rownum/1000) type_1000, floor(rownum/5) type_5,
mod(rownum,1000) type_1000mod,mod(rownum,5) type_5mod,mod(rownum,10009) type_10000mod,
sysdate-rownum/100 as dt,
rpad('x',10,'x') pad from dual connect by level <=400000;
To spare you some of the testing, here are my results. Query ran:
select /*+PARALLEL(t,[DEGREE])*/* from mytest order by [ORDER BY];
A few quick conclusions:
- The first column of order matters for distribution.
- If the Order by column has repeated values, PQ sort will be limited by number of distinct values, but only if they are not grouped together.
- Index creation on time series (log table, stock table) is slow, “type” indexes are slow.
- If you reorder the keys of an index, you may affect the time it takes to create it.
- Follow up on the preceding, especially true if you try to put low cardinality columns first to improve compression.
One could dispute in which cases data in the table is ordered, and it’s amazing how many cases this is:
- Time series data — ever-growing data. The PK is ordered; the “insert date” is somewhat ordered.
- Data warehouses — bulk load files are often ordered via some conditions.
- Sometimes it is good to reorder a table, to improve data locality and compression in data warehouses. This can, however, have negative effects on index build time.
- Sometimes one of the intermediate steps will return an ordered set for the final processing.
One example of the last type, is analytics. But that’s for a separate blog post.
And finally, to end on an optimistic note, it appears that 11.2 has the index creation issue resolved, but the order by in queries is still bad.
Blogrotate #6: The Weekly Roundup of News for System Administrators
Like good day and welcome to issue 6 eh. This is another light one as it’s been another busy week. I have not had any time of my own to crawl through the news, but luckily my cohort Bill has flagged some good stuff for this week. Let’s get to it.
InternetHappy Birthday FireFox! The browser turns 5 years old today. Ryan Paul has more in his article
Firefox turns five: half a decade of Web liberation.
The Google Chromium blog has an interesting post called A 2x Faster Web about their attempts to speed up your surfing. “The initial results are very encouraging: when we download the top 25 websites over simulated home network connections, we see a significant improvement in performance – pages loaded up to 55% faster”, says the blog. This is still in early testing phases but sounds very promising.
Steven J. Vaughan-Nichols at Computerworld reflects on his time using linux on the desktop in Opinion: Linux desktop turns 10; world yawns. The article needs to be taken with a grain of salt. He states in the preamble that “The credit for creating and marketing the first Linux desktop designed for ordinary users goes to Corel Corp., which launched Corel Linux OS 10 years ago, in November 1999″. I take umbrage with this statement because by 1999 Caldera and Mandrake were already making great strides with linux on the desktop, Caldera even had a pretty boxed edition out in stores 2 years ahead of Corel, and Mandrake had a box out in 1998 (if memory serves). Sure Corel had a big name and some marketing power, but they were not the first and certainly not the best (Corel Linux sucked IMHO).
SecurityThis next article is a combination of software, security, legal, patents and what the hell. Microsoft Patents Sudo?!! is a story with it all. Microsoft has taken something that has been in use in the unix community since the 80s (and on Mac since OSX) and patented it. I wonder if RBAC is also covered by this? The link to the patent and information on sudo are included in the article.
Help Net Security has a post about a recent Cenzic report on web server and browser vulnerabilities. Read all about it in Firefox most vulnerable browser, Safari close second.
SoftwareAccording to Ars Technica, Novell is “offering a new commercial add-in for Visual Studio that will allow software developers to test and package .NET applications for Linux without having to leave their Windows development environment”. The new add-in, called MonoVS is available now for a free 30 day trial.
According to the WithinWindows blog Microsoft may have used GPL code in it’s released Windows 7 USB/DVD Download tool without contacting the author or releasing the source code back to the community. The article has some evidence to support this claim. Microsoft has pulled the tool pending further investigation.
VirtualizationSlashdot has a note about The Remus project. This project adds live migration/failover functionality to the Xen hypervisor allowing virtual machines to (reportedly) seamlessly move to another host without losing data or active connections. This was one of the last things that VMWare ESX offered that I could not find in the open source (read free) world so I’ll be sure to check it out myself.
BusinessHP to Acquire 3Com for $2.7 Billion is a new release at the HP site. There’s a lot in there, read it for yourself. Hopefully this will be a good boost for 3Com which has not made anything outstanding since the 3c501.
A couple of items on the AMD front. Firstly AMD and Intel Announce Settlement of All Antitrust and IP Disputes which I hope will bury the hatchet once and for all and allow both companies to get back to innovation. Secondly, AMD has release some more news about their future roadmap and the “fusion” of CPU with GPU in AMD avoiding Larrabee route on road to CPU/GPU “Fusion” by Jon Stokes.
Well there was a couple more items on the list but I am out of time. Have a great weekend!
So like get lost eh! Hoser!
Brad
Log Buffer #169: a Carnival of the Vanities for DBAs
The 169th edition of Log Buffer, the weekly review of database blogs, has been published on Pakistan’s First Oracle Blog by Fahd Mirza.
This is the first Log Buffer published outside Pythian in quite a while, and we’d love to have more. Log Buffer has a regular readership, and so makes a great way to present yourself and your blog to the DBA community at large. To get started, just send an email to the Log Buffer coordinator.
And now, here is Log Buffer #169.
SQL Server: More light shed on “non-yielding scheduler” and Indexes stats
SQL server folks may find the “non-yielding scheduler” warning familiar and it is usually (if not always) associated with a stack dump.
The “non-yielding scheduler” message comes with warnings like the following:
Process %ld:%ld:%ld (0x%lx) Worker 0x%p appears to be non-yielding on Scheduler %ld. Thread creation time: %time. Approx Thread CPU Used: kernel 0x ms, user 0x ms. Process Utilization 0x%. System Idle 0x%. Interval: 0x ms.
This means that there is a background thread that keeps running and making other worker threads yielding back, in time.
There are many causes of the “non-yielding scheduler” error. Memory dump analysis is necessary to determine the cause. Look up the error in MS Support Advanced Search and you will find plenty of KBs dedicated to it.
You may also find (not all cases) errors 17883, 17884 logged in the event viewer, containing information like this:
Event Type: Information
Event Source: MSSQLSERVER
Event Category: (2)
Event ID: 17884
Date:
Time:
User:
Computer:
Description:
New queries assigned to process on Node 0x have not been picked
up by a worker thread in the last 0x seconds.
Blocking or long-running queries can contribute to this condition,
and may degrade client response time. Use the “max worker threads” configuration option to increase number
of allowable threads, or optimize current running queries. SQL Process Utilization: 0x%. System Idle: ox%%.
Event Type: Information
Event Source: MSSQLSERVER
Event Category: (2)
Event ID: 17883
Date:
Time:
User:
Computer:
Description:
Process Worker appears to be non-yielding on Scheduler .
Thread creation time: . Approx Thread CPU Used: kernel ms, user ms.
Process Utilization %. System Idle %. Interval: ms.
There is a must-read whitepaper describing SQL Server Scheduling and how to interpret and diagnose Errors 17883, 17884, 17887, and 17888; please look here How To Diagnose and Correct Errors 17883, 17884, 17887, and 17888
Recently, we had a client having same issue. The client runs a busy online business with thousands of connected users; databases sometimes include more than half a million tables.
We had a healthy counters and server didn’t seem to suffer from memory or I/O bottlenecks. We did suspect, however, that it is something to do with SQL Server internals, as we also received fatal exceptions pertaining to SPIDs <50 (system processes).
We looked at any Hardware Fault, outdated drivers, tempdb contention (there were nearly 0.5 datafiles/core).
We opened a case with MS support after the dumps continued to happen and their reply was:
After analyzing the dumps, we see that the thread has non-yielded while updating index usage stats. These stats are returned through DMV sys.dm_db_index_usage_stats.
According to MS, the issue is a known bug that has been corrected in SQL Server 2008 but isn’t expected to be resolved in SQL Server 2005.
WorkaroundEnable trace flag 2330 to prevent the update of data in DMV in sys.dm_db_index_usage_stats.
This DMV retrieves information that about index usage; it does not, however, generate this information itself, but rather just returns this info from a cache inside SQL Server. This cache is empty when the server instance beside it doesn’t persist between instance restarts.
The impact of disabling this job was nothing to the server operations. There was a selective indexes maintenance job that depended completely on sys.dm_db_index_physical_stats, which is a different DMV.
Also, the query optimizer should be only affected if we stopped updating statistics, or are no longer optimizing the indexes which we were not doing here.
The affected DMV helps with auditing the indexes usage when we opt to diagnose instance performance, as it helps tracking the number of scan operations, the number of seek operations, the number of lookup operations, and some more.
It is worthwhile to note that sys.dm_db_index_operational_stats can report similar statistics (there are some differences though) as sys.dm_db_index_physical_stats.
Craig Freedman wrote a good blog: What is the difference between sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats?
Two days ago—and that’s the reason I’m writing this blog—Microsoft released a “FAST PUBLISH” KB: (Query performance decreases when sys.dm_db_index_usage_stats has large number of rows) about this issue and it mentions the following scenario:
- In Microsoft SQL Server 2005, you frequently perform DDL operations that involve dropping and recreating of lot of tables (especially temporary tables in tempdb database).
- You have a large number of entries (100,000 or more) in the sys.dm_db_index_usage_stats dynamic management view (DMV).
This aligns with what we had with the affected server (more than half a million tables can easily generate that). The KB mentions these workarounds:
- Reduce the rate at which you create or drop temporary tables. (This may need extensive code change.)
- Upgrade to SQL Server 2008. (Unfortunately, not everyone can afford that.)
- Configure your SQL Server to use trace flag T2330 as a startup parameter. (This is what we have done and adieu la problème.)
Note: make sure that the trace is enabled by running DBCC TRACESTATUS(-1)
You can also look in the SQL Server error log and note whether the trace flags are mentioned, something like this:
Time Server -T1222
Time Server -T845
Time Server -T2330
One more KB that was released recently: (FIX: Error message in the SQL Server 2005 Errorlog file after the SQL Server service stops responding: “Timeout occurred while waiting for latch”); mentions similar symptoms that are related to SQL Server processing missing index information for a dynamic management view (DMV). This was fixed in SQL server 2005 Sp3 CU6
If you face similar issues, you should install the latest service pack and cumulative update, and check if you have any resource bottlenecks. You can try trace T2330 as well, and if you still get errors or dumps, then you had better open a case with Microsoft Support to analyze the dumps and provide a resolution.
Hope this helps!
Free 10-day trial of Safari Books Online
That’s right — get your free 10-day trial! All the information I know is here:
But the basics are: No access to Rough Cuts or Downloads, for new subscribers only. It’s one of those “sign up and if you do not cancel after 10 days, we bill you” — and at $42.99 a month, that’s not a mistake you want to make. Must sign up by Nov. 24th.
To sign up now: https://ssl.safaribooksonline.com/tryitfree
I was asked to send this information along, so I am…Now’s your chance to skim High Performance MySQL, among other high quality books!
SSIS tip: Set variables’ values interactively in SSIS packages
This is a little tip for those who develop or debug SSIS packages.
In SQL Server Integration services, User namespace variables are assigned values that are used across the package.
When developing, testing or debugging packages, we assign multiple values to those variables to test different scenarios. This is done using the variables window.
However, SSIS Script tasks can allow us to key in values for selected variables in run time. This looks more fun and keeps us from taking chances when we forget assigning variables’ values.
A tiny example:
- Return list of processes running on SQL server instance from SYS.SYSPROCESSES (I know it is legacy) where process ID (Spid) is larger than or equal to a value ( >50 for example to return user processes).
- Dump the returned list of SPIDs and associated info (cmd, waittype, loginame,…etc) into a flat file.
I’m using a script task and a Data flow task. The Data Flow task contains an OLE DB Source and a flat file destination.

Package
The script task idea is just simple :
- Pass a variable to the script indicating that we need to set some variables(s) value(s) interactively (Interactive mode).
- In the script body, we will check for interactive mode and if it is true then we will use an InputBox to receive variables values.
- Set the package variable(s) value(s) to the value(s) returned from inputbox(s).
I have attached two DSTX files, one for SS2005 and one for SS2008; the 2005 file can be easily upgraded with a tiny change in the function declaration, but I opted to attach a 2008 file for simplicity.
Interactive_Variables_value_2005.dtsx
Interactive_Variables_value_2008.dtsx
Here is also the code for the script task (2008). I hope this helps.
' Microsoft SQL Server Integration Services Script Task ' Write scripts using Microsoft Visual Basic ' The ScriptMain class is the entry point of the Script Task. Imports System Imports System.Data Imports System.Math Imports Microsoft.SqlServer.Dts.Runtime Imports System.Text Imports Microsoft.SqlServer.Dts.Tasks.ScriptTask.ScriptObjectModel Partial Public Class ScriptMain Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase Enum ScriptResults Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure End Enum Dim PackageName As String Public Sub Main() PackageName = Dts.Variables("PackageName").Value.ToString() If Dts.Variables("interactive_values").Value.Equals(True) Then Dts.TaskResult = PromptForParams() Else Dts.TaskResult = ScriptResults.Success End If End Sub Public Function PromptForParams() As ScriptResults Dim Response As String Dim spidint As Integer ' Default Task output to failure Dts.TaskResult = ScriptResults.Failure Response = InputBox("Get a list of SQl server processes that begin with SPID X " " Enter SQL server SPID, must be integer and > 0:", PackageName) If Response = String.Empty Then MsgBox("Cancelled! ", MsgBoxStyle.OkOnly, PackageName) Return ScriptResults.Failure End If If Integer.TryParse(Response, spidint) And spidint > 0 Then Dts.Variables("spid").Value = spidint Else MsgBox("Invalid value spid, must be integer and larger than Zero ", MsgBoxStyle.Critical, PackageName) Return ScriptResults.Failure End If Return ScriptResults.Success End Function End Class
Testing TokuDB – Faster and smaller for large tables
For the past two months, I have been running tests on TokuDB in my free time. TokuDB is a storage engine put out by Tokutek. TokuDB uses fractal tree indexes instead of B-tree indexes to improve performance, which is dramatically noticeable when dealing with large tables (over 100 million rows).
For those that like the information “above the fold”, here is a table with results from a test comparing InnoDB and TokuDB. All the steps are explained in the post below, if you want more details, but here’s the table:
ActionInnoDBTokuDB
Importing ~40 million rows119 min 20.596 sec69 min 1.982 sec
INSERTing again, ~80 million rows total5 hours 13 min 52.58 sec56 min 44.56 sec
INSERTing again, ~160 million rows total20 hours 10 min 32.35 sec2 hours 2 min 11.95 sec
Size of table on disk42 Gb15 Gb
COUNT(*) query with GROUP BY58 min 10.11 sec5 min 3.21 sec
DELETE query2 hours 46 min 18.13 sec1 hour 14 min 57.75 sec
Size of table on disk42 Gb12 Gb
OPTIMIZE TABLE1 day 2 hours 19 min 21.96 sec21 min 4.41 sec
Size of table on disk41 Gb12 Gb
TRUNCATE TABLE1 min 0.13 sec0.27 sec
Size of table on disk41 Gb193 Mb (after waiting 60 seconds before doing an ls -l)
OPTIMIZE TABLE23.88 sec0.03 sec
Size of table on disk176 Kb193 Mb
Installing TokuDB is not quite as easy as plugging in a storage engine. TokuDB requires a patch to the MySQL source code, so you can either patch the source code yourself or download an already-patched version from Tokutek that contains TokuDB as well. I used the already-patched version of MySQL from Tokutek, and it was no different than setting up a regular MySQL install — install, configure and go.
On disk, a table using the TokuDB storage engine is different from both InnoDB and MyISAM. It has a .frm file, as all MySQL tables do. In addition, there is a directory which contains a main.tokudb file with the data, a status.tokudb file (I believe with an action queue), and a key-KEYNAME.tokudb file for each index:
# ls -1
testtoku.frm
testtoku.tokudb
# ls -1 */*
testtoku.tokudb/key-DATE_COLLECTED.tokudb
testtoku.tokudb/key-HASHCODE.tokudb
testtoku.tokudb/main.tokudb
testtoku.tokudb/status.tokudb
A bit of playing around, and we see that we cannot get much from the file — with MyISAM tables, you can see the data in the table by doing a “strings” command on it:
# cd testtoku.tokudb
# file *
key-DATE_COLLECTED.tokudb data
key-HASHCODE.tokudb data
main.tokudb data
status.tokudb data
# strings *
tokudata
tokuleaf
x^cd
fdac
tokudata
tokuleaf
x^cd
bN fda
For a basic test I compared bulk insertion, simple querying and deletes with TokuDB and InnoDB. I did not use any special features of TokuDB. I started with an sql file produced by mysqldump that was 2.0 Gb in size, which had 19 million rows, and performed some simple tests on it. The table has a signed INT as a primary key, and the goal of this test was to see how easy it would be to delete test data. “Test data” is defined as anything that had a particular field (HASHCODE, defined as VARCHAR(32)) in common with more than 10,000 rows.
0) imported 19,425,235 rows
1) SELECT COUNT(*),HASHCODE FROM test[engine] GROUP BY HASHCODE HAVING COUNT(*)>10000;
2) DELETE FROM PRIMARY_KEY_HASH WHERE HASHCODE IN ([list of id's]); This deleted about 3.3% of the records in the table (647,732 rows)
3) OPTIMIZE TABLE test[engine] – to defragment
Tests were done on an Amazon EC2 instance — AMI ID:ami-2547a34c which is a Fedora 64-bit machine, using the m1.xlarge size (16 Gb RAM).
ActionInnoDBTokuDB
Importing over 19 million rows33 min 2.107 sec31 min 24.793 sec
Size of table on disk4.4 Gb2.4 Gb
COUNT(*) query with GROUP BY8.64 sec29.28 sec
DELETE query26.06 sec2 min 19.51 sec
Size of table on disk4.4 Gb1.9 Gb
OPTIMIZE TABLE35 min 15.04 sec1 min 20.42 sec
Size of table on disk4.3 Gb1.2 Gb
InnoDB performed exceedingly well because the InnoDB buffer pool was sized larger than the data (12 Gb buffer pool vs. 4.4 Gb table), and the data import caused the buffer pool to have all the data and indexes already cached when the queries were run. Even so, TokuDB only fared slightly worse than InnoDB in overall performance.
The most interesting part of the table, for me, is the fact that there is no need to defragment the table — Even though the size on disk does decrease after the OPTIMIZE TABLE, the Tokutek folks explained that there’s a queue of work to be done (such as defragmentation) that is done automatically, and OPTIMIZE TABLE processes the rest of the queue. This is why the size of the table on disk was already reduced even before teh OPTIMIZE TABLE was done, and if I had waited a minute or so before performing the OPTIMIZE TABLE it would have automatically been done and I would have seen no results with the OPTIMIZE TABLE.
(specifically, I was told “The fractal tree is a dynamic data structure which may rearrange itself when queries run. In addition, since the fractal tree is periodically checkpointed, there may be more than one version of the data changed since the last check point was taken in the underlying file.” and pointed to a blog post this post about quantifying fragmentation effects.
The table shows that for smaller amounts of data (fewer than 100 million rows), TokuDB is about 9% faster for inserts, but somewhat slower for even simple queries and deletes. There is no need to defragment TokuDB, which saves a lot of time in the long run.
As TokuDB is recommended for tables larger than 100 million rows, let’s see this same test with a large amount of data. This time we started with an import of 39,334,901 rows, a 4.0 Gb file produced by mysqldump. However, since we want more than 100 million rows, after the import we did 2 inserts to produce almost 160 million records:
INSERT INTO test[engine] (HASHCODE, [other non-primary key fields]) SELECT (HASHCODE, [other non-primary key fields]) FROM test[engine];
# after this there are almost 80 million records (78,669,802)
INSERT INTO test[engine] (HASHCODE, [other non-primary key fields]) SELECT (HASHCODE, [other non-primary key fields]) FROM test[engine];
# after this there are almost 160 million records (157,339,604)
ActionInnoDBTokuDB
Importing ~40 million rows119 min 20.596 sec69 min 1.982 sec
INSERTing again, ~80 million rows total5 hours 13 min 52.58 sec56 min 44.56 sec
INSERTing again, ~160 million rows total20 hours 10 min 32.35 sec2 hours 2 min 11.95 sec
Size of table on disk42 Gb15 Gb
COUNT(*) query with GROUP BY58 min 10.11 sec5 min 3.21 sec
DELETE query2 hours 46 min 18.13 sec1 hour 14 min 57.75 sec
Size of table on disk42 Gb12 Gb
OPTIMIZE TABLE1 day 2 hours 19 min 21.96 sec21 min 4.41 sec
Size of table on disk41 Gb12 Gb
TRUNCATE TABLE1 min 0.13 sec0.27 sec
Size of table on disk41 Gb193 Mb (after waiting 60 seconds before doing an ls -l)
OPTIMIZE TABLE23.88 sec0.03 sec
Size of table on disk176 Kb193 Mb
Clearly, TokuDB is better than InnoDB for all these values. And I did not even use any of the special features of TokuDB — no extra indexes were added!
One great aspect about TokuDB is that it gives you approximate statistics on how many rows have been inserted:
mysql> show processlist\G
*************************** 1. row ***************************
Id: 3
User: root
Host: localhost
db: test
Command: Query
Time: 1
State: Inserted about 4000 rows
Info: INSERT INTO `testtoku` VALUES (14600817,NULL,’c40325fb0406ccf2ad3e3c91aa95a6f2′,’000bxi504′,
The “State” value is per query, so in doing a bulk insert with many rows, I saw this number go up and down. However, it’s very useful nonetheless.
The SHOW TABLE STATUS shows that the statistics are exact (like MyISAM — InnoDB metadata is approximate):
mysql> select count(*) from testtoku;
+———-+
| count(*) |
+———-+
| 18431319 |
+———-+
1 row in set (15.01 sec)
mysql> SHOW TABLE STATUS\G
*************************** 1. row ***************************
Name: testtoku
Engine: TokuDB
Version: 10
Row_format: Dynamic
Rows: 18431319
Avg_row_length: 0
Data_length: 0
Max_data_length: 0
Index_length: 0
Data_free: 0
Auto_increment: 19425236
Create_time: NULL
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.00 sec)
TokuDB does have that “action queue” I mentioned before — the statistics are exact, but may not be up-to-date if there are still actions to be performed. However, any statement that touches every record will perform all the actions left in the queue — so after statements like OPTIMIZE TABLE and even SELECT COUNT(*) FROM tbl, the statistics are up-to-date.
Just in case anyone wants it, here is the my.cnf used for both InnoDB and TokuDB tests:
[mysqld]
datadir = /mnt/mysql/data
port = 3306
socket = /tmp/mysql.sock
skip-locking
key_buffer_size = 384M
max_allowed_packet = 1M
table_open_cache = 512
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size = 32M
thread_concurrency = 8
server-id = 1
innodb_data_home_dir = /mnt/mysql/data/
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /mnt/mysql/data/
innodb_buffer_pool_size = 12G
innodb_additional_mem_pool_size = 20M
innodb_log_file_size = 100M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 2
innodb_lock_wait_timeout = 50
innodb_file_per_table
MySQL on Debian and Ubuntu
Debian Linux (the underlying foundation of Ubuntu) manages the startup/shutdown of MySQL quite differently from the ways I am used to. I am a long-time user of both the MySQL binary provided by Red Hat/SuSE (along with Fedora and clones like CentOS and Oracle Enterprise Linux) and the official binary from mysql.com. After the successful restore of a cold backup, I started mysqld using the Debian provided init script. The script said that mysqld failed to start up, but in reality it did start up. Similarly, stopping mysqld fails.
The output below demonstrates the outputs and the behavior seen on a Ubuntu 8.04 LTS Server:
root:~# ps -ef | grep mysqld root 20165 19926 0 15:12 pts/4 00:00:00 grep mysqld root:~# /etc/init.d/mysql start * Starting MySQL database server mysqld [fail] root:~# ps -ef | grep mysqld root 9291 1 0 13:19 pts/0 00:00:00 /bin/sh /usr/bin/mysqld_safe mysql 9333 9291 0 13:19 pts/0 00:00:00 /usr/sbin/mysqld --basedir=/usr \ --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 \ --socket=/var/run/mysqld/mysqld.sock root 9334 9291 0 13:19 pts/0 00:00:00 logger -p daemon.err -t mysqld_safe -i -t mysqld root 9805 9115 0 13:23 pts/0 00:00:00 grep mysqld root:~# root:~# /etc/init.d/mysql stop * Stopping MySQL database server mysqld [fail] root:~# ps -ef | grep mysqld root 9291 1 0 13:19 pts/0 00:00:00 /bin/sh /usr/bin/mysqld_safe mysql 9333 9291 0 13:19 pts/0 00:00:00 /usr/sbin/mysqld --basedir=/usr \ --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 \ --socket=/var/run/mysqld/mysqld.sock root 9334 9291 0 13:19 pts/0 00:00:00 logger -p daemon.err -t mysqld_safe -i -t mysqld root 9805 9115 0 13:23 pts/0 00:00:00 grep mysqld root:~#
With the MySQL server provided by Red Hat/SuSE and MySQL.com, when using /etc/init.d/mysqld start, the init script invokes either mysqld_safe (or the older safe_mysqld under MySQL 4.x) or the newer mysqlmanager (which is starting to replace mysqld_safe in the newer 5.1 releases). As part of the start up procedure, the process ID (PID) of the mysqld process is written to a pidfile under /var/run which is later used to terminate mysqld. When /etc/init.d/mysqld stop is invoked, the init script reads the PID of mysqld process from the pidfile, and execute a kill on the process ID. The mysqld process receives the kill signal and shutdowns cleanly.
Side note: This means that you can run kill `cat /var/run/mysql.pid` to shutdown mysqld the way that the startup script does, though the script adds some cleanup work.
Under the Debian/Ubuntu distribution of the MySQL server, things are done differently. The package relies on a special mysql user for the init scripts called debian-sys-maint. The password for the user, which I have replaced with X in this blog post, is stored as a root-owned plain text file in /etc/mysql/debian-my.cnf:
root:~$ ls -l /etc/mysql/debian.cnf -rw------- 1 root root 312 2009-01-02 17:41 /etc/mysql/debian.cnf root:~# cat /etc/mysql/debian.cnf # Automatically generated for Debian scripts. DO NOT TOUCH! [client] host = localhost user = debian-sys-maint password = X socket = /var/run/mysqld/mysqld.sock [mysql_upgrade] user = debian-sys-maint password = X socket = /var/run/mysqld/mysqld.sock basedir = /usr root:~#
For the security minded, the password is generated randomly at the time the package is installed, and is unique (and in our example above, replaced with X). Therefore, each installation of the MySQL server on Ubuntu/Debian will have a different randomly generated password — even if you re-install on the same server. The mysql user debian-sys-maint@localhost is added by the Debian package, and has the following privileges:
GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD,
SHUTDOWN, PROCESS, FILE, REFERENCES, INDEX, ALTER, SHOW DATABASES,
SUPER, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, REPLICATION
SLAVE, REPLICATION CLIENT
ON *.* TO 'debian-sys-maint'@'localhost' IDENTIFIED BY PASSWORD
'PASSWORD_HASH_OF_X' WITH GRANT OPTION
When MySQL is started using /etc/init.d/mysql start, mysqld_safe is invoked as usual but adds three extra checks on the MySQL database. These checks being done are unique to Debian/Ubuntu:
- upgrade_system_tables_if_necessary – This check runs the
mysql_upgradescript with the username and password provided by the/etc/mysql/debian.cnf. Runningmysql_upgrademultiple times does not harm anything since running it on an “upgraded” database does nothing. However, it does add to the time it takes to start upmysqld. - check_root_accounts – The script checks for the number of users with the username
rootthat have no passwords. If there are such accounts, then an warning is written to the system log. This warns about open root accounts which may be exploited. - check_for_crashed_tables – This check is somewhat misleading in that while it suggests that it checks for crashed tables, in fact it only does it for MyISAM tables. It gets a list of all MyISAM tables from the
information_schemaand for each table does aselect count(*) into @discard from database.table. If any tables are corrupt, the select will output an error; these are captured and emailed to the root user. In theory, this should be a relatively simple operation since MyISAM maintains row counts in the meta-data for each table, but in fact, if there are many tables, this may cause some I/O spikes on the server.
Stopping mysqld using the script provided by Ubuntu (and Debian) does not involve issuing a kill command to the mysqld process, even though the process ID is recorded in the start-up phase. Instead, the script runs the following command to shutdown MySQL.
mysqladmin --defaults-file=/etc/mysql/debian.cnf shutdown
When a backup is restored to a Ubuntu/Debian server, a problem can occur due to one of two possible scenarios. The first is that the the backup comes from another Ubuntu/Debian server that contains the debian-sys-maint user but has a different password from the current server. The second scenario is that the backup is from a distribution of MySQL which do not use the debian-sys-maint user. This assumes that the backups include the mysql database.
When MySQL is first installed on Ubuntu/Debian, this problem never occurs because the package as part of the post installation process randomly creates a password for the debian-sys-maint user, creates the user in MySQL (during the initial installation the root MySQL user has no password so is able to login as root), and creates the /etc/mysql/debian-my.cnf file on the system. When upgrading MySQL using the Ubuntu/Debian packages, the debian-sys-maint user is used, with the password already stored in /etc/mysql/debian-my.cnf. If the password is incorrect, the upgrade will succeed with some warnings but the user is not recreated automatically.
So what’s the solution? Well, there are ways to deal with it. One is to just ignore the error when using the init script to start MySQL and use mysqladmin to shutdown MYSQL using a user with the proper privileges. Another is to create the necessary debian-sys-maint user (or update the password for the user) in MySQL using the information in /etc/mysql/debian-my.cnf file.
I did not report this as a bug since is not one. Debian (and hence, Ubuntu) does things a little differently then others, and has repackaged the MySQL source differently.
Where’s Everyone Gone?
Now that our new web-site is live I’d like to do the first blog post on it!
You shouldn’t be surprised if you notice that it’s difficult to locate your DBAs in the first week of December (i.e. in just two weeks from now)! And you know why? Because you need to be looking in the right place to find them.

You are likely to know already that I’ll be going to the UKOUG Conference Technology & E-Business Suite 2009 and present few sessions. Few more people from Pythian are actually coming as well:
- Christo Kutrovsky is coming to present with a smashing topic — “RAC + ASM: 3 years in production – Stories to share”. It’s on Wednesday at 12:50pm.
- Paul Vallee is coming to cut some slack. Last year it’s been his first UKOUG conference and you bet he liked it so much that he couldn’t afford to miss it this year. It happens to many… Did I tell you that I didn’t miss a single one since I first came in 2006?
That’s all for now folks. Make sure you say hello to us if you are at the UKOUG. The simplest way to reach us is Twitter — @alexgorbachev, @paulvallee and @kutrovsky.
Video: Building a MySQL Slave and Keeping it in Sync
Last night at the Boston MySQL User Group I presented on how to get a consistent snapshot to build a slave, how to use mk-table-checksum to check for differences between masters and slaves on an ongoing basis, and how to use tools such as mk-table-sync and mysqldump to sync the data if there are any discrepancies.
The slides are online at http://technocation.org/files/doc/slave_sync.pdf.
The video can be watched on youtube at http://www.youtube.com/watch?v=Un0wqYKmbWY or directly in your browser with the embedded player below:
Blogrotate #5: The Weekly Roundup of News for System Administrators
Hi all, and welcome back to blogrotate. It’s been a busy week here at Pythian which reduced the amount of time I had for cruising the news, so this weeks edition will be a short one. Here’s a few of the stories that tweaked our interest this week.
Operating Systems
The Machine SID Duplication Myth is an article on the Microsoft Technet blog by Mark Russinovich. It goes through an in depth explanation of what SID’s are used for, and notes that Sysinternals has officially retired the NewSID utility as of Nov 03, 2009. This is of particular interest to anyone who created desktops and laptops via saved images as NewSID was a staple utility after the machine was imaged to ensure it did not conflict with other machines on the network.
Michael Larabel published some CentOS 5.4 vs. OpenSuSE 11.2 vs. Ubuntu 9.10 Benchmarks on the Phoronix site. I was suprised to see that CentOS beat the others on the majority of tests run, at least in part due to issues with the ext4 filesystem that both SuSE and Ubuntu use as their defaults.
Over at the Computerworld blog, Steven J. Vaughan-Nichols writes about 5 Reasons why Ubuntu 9.10 is better than Windows 7. I agree with most of what he says, but generally Linux is still not an easy conversion for a die hard Windows user. It sure did spark a huge amount of debate in the commentary.
Mandriva Linux 2010 is out. Check out the release information and feature set at the Mandriva blog site. I’ll have a closer look at this if I ever get the time.
HardwareIn the world of hard drives the trend has always been to bump up the amount of bits the drive can hold to combat the constant increase in data size. Another way of dealing with this could be deduplication of data which should reduce the amount of storage required for the same information. Could a hard drive dedupe data has more on this subject.
SecurityRyan Paul writes HTTPS, SSL attack vector discovered; fix is on the way. This vulnerability was discovered by Marsh Ray and Steve Dispensa from security company PhoneFactor but not publicized pending a fix. There is a temporary workaround from the OpenSSL team, hopefully it’ll be resolved quickly.
Not long after Windows 7 was released, John Leydon at The Register writes that Naked Win 7 still vulnerable to most viruses. He’s reporting on testing done by the Sophos security firm which showed that 7 out of 10 of the malware tested still managed to run in the default configuration. So even if you upgraded to Windows 7 you still need to run that anti virus.
Even Linux is not safe from security threats (nothing ever is IMHO). Bug in latest Linux gives untrusted users root access by Dan Goodin gives you the details. Patches for RedHat linux are already out, keep your system up to date to make sure you get the patch as soon as it’s available.
VirtualizationRed Hat takes on VMware with server virtualization solution by Ryan Paul discusses RedHat’s newest foray into the virtualization market with their solution called Enterprise Virtualization for Servers. This solution uses RedHat’s recently acquired KVM and is prominent in the recent RedHat Enterprise Linux 5.4 release.
So as not to be left out, Cisco, EMC, and VMware join hands and plunge into cloud with their new joint venture called Acadia. You can also read more about this in Cisco, EMC, VMware & Intel Form Acadia.
In the “I totally called it” departmentI mentioned in a recent version of this blog that Microsoft was backing a Family Guy episode. I said at the time that I did not see how they could funny it up, apparently Microsoft could not see it either. Joe Fay gives us the skinny in Microsoft drops Family Guy like a hot deaf guy joke. Apparently the humour was not in keeping with the clean, family friendly image that Microsoft wants to convey. Seriously? I suggest someone at Microsoft watch any of the Seth McFarlane shows before signing on with him. I am guessing it was not a matter of foul language however, expletives are an occupational hazard when using Windows.
Til next time, keep your cache full and your swap empty.
Brad
Questions you always wanted to ask about Flashback Database…
Last Friday in the Sydney Oracle Meetup I talked about Oracle Flashback technology and how it helps to reduce downtime. The session generated great interest among the attendees, which led to interesting discussions and many questions about the subject.
Some of the questions couldn’t be answered during the meeting so I’ve followed up on them and I’m posting the answers here since they may be of interest for many others.
Q: Is there a separate background process for writing flashback logs?A: Yes. RVWR (Recovery Writer, a.k.a Flashback Writer) was introduced in Oracle 10g to write flashback data from the Flashback Buffer in the SGA to the flashback database logs on disk.
Q: Do I need to shutdown and mount the database to turn flashback on?A: ALTER DATABASE FLASHBACK ON is an online operation in 11g Release 2. In 10g Release 2 the database must be mounted, but not open. I haven’t tested it in 11.1.
Once flashback is turned on Oracle starts to save the before-images of the database blocks before they are changed. This guarantees that the database can be flashbacked to the exact point-in-time when the flashback mode was turned on.
Q: What happens if RVWR cannot write to disk?A: It depends on the context where the write error occurs:
- If there’s a Guaranteed Restore Point, the database crashes to ensure the restore point guarantee is not voided.
- If there isn’t a Guaranteed Restore Point and it’s a primary database, the Flashback Mode will be automatically turned off for the database, which will continued to operate normally.
- If there isn’t a Guaranteed Restore Point and it’s a standby database, the database will hang until the cause of the write failure is fixed.
A: Yes, but indirectly. The size of the Flashback Buffer is set to 2 * LOG_BUFFER.
For performance reasons, it’s recommended to set LOG_BUFFER to at least 8MB for
large databases running in Flashback Mode.
A: No. Flashback Logs are not backed up. Even if the command BACKUP RECOVERY AREA is used to backup the contents of the FRA to tape only the following file types are backed up: full and incremental backup sets, control file autobackups, datafile copies, and archived redo logs.
Flashback Logs are considered to be transient files and cannot be backed up by RMAN. They are not needed for media recovery.
Q: When are the flashback logs deleted?A: Flashback logs are managed by Oracle only. Oracle will try to keep as much Flashback logs as needed to satisfy the DB_FLASHBACK_RETENTION_TARGET parameter. However, if there’s space pressure in the Flash Recovery Area (FRA), flashback logs may be deleted to make room for other things, like backups and archived logs, for example.
- If the fast recovery area has enough space, then a flashback log is created whenever necessary to satisfy the flashback retention target.
- If a flashback log is old enough that it is no longer needed to satisfy the flashback retention target, then a flashback log is reused.
- If the database must create a new flashback log and the fast recovery area is full or there is no disk space, then the oldest flashback log is reused instead.
- If the fast recovery area is full, then an archived redo log that is reclaimable according to the FRA rules may be automatically deleted by the fast recovery area to make space for other files. In this case, any flashback logs that would require the use of that redo log file for the use of
FLASHBACK DATABASEare also deleted. - No file in the fast recovery area is eligible for deletion if it is required to satisfy a guaranteed restore point. Thus, retention of flashback logs and other files required to satisfy the guaranteed restore point, in addition to files required to satisfy the backup retention policy, can cause the fast recovery area to fill completely.
Other than that flashback logs are deleted according to the below:
- When flashback mode is turned off all flashback logs are deleted ONLY if there’s no guaranteed restore points. If there’s at least one guaranteed restore point, no flashback logs are deleted.
- When the oldest guaranteed restore point is deleted and flashback mode is off, all flashback logs older than the second oldest guaranteed restore point are deleted. If flashback mode is on for the database OR the guaranteed restore point is not the oldest no flashback logs are deleted.
A: You can use either the RC_RESTORE_POINT view in the recovery catalog or the command
LIST RESTORE POINT [ALL|restore_point_name] in RMAN.
RESETLOGS operation is it possible to flash forward to the incarnation after the RESETLOGS?
A: Yes, it’s perfectly possible.
Q: Can you see the progress of aFLASHBACK DATABASE operation?
A: Yes, you can. During a FLASHBACK DATABASE operation you can query V$SESSION_LONGOPS from another session to see the progress of the flashback.
The FLASHBACK DATABASE operation has two distinct phases: the actual flashback and the media recovery that happens afterwards to bring the database to a consistent state.
While the actual flashback is running you’ll see the following message in V$SESSION_LONGOPS:
Flashback Database: Flashback Data Applied : 238 out of 282 Megabytes done
During the media recovery, the following messages will be seen:
Media Recovery: Redo Applied : 263 out of 0 Megabytes done
Media Recovery: Average Apply Rate : 1164 out of 0 KB/sec done
Media Recovery: Last Applied Redo : 626540 out of 0 SCN+Time done
Media Recovery: Elapsed Time : 232 out of 0 Seconds done
Media Recovery: Active Time : 116 out of 0 Seconds done
Media Recovery: Active Apply Rate : 1859 out of 0 KB/sec done
Media Recovery: Maximum Apply Rate : 1859 out of 0 KB/sec done
Media Recovery: Log Files : 15 out of 0 Files done
Media Recovery: Apply Time per Log : 7 out of 0 Seconds done
A: Oracle’s recommendations are:
- Use a fast file system for your flash recovery area, preferably without operating system file caching. It is recommended to use a file system that avoids operating system file caching, such as ASM.
- Configure enough disk spindles for the file system that will hold the flash recovery area. For large production databases, multiple disk spindles may be needed to support the required disk throughput for the database to write the flashback logs effectively.
- If the storage system used to hold the flash recovery area does not have non-volatile RAM, try to configure the file system on top of striped storage volumes, with a relatively small stripe size such as 128K. This will allow each write to the flashback logs to be spread across multiple spindles, improving performance
- For large, production databases, set the init.ora parameter
LOG_BUFFERto be at least 8MB. This makes sure the database allocates maximum memory (typically 16MB) for writing flashback database logs.
You CAN execute DTS packages from SQL Server 2005 64-bit
All over the web I’d read that DTS packages could be stored on SQL Server 2005 64-bit, but not executed on this server. Workarounds I’ve seen range from creating SSIS packages with Execute DTS tasks, migrations to SSIS using the wizard or third party tools, and running the DTS Packages from a 32-bit server against the 64-bit target.
Recently (and much to my embarrassment after making that statement), a colleague demonstrated that this is not correct. DTSRun.exe can be found on SQL Server 2005 64-bit installations (although it might not be supported by Microsoft).
Upon investigating and testing on my own, I found that on the servers I checked, DTSrun.exe is located in: C:\Program Files (x86)\Microsoft SQL Server\80\Tools\Binn\.
I mocked up a simple DTS package that picks up a flat file, creates a table, and then imports the data into it with a transformation step. I saved it as a structured storage file, which I copied to the 64-bit server, generated a DTS Run line (F.Y.I. – DTSRunUI.exe is not located on these servers), then executed it from a command prompt on the 64-bit server. And it worked!
SSIS is a good platform with many noticeable improvements over DTS (also some new quirks), but if you need a workaround or just can’t justify a complete re-write of your DTS packages to your manager, this may be an option for you.
I tested some simple commands, (not the entire functionality) so I strongly recommend you test your packages before relying on this in a production environment.
Please note that DTSRun.exe is not included in SQL Server 2008 installations.
Log Buffer #168: a Carnival of the Vanities for DBAs
This is the 168th edition of Log Buffer, the weekly review of database blogs. Let’s give the wheel a spin and see who comes first . . .
MySQLBrian “Krow” Aker has something to say about Drizzle, InfiniDB, and column-oriented storage: “I have been asked a number of times ‘do you think there is a need for a column oriented database in the open source world?’ The answer has been yes! . . . I was very happy to see Calpont do their release of Infinidb last week.”
Vadim of the MySQL Performance Blog said, “As Calpont announced availability of InfiniDB I surely couldn’t miss a chance to compare it with previously tested databases in the same environment.” And he didn’t, as shows his post Air traffic queries in InfiniDB: early alpha. Bob Dempsey and Jim Tommaney of InfiniDB are in on the discussion.
Back to Drizzle for a moment, and Jay Pipes’ item, The Great Escape. “This week, I am working on putting together test cases which validate the Drizzle transaction log’s handling of BLOB columns. . . . I ran into an interesting set of problems and am wondering how to go about handling them. Perhaps the LazyWeb will have some solutions. . . . The problem, in short, is inconsistency in the way that the NUL character is escaped (or not escaped) in both the MySQL/Drizzle protocol and the MySQL/Drizzle client tools.”
Baron Schwartz has been catching erroneous queries, without MySQL proxy, having been inspired by Chris Calender’s post, Capturing Erroneous Queries with MySQL Proxy.
Nick Goodman promises instant relief from slow MySQL reporting queries using dynamoDB. And no gooey applicator!
Robert Hodges of the Scale-Out Blog looks at replicating from MySQL to Drizzle and beyond. “I am . . . delighted that Marcus Erikkson has published a patch to Tungsten that allows replication from MySQL to Drizzle. He’s also working on implementing Drizzle-to-Drizzle support, which will be very exciting. . . . This brings up a question–what about replicating from MySQL to PostgreSQL? What about other databases?”
And back to xaprb, where Baron confesses, I’m a Postgres user, as it turns out.
PostgreSQLBruce Momjian has published a new security talk, Securing PostgreSQL From External Attack.
Andrew Dunstan delves into recursion in Recursion, n. See recursion. “Never,” he says, “underestimate the usefulness of silly demos (this is written for a talk next week) to teach things worth knowing.”
Bernd Helmle shares a walk-through of cloning Slony nodes. “The new stable branch 2.0 of Slony-I is out for a while now. Time to blog about one of my favorite new features there, cloning an existing node without doing an initial SUBSCRIBE command.”
SQL ServerIt was PASS Summit this week. On Home of the Scary DBA, Grant Fritchey covers the event with several good posts, including PASS Summit 2009 Key Note 3. (Grant is also Geek of the Week! Congratulations, Grant! I guess. Quote: “I think most DBA’s have adminhood thrust upon them. I think the ‘accidental’ DBA is the most prevalent path into becoming a DBA. I became a full time Admin by opening my mouth once too often.”)
Aaron Bertrand also has his summary Blogging from the PASS Keynote: 2009-11-03. (Grant and Aaron both have to specify which keynote they mean, because there’s more than one keynote. This, I guess, is “keynote redundancy”, but I still think PASS needs to normalize.)
Greg Low announces the launch at the PASS Summit of a new book, SQL Server MVP Deep Dives. “This is no ordinary book,” he writes. “Paul Nielsen took up Steve Ballmer’s challenge at a recent MVP summit to do something notable to give back to the community. He organised a large group of SQL Server MVPs to create a unique book and worked with Manning to get it published. The money made on the book was to go directly to a charity and the charity chosen was WarChild.”
Ben Nevarez asks, Are You Using Scalable Shared Databases? “Did you know that you can share read-only databases between several instances of SQL Server? . . . Scalable Shared Databases is a very interesting SQL Server feature that many of us seem to almost have forgotten about . . . ”
Here’s Roman Rehak reporting an issue with restoring 2000 backups on 2008. He writes, “Recently we’ve been experiencing a lot of headaches with SQL Server 2008 crashing while restoring a backup taken on a SQL Server 2000 production server. The crash resulted in a stack dump but SQL Server would continue running, although less stable, and sooner or later needed a reboot.”
Meanwhile, Adam Machanic reports on SQL Server 2008: lock escalation, INSERTs, and a potential bug. Adam says, “Lock escalation is a funny thing. I’ve found myself on numerous occasions waging war against its concurrency-sapping existence, and rarely have I found myself wishing that it would work more aggressively. But there is a time and place for everything, and yesterday I discovered that a major change has occurred with regard to lock escalation in SQL Server 2008.”
OracleMohammed Mawla on the Pythian Blog bridges the gap with his item on running the same query against multiple SQL Server AND Oracle instances.
Surachart Opun shares his HOWTO on using DUPLICATE without a connection to target database: “ . . . that’s a 11gR2 Feature. DUPLICATE can be performed without connecting to a target database. This requires connecting to a catalog and auxiliary database.”
Here’s another HOWTO, this one from the great grandson of Husnu Sensoy: How to Install Oracle 11g Release 2 on OEL 5.4 on VirtualBox: Installing Grid Infrastructure. He begins, “In Oracle 11g Release 2 you will find that things have changed even for single instance database installation. I will try to illustrate in this series of posts how to install a single instance Oracle 11g Release 2 database to your Linux machines.”
But let’s step back a bit. Ronny Egners says, Oracle on linux – yes of course – but what linux?. “There is a discussion from December 2008 what Linux (SLES vs. Red vHat vs. Oracle Enterprise Linux) to use for running oracle on Linux by Yann Neuhaus. . . . After nearly one year i wanted to catch up the article and check if the pros and cons are still valid or if there changed anything.”
Chen Shapira offers The Senile DBA Guide to Troubleshooting Sudden Growth in Redo Generation, which begins, “I just troubleshooted a server where the amounts of redo generated suddenly exploded to the point of running out of disk space. . . . The problem was found and the storage manager pacified, I decided to save the queries I used. . . . It was very embarrassing to discover that I actually have 4 similar but not identical scripts . . . Now I have 5.”
Embarrassing, Chen! But do you have guilty feelings like Martin Widlake does? He makes a guilty confession. The sin? “I use the Buffer Cache Hit Ratio.”
Last, Tyler Muth introduces Logger, A PL/SQL Logging and Debugging Utility.
That is all for now. Please let’s hear your favourite database blogs in the comments. Until next time!
Installing SQL Server 2005 Reporting Services 32-bit on a Windows Server 2003 64-bit
Installing a 32-bit version of SQL Server 2005 Reporting Services on a Windows Server 2003 64-bit could be a bit of a challenge, as it requires IIS. I have seen several customers who purchased servers with Windows Server 2003 64-bit pre-loaded while they only have a license for a 32-bit SQL Server 2005.
It is always recommended to have a 64-bit application running on a 64-bit OS to take full advantage of the 64-bit platform. IIS, by default, runs 64-bit on a 64-bit Windows Server system. SQL Server Reporting Services requires ASP.NET which can be manually installed on top of IIS.
When you install the .NET Framework on a 64-bit machine, you have both the 32- and 64-bit versions. Nonetheless, you will not be able to install ASP.NET 32-bit version on a 64-bit IIS, which will be needed by SQL Server Reporting Services.
To be able to install ASP.NET 32-bit on a 64-bit IIS, you need to configure IIS to run 32-bit web applications. With Windows Server 2003 Service Pack 1, IIS can be enabled to run 32-bit applications on a 64-bit Windows using the Windows32-On-Windows64(WoW64) compatibility layer. This makes it possible to run ASP.NET 32-bit and other 32-bit web applications as well as allow creation of 32-bit worker processes.
To enable IIS 6.0 to run 32-bit web applications on a 64-bit Windows, navigate to the %windir%\Inetpub\AdminScripts directory. Run the adsutil.vbs script with the following parameters:
csript.exe adsutil.vbs set W3SVC/AppPools/Enable32BitAppOnWin64 "true"
This will enable IIS to run 32-bit web applications. You can then install SQL Server 2005 Reporting Services 32-bit which will install the corresponding ASP.NET version as part of the .NET Framework.
If, however, you already have the .NET Framework installed on the server, you will need to manually install ASP.NET 2.0 32-bit on IIS after it has been configured. Doing so before configuring IIS will throw an exception stating that you cannot run 32-bit ASP.NET on 64-bit IIS. Navigate to the %WINDIR%\Microsoft.NET\Framework\v2.0.50727 folder (the 64-bit version of the .NET Framework will be at the %WINDIR%\Microsoft.NET\Framework64\ folder). Run the aspnet_regiis.exe utility on the command line.
aspnet_regiis -i
Now that all the groundwork has been prepared, you are ready to run SQL Server 2005 Reporting Services 32-bit on a Windows Server 2003 64-bit.
If, however, you decide to install it on a Windows Server 2008 64-bit, there’s more groundwork to do. By default, IIS 7 as well as ASP.NET are not installed. You have to add the web server role and further configure IIS. SQL Server 2005 Reporting Services is dependent on the IIS 6 or earlier metabase, and thus requires installation of the IIS 6 Metabase Compatibility and IIS 6 WMI Compatibility. Installing these two would also configure the IIS Default Application Pool to run on 32-bit similar to running the adsutil.vbs script for Windows Server 2003.
A Microsoft KB outlines the detailed steps in preparing IIS7 to run SQL Server 2005 Reporting Services on Windows Vista, but the steps work pretty well in Windows Server 2008. In case you still see the installation error on required 32-bit ASP.NET, you can switch the configuration of the Default Application Pool to Enable 32-bit applications. You don’t really need to do this once all the IIS 7 groundwork has been completed but, then again, you’ll never know. It might come in handy.


