Edward Whalen
Oracle VM
Enabling the 11gR2 Grid Infrastructure on Windows
With this new information in mind I immediately recommended that the 11gR2 Grid Infrastructure be installed. This provides the most stable and robust clusterware and ASM software and at the same time extending the lifetime of these components before they need to be upgraded.
In addition, by using the 11gR2 Grid Infrastructure both 10g and 11g RAC databases can be created. This will allow you to create an 11gR2 database for testing. Since the architecture of RAC has changed slightly in 11gR2 it is best to move to this structure for any new installations, even if a 10g database is the end result.
So, what has changed? In Oracle 10g and 11gR1 there were three separate components and/or Oracle homes that were used for RAC; the CRS, the ASM home and the database home. In 11gR2 the clusterware and ASM homes are combined into the Grid Infrastructure and have one home. In addition, the confusion as to which listener to use is clarified by having the cluster, ASM and master listener run out of the Grid Infrastructure. Now, instead of three Oracle homes, there are two; Grid Infrastructure and database.
So, if you are planning any new RAC installations on Windows, do yourself a favor and use the 11gR2 Grid Infrastructure, even if you are installing a 10g database.
Using Oracle Parallel Query in a RAC Environment
Let’s say that you have a table scan that is going to read 1,000,000 blocks. Furthermore, ignoring all disk caching, driver optimizations, etc. let’s assume that each physical I/O takes 2ms. So, in order to read this entire table into the SGA it will take 1,000,000 I/Os x 0.002 sec/I/O = 2,000 sec which is 33.3 minutes. So, it will take a half an hour (plus overhead) just to read the data into the SGA. Most of the time which is spent idle waiting on the I/O to return.
Now let’s turn on parallel query with a parallelism of 10. Oracle will internally divide this 1,000,000 block table into multiple pieces and pass those pieces off to the 10 parallel query processes. If you are using anything other than a single SATA drive for your database the I/O subsystem can easily handle the parallelism and will now return that same 1,000,000 blocks in 1/10th of the time, or 3.33 minutes.
Note: I prefer to have control over my parallelism, so I will use the parallel hint to invoke parallel query as such; SELECT /*+ PARALLEL(table_alias, degree) */ FROM table;
Taking it up a notch, now let’s consider that the operations are consuming all of the CPU in the system. I can now additionally add nodes to use in a RAC cluster for parallel query, thus taking advantage of parallelism and RAC. This can be done by extending the hint to include the number of nodes to use in a parallel query operation as such; /*+ PARALLEL(table_alias, degree, instances) */ FROM table; This will allow me to specify not only the number of query slaves to use, but the number of instances to use as well.
This is only part of the story. Now it is necessary to specify whether we are talking about 10g or 11g. Let’s start with Oracle 10g.
Parallel Query and Oracle 10g RACWith Oracle 10g, the number of RAC nodes that will participate in a parallel query operation depends on a couple of parameters; instance_groups and parallel_instance_group. The instance_groups parameter is used to set up instance_groups and the parallel_instance_group parameter is used to specify which group you belong to.
First, set up the instance groups. In this example I have a four node RAC cluster:
INSTANCE_GROUPS
Assign the node a set of instance groups that it participates in
- node1.instance_groups = node1,first_two,all_nodes
- node2.instance_groups = node2,first_two,all_nodes
- node3.instance_groups = node3,last_two,all_nodes
- node4.instance_groups = node4,last_two,all_nodes
Setting the parallel_instance_group parameter will define which group you belong to, thus defining where your parallel queries can run.
For example, if I set my parallel_instance_group parameter to Node1, my parallel queries will only run on node 1 as shown here:
Setting the parameter parallel_instance_group in the init.ora file will specify the default value for each user who logs into that instance. An alter session command can be used to modify that parameter as shown here: SQL> alter session set parallel_instance_group=all_nodes;
This will allow the parallel query to span all nodes in the RAC cluster.
Note: You don’t have to be on the node that parallel query is configured to run on. If you happen to log onto node 3 and your parallel_instance_group is Node1 non-parallel queries will run on node 3, but any parallel queries will run on node 1.
Parallel Query and Oracle 11g RACSo, what about Oracle 11g? If you are running on Oracle 11g forget everything I just told you. One of the best features about Oracle 11g is that it now uses services to specify which nodes your parallel query runs on. So, depending on which service you connect to, the preferred nodes for that service are the nodes that parallel queries will use.
ConclusionsDepending on the operation that you are doing, you might be able to achieve huge performance gains by parallelizing across multiple nodes in a RAC cluster. Of course your mileage might vary, and if you aren’t careful you might increase the load on nodes to a point where you are affecting other users. I have a saying that “If you aren’t careful with parallelism, you might paralyze your system.”
Using Oracle in a Windows VM
With Windows Server 2008, everything needed to support database server virtualization is available as an integral part of the operating system – Windows Server 2008 Hyper-v. This whitepaper demonstrates the performance capabilities of Oracle 10g on Hyper-v. It also has provides several best practices and resources essential to virtualization of Oracle database workloads.
Please visit our website and take a look.
What Do a Few Bits Matter
Oracle on Microsoft Windows is unlike any other platform that Oracle supports. When Oracle was originally ported to windows, the decision was made to take advantage of Windows threads. A thread, sometimes known as a lightweight process shares the same context as the calling process, thus the context switch is replaced by a thread switch (which uses much less CPU resources). With Oracle for windows, all traditional processes (background and server) are implemented as threads. This provides for more efficient processing, but causes another problem.
Within the thread model, all threads within a process share the same virtual memory address space. With Windows 32-bit, this virtual address space is 3 GB in size (with /3GB flag, 2 GB otherwise). Thus all of the Oracle processes share the same 3 GB virtual memory limit. In a process environment such as Unix, each process has its own 3 GB virtual memory address space. This limitation can cause the virtual memory space to be completely used, thus causing failures. The symptom of this in 32-bit Oracle is a failure to allocate memory for new dedicated server processes.
This problem has been solved with the 64-bit version of Oracle for Windows. The 64-bit version of Oracle for Windows supports a 16 Terabyte virtual memory limit. In addition, memory above 4 GB can be accessed directly, indirect data buffers need not be used. This makes the use of memory above 4 GB much more efficient and faster.
When implementing Oracle with an SGA that is more than 4 GB in size I always recommend the use of large memory pages. Large memory pages are much more efficient for large SGAs and makes a huge performance difference, but that’s going to be the subject of my next blog. For information on how to use large memory pages in Oracle on Windows see Metalink note 422844.1.
The limitations in this article lead me to strongly recommend the use of 64-bit Oracle on 64-bit Windows. So, "Just say no, to Oracle on 32-bit Windows".
IOUG Collaborate 2008
What hardware should I buy?
I recently had one of my blog readers ask me for some advice on some new hardware that he was going purchase for running Oracle on Windows. It is difficult to give specific advice on what to purchase, but I can provide a few general guidelines.
- Get something that is expandable. If you don’t need 4 CPUs now you can get a system capable of supporting 4 CPUs, but only purchase one or two. Make sure that you can add sufficient memory as necessary. Start with 2 or 4 GB but make sure that there are free slots in case you need to add more.
- If you will be running Oracle 10g, absolutely go 64-bit. Any recent Xeon or Opteron processor supports 64-bit Windows. The 64-bit version of Windows 2003 works great and is priced similar to the 32-bit version.
- Get a name brand. HP, IBM, Dell, etc. Get something that is supported by the manufacturer.
- If possible, separate the application tier from the database tier.
- Get sufficient disk drives. I'm not saying that you need to start with 8, but you need enough so that IO performance is not a problem. How many do you need? I can't tell you without knowing more about the database and application, etc.
- Absolutely use a RAID controller and disk mirroring (RAID 1). If you lose your data, you are out of business.
- In larger systems I recommend separating the log drives and data drives, since the loss of one of the two is a recoverable failure. The loss of both is catastrophic.
- If you don't already have one, get something to back up your database with. This can be tape, DVD, etc..
- Get trained, get some books, etc.. I'm trying to convince Oracle Press to let me do an Oracle on Windows book.
I hope that this is helpful. If you have any comments or a suggestion for a future topic, drop me an email at ewhalen@perftuning.com.
Oracle on Windows vs. Oracle on Linux
In addition, I feel that the way Oracle had developed the Oracle Database Server for Windows using the threading model would turn into an asset once 64-bit Windows is adopted. Remember that 32-bit Oracle on Windows suffers from virtual memory issues that are solved with 64-bit Windows. Thus the liability of using the thread model has turned into an asset (see previous blog).
In order to dispel any rumors or conjecture on whether Windows or Linux works better on the same hardware we recently ran a comparison. This comparison was done using the SwingBench tool. The result of this comparison is provided in a white paper which we have just published on our website. In order to get to this whitepaper follow this link.
Choosing the best OS for your environment involves more than just the performance of the database server. The Oracle Database Server on Windows provides compatibility with Active Directory and your entire integrated environment.
Oracle on Windows and NAS Storage
I have been asked the question "Does Oracle support NAS on Windows". Officially Oracle does not support native NAS devices per-se on Windows. I am speaking specifically of the CIFS protocol. This is not true for all platforms; Oracle does support NAS on Unix with NFS, but not on Windows. However, with the introduction of iSCSI Oracle does now support NAS on Windows.
The iSCSI protocol uses embedded SCSI commands within IP packets. This allows for storage to be accessed over the network, but at the same time take advantage of a proven storage protocol. The main benefits of iSCSI is that it is cheaper and potentially easier to use than the fibre channel equivalent.
In June of 2003 Microsoft released the iSCSI initiator and driver for Windows 2003. In addition, Microsoft has qualified a number of storage vendorÂs products for use with Windows 2003. At the same time, Microsoft has introduced Windows Storage Server 2003, which allows a Microsoft Server to become the iSCSI NAS.
With iSCSI Oracle on Windows is completely supported as both a stand-alone Oracle database server or an Oracle RAC server. Yes, Oracle does support NAS on Windows.
Microsoft and Oracle Work Together
In all the years that I have worked with both Microsoft and Oracle I have never seen so much interoperability and cooperation. This week Oracle announced support for People Tools on SQL Server 2005. See the announcement at http://www.oracle.com/corporate/press/2006_jul/peopletools-848.html. This is just one in many examples of how the interoperability between Oracle and Microsoft products is getting better. Other examples include the ability to integrate Oracle logins with Windows Active Directory and the extensive support of Oracle products on Microsoft Windows.
Oracle was the first database software available for the Windows NT platform and was also the first database software available for the 64-bit editions of the Windows platform. Oracle supports not only their database software, including RAC, on Windows, but also People Tools, Oracle eBusiness Suite software, and the Oracle Fusion Middleware products. Oracle even supports Microsoft clustering with Oracle Fail Safe.
Recently a new type of interactivity has started occurring. Oracle Grid Control monitors the Windows OS, but now does much more. Oracle is now providing Grid Control Plug-ins for the following Microsoft products:
- Active Directory
- BizTalk Server
- Commerce Server
- IIS
- Internet Security and Acceleration (ISA)
- Microsoft .NET Framework
- Microsoft SQL Server
These Plug-ins allow Oracle Grid Control to monitor and provide alerting functions, so that in the event of a failure, you will be notified via email, pager, etc. Oracle has expanded Grid Control in order to provide a full service management and monitoring utility.
On the other side of the fence, Microsoft does not explicitly create Microsoft Operations Manager (MOM) management packs for non-Microsoft products, but has opened this up to the 3rd party vendors. There are several management packs for Oracle products available today. This allows the extensibility of MOM to be used with Oracle and any OS that you happen to run Oracle on.
This week I happened to be working on my upcoming “SQL Server 2005 Administrator’s Companion” book and was going through the editorial process on replication. With SQL Server 2005 replication, you can replicate data into a SQL Server 2005 database from an Oracle publisher. This allows SQL Server reporting servers and batch servers to get their data directly from Oracle using replication. In addition, it is possible to set up an Oracle subscriber to use a SQL Server publisher, thus using Oracle for reporting and SQL Server for OLTP. It is also possible to perform distributed queries between Oracle and SQL Server databases by configuring the Oracle database as a linked server within SQL Server.
There has always been cooperation between Oracle and Microsoft for using Oracle on the Windows platform, but the management cooperation has just begun. In the next few years you will most likely shop around and decide whether you like the Microsoft tools or the Oracle tools better and then implement that tool throughout your entire enterprise.
Oracle for Windows Hardware Guide
I’d like to stray a little bit from the pure software aspects of Oracle on Windows to focus on the hardware and OS perspective in this blog. The Oracle software is very important, and the version of Windows 2003 is also very important, but if you do not have the proper hardware running this software, you might be headed for trouble. In this installment I will be describing the hardware components necessary to run Oracle and Oracle RAC on Windows as well as discussing some of the new technologies that are available today.
When putting together a system to run Oracle, you must consider not only the amount and type of hardware, but the brand of hardware. In addition, it is important to properly size the system. All of these tasks lead you to putting together an optimal system for your environment.
I’m not going to recommend that you purchase a particular brand of hardware, but I would recommend that you purchase your hardware from a company that you recognize. Not only is the quality of the hardware important, but the quality of the support as well. Many vendors go the extra step and test their products with Oracle on Windows in order to provide an optimal solution. It is especially important when implementing Oracle RAC that you choose a vendor that can provide a tested solution. These vendors will often supply you with white papers or other docs on how to optimally configure their system for Oracle RAC on Windows.
With Oracle RAC, the storage selection is as important as the selection of the actual servers. The storage system must provide the ability to be shared, be robust and be able to support a sufficient number of disk drives to achieve the performance that you need. In addition, the storage should be configured in order to avoid a single point of failure. In this installment of my blog I will discuss the servers, the storage and the infrastructure.
Servers
When choosing a Windows server to run Oracle there are several features that you should look for. The CPU is the brains of the computer and should be selected for the type of processor, the cache size and its speed. Because of the memory requirements of the dedicated and shared server processes, the larger the better is usually the case with the cache on the CPUs. The more CPUs there are in the system, the more important the CPU cache is. If you have a choice between a faster CPU and more cache, with Oracle it is usually better to go with more cache.
The type of CPU offers quite a selection now. AMD has an excellent processor with the Opteron line, and Intel has done a great job with the EM64T processor. They are both available with single or dual cores and both are x64 processors. I only recommend the Itanium 2 processors for specific applications where a large system with many processors is required due to the cost differential between the x64 systems and the Itanium systems. In more cases, the x64 (Opteron or EM64T) processors will work great for you.
Storage
Storage is one of the most crucial components of your Oracle system. The storage subsystem must be robust, high performing and expandable. Today there are many options to choose from. In the past I would have only recommended SAN storage for an Oracle RAC system, but recently I have had very good experience with both NAS and iSCSI solutions. This in conjunction with Microsoft’s release of its iSCSI support for Microsoft Storage Server 2003 R2, iSCSI is not much more interesting.
When selecting a storage subsystem make sure to minimize the single points of failure. This is done by creating as many redundant components as possible. In addition, not only the size of the disk drive is important, but the number of disk drives is crucial. A state of the art disk drive should be configured to do no more than 125 IOs Per Second (IOPS). Running disk drives hotter than this will cause latencies to increase, thus causing system performance problems.
Infrastructure
The network infrastructure is an important piece of your entire system, especially with network storage. If you are using Network Attached Storage (NAS) or iSCSI you should dedicate a network specifically for storage. By sharing the storage network with other network functions such as the RAC interconnect or general network connectivity you could create higher latencies and thus reduced performance. Since the speed of the network is critical to your storage performance (in a NAS or iSCSI environment) you should not use anything less than Gigabit speed networks.
Network storage should always be available locally, that is, there should not be any active components between the database server and its storage; such as routers, firewalls, etc. The only components that should reside between the database server and its storage is one or more network switches. Active components such as routers and firewalls can add too much latency.
Summary
In summary, it is important to think about the hardware that you will be deploying in terms of both stability (supportability) and performance. Many of the performance engagements that I am brought in on are caused by an undersized system. If you are using network attached storage, make sure that you have a sufficient network infrastructure to handle it.Oracle Services and the Windows Registry
For those of you who have used Oracle on Windows you either think of the Windows registry as a blessing or a curse. The Windows registry is critical for the setup and maintenance of services as well as to maintain parameters relating to Oracle Home, etc. Without the registry, or with an incorrectly configured registry, Oracle on Windows will not function. Thus it is unwise to modify the registry own your own. Let the Oracle installer and DBCA modify the registry.
One of the main differences between Oracle on Windows and other platforms (besides the registry) is the use of services. With the Windows operating system any process started by the login user will exit when the user exits. This would not be good for something like the Oracle database, thus the Oracle instance is started as a Windows Service.
A Microsoft Windows service allows you to create long-running executable applications that run in their own Windows sessions. These services can be automatically started when the computer boots, can be manually started, stopped and paused. Because they run in their own session, they do not contain a user interface, but they can take parameters or use registry parameters. Services are used for long-running functionality that does not interact with other users, thus is good for server applications such as the Oracle Database Server and Oracle Listener. In addition, services can be invoked under the user context of a domain user, local user or the system account. Thus the Oracle service can be started by a user other than the Oracle database owner, but will run under the context of the Oracle user.
By default, the Oracle database server must create a minimum of two services:
- OracleServiceSID is the Oracle bootstrap service that is used as the service that the Oracle instance runs under. The Oracle bootstrap service must be started in order for the Oracle instance to start, however, the Oracle instance can be started up or shutdown manually once the bootstrap instance is running.
- OracleHOME_NAMETNSListener is the service that controls the Oracle listener.
These are the only two services that are critical to the Oracle instance (unless you are using ASM). Other services that you might encounter include:
- OracleDBConsoleSID is the Oracle Enterprise manager DB Console service.
- OracleJobSchedulerSID is the Oracle job scheduler process.
- OracleHOME_NAMEiSQL*Plus is obviously the iSQL service.
- OracleServiceASM_SID is the ASM instance service.
As I mentioned earlier, the OracleServiceSID is really just a service to bootstrap the Oracle instance. This service is required before the Oracle instance can be started, but if this service is running, you can still use SQL*Plus to start and stop the Oracle instance. The registry parameters HKEY_LOCAL_MACHINE->SOFTWARE->ORACLE->KEY_HOME_NAME->ORA_SID_AUTOSTART and ORA_SID_SHUTDOWN controls whether the Oracle instance automatically starts and stops when the OracleServiceSID service does. In addition to these parameters, the registry location HKEY_LOCAL_MACHINE->SOFTWARE->ORACLE->KEY_HOME_NAME holds some other very useful entries:
- ORACLE_BASE is the location of the Oracle base directory for this Oracle home.
- ORACLE_HOME is the location of the Oracle home directory for this Oracle home.
- ORACLE_SID is the Oracle System Identifier or SID.
- SQLPATH specifies the location of SQL scripts.
- RDBMS_CONTROL specifies the location of control files. This is where Oracle will look for the spfile or the pfile for this instance.
There are other registry parameters, but I have touched on some of the most important. Beware, registry parameters must be exact. Spelling and capitalization are extremely important. If you do not have the parameter exactly right, it will be ignored and you will be debugging an illusive problem.
The registry is very useful in that it centralizes the location of parameters and settings for the Oracle installation. However, if something goes wrong, it can sometimes be very difficult to fix, so be careful.
Services are an important part of Windows 2003 and the entire Windows family. Without services it would be very difficult to implement Oracle without requiring the Oracle user to stay logged in at all times. Oracle on Windows is different from other platforms, and this is what distinguishes it.
Oracle Performance Monitoring on Windows
Performance monitoring is a regular task for most Oracle DBAs. For Oracle on Windows this is no exception. The only difference are the tools that are available to you. With Oracle on Windows you still have all of the tools available to you that are available on other platforms, such as the Automatic Workload Repository, but in addition, you can view Oracle statistics via the Windows Performance Monitor (perfmon).
There are several advantages to monitoring Oracle via perfmon. By using perfmon, you can collect Oracle data as well as OS data that might be important to your overall performance monitoring plan. By consolidating your data collection, you will find it easier to analyze data. In addition, perfmon data can be collected and analyzed via the Microsoft Operations Manager (MOM) product. MOM allows you to collect and analyze long term data as well as to configure alerting functions that can let you know that there is a problem.
By default, Oracle statistics are not collected for use by perfmon. By following the steps outlined here you can quickly and easily configure Oracle for perfmon data collection. By default, you still have many ways of monitoring Oracle. Among these are the following:
- Oracle Automatic Workload Repository. The Automatic Workload Repository (AWR) is performance data that is automatically collected by Oracle for performance and trouble analysis.
- Event Viewer Alerts. The event viewer alerts are events that are automatically logged directly into the Windows alert log.
- Oracle Alert Log. The Oracle alert log is the log where Oracle logs status and error messages.
- Trace Analyzer. Oracle sessions can be traced and that trace data analyzed via the Oracle trace analyzer.
The Oracle Counters for Windows Performance Monitor package is not installed by default. In order to install them when you install Oracle, select the custom install option. You can also install this option later via the Oracle installer. Select custom installation and select the Oracle for Windows Performance option. This will install this package.
Once Oracle Counters for Windows Performance Monitor has been installed, you must perform one more piece of setup. The Oracle performance counters are set up to monitor one Oracle instance. Information about this instance must be configured in the registry. In order to do this, from a command prompt run orafcfg.exe with a username, password and Oracle net service name as follows:
operfcfg –U system –P password –D orcl
This will update the registry. You should now be able to monitor Oracle via perfmon. Some of the things that you can monitor are:
- The Oracle Buffer Cache. Here you can see the cache miss ratio.
- Shared Pool Stats. This collection includes the data dictionary cache, and the library cache.
Log Buffer. Provides information on log space requests. - Database Data Files. This object provides physical read and write per second counters.
- DBWR stats. Provides information on the DB Writer processes.
- Miscellaneous. Other statistics include dynamic space management, free lists and dynamic sorts.
By taking advantage of Oracle Counters for Windows Performance Monitor you can easily and efficiently monitor Oracle along with monitoring the OS. As I mentioned earlier, perfmon provides valuable performance data that is easily collected and analyzed. Some of the most important and first counters that I look at when performance monitoring a system are:
- Processor: %Processor Time. This gives me a quick look at how busy the system is.
- Physical Disk: Avg. Disk sec/Read, Avg. Disk sec/Write. This provides me with an overview of how well the I/O subsystem is doing.
When first looking at a system I am actually more interested in disk latencies than throughput. The Avg. Disk sec/Read and Avg. Disk sec/Write should be in the range of 5-15 ms (0.005 – 0.015). Anything higher than this indicates a problem. Once I have detected a problem, I can try to determine its cause.
Of all of the platforms I work on; Linux, AIX, HP/UX, and Solaris, I find that Windows provides the best performance monitoring tool. Perfmon is easy to use, quite intuitive and easy to view and save the results in a number of formats. Oracle Counters for Windows Performance Monitor is not an end-all, but in conjunction with AWR stats and tracing will help to provide a complete picture.
I/O Performance and Oracle
One of the biggest problems that performance tuning consultants have encountered in the last few years is the introduction of larger and larger disk drives. This has led to database systems being created with larger and larger disk drives, but fewer of them. This has led to an epidemic of I/O performance problems. Let me explain further.
Most I/O in Oracle is random in nature. This is due to the fact that you have a system where many people are looking for many diverse pieces of data. In addition, structures such as indexes work by retrieving one block that points to another block and finally to the data block itself. Thus almost all I/O can be considered random.
The basic structure of the disk drive has not changed significantly in the last few years. Since disk drives have gone from 2 GB in size to 300 GB in size, the average seek time has dropped from 9 ms to 4.5 ms. This represents a 150 times increase in size, with a 2 times increase in performance. This has caused a problem where large databases are being configured on fewer and fewer disk drives.
To make things worse, some vendors are now selling Serial-ATA drives that have an average seek time of 8 ms or higher. This represents a 250 times larger drive with essentially no increase in performance. SATA drives should be avoided for Oracle storage if Fibre Channel or SCSI drives could be used.
At the same time, with Oracle on Windows (32-bit) databases are getting larger and larger with memory not increasing significantly. This has caused an upturn in I/O problems occurring with Oracle systems on Windows. The good news is that the solution to this problem is simple, add more disk drives.
How do you determine if you are having I/O problems? There are several methods. The Windows performance monitor (Perfmon) provides a wealth of I/O performance data. To me, the most significant I/O counters are Avg. Disk sec/Read and Avg. Disk sec/Write. These represent the disk latencies and are your first indication that you are having a problem. Disk latencies should be in the range of 5 ms – 15 ms for reads and 0 – 10 ms for writes (if a write cache is used). The performance counters for Windows are very accurate and very useful in determining if you have a performance problem.
In addition to the Windows Perfmon counters, Oracle AWR reports can also be quite useful in determining if you are having an I/O problem. Look for the top 5 wait events. If you see one or more of the Top 5 Timed Events in the System I/O wait class, this is an indication of an I/O problem. In addition, the I/O statistics also report average read ms and average write ms per data file and per tablespace. This information is very useful.
In summary, I/O is critical to the performance of your Oracle Database system. Don’t confuse size with performance. The performance capacity of your I/O subsystem depends on the number of disk drives, not the size of the drives. Performance and sizing are critically linked. An undersized I/O subsystem will perform poorly no mater how much CPU capacity and memory you have. If you are experiencing performance problems, check the I/O subsystem first.
Oracle High Availability on Windows using Fail Safe
Oracle on Windows supports the same Oracle Disaster Recovery (DR) and High Availability (HA) features as other platforms, such as Oracle RAC and Data Guard. In addition to these, Oracle on Windows supports Oracle Fail Safe. These are all excellent solutions with different usages, costs and performance.
Oracle RAC is a high availability solution that allows multiple Oracle instances to access the same database. This provides for both high availability and additional performance, since all nodes are active. Oracle RAC is an excellent product, but somewhat complicated to install and manage and can be expensive.
In addition to Oracle RAC is Oracle Data Guard. Data Guard is really more of a Disaster Recovery (DR) solution than a High Availability solution, but it can be very useful in the event of a system failure. Data Guard is a fantastic product which I have installed many times. Data Guard is one of those products where the base installation can be fairly straight forward, but there are so many options and modes of operation that it can quite quickly become very complicated.
The final solution in this discussion is Oracle Fail Safe. Fail Safe is a Windows only product that makes Oracle Microsoft Cluster Services (MSCS) aware. Failsafe is available for all Windows platforms (32-bit, x64 and Itanium) and is downloadable from the Oracle web site. Fail Safe is a plug in to MSCS to allow the Oracle instance and listener to be managed under the control of MSCS. This allows the Oracle instance to fail over in the event of a failure of one of the nodes.
The MSCS architecture is very straight forward. Two systems share the same disk, of which only one system controls at a time. In the event of a failure (determined by the heartbeat mechanism), the instance currently running the Oracle instance and controlling the storage is replaced by the standby system. This allows the Oracle database to be back up and running in a matter of minutes, thus improving availability.
Oracle Fail Safe has the advantage over RAC in that it is much easier to implement and administer. Most Windows shops already have implemented MSCS and are very familiar with it. In addition, Oracle Fail Safe is a core feature of Oracle9i and Oracle10g for Windows and no additional license is needed.
With Oracle Fail Safe you can fail over the following Oracle components:
- Oracle Database Server.
- Oracle Listener.
- Oracle Intelligent Agent.
The failover is seamless and works very well. In addition, the management tools allow you to configure Fail Safe to automatically fail back a node at a scheduled time, thus allowing you to configure different hardware resources for the primary and standby servers.
I have worked on multiple Oracle Fail Safe installations. In a Windows environment where you want to have assurance of High Availability at a low entry point cost, Oracle Fail Safe might be for you.
Configuring Oracle Fail Safe is easy once the Microsoft Cluster Services have been installed. The following steps are done in order to install Oracle Fail Safe.
- Install Windows on both servers.
- Install MSCS (if applicable) and create the cluster.
- Install Oracle Database Software on each node’s local disks, using the same Oracle Home directory.
- Create the Oracle database on one node using the shared disk for storage.
- Install Oracle Fail Safe on each node in the cluster.
- Configure the cluster using the Fail Safe manager.
You will find that Fail Safe is an easy to use, inexpensive way to provide high availability in a Windows environment.
Oracle RAC on Windows
For those of you who are not familiar with Oracle RAC, let me give you a brief overview. Oracle RAC (Real Application Clusters) has been around since Oracle9i. It is a follow-on product to Oracle Parallel Server (but much better). In essence, Oracle RAC is a clustering product that allows Oracle on multiple servers to access the same database.
This is unlike a typical failover cluster in that all nodes are active and accessing the same data. In a failover cluster only one node can access the database at a time. With RAC it is all nodes all the time.
Oracle defines an instance to be the memory and processes necessary to access the Oracle database. In a RAC cluster, there are multiple instances accessing the same database. RAC accomplishes this via a sophisticated locking mechanism that keeps data consistent among the nodes. Because of the way RAC does its locking and cache consistency, the performance of the cluster interconnect is critical. The cluster interconnect must be Gigabit Ethernet as a minimum.
Setting up RAC on Windows is pretty much the same as setting up RAC on other platforms with a few exceptions. Because of the way Windows does its networking, it is important to have the Public (talks to the world) interface listed first in the advanced network properties.
All nodes in the cluster must be able to access all other nodes without having to type in a password. This can be accomplished via several methods. The preferred method is to install and configure Oracle and RAC using a domain account. This account must be available on all nodes in the cluster and it must have local administrator rights. The second method is to use the local Administrator account, but make sure that the password is identical on all nodes in the cluster.
Because all nodes in the cluster access the same database, they must all be able to see the same storage. There are several methods supported under Windows for sharing storage. They include the use of raw devices, Oracle ASM (Automatic Storage Management) and the Oracle Cluster Filesystem (OCFS). Oracle ASM and raw devices are available on all platforms that support RAC, but OCFS is only available on Windows and Linux. I have implemented RAC clusters on Windows using all of these methods, and they all work well, but raw devices can be very difficult to administer. I prefer to either use ASM or OCFS. With OCFS and OCFS-2, Oracle supports the use of a shared Oracle Home directory so that there is only one copy of the Oracle binaries.
Oracle RAC is an important feature to Oracle since it provides a scalable solution that allows smaller systems to provide greater overall performance to the cluster. This opens the door for commodity hardware and operating systems such as Microsoft Windows to provide higher performance systems by combining the power of multiple systems into one cluster. Oracle currently supports up to an 8 node cluster with Oracle 10g R2 on Windows, however, the largest RAC cluster on Windows that I have been involved with has been 6 nodes. The most common clusters that I have seen are 4 or 2 node clusters.
I recommend using the largest system that you can, so that as you need more nodes you can add them in one at a time. It is easier to add a node than to add more resources to an existing node. Don’t forget to allocate plenty of memory for Oracle. The RAC cluster is scalable, but you will eventually run out of bandwidth on the interconnect if you add too many nodes (depending on what you are doing).
In general, Oracle RAC is a great product that provides for both scalability and failover capabilities. Microsoft Windows is a great platform to run Oracle RAC on.
Edward Whalen is the Chief Technologist and Founder of Performance Tuning® Corporation and has been working with Oracle for over 15 years. He is currently working on his 9th book; “SQL Server 2005 Administrator’s Companion”. He holds the distinction of being one of the few people to write books on both Oracle and MS SQL Server and writes for both Oracle Press and Microsoft Press.
Oracle on Windows Server 2003 x64 Edition
The introduction of Oracle on Windows Server x64 has overcome a limitation that has existed since the introduction of Oracle on Windows. The number of user sessions is limited with Oracle on Windows due to the way Oracle is architected on Windows and the basic architecture 32-bit chipset. Before I get into that, let’s review the 32-bit architecture and Oracle architecture.
The Intel x86 architecture was introduced in 1978 as a 16-bit processor. This was replaced in 1986 with a 32-bit version. I don’t think that anyone expected the 32-bit architecture to hang around this long. The 32-bit architecture allowed for up to 4 GB of virtual address space and 4 GB of RAM. This was later improved to allow 64 GB of RAM with PAE, but the virtual memory (process address) limits remain at 4 GB.
This remained in effect (for the PC server platform) until the introduction of the AMD Opteron and Intel EM64T processors. The EM64T and Opteron processors can access up to 256 Terabytes of virtual memory ( 2^48 bytes). The architecture allows this limit to be increased to a maximum of 16 exabytes (2^64 bytes). The EM64T and Opteron processors can currently address up to 1 Terabyte of physical memory ( 2^40 bytes). The architecture allows this limit to be increased to a maximum of 4 petabytes (2^52 bytes).
Oracle is made up of many independent processes (or threads) that perform independent functions. The server processes are created when a user connects into the Oracle instance and does work on behalf of the user such as retrieving data from the SGA, or reading data from the data files, modifying data in the SGA and presenting data back to the user. The background processes do work on behalf of the Oracle instance and consist of things such as the DB writer, Log writer, archiver, system and process monitors and many others. These are independent entities that make the Oracle RDBMS work.
In most operating systems the above mentioned entities are processes. In Oracle for Windows these entities are threads. The difference between a process and a thread is how memory is used. In essence, a thread is a subcomponent of a process. A process has its own memory and registers and acts somewhat independently. With the Intel x86 architecture a process can address up to 2 GB of memory, or 3 GB if the /3GB flag is used upon Windows startup. The operating system loads the process’s memory and registers into the CPU and start the process running. Whenever the process’s time slice is up or it puts itself to sleep waiting on an I/O or other resources the CPU unloads the process, loads another process’s memory and registers and starts it up. This is known as a process or context switch. The number of context switches per second can be seen in perfmon under the system object. A context switch is a very expensive operation.
In order to make this more efficient, Oracle was developed to use the Windows thread architecture. A thread is a subcomponent of a process. With a thread, multiple entities can use the same process, thus avoiding context switches. The process allocates the memory and each thread has a pointer into the same memory. The downside of using threads is that all Oracle entities, such as the server processes and background processes share the same 3 GB address space. This has caused an issue with memory, since each Oracle server process and background process can consume significant memory. The end result is that the number of Oracle user sessions can be limited on Windows. Our experiments have shown connection failures at around 300 – 400 users.
If you were to look into the task manager on a Windows system running Oracle you would see a process called oracle.exe. If you expand the columns to include thread count you would see that oracle.exe is made up of a number of threads (20-30, depending on what options you are using).
So, as you can see, the decision to use Windows threads for Oracle has turned out to be somewhat of a liability, since the virtual memory address space limitation has caused the number of supported sessions to be somewhat limited. However, the upside is that the reduction of contexts switches provides a performance improvement.
With the switch from Oracle 32-bit to Oracle 64-bit the virtual memory limitation is no longer an issue. Thus the liability has been turned into an asset. With no more session limitations and the additional advantages that you get from Oracle on Windows such as Active Directory integration, ease of use and stability it has become a great platform for Oracle. Our tests in the lab have seen no connectivity problems well over 1,000 sessions.
I am often asked by customers what platform they should choose for Oracle. I feel that there are many viable platforms. You should look at what you are comfortable with. If you are a Windows shop and are comfortable with Windows and have significant Windows expertise, then this is the platform that you should choose. Adding a foreign OS into a Windows shop just to run Oracle can be a costly mistake.
In the next few weeks I will be elaborating on some of the additional features that makes Oracle on Windows a great platform to use.