Skip Headers
Oracle® Database High Availability Overview
11g Release 1 (11.1)

Part Number B28281-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

2 Oracle Database High Availability Features and Products

Oracle Database offers an integrated suite of high availability solutions that increase availability and eliminate or minimize both planned and unplanned downtime. These solutions help enterprises maintain business continuity 24 hours a day, seven days a week. However, the Oracle high availability solutions go beyond reducing downtime by providing solutions to increase system utilization on the primary and secondary systems and helping to improve overall performance, scalability, and manageability.

The chapter contains the following sections that outline the key impacts of the Oracle high availability features on businesses and applications:

2.1 Oracle High Availability Features and Solutions for Unplanned Downtime

Oracle provides the following features for high availability:

Also, the Oracle HA Solutions and Recovery Times for Unplanned Downtime section provides a summary of the key HA solutions that address different types of unplanned downtime along with the recovery time that can be attained with each solution.

See Also:

2.1.1 Fast-Start Fault Recovery

Oracle provides fast and predictable recovery from system faults and database failures. The Fast-Start Fault Recovery technology included in Oracle Database automatically bounds database recovery time upon startup by using its self-tuned checkpoint processing. This makes recovery time fast and predictable, and improves the ability to meet service level objectives. Oracle's Fast-Start Fault Recovery can reduce recovery time on a heavily-laden database from tens of minutes to a few seconds.

Fast-Start Fault Recovery features include:

  • Predictable, bounded recovery from instance, database, and computer failures

  • Database checkpointing is self-tuning to maintain desired recovery time objective

2.1.2 Oracle Real Application Clusters and Oracle Clusterware

Oracle Real Application Clusters (Oracle RAC) and Oracle Clusterware allow the Oracle Database to run any packaged or custom application across a set of clustered servers. This capability provides the highest levels of availability and the most flexible scalability. If a clustered server fails, Oracle Database continues running on the surviving servers. When more processing power is needed, you can add another server without interrupting access to data.

Oracle RAC enables multiple instances that are linked by an interconnect to share access to an Oracle database. In an Oracle RAC environment, the Oracle Database runs on two or more systems in a cluster while concurrently accessing a single shared database. The result is a single database system that spans multiple hardware systems. This enables Oracle RAC to provide high availability and redundancy during failures in the cluster. Oracle RAC accommodates all system types, from read-only data warehouse (DSS) systems to update-intensive online transaction processing (OLTP) systems.

Oracle Clusterware is software that, when installed on servers running the same operating system, enables the servers to be bound together to operate as if they are one server, and manages the availability of user applications and Oracle databases. Oracle Clusterware also provides all of the features required for cluster management, including node membership, group services, global resource management, and high availability functions:

  • For high availability, you can place Oracle databases (single-instance or Oracle RAC databases), and user applications (Oracle and non Oracle) under the management and protection of Oracle Clusterware so that they restart on process failure or fail over to another node on node failure.

  • For cluster management, Oracle Clusterware presents multiple independent servers as if they are a single-system image or one virtual server. This single virtual server is preserved across the cluster for all management operations, enabling administrators to perform installations, configurations, backups, upgrades, and monitoring functions once. Then, Oracle Clusterware automatically distributes the execution of these management functions to the appropriate nodes in the cluster.

Oracle Clusterware is a requirement for using Oracle RAC and it is the only clusterware that you need for most platforms on which Oracle RAC operates. Although the Oracle Database continues to support select third-party clusterware products on specified platforms, using Oracle Clusterware provides the benefit of dispensing with proprietary vendor clusterware and using an integrated software stack from Oracle that provides disk management with Oracle ASM to data management with the Oracle Database and Oracle RAC. In addition, Oracle Database features, such as Oracle Services, use the underlying Oracle Clusterware mechanisms to provide their capabilities.

Oracle Clusterware requires two clusterware components: a voting disk to record node membership information and the Oracle Cluster Registry (OCR) to record cluster configuration information. The voting disk and the OCR must reside on shared storage. The Oracle Clusterware requires that each node be connected to a private network by way of a private interconnect.

Together, Oracle RAC and Oracle Clusterware provide the following benefits:

  • Ability to tolerate and quickly recover from computer and instance failures

  • Ability to apply Oracle Clusterware upgrades, patch sets, and interim patches in a rolling fashion

    For example, upgrading Oracle Clusterware from Oracle 10g to Oracle 11g, patching Oracle Clusterware from Oracle 10.2.0.3 to 10.2.0.4, and patching Oracle Clusterware from Oracle 10.2.0.2 Bundle 1 to Oracle 10.2.0.2 Bundle 2

  • Rolling upgrades for system and hardware changes

  • Rolling patch upgrades for some interim patches

  • Service relocation and, when you perform additional Fast Application Notification (FAN) and client configuration, applications can receive FAN events and react immediately to achieve fast, automatic, and intelligent connection and failover

  • Load balancing advisory

  • Runtime connection load balancing with Oracle JDBC, OCI, and ODP.NET

  • Flexibility to increase processing capacity using commodity hardware without downtime or changes to the application

  • Comprehensive manageability integrating database and cluster features

  • Scalability across database instances

Specifically, Oracle Clusterware provides the following benefits

  • Automatically restarts failed Oracle processes

  • Automatically manages and fails over Oracle Virtual IP (VIP) on another node in the cluster on node failures

  • Automatically restarts resources from failed nodes on surviving nodes

  • For Oracle RAC databases, all Oracle processes are under the control of Oracle Clusterware by default; for Oracle single-instance databases, you can configure the Oracle processes into a resource group that is under the control of Oracle Clusterware

  • For Oracle and non Oracle applications, Oracle Clusterware provides an application programming interface (API) that enables you to control other Oracle processes with Oracle Clusterware, such as restart or react to failures and certain rules

  • Manages node membership and prevents split brain syndrome in which two or more instances attempt to control the database

  • Provides rolling release upgrades of Oracle Clusterware, with no downtime for applications

2.1.3 Oracle Data Guard

Oracle Data Guard ensures high availability, data protection, and disaster recovery for enterprise data. Data Guard provides a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases to enable Oracle databases to survive disasters and data corruptions. Data Guard maintains standby databases as transactionally consistent copies of the primary (production) database. Then, if the primary database becomes unavailable because of a planned or an unplanned outage, Data Guard can switch any standby database to the primary role, minimizing the downtime associated with the outage. Data Guard can be used with traditional backup, restoration, and cluster techniques to provide a high level of data protection and data availability.With Data Guard, administrators can optionally improve primary database performance by offloading resource-intensive backup and reporting operations to standby systems.

An Oracle Data Guard configuration consists of one primary database and one or more standby databases. Using a backup copy of the primary database, you can create up to nine standby databases and incorporate them in a Data Guard configuration. Once created, Data Guard automatically maintains each standby database by transmitting redo data from the primary database and then applying the redo to the standby database.

Similar to a primary database, a standby database can be either a single-instance Oracle database or an Oracle RAC database.

A standby database can be a physical standby database, a snapshot standby database, or a logical standby database, and a Data Guard configuration can include any combination of these types of standby databases:

  • Physical standby database

    Provides a physically identical copy of the primary database, with datafiles that are identical to the primary database. The database schema, including indexes, are the same. A physical standby database is kept synchronized with the primary database, though Redo Apply, which recovers the redo data received from the primary database and applies the redo to the physical standby database.

    A physical standby database can be used for business purposes other than disaster recovery. Starting in Oracle Database 11g Release 1, the physical standby database can be opened with read-only access while redo is being applied to the standby database. This mode, referred to as real-time query standby, allows users to access an up-to-date physical standby database for queries at any time. Also, you can convert a physical standby database into a snapshot standby database.

  • Snapshot standby database

    A snapshot standby database is an updatable standby database that you create from a physical standby database. A snapshot standby database receives and archives redo data received from the primary database, but the snapshot standby database does not apply redo data from the primary database while the standby is open read/write. Because of this, the snapshot standby typically diverges from the primary database over time. Moreover, local updates to the snapshot standby database cause additional divergence.

    Redo data is not applied until you convert the snapshot standby database back into a physical standby database, and after all local updates to the snapshot standby database are discarded. With a single command, you can revert a snapshot standby back to a physical standby database, at which time the changes made to the snapshot standby state are discarded, and Redo Apply automatically resynchronizes the physical standby database with the primary database using the redo that was archived.

  • Logical standby database

    Contains the same logical information as the primary database, although the physical organization and structure of the data can be different. The logical standby database is kept synchronized with the primary database though SQL Apply, which transforms the data in the redo received from the primary database into SQL statements and then executing the SQL statements on the standby database.

    A key benefit of a logical standby database is that you can create significant auxiliary structures to optimize the reporting workload, including structures that could have a prohibitive effect on the primary database's transactional response time. A logical standby database can have its data physically reorganized into a different storage type with different partitioning, having many different indexes, having on-demand refresh materialized views created and maintained, and it can be used to drive the creation of data cubes and other OLAP data views.

    A logical standby database can be used for other business purposes in addition to disaster recovery requirements. This allows users to access a logical standby database for queries and reporting purposes at any time. Also, using a logical standby database, you can upgrade Oracle Database software and patch sets with almost no downtime. Thus, a logical standby database can be used concurrently for data protection, reporting, and database upgrades.

Oracle Data Guard provides the following benefits:

  • Maintenance of real-time, transactionally consistent database copies to provide protection against unplanned downtime and disaster.

  • Data protection and fast repair against computer failures, human errors, data corruption, lost writes, and site failures.

  • Automatic failover with flexible data protection levels to support all network configurations and business requirements.

  • Faster redo application, redo transport, and role transitions with various enhancements.

  • Reduction of planned downtime for system changes, some platform migrations, hardware and system upgrades, and Oracle patch set and database upgrades (see also Table 2-1, "Outage Types and Oracle High Availability Solutions for Unplanned Downtime").

  • Multiple levels of data protection and performance to balance data availability against system performance requirements.

  • Support for the real-time query capability of physical standby databases provides better primary database performance and better use of standby resources, because real-time query:

    • Diverts read-only queries and transactions from the primary database to standby databases.

    • Diverts backup operations from the primary database to standby databases.

  • Support for both physical standby databases (real-time query) and logical standby databases to provide more efficient use of system resources by diverting more querying and reporting functions from the primary database to standby databases (with the logical standby databases providing greater flexibility for any activity that requires access to a standby database that is open read/write).

  • Support for snapshot standby database for reporting or testing (cloning) purposes and automatic resynchronization with the primary database once reporting or testing has completed.

  • Managed and automatic role transition and application notification to minimize planned and unplanned downtime.

  • Automatic or automated resynchronization of a failed primary database following a failover.

  • Management of all systems as a single configuration for simplified administration.

  • Increased flexibility for Data Guard configurations where the primary and standby systems may have different CPU architectures, operating systems (for example, Windows and Linux), operating system binaries (32-bit and 64-bit), and Oracle database binaries (32-bit and 64-bit); this is subject to restrictions that are defined in MetaLink note 413484.1.

    See Also:

    Metalink note 413484.1 at https://metalink.oracle.com/ for the latest capabilities and restrictions

2.1.4 Oracle Streams

Oracle Streams is a very flexible and powerful database feature to implement fine-grained replication, multimaster replication, many-to-one replication, data transformation, hub and spoke replication, and message queuing.

Comparing Streams and Data Guard

Streams is designed from the ground up for information sharing. It enables highly customized replication strategies to satisfy the many varied uses of data replicated to a target database. These same capabilities can also make Streams a useful technology for addressing HA/DR requirements and for minimizing planned downtime during upgrades to new database releases and patch sets.Data Guard is designed for simple, one-way replication of an entire database expressly for maintaining a synchronized copy that can assume the primary role in the event of a failure. Data Guard Redo Apply (physical standby) best exemplifies this notion of simplicity, as a disaster recovery solution that is both datatype and application agnostic, and able to scale to very high levels of performance. While Data Guard also provides capabilities that enable a standby database to offload the primary database of the overhead of performing backups, queries, and reports, these capabilities are ancillary to Data Guard's primary mission, and are provided to increase the ROI of customer HA/DR investments. You can use Data Guard SQL Apply (logical standby) to get additional value from a Data Guard configuration by minimizing planned downtime during upgrades to new database releases and patch sets.

Streams Messaging and Information Flow

Oracle Streams enables information sharing. Using Streams, each unit of shared information is called a message, and you can share these messages in a stream. The stream can propagate information within a database or from one database to another.

For example, Figure 2-1 shows a multimaster configuration where all sites are directly connected to all other sites participating in the replication environment. This enables data to be replicated between all locations in a near realtime manner.

Figure 2-1 Oracle Streams Multimaster Configuration

Description of Figure 2-1 follows
Description of "Figure 2-1 Oracle Streams Multimaster Configuration"

Another example is the Oracle Streams 1-N, or hub-and-spoke configuration in which changes made at the primary or hub location can be propagated to the remote or spoke locations in a near real-time manner.

Although it is possible to configure a hub-and-spoke configuration for bidirectional replication, you may prefer to restrict updates to a single location, the hub, as shown in Figure 2-2. In query intensive environments, you can still balance the load between multiple locations, with fast local access, while updates are restricted to the hub. By offloading reporting to the spoke locations, you improve performance at the hub, or primary OLTP location. This type of configuration is easier to implement than multimaster replication because it is not necessary to establish connectivity between all locations in the replication environment and it is not necessary to implement a conflict resolution strategy.

Figure 2-2 Information Dissemination with Oracle Streams (1-N configuration)

Description of Figure 2-2 follows
Description of "Figure 2-2 Information Dissemination with Oracle Streams (1-N configuration)"

The stream routes specific information to specific destinations. The result is a feature that provides greater functionality and flexibility than traditional solutions for capturing and managing messages, and sharing the messages with other databases and applications. Streams provides the capabilities needed to build and operate distributed enterprises and applications, data warehouses, and high availability solutions. You can use all of the capabilities of Oracle Streams at the same time. If your business requirements change, then you can implement a new capability of Streams without sacrificing existing capabilities.

As with any Oracle Streams configuration, there are three phases: capture, stage (propagate), and consume (apply). Using Oracle Streams, you control what information is put into a stream, how the stream flows or is routed from database to database, what happens to messages in the stream as they flow into each database, and how the stream terminates. By configuring specific capabilities of Streams, you can address specific requirements. Based on your specifications, Streams can capture, stage, and manage messages in the database automatically, including, but not limited to, data manipulation language (DML) changes and data definition language (DDL) changes. You can also put user-defined messages into a stream, and Streams can propagate the information to other databases or applications automatically. When messages reach a destination, Streams can consume them based on your specifications.

The following figure shows the Oracle Streams information flow.

Figure 2-3 Oracle Streams Information Flow

Description of Figure 2-3 follows
Description of "Figure 2-3 Oracle Streams Information Flow"

With Streams, you can create a local or remote copy of a production database. In the event of human error or a catastrophe, the copy can be used to resume processing. You can use Streams to configure flexible high availability environments.You can use the features of Oracle Streams to achieve little or no database down time during database upgrade and maintenance operations. Maintenance operations include migrating a database to a different platform, migrating a database to a different character set, modifying database schema objects to support upgrades to user-created applications, and applying an Oracle software patch.

Figure 2-4 shows an application that explicitly enqueues and dequeues messages through Oracle Streams Advanced Queuing as a method of sharing information with business partners or customers with different messaging systems. Once enqueued, messages can be transformed and propagated as desired, before being dequeued to a business partner's application that is a non-database oriented messaging systems.

Figure 2-4 Oracle Streams Message Queuing

Description of Figure 2-4 follows
Description of "Figure 2-4 Oracle Streams Message Queuing"

Benefits of Using Oracle Streams

Oracle Streams provides the following benefits:

  • Data protection by maintaining a full or partial remote copy of the database

  • Achieves little or no downtime during database upgrade or maintenance operations such as migrating a database to a different platform or character set, modifying database objects to support upgrades to applications, and applying an Oracle software patch

  • Data replication by capturing DML and DDL changes made to database objects and replicating these changes to one or more other databases; bidirectional replication environment, in which exactly two databases share the replicated database objects and data, is possible.

  • Event management and notification by enqueuing messages or capturing events, propagating the messages and events through queues, and dequeuing and applying or acting upon the message or event (as shown in Figure 2-4)

  • Supports heterogeneous platforms across databases within the configuration

  • Allows character sets to differ between replicas

  • Permits fine-grained control of data sharing

Note:

Although Oracle Streams requires some initial implementation investment, it is well worth the effort because of the high flexibility you get with Streams.

2.1.5 Transportable Technologies

Transportable technologies provides transportable database and transportable tablespace:

  • Transportable database is used to transport an entire database (user data and the Oracle dictionary) to a new platform with the same endian format. Transportable database permits a minimal downtime migration to a new platform by avoiding the time-consuming method of unloading all user data from the source database and loading it into the target database.

  • Transportable tablespaces is a feature designed to move a subset of one database into another, even among platforms that differ in endian format. The cross-platform capability of transportable tablespaces can be used to migrate all user data within a database to a new platform with a different endian format. Leveraging transportable tablespaces in this manner permits a minimal downtime migration to a new platform by avoiding the time-consuming method of unloading all user data from the source database and loading it into the target database.

See Also:

Oracle Database Administrator's Guide for details about how to move or copy tablespaces to another database, including details about transporting tablespaces across platforms

2.1.6 Oracle Flashback Technology

Flashback technology provides a set of features to switch between views of the data as it existed at different points in time. Using flashback features you can query past versions of schema objects and historical data. You can also perform change analysis and self-service repair to recover from logical corruption while the database is online.

Flashback technology provides a SQL interface to quickly analyze and repair human errors. Flashback provides fine-grained analysis and repair for localized damage such as deleting the wrong customer order. Flashback technology also enables correction of more widespread damage, yet does it quickly to avoid long downtime. Flashback technology is unique to Oracle Database and supports recovery at all levels including row, transaction, table, tablespace, and database.

Most of the flashback features use undo data while other features, such as Flashback Database and Block Media Recovery, use flashback logs:

  • Undo tablespace—A dedicated tablespace that stores only undo information when the database is run in automatic undo management mode.

  • Flashback data archive—An archive that is stored in a tablespace and contains transactional changes to every record in a table for the duration of the record's lifetime. The archived data can be retained for much longer duration than the retention period offered by an undo tablespace.

  • Flashback logs—Oracle-generated logs used to perform flashback operations. The database can only write flashback logs to the flash recovery area. Flashback logs are written sequentially and are not archived. They cannot be backed up to disk.

The following list describes the Flashback features:

  • Oracle Flashback Query

    Oracle Flashback Query provides the ability to view the data as it existed in the past by using the Automatic Undo Management system to obtain metadata and historical data for transactions. Undo data is persistent and survives a database malfunction or shutdown. The unique features of Flashback Query not only provide the ability to query previous versions of tables, they also provide a powerful mechanism to recover from erroneous operations.

    Uses of Flashback Query include:

    • Recovering lost data or undoing incorrect, committed changes. For example, rows that have been deleted or updated can be immediately repaired even after they have been committed.

    • Comparing current data with the corresponding data at some time in the past. For example, using a daily report that shows the changes in data from yesterday, it is possible to compare individual rows of table data, or find intersections or unions of sets of rows.

    • Checking the state of transactional data at a particular time, such as verifying the account balance on a certain day.

    • Simplifying application design by removing the need to store certain types of temporal data. Using a Flashback Query, it is possible to retrieve past data directly from the database.

    • Applying packaged applications, such as report generation tools, to past data.

    • Providing self-service error correction for an application, enabling users to undo and correct their errors.

  • Oracle Flashback Versions Query

    Oracle Flashback Versions Query is an extension to SQL that you can use to retrieve the versions of rows in a given table that existed in a specific time interval. Oracle Flashback Versions Query returns a row for each version of the row that existed in the specified time interval. For any given table, a new row version is created each time the COMMIT statement is executed.

    Flashback Versions Query is a powerful tool for the DBA to run analysis to determine the sources of problems. Additionally, application developers can use Flashback Versions Query to build customized applications for auditing purposes.

  • Oracle Flashback Transaction

    Oracle Flashback Transaction is a new feature in Oracle Database 11g Release 1 that can easily back out a transaction and its dependent transactions. The DBMS_FLASHBACK.TRANSACTION_BACKOUT() procedure rolls back a transaction and its dependent transactions while the database remains online. This recovery operation uses undo data to create and execute the compensating transactions that return the affected data to its original state. You can query the DBA_FLASHBACK_TRANSACTION_STATE view to see the current state of a transaction with respect to whether the transaction has been backed out using dependency rules or forced out by either:

    • Backing out nonconflicting rows

    • Applying undo SQL

    Oracle Flashback Transaction increases availability during logical recovery by easily and quickly backing out a specific transaction or set of transactions and their dependent transactions, with one command while the database remains online.

  • Oracle Flashback Transaction Query

    Oracle Flashback Transaction Query provides a mechanism to view all changes made to the database at the transaction level. When used in conjunction with Flashback Versions Query, it offers a fast and efficient means to recover from a human or application error. Flashback Transaction Query increases the ability to perform online diagnosis of problems in the database by returning the database user that changed the row, and performs analysis and audits on transactions.

  • Oracle Flashback Table

    Oracle Flashback Table enables you to recover a table to a previous point in time. It provides a fast, online solution for recovering a table or set of tables that has been modified by a human or application error. In most cases, Flashback Table alleviates the need for administrators to perform more complicated point-in-time recovery operations. Even after a flashback, the data in the original table is not lost; it can later be reverted back to the original state.

  • Oracle Flashback Drop

    Dropping objects by accident is a problem for database users and DBAs alike. While there is no easy way to recover dropped tables, indexes, constraints, or triggers, Oracle Flashback Drop provides a safety net when dropping objects. When you drop a table, Oracle automatically places it into the Recycle Bin. The Recycle Bin is a virtual container where all dropped objects reside. You can continue to query data in a dropped table.

  • Oracle Flashback Restore Points

    When an Oracle Flashback recovery operation is performed on the database, the DBA must determine the point in time—identified by the System Change Number (SCN) or timestamp—to which the data can later be flashed back. Oracle Flashback restore points are labels you can define that can be substituted for the SCN or transaction time used in Flashback Database, Flashback Table, and Recovery Manager (RMAN) operations. Furthermore, a database can be flashed back through a previous database recovery and open resetlogs by using guaranteed restore points. Guaranteed restore points allow major database changes—such as database batch jobs, upgrade, or patch—to be quickly undone by ensuring that the undo required to rewind the database is retained.

    Using the Oracle Flashback restore points feature provides the following benefits:

    • Provides the ability to quickly restore to a consistent state, to a point in time that was prior to a planned operation that has gone awry (for example, a failed batch job, an Oracle software upgrade, or an application upgrade.

    • Allows the snapshot standby to be resynchronized with the production database.

    • Allows for a quick mechanism to restore a test or cloned database back to its original state.

  • Oracle Flashback Database

    Oracle Flashback Database provides a more efficient alternative to database point-in-time recovery. With Oracle Flashback Database, current datafiles can be reverted to their contents at a past time. The result is much like restoring data from data file backups and executing point-in-time database recovery. However, Flashback Database skips the data file restoration and most of the application of redo data.

    Enabling Oracle Flashback Database provides the following benefits:

    • Eliminates the time to restore a backup when fixing human error that has a database-wide impact.

    • Because human errors can be quickly undone, it allows standby databases to use real-time apply to synchronize with the primary database.

    • Allows quick standby database reinstantiation after a database failover.

  • Block Recovery Using Flashback Logs

    Starting with Oracle Database release 11.1, block recovery can optionally retrieve a more recent copy of a data block from the flashback logs to reduce recovery time. Furthermore, a corrupted block encountered during instance recovery does not result in instance recovery failing. The block is automatically marked as corrupt and added to the RMAN corruption list in the V$DATABASE_BLOCK_CORRUPTION table. You can subsequently issue the RMAN RECOVER BLOCK command to fix the associated block.

  • Flashback Data Archive

    An archive that is stored in a tablespace and contains transactional changes to every record in a table for the duration of the record's lifetime. The archived data can be retained for much longer duration than the retention period offered by an undo tablespace.

2.1.7 Automatic Storage Management

Automatic Storage Management (ASM) provides a vertically integrated file system and volume manager directly in the Oracle kernel, resulting in:

  • Significantly less work to provision database storage

  • Higher level of availability

  • Elimination of the expense, installation, and maintenance of specialized storage products

  • Unique capabilities for database applications

For optimal performance, ASM spreads files across all available storage. To protect against data loss, ASM extends the concept of SAME (stripe and mirror everything) and adds more flexibility in that it can mirror at the database file level rather than the entire disk level.

More importantly, ASM simplifies the processes of setting up mirroring, adding disks, and removing disks. Instead of managing hundreds and possibly thousands of files (as in a large data warehouse), DBAs using ASM create and administer a larger-grained object called a disk group. The disk group identifies the set of disks that are managed as a logical unit. Automation of file naming and placement of the underlying database files save DBAs time and ensures adherence to standard best practices.

The ASM native mirroring mechanism (2-way or 3-way) is an option that protects against storage failures. With ASM mirroring, an additional level of data protection can be provided with the use of failure groups. A failure group is a set of disks sharing a common resource (disk controller or an entire disk array) whose failure can be tolerated. Once defined, an ASM failure group intelligently places redundant copies of the data in separate failure groups. This ensures that the data is available and transparently protected against the failure of any component in the storage subsystem.

ASM provides the following benefits:

  • Provides the ability to mirror and stripe across drives and storage arrays

  • Automatically re-mirrors from a failed drive to remaining drives

  • Automatically rebalances stored data when disks are added or removed while the database remains online

  • Allows for operational simplicity in managing database storage

  • Manages OCR and voting disks

  • Provides local read capability, which gives better performance in an extended cluster

  • Supports very large databases

  • Supports ASM rolling upgrades

  • Supports finer granularity in tuning and security

  • ASM Fast Mirror Resync, which provides fast repair after a temporary disk failure

2.1.8 Recovery Manager

Recovery Manager (RMAN) is an Oracle utility to manage the backup and, more importantly, the recovery of the database. It eliminates operational complexity while providing superior performance and availability of the database.

Recovery Manager determines the most efficient method of executing the requested backup, restoration, or recovery operation and then submits these operations to the Oracle Database server for processing. Recovery Manager and the server automatically identify modifications to the structure of the database and dynamically adjust the required operation to adapt to the changes.

RMAN provides the following benefits:

  • Automated channel failover on backup and restore operations

  • Automatic failover to a previous backup when the restore operation discovers a missing or corrupt backup

  • Automated creation of new database and temporary files during recovery

  • Automated recovery through a previous point-in-time recovery—recovery through resetlogs

  • Block media recovery enables the datafile to remain online while fixing the block corruption

  • Fast incremental backups using block change tracking

  • Fast backup and restore operations with intrafile and interfile parallelism

  • Enhanced security with Virtual Private Catalog

  • Lower space consumption when creating a database over the network by eliminating staging areas

  • Merge incremental backups into image copies in the background providing up-to-date recoverability

  • Optimized backup and restore of required files only

  • Retention policy ensures that relevant backups are retained

  • Ability to resume backup and restore of previously failed operations

  • Automatic backup of the control file and the server parameter file ensuring that backup metadata is available in times of database structural changes as well as media failure and disasters

  • Online backup does not require the database to be placed into hot backup mode

2.1.9 Oracle Secure Backup

Oracle Secure Backup (OSB) is a centralized tape backup management solution providing performant, heterogeneous data protection in distributed UNIX, Linux, Windows, and Network Attached Storage (NAS) environments. By protecting file system and Oracle database data, OSB provides a complete tape backup solution for your IT environment.

OSB is tightly integrated with Recovery Manager (RMAN) to provide the media management layer for RMAN, supporting releases since Oracle9i. With optimized integration points, OSB and RMAN provide the fastest most efficient tape backup for the Oracle database.

You can backup distributed servers to local and remote tape devices from a central OSB administrative server using backup policies, calendar based scheduling for lights out operations, or on-demand backup for immediate requirements. With its highly scalable client/server architecture, OSB provides local and remote data protection, leveraging SSL for secure intradomain communication and two-way server authentication.

The following list describes the key benefits of OSB:

  • Optimized tape backup for the Oracle database by backing up only the currently used blocks and increasing backup performance by 10% to 25%.

  • Policy based management allows backup administrators to exercise precise control over the backup domain.

  • Dynamic drive sharing for increased tape resource use.

  • Heterogeneous storage area network (SAN) support allowing NAS, UNIX, Windows, and Linux to share tape drives and media.

  • File system backup at the file, directory, file system or raw partition level with full, incremental and offsite backup scheduling.

  • Integrated with Oracle Enterprise Manager, providing an intuitive, familiar interface.

  • Backup encryption to tape.

  • Broad tape-device support for new and legacy tape devices in SAN and SCSI environments.

  • Network Data Management Protocol (NDMP) support for highly efficient backup of NAS filers.

  • Scalable, low-cost licensing model reduces IT costs and operational considerations.

2.1.10 Data Recovery Advisor

Data Recovery Advisor, a new feature in Oracle Database 11g release 1 (11.1), automatically diagnoses persistent (on disk) data failures, presents appropriate repair options, and runs repair operations at your request.

Note:

Note that the initial release of Data Recovery Advisor does not support Oracle RAC. In addition, while you can use Data Recovery Advisor when managing a primary database in a Data Guard configuration, you cannot use Data Recovery Advisor to troubleshoot a physical standby database. Data Recovery Advisor only takes the presence of a standby database into account when recommending repair strategies if you are using Enterprise Manager 11g Grid Control.

Data Recovery Advisor includes the following functionality:

  • Failure Diagnosis

    The first symptoms of database failure are usually error messages, alarms, trace files and dumps, and failed health checks. Assessing these symptoms can be complicated, error-prone and time-consuming. Data Recovery Advisor automatically diagnoses data failures and informs you about them.

  • Failure Impact Assessment

    After a failure is diagnosed, you must understand its extent and assess its impact on applications before devising a repair strategy. Data Recovery Advisor automatically assesses the impact of a failure and displays it in a easily understood format.

  • Repair Generation

    You typically have several repair options available, offering trade-offs in recovery time and potential data loss. If there are multiple failures present, you must also determine the best sequence of repair steps. In some situations it can be advantageous to consolidate repairs. Data Recovery Advisor does all this for you, automatically determining the best repair options.

  • Repair Feasibility Checks

    Before presenting repair options, Data Recovery Advisor validates them with respect to the specific environment and availability of media components required to complete the proposed repair. The feasibility check is fast and validates if the required backups are available. The actual contents of these backups will be validated during repair.

  • Repair Automation

    If you accept the suggested repair option, Data Recovery Advisor automatically performs the repairs, verifies that the repair was successful, and closes the appropriate failures.

  • Validation of Data Consistency and Database Recoverability

    Data Recovery Advisor can validate the consistency of your data, as well as backups and redo stream, whenever you choose.

  • Early Detection of Corruption

    Through Health Monitor, you can schedule periodic runs of Data Recovery Advisor diagnostic checks to detect, analyze, and repair data failures before a database process executing a transaction discovers the corruption and signals an error. Early warnings can limit the damage caused by corruption.

  • Integration of Data Validation and Repair

    Data Recovery Advisor is a single tool for data validation and repair.

See Also:

ÒDiagnosing and Repairing Failures with the Data Recovery AdvisorÓ in Oracle Database Backup and Recovery User's Guide

2.1.11 Flash Recovery Area

The flash recovery area is a unified storage location for all recovery-related files and activities in Oracle Database. After this feature is enabled, all RMAN backups, archive logs, control file autobackups, and datafile copies are automatically written to a specified file system or automatic storage management disk group, and the management of this disk space is handled by RMAN and the database server.

Making a backup to disk is faster because using the flash recovery area eliminates the bottleneck of writing to tape. More importantly, if database media recovery is required, then datafile backups are readily available. Restoration and recovery time is reduced because you do not need to find a tape and a free tape device to restore the needed datafiles and archive logs.

The flash recovery area provides:

  • Unified storage location of related recovery files

  • Management of the disk space allocated for recovery files, which simplifies database administration tasks

  • Fast, reliable disk-based backup and restoration

  • Ability to backup and restore the entire flash recovery area

  • Ability to tolerate failures to the flash recovery area

2.1.12 Oracle Security Features

The best protection against human errors is to prevent their occurrence. The best way to prevent human errors is to restrict user access to only those data and services truly needed to perform business functions. Oracle provides a wide range of security tools to control access to application data by authenticating database users and then enabling administrators to grant them only those privileges required to perform their duties.

In addition, the Oracle Database security model provides the ability to restrict data access at a row level using Virtual Private Database, thereby further isolating database users from data that they do not need to access.

Oracle security features include:

  • Authentication control to validate the identities of entities using networks, databases, and applications. Network sessions between databases, such as redo transport sessions, are also authenticated.

  • Authorization control to provide limits to access and actions linked by database user identities and roles.

  • Access control to objects, providing protection regardless of the entity seeking to access or alter them.

  • Auditing control to monitor and gather data about specific database activities, investigate suspicious activity, deter users (or others) from inappropriate activities, and detect problems with authorization or access control implementation.

  • Security policy management using profiles.

  • Encryption of data residing within the database and backups, or transferred to and from databases.

2.1.13 LogMiner

Oracle log files contain useful information about the activities and history of the Oracle database. Log files contain all data necessary to perform database recovery, and also record all changes made to the data and metadata within the database.

LogMiner is a fully relational tool that allows redo log files to be read, analyzed, and interpreted using SQL. Analysis of the log files with LogMiner can be used to:

  • Track or audit changes to data

  • Provide supplemental information for tuning and capacity planning

  • Retrieve critical information for debugging complex applications

  • Recover deleted data

  • Provide additional browser-based simplification to help troubleshoot and resolve logical failures

LogMiner features include:

  • Pinpoint when a logical corruption to the database—such as errors made at the application level—may have occurred

  • Determine the necessary actions to perform fine-grained recovery at the transaction level

  • Performance tuning and capacity planning through trend analysis

  • Perform post auditing

2.1.14 Hardware Assisted Resilient Data (HARD) Initiative

The Hardware Assisted Resilient Data (HARD) Initiative is a joint initiative between Oracle and hardware vendors to prevent data corruptions from being written out to disk. Data corruption is very rare, but when it happens, it can have a catastrophic effect on a database, and therefore a business.

Under the HARD Initiative, Oracle works with selected system and storage vendors to build operating system and storage components that can detect corruption early and prevent corrupted data from being written to disk. The key approach is block checking where the storage subsystem validates the Oracle block contents.

Any datafiles and log files located on HARD-compliant storage is protected. You must also enable the HARD validation feature on the storage, using the vendor-provided interface. When Oracle writes data to the storage, the storage system validates the data. If the data appears to be corrupted, then the write is either rejected with an error, or it is accepted with an error logged by the storage in the internal logs.

Figure 2-5 Oracle Data Validation

Description of Figure 2-5 follows
Description of "Figure 2-5 Oracle Data Validation"

Storage vendors may choose to implement some or all of the checks in their implementation. Also, each vendor's implementation is unique and their control interfaces may have different features.

See Also:

The HARD initiative page for the latest vendor and implementation information at http://www.oracle.com/technology/deploy/availability/htdocs/HARD.html

2.1.15 Hang Manager

The Hang Manager, a new feature in Oracle Database 11g Release 1, simplifies database management by enabling you to more quickly resolve database hangs. It automatically detects, analyzes, and dumps diagnostic information for hangs in Oracle Database environments, including Oracle RAC and ASM databases.

The Hang Manager is enabled by default in Oracle RAC databases and ASM instances. Thus, you can use Hang Manager to follow a hang from the database instance to the underlying ASM instance.

Active entities that attempt to obtain restrictive access to shared resources or request services from other Oracle Database processes, sessions, and transactions are in danger of hanging. A hang chain is a chain of processes with each one waiting on a resource held by the next, with a single process serving as the root of the hang.

Hangs in Oracle Database can cost a great deal in terms of system unavailability. Specifically, hangs lead to the following problems:

  • Extended system outages. These outages may occur frequently before a fix is found, which adds to the total downtime.

  • Difficult analysis: Analyzing the hang to determine where the problem lies can be lengthy, complex, and prone to error.

See Also:

Oracle Database Reference for more information about the views used to determine hangs

2.1.16 Data Block Corruption Prevention and Detection Parameters

Prior to Oracle Database 11g, RMAN-detected block corruptions were recorded in V$DATABASE_BLOCK_CORRUPTION. In Oracle Database 11g, several database components and utilities, including RMAN, can now detect a corrupt block and record it in that view. Oracle Database automatically updates this view when block corruptions are detected or repaired (for example, using block media recovery or data file recovery). The benefit is that the time it takes to discover block corruptions is shortened.

In addition, you can use the DB_ULTRA_SAFE initialization parameter to automatically configure the appropriate data protection block checking level in the database. The performance impact may vary depending on the application and available system resources, but the effect can vary from 1% to 10%.

The DB_ULTRA_SAFE initialization parameter:

  • Controls the setting of other related initialization parameters, including DB_BLOCK_CHECKING, DB_BLOCK_CHECKSUM, and DB_LOST_WRITE_PROTECT

  • Controls other data protection behavior in the Oracle Database, such as requiring ASM to perform sequential mirror writes

By making it possible to detect data corruptions in a timely manner, these features provide critical high availability benefits for the Oracle database.

See Also:

Oracle Database Reference for more information about these views and initialization parameters

2.1.17 Oracle HA Solutions and Recovery Times for Unplanned Downtime

Oracle provides high availability solutions to prevent, tolerate and reduce downtime for all types of unplanned failures.

Table 2-1 describes the various Oracle high availability solutions for unplanned downtime along with the recovery time that can be attained with each solution. The table shows how the features discussed in Section 2.1.1 through Section 2.1.16 can be used to address various causes of unplanned downtime. Also, see Table 4-4 for a summary of the attainable recovery times for all types of unplanned downtime for each Oracle high-availability architecture.

Table 2-1 Outage Types and Oracle High Availability Solutions for Unplanned Downtime

Outage Type Oracle Solution Benefits

Computer Failures

Oracle Real Application Clusters and Oracle Clusterware


  • Automatic recovery of failed nodes and instances.

  • Fast application notification with integrated Oracle client failover.

Computer Failures

Fast-Start Fault Recovery


  • Tunable and predictable cache recovery from computer failures.

Computer Failures

Oracle Data Guard


  • Fast-start failover and fast application notification with integrated Oracle clients.

Computer Failures

Oracle Streams


  • Online replica database resume processing.

Storage Failures

Automatic Storage Management


  • Mirroring and online automatic rebalance places redundant copies of the data in separate failure groups.

Storage Failures

Oracle Data Guard


  • Fast-start failover and fast application notification with integrated Oracle clients

Storage Failures

Recovery Manager with Flash Recovery Area

  • Fully managed database recovery and managed disk-based backups

Storage Failures

Oracle Streams


  • Online replica database resumes processing.

Human Errors

Oracle Security Features


  • Restrict access as prevention

Human Errors

Oracle Flashback Technology


  • Fine-grained and database-wide rewind capability

Human Errors

LogMiner


  • Redo log analysis

Data Corruption

Hardware Assisted Resilient Data (HARD) Initiative


  • Corruption prevention within a storage array.

Data Corruption

Data Block Corruption Prevention and Detection Parameters

Database initialization settings such as DB_ULTRA_SETTINGS, DB_BLOCK_CHECKING, DB_BLOCK_CHECKSUM

  • Different levels of block corruption prevention and detection at the database level.

Data Corruption

Data Recovery Advisor and Recovery Manager with Flash Recovery Area

  • Data Recovery Advisor automatically detects data corruptions and advises you about the best recovery plan.

  • RMAN online block-media recovery time in Oracle Database 11g is faster, because RMAN can now use flashback logs to restore a more current copy of the data block for recovery.

Data Corruption

Oracle Data Guard


  • Fast-start failover and fast application notification with integrated Oracle clients.

Data Corruption

Oracle Streams


  • Online replica database resumes processing.

Lost writes

Oracle Data Guard, Recovery Manager, and the DB_LOST_WRITE_PROTECT initialization parameter

Also, setting DB_ULTRA_SETTINGS to DATA_ONLY or DATA_AND_INDEX automatically enables DB_LOST_WRITE_PROTECT

  • DB_LOST_WRITE_PROTECT initialization parameter provides lost write detection.

  • If a lost write is detected on the primary database either by the physical standby database or during media recovery of the primary database, recovery is stopped to preserve the consistency of the database. However, data loss is expected.

  • If a lost write is detected on the standby database, you can restore the affected file and restart managed recovery if the lost write is isolated and the hardware problem is corrected.

    Note: Lost writes can corrupt the entire database. In many cases, you need to rebuild the affected database after resolving the hardware issue.

Lost writes

Hardware Assisted Resilient Data (HARD) Initiative


  • Detection and prevention of lost writes resulting from stray or misdirected writes. Some vendors in the HARD program have not implemented this additional protection. For the most comprehensive lost write protection, use Oracle Data Guard.

  • For the most comprehensive lost write protection, use Oracle Data Guard and set either the DB_ULTRA_SETTING parameter (to DATA_ONLY or DATA_AND_INDEX) or set the DB_LOST_WRITE_PROTECT parameter (to TYPICAL or FULL) on both the primary and standby databases.

Hangs or slow down

Hang Manager and Oracle Enterprise Manager

  • Hang Manager monitors for database hangs and attempts to resolve them automatically.

  • Oracle Enterprise Manager can be configured to detect application or response time slowdown and react to these SLA breaches. For example, you can configure the Enterprise Manager Beacon to monitor and detect application response times.

Site Failures

Oracle Data Guard


  • Fast-start failover and fast application notification with integrated Oracle clients

Site Failures

Oracle Streams


  • Online replica database resumes processing.

Site Failures

Recovery Manager



2.2 Oracle High Availability Features and Solutions for Planned Downtime

Planned downtime can be just as disruptive to operations as unplanned downtime. This holds especially true for global enterprises that need to support users in multiple time zones, or for those that need to provide Internet access to customers 24 hours a day, seven days a week.

In the past, planned downtime became necessary when performing periodic maintenance or when migrating to new deployments. Periodic maintenance—such as patching or reconfiguring the system—may be necessary to update the database, application, operating system, middleware, or network. New deployments include major upgrades or new rollouts of the hardware, database, application, operating system, middleware, or network.

Oracle provides the following high availability solutions to eliminate or reduce planned downtime for system and database changes, data changes, and application changes:

2.2.1 Dynamic Resource Provisioning

This section describes dynamic resource provisioning under the following topics:

2.2.1.1 Dynamic Reconfiguration of the Database

Oracle continues to broaden support for dynamic reconfiguration of the database, enabling it to adapt to changes in hardware demands without any service interruptions. Oracle Database dynamically accommodates various changes to hardware and database configurations:

  • Add and remove processors from an SMP server

  • Add and remove nodes and instances in an Oracle RAC environment

  • Dynamically grow and shrink its shared memory allocation and automatically tune memory online using Automatic Shared Memory Management

  • Add and remove database disks online without disturbing database activities using Automatic Storage Management (ASM)

  • Add and remove storage arrays online without disturbing database activities using ASM

  • Automatically rebalance the I/O load across the database storage using ASM

  • Move datafiles online when adding or dropping disks using ASM, which automatically rebalances database storage whenever the storage configuration is changed

  • Change almost all initialization parameters without shutting down the instance by using the SQL*Plus ALTER SESSION statement to change the value of a parameter during a session, or the ALTER SYSTEM statement to change the value of a parameter in all sessions of an instance for the duration of the instance

These capabilities provide no-cost system changes and capacity on-demand provisioning, both of which are fundamental requirements of enterprise Grid computing.

2.2.1.2 Autotuning Memory Management

Oracle Database 11g release 1 (11.1) introduces two new memory management initialization parameters, MEMORY_TARGET and MEMORY_MAX_TARGET that enable automatic management of the system global area (SGA), program global area (PGA), and other memory required to run Oracle Database.

Note:

MEMORY_MAX_TARGET is the value up to which MEMORY_TARGET can grow dynamically. If these initialization parameters are left at their default values (0), then Oracle Database does not autotune memory. If one parameter is set to a nonzero value and other is not set, then Oracle Database internally sets both parameters to the nonzero value.

Oracle Database uses a noncentralized policy to free and acquire memory in each subcomponent of the SGA and the PGA. Oracle Database autotunes memory by prompting the operating system to transfer granules of memory from less needy to more needy components. The granularity of the memory transfer is dependent on the current free memory and the amount of memory the operating system needs to maintain a basic level of service.

Note:

Automatic memory management with the MEMORY_TARGET and MEMORY_MAX_TARGET initialization parameters is supported on Linux, Windows, Solaris, HP-UX, and AIX. Check the Oracle Database Concepts and Oracle Database Administrator's Guide for more information about all supported platforms.

2.2.1.3 Automated Distribution of Datafiles, Control Files, and Log Files

ASM automates and simplifies the layout of datafiles, control files, and log files. Database files are automatically distributed across all available disks. Database storage is rebalanced whenever the storage configuration changes, including adding and removing disks or storage arrays. ASM provides redundancy through the mirroring of database files, and provides optimal performance by automatically striping database files across available disks.

See Also:

Oracle Database Concepts and Oracle Database Storage Administrator's Guide for more information about Automatic Storage Management

2.2.2 Oracle High Availability Solutions and Recovery Times for Planned Downtime

Oracle provides high availability solutions to prevent, tolerate and reduce downtime for all types of unplanned failures.

Table 2-2 describes the various Oracle high availability solutions for planned downtime along with the outage time that can be attained with each solution and their known considerations. In all cases, Oracle recommends you perform extensive testing prior to performing any rolling upgrade. Also, see Table 4-5 for a summary of the attainable recovery times for all types of planned downtime for each Oracle high-availability architecture.

Table 2-2 Oracle High Availability Solutions for Planned Downtime

Maintenance Type Oracle Recommended Solution Solution Description Outage Time

Operating system and hardware upgrades

Oracle Real Application Clusters and Oracle Clusterware


Section 2.2.2.1


No downtime

Oracle interim patches

Oracle Real Application Clusters (Oracle RAC)

Section 2.2.2.3


No downtimeFoot 1 

Online Patches

Online Patching

Section 2.2.2.4


No downtime

Oracle Clusterware upgrades and patches

Cluster Ready Services (CRS)

Section 2.2.2.5


No downtime

ASM upgrades

Automatic Storage Management


Section 2.2.2.6


No downtime

Storage migrationFoot 2 

Automatic Storage Management


Section 2.2.2.7


No downtime

Migrating to ASM or migrating a single-instance database to Oracle RAC

Oracle Data Guard


Section 2.2.2.2


Seconds to minutes

Patch set and database upgrades

Oracle Data Guard using SQL Apply and logical standby databases

Section 2.2.2.8


Seconds to minutes

Platform Migration Across Windows and Linux Platforms

Oracle Data Guard


Section 2.2.2.8


Seconds to minutes

Platform Migration across same endian format platforms

Transportable Database

Section 2.2.2.9


Minutes to hours

Platform migration across different endian format platforms

Transportable Tablespace

Section 2.2.2.10


Minutes to hours

Application upgrades

Online Application Maintenance and Upgrades


Section 2.2.4




Footnote 1 Patches that cannot be applied by performing a rolling upgrade can be applied with the MINIMIZE_DOWNTIME option of the OPatch utility to reduce the availability impact of the patch application.

Footnote 2 An example is migration from traditional storage to low-cost storage

See Also:

2.2.2.1 Avoiding Downtime During Operating System and Hardware Upgrades

Using Oracle RAC is the recommended solution for avoiding downtime during system and hardware upgrades.

If you cannot perform the upgrade using Oracle RAC, then Oracle Data Guard and physical standby databases are the recommended solution. See Section 2.2.2.2 for more information.

Oracle RAC Solution Description

Perform the following steps:

  1. Stop the application service.

    This implicitly redirects connections off of the target instance when using FAN.

  2. Shut down target instance or instances with the IMMEDIATE option.

  3. Shut down and disable Oracle Clusterware.

    Disable is used to prevent startup at boot time.

  4. Perform maintenance.

  5. Enable and start Oracle Clusterware.

    This step implicitly starts the database instances.

  6. Start the application service.

    This step implicitly redirects connections back on to the target instance when using FAN.

  7. Repeat all steps on the next node.

Additional Considerations

In addition, verify the following:

  • Ensure the planned maintenance can be done in a rolling fashion from an operating system perspective.

  • Ensure the database and clusterware versions are certified with the new system and hardware changes.

    See Also:

    Your operating system-specific Oracle Real Application Clusters Installation Guide

2.2.2.2 Using Oracle Data Guard for System and Cluster Upgrades and Migrations

Oracle Data Guard and physical standby databases are the recommended solution for performing system and cluster upgrades that are not upgradeable using Oracle RAC rolling upgrades. Oracle Data Guard is also recommended for migrations to ASM, Oracle RAC, 64-bit systems, Windows to Linux or Linux to Windows, or same processor architecture platforms.

For example, Oracle Data Guard can be used for system upgrades that cannot be upgraded using Oracle RAC rolling upgrades due to system restrictions or if migrating to ASM, from a noncluster environment to Oracle RAC, to a different platform with the same endian format or to a different platform with the same processor architecture.

In general, you first upgrade the physical standby database and then perform a Data Guard switchover to the physical standby database, as follows:

  1. Upgrade the system or change the physical standby database system to your target environment.

    For example, you can convert the standby database from a single-instance database to an Oracle RAC database by using ASM, without any impact on the primary database. Then, restart the standby database, ensure that it matches your target environment, and wait for Redo Apply to finish applying all redo data to the standby database.

  2. Perform a Data Guard switchover—optimally the switchover should take only seconds to minutes.

  3. Shut down the original primary database (now the standby database).

  4. Upgrade or make system changes to the original primary database.

  5. Restart it as a standby database and allow recovery to synchronize the databases.

  6. Optionally, perform a Data Guard switchover to return the standby database to the primary database role.

Additional Considerations

  • For fastest switchover, the standby database should be using real-time apply and synchronized prior to the switchover operation.

  • This is the best approach if Oracle RAC rolling upgrade or online patching is not possible.

  • The conversion from 32 to 64 bit is automatic if you are applying an Oracle Database patch set or doing an Oracle Database upgrade at the same time. If you are upgrading only the operating system, you may need to perform additional post-upgrade steps that are described in Metalink note 414043.1. Also, see the Oracle Database Upgrade Guide for more information about upgrades.

2.2.2.3 Oracle Interim Database Patches

Using Oracle RAC is the recommended solution for avoiding downtime when applying Oracle interim database patches. If you cannot apply patches using Oracle RAC, then Oracle Data Guard and physical standby databases are the recommended solution. See Section 2.2.2.2 for more information.

Solution Description

Oracle interim (one-off) patches to database software are usually applied to implement known fixes for software problems, or to apply diagnostic patches to gather information about a problem. Applying patches is often performed during a schedule maintenance outage.

Oracle provides the capability to do rolling patch upgrades with Oracle RAC with little or no database downtime using the opatch command-line utility.

An Oracle RAC rolling upgrade enables all but one of the instances of the Oracle RAC installation to be available during the scheduled outage. This means that the impact on the application downtime required for scheduled outages is further reduced. The Oracle opatch utility enables you to apply the patch successively to the different instances in an Oracle RAC installation.

Additional Considerations

Performing a rolling upgrade is possible only for patches that are certified for rolling upgrades. Typically, patches that can be installed in a rolling upgrade include:

  • Patches that do not affect the contents of the database, such as the data dictionary

  • Patches not related to Oracle RAC internode communication

  • Patches related to client-side tools such as SQL*Plus, Oracle utilities, development libraries, and Oracle Net

  • Patches that do not change shared database resources, such as datafile headers, control files, and common header definitions of kernel modules

Do not use Oracle RAC to perform rolling upgrades of patch sets.

See Also:

Your operating system-specific Oracle Real Application Clusters Installation Guide

2.2.2.4 Online Patching

Using Online Patching is the recommended solution for avoiding downtime when an online patch is available.

Solution Description

Online patches are a special type of interim patch that can be applied while the instance remains online.

Oracle provides the capability to do online patching with any Oracle database using the opatch command-line utility.

Additional Considerations

  • Oracle provides online patches when the changed code is small in scope and complexity, such as with diagnostic patches or small bug fixes.

  • Oracle provides online patches when the patch does not change shared memory structures in the System Global Area (SGA), or other critical internal code structures.

  • Applying an online patch increases memory consumption on the system because each Oracle process uses more memory from the Program Global Area (PGA) during the patch application. You need to take your memory requirements into consideration before you begin applying an online patch. Each online patch is unique and the memory requirements are patch specific. As is always the case, the best practice is to apply the patch on your test system first. Doing so also enables you to assess the effect of the online patch on your production system and estimate any additional memory usage.

See Also:

Oracle Universal Installer and OPatch User's Guide for information about online patching and OPatch, and see Oracle Database Upgrade Guide for an overview of rolling upgrades and rolling patches

2.2.2.5 Upgrading Oracle Clusterware

Performing rolling upgrades of the Oracle Clusterware using Cluster Ready Services (CRS) software is the recommended solution for avoiding downtime when upgrading Oracle Clusterware.

Solution Description

All upgrades to Oracle Clusterware can be performed in a rolling fashion.

See Also:

Your operating system-specific Oracle Clusterware installation guide

2.2.2.6 Upgrading Automatic Storage Management (ASM)

Upgrades ASM in a rolling fashion is the recommended solution for upgrading ASM.

Solution Description

All upgrades starting with Oracle Database 11g (and later releases) can be performed in a rolling fashion.

2.2.2.7 Storage Migration

Using ASM is the recommended solution for performing storage migrations.

Solution Description

ASM enables you to add all disks in one storage array and subsequently drop all disks from another array. ASM automatically rebalances and migrates data to the new storage while the database remains operational.

Additional Considerations

Before removing the source storage array, ensure that the rebalancing is complete.

See Also:

The chapter about performing ASM Data Migration in the Oracle Database Backup and Recovery User's Guide

2.2.2.8 Patch Set and Database Upgrades

Oracle Data Guard using SQL Apply is the recommended solution for performing patch set and database upgrades. Section 2.2.2.8.1 describes this solution.

Oracle Streams is the recommended solution for performing database upgrades when there are data type restrictions that prevent you from performing a rolling upgrade with SQL Apply. Section 2.2.2.8.2 describes this solution.

2.2.2.8.1 Solution Description for Database Upgrades Using Data Guard and SQL Apply

Follow these steps to leverage Data Guard using SQL Apply to upgrade an Oracle database:

  1. Upgrade logical standby database to the new release and evaluate the change.

  2. Ensure that SQL Apply has applied all redo data to the logical standby database

  3. Disconnect applications.

  4. Perform Data Guard switchover.

  5. Reconnect applications to the new primary database.

  6. Shut down the original primary database (now the logical standby database).

  7. Execute database software upgrade steps on the new standby database.

  8. Restart the standby database and allow recovery to synchronize.

  9. Optionally perform a Data Guard Switchover to return to the original database.

Additional Considerations


SQL Apply rolling upgrades are only supported for Oracle Database versions 10.1.0.3 and higher. For complete information, see the chapter about using SQL Apply to upgrade the Oracle Database in Oracle Data Guard Concepts and Administration.

SQL Apply has some data type restrictions. For more information, see the appendix about data type and DDL support on a logical standby database in Oracle Data Guard Concepts and Administration. If there are data type restrictions, consider using Oracle Streams. See Section 2.2.2.8.2, "Solution Description for Database Upgrades That Cannot Use SQL Apply" for more information.

Oracle Data Guard is the best approach if performing a Oracle RAC rolling upgrade is not possible and there are no data type restrictions.

See Also:

The ÒRolling Database Upgrades Using Data Guard SQL ApplyÓ white paper available at http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm
2.2.2.8.2 Solution Description for Database Upgrades That Cannot Use SQL Apply

Streams is similar in function to Data Guard SQL Apply but provides added flexibility if your database includes data types that SQL Apply does not support. Streams enables a method that can be used to work around the existence of unsupported data types and still effect a rolling database upgrade with minimal downtime.

The following high-level steps describe how to perform a database upgrade:

  1. Before you begin the upgrade process, see Oracle Streams Concepts and Administration for information about how to perform a database upgrade on a database that has user-defined types.

  2. Create a duplicate database. (Ideally the replica will start out as a physical standby database that is up to date.)

  3. Activate and upgrade the database to the later version.

  4. Enable Oracle Streams replication.

  5. During the upgrade of the replica, the source database continues ahead. Once the replica is caught up, perform a switchover.

See Also:

Oracle Streams Concepts and Administration for complete information about online database upgrade with Oracle Streams

2.2.2.9 Platform Migration Across Same Endian Format Platforms

Consider that following approaches when performing platform migrations across same endian format platforms

2.2.2.9.1 Solution Description for Platform Migration Using Transportable Database

Transportable database should be used for platform migration only when cross-platform physical standby database or logical standby database is not supported for the platform combination in questionFoot 1 .

For example, if you want to move from Windows x86-64 to Linux x86-64, it is best to use cross-platform standby database instead of transportable database. There is less downtime (simply the time it takes to switchover) and it is possible to run the standby database on the new platform for a period of time to ensure that everything is working as planned.

The high-level steps (with target system conversion) are as follows:

  1. Place the source database in read/only mode

  2. Run RMAN CONVERT DATABASE command

  3. Move files to the target system

  4. Run RMAN generated script to convert datafiles to target platform format

  5. Run RMAN generated script to complete the migration

When using transportable database, the downtime required for a platform migration is determined by the time needed to:

  • Place the source database in read-only mode

  • Convert all data files to the new platform format

  • Transfer all data files from the source system to the target system

    You can significantly minimize this time by using a storage infrastructure that can make the data files available to the target system without the need to physically move the files.

  • Invalidate and recompile all PL/SQL using SQL scripts utlirp.sql and utlrp.sql.

See Also:

The ÒPlatform Migration using Transportable DatabaseÓ white paper available at http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm
2.2.2.9.2 Solution Description for Platform Migration Using Oracle Streams

Oracle Streams enables updates on the multiple masters and provides support for heterogeneous platforms with different database releases. Therefore, Oracle Streams may provide the fastest approach for database upgrades and platform migration.

Oracle Streams has data type limitations and restrictions, such as for advanced queue and object types. But in some cases you can work around them by creating shadow tables on the source database. You can create a trigger on tables with unsupported data types to capture and propagate changes to tables with supported data types. Those changes are replicated by way of Streams to the target database. You can customize apply to apply the changes to the original tables in the target database.

Oracle Streams implementations require additional administrative effort for testing, setup, and configuration because Streams is designed to be a more flexible architecture.

The following high-level steps describe how to perform a platform migration with Oracle Streams:

  1. Set up the Streams environment on the source database.

  2. Instantiate the replica database (aka target database) using the new target version or on the target platform.

  3. Setup the Streams environment on the target database.

  4. Enable Streams to propagate all changes made on the source database to the target database to completely synchronize the target database with the source.

  5. Connect users to target database and shutdown source database.

  6. Remove Streams configuration.

2.2.2.10 Platform Migration Across Different Endian Format Platforms

Consider that following approaches when performing platform migrations on different endian format platforms:

Solution Description for Transportable Tablespace

Migrating a database to a new platform using a different endian format with transportable tablespaces requires the following high level steps:

  1. Create a new, empty database on the target platform.

  2. Import objects required for transport operations from the source database into the target database.

  3. Export transportable metadata for all user tablespaces from the source database.

  4. Transfer datafiles for user tablespaces to the target system.

  5. Use RMAN to convert the datafiles to the target system's endian format.

  6. Import transportable metadata for all user tablespaces into the target database.

  7. Import the remaining database objects and metadata (that were not moved by the transport operation) from the source database into the target database.

If the target database is being moved to a new location (for example, to a new data center) during the migration, then create a physical standby database from the original primary database co-located with the target database. After a Data Guard switchover, transport the tablespaces from the source to the target without incurring the file transfer time as part of the downtime.

Additional Considerations

Transportable tablespace has limitations and restrictions in regard to character sets, opaque types, and system tablespace objects. Unlike previous solutions, the steps are not automated.

Perform a platform migration using transportable tablespaces if all of the following are true:

  • The source and target platforms have different endian formats.

  • The time required to perform a full Data Pump Export and Import does not fit in the maintenance window.

See Also:

The best practices white paper titled: ÒOracle Database 10g Release 2 Best Practices: Platform Migration using Transportable TablespacesÓ available at http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm

2.2.3 Online Reorganization and Redefinition

One way to enhance availability and manageability is to allow customary user access to the database during a data reorganization operation. The Online Reorganization and Redefinition feature in Oracle Database offers administrators significant flexibility to modify the physical attributes of a table and transform both data and table structure while allowing customary user access to the database. This capability improves data availability, query performance, response time, and disk space utilization. All of these are important in a mission-critical environment and make the application upgrade process easier, safer, and faster.

This online architecture provides the following benefits:

  • Tables can be reorganized and redefined online

    • Any physical attribute of the table can be changed online. The table can be moved to a new location, partitioned, and converted from one organization (such as heap-organized) to another (such as index-organized).

    • Many logical attributes can also be changed. Column names, types, and sizes can be changed. Columns can be added, deleted, or merged. One restriction is that the primary key of the table cannot be modified.

  • All index operation can be performed online

    • Indexes can be created online and analyzed simultaneously. Online repair of the physical guess component of logical rowids (used in secondary indexes and in the mapping table for index-organized tables) can also be used.

    • An index-organized table and secondary indexes can be reorganized online to eliminate the reorganization maintenance window. Secondary indexes support efficient use of block hints (physical guesses). Invalid physical guesses of logical rowids stored in secondary indexes on index-organized table can also be repaired online.

    • An index-organized table or table partition can be reorganized without rebuilding its secondary indexes, resulting in a short reorganization maintenance window.

  • Online move of a partitioned table

  • Online reorganization support for advanced queues, clustered tables, materialized views, and abstract data types (objects)

  • Fast ADD COLUMN with default value (does not need to update all rows to default value)

  • Invisible Indexes speed application migration and testing:

    • Speeds up migration with explicit hints, then drops when finished

    • Prevents premature use of newly created indexes

    • Tests effects of DROP INDEX, making the index visible if needed, thus there is no need for an index rebuild

  • Online index build with no pause to DML (no exclusive DML locks are required)

  • No recompilation of dependent objects when online redefinition does not logically affect objects (for example, when columns are added to tables, or when procedures are added to packages)

  • Easier to execute table DDL operations online (there is an option to wait for active DML operations instead of aborting)

  • Support for redefinition of tables that have materialized views or materialized view logs

Depending on the type of online reorganization that is required, you can perform the following types of data reorganization using the DBMS_REDEFINITION package or the SQL CREATE/ALTER TABLE and INDEX commands:

  • Modify table storage parameters

  • Move table to a different tablespace

  • Add support for parallel queries

  • Add or drop partitioning support

  • Re-create table to avoid fragmentation

  • Change from table to IOT or from IOT to table

  • Add or drop a column

  • Transform a column using a function

  • Create indexes online

  • Rebuild indexes online

  • Coalesce indexes online

  • Move index-organized tables online

  • Copy dependent objects (such as triggers, constraints, and indexes)

  • Convert LONG and LONG RAW columns to a LOB

  • Use a unique key as an alternative to a primary key or rowid

  • Specify columns to order data by

  • Change a table without recompiling stored procedures

  • Online segment shrink

  • Reorganize a single partition

  • Reorganize advanced queue and clustered tables

  • Reorganize a table containing an ADT

  • Retain and clone statistics

  • Copies check and not null constraints

  • Copies dependent objects for nested tables

2.2.4 Online Application Maintenance and Upgrades

The following sections describe features that can significantly reduce (or eliminate) the application downtime required to make changes to an application's database objects.

2.2.4.1 Oracle Streams for Rolling Upgrades

Consider using Oracle Streams for fast rolling upgrades. However, note that while Oracle Streams upgrades can achieve little or no database down time, your ability to configure this solution will require some operational investment. See Section 2.1.4, "Oracle Streams" and Oracle Streams Concepts and Administration for more information.

2.2.4.2 DDL With the WAIT Option

Data definition language (DDL) commands require exclusive locks on internal structures. If DDL commands are issued, these locks may not be available causing the statement to immediately fail even though the DDL could have possibly succeeded sub-seconds later. DDL specified with the WAIT option resolves this issue.

DDL with the WAIT option is the new DEFAULT. The wait time is specified instance-wide (in the initialization parameter file) and can be modified on a session level.

DDL commands specified with the WAIT option gives you more flexibility to define grace periods for such commands to succeed instead of raising an error right away, thus requiring additional application logic to handle such errors.

2.2.4.3 ENABLE, DISABLE and FOLLOWS Clauses for CREATE TRIGGER

New states (ENABLE and DISABLE) and ordering (FOLLOWS) are introduced for triggers to control the firing of triggers. With this feature, the CREATE TRIGGER statement can be created in a disabled state to validate successful compilation before enabling. In addition, the trigger order can be controlled with the FOLLOWS clause.

These additional states allow greater administrative control for triggers.

2.2.4.4 Enhanced ADD COLUMN Functionality

Default values of columns are maintained in the data dictionary for columns specified as NOT NULL.

Adding new columns with DEFAULT values and NOT NULL constraint no longer requires the default value to be stored in all existing records. This not only enables a schema modification in sub-seconds and independent of the existing data volume, it also consumes no space.

2.2.4.5 Finer Grained Dependencies

In releases prior to Oracle Database 11g, metadata records mutual dependencies between objects with the granularity of the whole object. For example, PL/SQL unit P depends on PL/SQL unit Q or that view V depends on table T. This means that dependent objects were sometimes invalidated when there was no logical requirement to do so. For example, if view V depends only on columns C1, C2, and C3 in table T and a new column, C99, is added, the validity of view V is not logically affected. Nevertheless, in earlier releases, V was invalidated by the addition of column C99.

Oracle Database 11g records dependency metadata at a finer level of granularity so that the addition of C99 does not invalidate view V. Similarly, if procedure P depends only on elements E1 and E2 in package PKG, then if element E99 is added to PKG, procedure P is not invalidated. (In Oracle Database 10g, this change to PKG would invalidate procedure P.)

By reducing the consequential invalidation of dependent objects in response to changes in the objects they depend upon, application availability is increased. The benefit is felt both in the development environment and when a live application is parsed or upgraded. The benefit occurs when an Oracle Database patch set is applied because changes to schema objects are required to be compatible and, therefore, not cause consequential invalidations.

2.2.4.6 Invisible Indexes

An invisible index is an alternative to making an index unusable or even to drop it. An invisible index is maintained for any DML operation but is not used by the optimizer unless you explicitly specify the index with a hint.

Applications often have to be modified without being able to bring the complete application offline. Invisible indexes enable you to leverage temporary index structures for certain operations or modules of an application without affecting the overall application. Furthermore, invisible indexes can be used to test the removal of an index without dropping it right away, thus enabling a grace period for testing in production environments.

2.2.4.7 Materialized View Logging Control

Oracle Database 11g Release 1 has added session-level control for materialized view logs. The capture of changes for materialized views (materialized view logs) can be disabled for an individual session while logging continues for changes made by other sessions. This feature reduces Application patching downtime.

2.2.4.8 Dependent PL/SQL Recompilation After Online Table Redefinition

This feature minimizes the need to recompile dependent PL/SQL packages after an online table redefinition. If the redefinition does not logically affect the PL/SQL packages, recompilation is not needed. This optimization is turned on by default.

This feature reduces the time and effort to manually recompile dependent PL/SQL after an online table redefinition. This also includes views, synonyms, and other table dependent objects (with the exception of triggers) that are not logically affected by the redefinition.

2.3 Optimizing Grid Computing and Disaster Recovery Solutions

With Grid Computing and new standby database capabilities, you can leverage and scale your existing system infrastructure. For the primary database, this implies that all hardware resources are leveraged for performance and scalability. For secondary or disaster recovery systems, system and database resources can be used for real-time query or reporting serving a production purpose while in standby database role. With Oracle Database 11g Release 1 (11.1), Oracle Data Guard can be integral part of your IT operations and application business.

This section covers the following topics:

2.3.1 Grid Computing

Grid computing is a computing architecture that effectively pools large numbers of servers and storage into a flexible, on-demand computing resource for all enterprise computing needs.

Oracle Database captures the cost advantages of Grid enterprise computing without sacrificing performance, scalability, security, manageability, functionality, or system availability. A Database Server Grid is a collection of commodity servers connected together to run on one or more databases. A Database Storage Grid is a collection of low-cost modular storage arrays combined together and accessed by the computers in the Database Server Grid.

With the Database Server and Storage Grid, you can build standby and testing Hubs to leverage a pool of system resources. The system resources can be dynamically allocated and deallocated depending on various priorities. For example, if the production database fails over to one of the standby databases in the standby hub, it will acquire more system and storage resources while the testing resources may be temporarily starved. With Grid technologies, you can enable high level of utilization and low TCO without sacrificing business requirements.

Figure 2-6 illustrates the Database Server Grid and Database Storage Grid in a Grid enterprise computing environment.

Figure 2-6 Grid Computing Environment

Description of Figure 2-6 follows
Description of "Figure 2-6 Grid Computing Environment"

2.3.2 Database Server Grid

The availability of low-cost and reliable blade servers, small multiprocessor servers, and inexpensive open-source operating systems such as Linux, has made it possible to build a Database Server Grid that is highly available, scalable, flexible, and manageable.

Oracle RAC is the technology that enables a Database Server Grid by providing a single database that spans multiple low-cost servers yet appears to the application as a single, unified database system. Oracle RAC provides flexibility to dynamically provision resources and services within the Grid as computing needs change, and to add systems to the Grid as capacity demands increase. In addition, Oracle RAC provides protection from system failures by automatically recovering the processing of a failed node by any of the surviving systems running the database, and facilitating the reconnection of clients and redistribution of load affected by the failed system.

2.3.3 Database Storage Grid

The availability of low-cost ATA disk-based storage arrays and low-cost storage networks has made it possible to use a Database Storage Grid with Oracle Database at very low cost. A DBA can use the Automatic Storage Management (ASM) interface to specify the disks within the Database Storage Grid that ASM should manage across all server and storage platforms. ASM partitions the disk space and evenly distributes the data storage throughout the entire storage array. Additionally, ASM automatically redistributes the data storage as storage arrays are added or removed from the Database Storage Grid.

2.3.4 Disaster Recovery Solutions with Better Standby Database Usage

Beginning in Oracle Database 11g, standby databases can be used for dynamic IT and application requirements in addition to providing disaster recovery. The real-time query feature in Oracle Data Guard enables you to use physical standby databases for other useful work during normal operations, in addition to providing a disaster-recovery solution.

The following sections describe the Oracle Data Guard features that help you to leverage physical standby databases for additional business purposes:

2.3.4.1 Real-Time Query Physical Standby Databases

Data Guard Redo Apply (physical standby database) has proven to be a popular solution for disaster recovery due to its relative simplicity, high performance, and superior level of data protection. Beginning with Oracle Database 11g, a physical standby database can be open read-only while redo apply is active. This means that you can run queries and reports against an up-to-date physical standby database without compromising data protection or extending recovery time in the event a failover is required. This makes every physical standby database able to support productive uses even while in standby role.

To enable real-time query, open the database in read-only mode and then issue the ALTER DATABASE RECOVER MANAGED STANDBY statement. Note that the COMPATIBLE parameter must be set to 11.0.0 on both the primary and physical standby databases. Using this feature is totally transparent to applications.

Real-time query provides an ultimate high availability solution because it:

  • Supports Oracle RAC on the primary and standby databases

    Real-time query works on both single-instance and Oracle RAC physical standby databases. Although Redo Apply can be running on only one Oracle RAC instance, you can have all of the instances running in read-only mode while Redo Apply is running on one instance.

  • Queries return transactionally consistent results that are very close to being up-to-date wish the primary database

    Depending on any delay settings or apply rates, the standby database can be lagging seconds behind the primary database. The queries will always be transactionally consistent and will represent a consistent view of the last committed transaction at that time.

  • Allows fast switchovers or failovers because the redo generated by the primary database while the standby database was open read-only has already been applied to the standby database, making it immediately available to assume the primary database role

  • Enables you to use fast-start failover to allow for automatic fast failover in the case the primary database fails

Note:

Transactions that attempt to modify a physical standby database running with real-time query enabled will fail with an error.

See Also:

Oracle Data Guard Concepts and Administration for complete information about using real-time query

2.3.4.2 Web Scale Using Standby Reader Farms

Beginning with 11gR1, you can use both physical standby databases (using real-time query) and logical standby databases to deploy a reader farm. An example of such a configuration is provided Figure 2-7, complete with the use of Data Guard fast-start failover to automatically fail over should the primary database fail. Note that all standby databases in the reader farm automatically recognize the new primary database after such a fast-start failover occurs.

Because a Data Guard configuration can support multiple standby databases, customers have used this capability to boost read performance of the most demanding web applications beyond what the underlying system and storage architecture can support. This provides a relatively low-cost method of scaling out using a Grid architecture where I/O is the driving factor.

The concept is straightforward—a single primary database that supports read/write transactions, and multiple standby databases that provide read-only access to web users. Such an approach scales read performance linearly as additional standby databases are added. It is also an effective way to isolate faults, because problems that impact one standby database are isolated from the other standby databases in the configuration.

The benefits of creating a reader farm of physical standby databases include the following:

  • Fault isolation

  • High performance with physical standby databases and Redo Apply

  • Seamless support for all DDL and data types using Redo Apply

  • All reader databases are kept up-to-date with changes made to the primary database

  • Automatic, zero or minimal data loss failover capability

  • Management as a unified configuration through Grid Control

  • Scale-out using single writer database and n reader databases.

Figure 2-7 shows a good example of how you can leverage Oracle Data Guard, physical standby databases and real-time query to provide the flexibility you need to grow your business quickly, while still providing disaster recovery. In the configuration, the primary database transmits redo data to multiple standby database, one of which is also enabled for fast-start failover for automatic, zero or minimal data loss failover.

Figure 2-7 Standby Database Reader Farms

Description of Figure 2-7 follows
Description of "Figure 2-7 Standby Database Reader Farms"

If a fast-start failover is triggered in the Data Guard configuration in Figure 2-7:

  • Automatic failover occurs to the designated standby database

  • All standby databases accept data from new primary database

  • You can perform a switchover at a convenient time in the future to return all databases to their original roles

2.4 Optimizing Manageability

Complex environments demand coordinated configuration changes, system upgrades and new application roll-outs. Manageability in Oracle Database 11g has improved dramatically to automate and simplify operations in high availability architectures, and represents a major milestone in the drive toward self-managing Oracle databases.

This section contains these topics:

2.4.1 Intelligent Infrastructure

Oracle Database has a sophisticated self-management infrastructure that allows the database to learn about itself and use this information to adapt to workload variations or to automatically remedy any potential problem. The self-management infrastructure includes the following:

  • Automatic Workload Repository

    The Automatic Workload Repository (AWR) is a built-in repository that contains performance statistics used by Oracle Database for problem detection and self-tuning purposes. At regular intervals, Oracle Database makes a snapshot of vital statistics and workload information and stores them in the AWR. The data contained in the snapshots is then analyzed by the Automatic Database Diagnostic Monitor (ADDM). See the Oracle Database Performance Tuning Guide for information about the AWR.

  • Automatic Maintenance Tasks

    By analyzing the information stored in the AWR, the database can identify the need to perform routine maintenance tasks. The automated maintenance tasks infrastructure (known as ÒAutoTaskÓ) enables Oracle Database to automatically schedule such operations. AutoTask schedules automatic maintenance tasks to run in a set of Oracle Scheduler windows known as maintenance windows. Maintenance windows are those windows that are members of the Oracle Scheduler window group MAINTENANCE_WINDOW_GROUP. See the Oracle Database Administrator's Guide and the Oracle Database 2 Day DBA for more information.

  • Fault diagnosability infrastructure

    Oracle Database includes an advanced fault diagnosability infrastructure for preventing, detecting, diagnosing, and resolving problems. The problems that are targeted are critical errors such as those caused by database code bugs, metadata corruption, and customer data corruption. This includes:

    • The automatic diagnostic repository (ADR), which is a file-based repository for database diagnostic data such as traces, the alert log, health monitor reports, and more. It has a unified directory structure across multiple instances and multiple products.

    • The incident packaging services that a DBA can use to automatically and easily gather all diagnostic data (traces, health check reports, SQL test cases, and more) pertaining to a critical error and package the data into a zip file suitable for transmission to Oracle Support.

    See the Oracle Database Administrator's Guide for more information about these components.

  • Server generated alerts

    For problems that cannot be resolved automatically and require administrators to be notified (such as running out of space) the Oracle Database provides server-generated alerts. Oracle Database can monitor itself and send out alerts to notify you of any problem and provide recommendations on how the reported problem can be resolved. This ensures quick problem resolution and helps prevent potential failures.

  • Advisor framework

    Oracle Database includes a number of advisors for different subsystems in the database to automatically determine how the operation of the corresponding subcomponents could be further optimized. The SQL Tuning Advisor and the SQL Access Advisor, for example, provide recommendations for running SQL statements faster. Memory advisors help size the various memory components without resorting to trial-and-error techniques. The Segment Advisor handles space-related issues, such as recommending wasted-space reclamation and analyzing growth trends, while the Undo Advisor guides you in sizing the undo tablespace correctly. See the Oracle Database 2 Day DBA for more information about using advisors.

  • Hang Manager

    The Hang Manager is an Oracle Database infrastructure that can detect hangs, analyze them, and then obtain the required diagnostic data from Oracle. The Hang Manager is enabled by default in single-instance databases, Oracle RAC databases, and Automatic Storage Management (ASM) instances. Thus, you can use Hang Manager to follow a hang from the database instance to the underlying ASM instance.

2.4.2 Change Assurance

Oracle Database 11g Release 1 introduces automatic capture and replay of workloads before and after changes so that you can analyze the impact of a database or a SQL change:

  • Database Replay

    The Database Replay feature addresses enables you to perform real-world testing by capturing the actual database workload on the production system and replaying it on the test system. It also provides analysis and reporting to highlight potential problems (for example, errors encountered and divergence in performance) and recommend ways to remedy the problems.

  • SQL Performance Analyzer

    SQL performance regression is always a concern during system changes such as database upgrades, initialization parameter changes, and addition or dropping of indexes. The SQL Performance Analyzer feature alleviates this concern by providing an easy way to assess the impact of a change on the performance of SQL statements by comparing and contrasting their response times before and after the change. SQL Replay enables you to capture the SQL workload from the source system, such as the production database, and to replay it on the test system where the change has been applied.

2.4.3 Oracle Enterprise Manager Grid Control

By reducing the amount of human intervention required to execute routine and repetitive tasks, services become more stable, reliable, and available. This is particularly important when administrators need to manage very large numbers of systems as efficiently as possible.

Oracle Enterprise Manager Grid Control is an HTML-based interface that provides the administrator with complete monitoring across the entire Oracle technology stack—business applications, application servers, databases, and the E-Business Suite—as well as non Oracle components. If a component within the fast application notification becomes unavailable or experiences performance problems, an alert is automatically generated to Grid Control to inform the administrator so appropriate action can be taken.

The components of Grid Control include:

  • Oracle Management Service (OMS)

    The OMS is now a set of J2EE applications that renders the interface for Grid Control, works with all Management Agents to process monitoring information, and uses the Management Repository as its persistent data store.

  • Oracle Management Agents

    These are processes deployed on each monitored host to monitor all targets on the host, communicate that information to OMS, and maintain the host and its targets.

  • Oracle Management Repository

    This is a schema in Oracle Database that contains all available information about administrators, targets, and applications managed by Grid Control.

Communication between Grid Control, the OMS, and Oracle Management Agents is done through HTTP. Also, you can enable SSL to allow secure communications between tiers within firewall-protected environments. The Management Agent uploads collected monitoring data to the OMS, which in turn loads the data into the Management Repository. Changes in a target state (such as an availability state change) result in an alert being generated to Grid Control.

Using Grid Control, an administrator can:

  • Monitor architecture components and be alerted when a failure occurs

  • View overall system status, such as the number of nodes in the database cluster and their current status

  • View alerts aggregated across all instances

  • Set thresholds for alert generation on a database cluster-wide basis

  • Monitor performance metric across all instances

  • Perform database cluster-wide operations such as backup and recovery

  • Interconnect monitoring of cluster databases

See Also:



Footnote Legend

Footnote 1: Beginning with Oracle Database release 11g, the primary and standby systems in a Data Guard configuration can have different CPU architectures, operating systems (for example, Windows and Linux), operating system binaries (32-bit and 64-bit), and Oracle database binaries (32-bit and 64-bit). For the latest capabilities and restrictions, see Metalink note 413484.1 at https://metalink.oracle.com/.