They can be generally classified as routine operations, periodic maintenance, and upgrades. Figure 10-2 shows several kinds of planned downtime. However, planned downtime can be just as disruptive to operations, especially in global enterprises that support users in multiple time zones. Text description of the illustration pfscn012.gif This unplanned downtime is disruptive because it is difficult to predict its timing. They can be broadly classified as system faults, data and media errors, and site outages. Figure 10-1 shows many causes of unplanned downtime. Working with your hardware vendor is key to building fully redundant cluster environments for Real Application Clusters.ĭowntime can be either planned or unplanned. This inevitably contributes toward a more favorable availability for an entire system.Īs mentioned, a well designed Real Application Clusters system has redundant components that protect against most failures and that provide an environment without single points-of-failure. For these reasons, Real Application Clusters can significantly reduce MTTR during failures. It is difficult to consider several nines of availability without also describing the ground rules and strict processes for managing application environments, testing methodologies, and change management procedures. For example, 526 minutes of system unavailability for each year results in 99.9% or three nines availability.įive minutes of system unavailability for each year results in a 99.999% or five nines availability. Hence, Real Application Clusters can greatly reduce the MTBF from an application availability standpoint.Īnother metric that is generally used is number of nines. However, given that you can design Real Application Clusters to avoid single points-of-failure, component failures might not necessarily result in application unavailability. MTBF is generally more applicable to hardware availability metrics this chapter does not go into detail about mean time between failures (MTBF). The software industry generally measures availability by using two types of metrics (measurements):įor most failure scenarios, the industry focuses on mean time to recover (MTTR) issues and investigates how to optimally design systems to reduce these. As well, some systems could have a continuous (24 hours a day, seven days a week) uptime requirement, while others such as a stock market tracking system will have nearly continuous uptime requirements for specific time frames, such as when a stock market is open. Mission-critical and business-critical applications such as e-mail and Internet servers probably require a significantly greater availability than do applications that have a smaller number of users. You can classify systems and evaluate their expected availability by system type. These chapters describe high availability within the context of Oracle Real Application Clusters Guard. More on the topic of High Availability is included in Chapter 11 and Chapter 12. Clusters typical of Real Application Clusters environments, as described in Chapter 3, can provide continuous service for both planned and unplanned outages. Real Application Clusters are inherently high availability systems. This chapter describes high availability within the context of Real Application Clusters. These can be used in various combinations to meet your specific high availability needs. These include Real Application Clusters, Oracle Real Application Clusters Guard, Oracle Replication, and Oracle9i Data Guard. Oracle has a number of products and features that provide high availability. The more transparent that failover is to users, the higher the availability of the system. This process remasters systemwide resources, recovers partial or failed transactions, and restores the system to normal, preferably within a matter of microseconds. When failures occur, the failover process moves processing performed by the failed component to the backup component. Any hardware or software component that can fail has a redundant component of the same type. Well-designed high availability systems avoid having single points-of-failure. Such systems typically have redundant hardware and software that makes the system available despite failures.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |