Industry leading availability with least business interruption
Redundancy means that multiple components can perform the same task. The problem of a single point of failure is eliminated because redundant components can take over a task performed by a component that has failed.
Monitoring means checking whether or not a component is working properly.
Failover is the process by which a secondary component becomes primary when the primary component fails.
The best practices introduced here focus on these three key elements. Although high availability can be achieved at many different levels, including the application level and the cloud infrastructure level, here we will focus on the Application and the cloud infrastructure level. Our Cloud Infrastructure region is a localized geographic area composed of one or more availability domains, each composed of three fault domains. High availability is ensured by a redundancy of fault domains within the availability domains.
An availability domain is one or more data centres located within a region. Availability domains are isolated from each other, fault tolerant, and unlikely to fail simultaneously. Because availability domains do not share physical infrastructure, such as power or cooling, or the internal availability domain network, a failure that impacts one availability domain is unlikely to impact the availability of others.
A fault domain is a grouping of hardware and infrastructure within an availability domain. Each availability domain contains three fault domains. Fault domains let you distribute your instances so that they are not on the same physical hardware within a single availability domain. As a result, an unexpected hardware failure or a hardware maintenance that affects one fault domain does not affect instances in other fault domains. You can optionally specify the fault domain for a new instance at launch time, or you can let the system select one for you.
All the availability domains in a region are connected to each other by a low-latency, high bandwidth network. This predictable, encrypted interconnection between availability domains provides the building blocks for both high availability and disaster recovery.
Our Cloud Infrastructure resources are either specific to a region, such as a virtual cloud network, or specific to an availability domain, such as a Compute instance. When we configure your cloud services, if the services are specific to an availability domain, it is important to leverage multiple availability domains or fault domains to ensure high availability and to protect against resource failure. By creating redundant Compute instances in other availability domains or fault domains, you can avoid an impact to your applications by an issue that affects the primary Compute instance or its domain. We are designing the solutions to have multiple regions, multiple availability domains, or multiple fault domains, depending on the class of failures you want to protect against.