Business Continuity for Mission-Critical Applications -
Data protection and availability tools enable business continuity for mission-critical applications. -
A range of solutions grows in cost and complexity as availability requirements increase. -
Protection and availability options range from basic backup to wide area replication with failover. 6.1 Assessing Business Continuity Objectives -
Some of the steps to be taken in the assessment process include the following: -
List all applications. -
Identify dependencies between applications. -
Create application groups based on dependencies. -
Prioritize application groups based on importance. -
Associate availability metrics for application groups and applications within groups. -
Recovery time objective (RTO) is a measure of the acceptable time allowed for recovery. RTO poses the question, How long can the application be down? -
Recovery point objective (RPO) measures the earliest point in time to which the application and data can be recovered. Simply put, how much data can be lost? -
Data protection and availability solutions can be mapped out by RTO and RPO and matched to appropriate application requirements. -
Factors affecting the choice of solution include -
Appropriate level of availability -
Total cost of ownership (TCO) -
Complexity -
Vendor viability in the long run -
Performance impact -
Security -
Scalability 6.2 Availability within the Data Center -
Backup : Backing up the data on a regular schedule to a tape device provides a fundamental level of availability. -
Disk Redundancy ”RAID : Since storage disks have some of the lowest mean time between failures (MTBF) in a computer system, mirroring the boot disk and using RAID storage provides resilience in case of disk failure. -
Quick Recovery File Systems : In the event of a system crash and reboot, using a quick recovery file system will speed up the reboot process, which in turn minimizes the amount of time the application remains unavailable. -
Point-in-Time Copies : RAID storage and high-availability software do not protect applications from logical data corruption. Online point-in-time copies of data ensure a more timely recovery. -
High Availability and Clustering : Deploying high-availability and clustering software with redundant server hardware enables automated detection of server failure and provides transparent failover of the application to a second server with minimal disruption to end users. 6.3 Availability Across Geographies: Disaster Tolerance and Recovery -
Offsite Media Vaulting and Recovery : Offsite tape backups provide regular archived copies for disaster recovery. Tape backup can take up to a week or more for full recovery. -
Remote Mirroring and Failover : In the event of a data center outage, replicating data over a WAN to a remote site server ensures application availability in a relatively short time period. Most methods of remote data replication provide a passive secondary standby system. In the event of a primary data center outage , the use of wide-area failover software, which provides failure notification and automated application recovery at the secondary site, ensures a relatively quick and error-free resumption of application service at the secondary site. -
Data Replication : Replication modes may be synchronous, asynchronous, or periodic. -
Methods of Replication : Redundancy layers exist across applications, databases, file systems, logical volumes , and storage devices or LUNs. -
Secondary sites require special consideration and may be used for protection and optimized deployments, such as failover and facilitating easier systems maintenance. 6.4 Corporate Systems -
Corporate systems span load balancers, Web servers, application servers, database servers, storage area networks, and storage subsystems. -
Applications include external Web sites, internal Web sites, e-commerce, email, enterprise applications (ERP, CRM, supply chain, etc.), and call centers. -
Each application has specific requirements that need to be mapped across redundancy layers to determine the optimal availability and protection measures. |