Understanding RPO and RTO: Their Roles in a Disaster Recovery Plan and Business Impact Analysis
By Joseph Giella, Vice President - Data Protection Strategy, Marcum Technology
A disaster recovery plan (DRP) is a set of documented processes and procedures containing detailed instructions about how to respond to and recover from disruptive events. This can include events such as cyber-attacks, natural disasters, electrical blackouts, hazmat exposure, or something as drastic as the terrorist attacks of 9/11. The disaster recovery plan is basically a documented set of actions to be executed following a disaster, which focuses on resuming work quickly and reducing further interruptions. Increasingly, disasters have been attributed to cybersecurity-related events.
Dr. Steven Covey, in his bestselling book, “The Seven Habits of Highly Effective People,” offers as one of those habits: “Begin with the End in Mind.” Nothing could be more applicable to properly executed disaster recovery planning. The “begin” is the business impact analysis (BIA), and the “end” is the successful completion, execution and maintenance of the DRP.
Disaster recovery planning often focuses on two critical objectives that define the limits of acceptable data loss and downtime — recovery point objective (RPO) and recovery time objective (RTO). RPO and RTO are the management-defined objectives for moving from the current state to the desired state of preparedness.
RPO is the point in time (prior to the outage) to which systems and data must be restored.
RTO is the period of time (after an outage) during which systems and data must be restored to the predetermined RPO without causing significant damage to the business, as well as the time spent restoring the application and its data.
These two objectives represent the organization’s defined requirements for the amount of downtime and permanent data loss management has determined the organization can tolerate.
Source: Symantec Corp.
The above graphic indicates a timeline for recovery points and times. The lightning bolt represents the occurrence of a disaster/unplanned downtime.
A goal of disaster recovery planning is to achieve the quickest resumption of operations at the lowest cost. This is a challenge in that technology costs are the highest for the quickest recovery. Prioritization in the order of which functions must be recovered first will ensure that the most mission-critical systems are available as soon as possible.
The disaster recovery plan) begins with the business impact analysis. The BIA identifies each business function and, after interviews with key personnel and management are completed, assigns an RTO and RPO value. For example, RTO for email services might be one hour and RPO might be zero. An RPO of zero would be necessary for SEC-regulated firms, where a court could impose large fines for each day that an evidentiary email message could not be produced, such as in a case of insider trading.
When RPO and RTO are known for all systems, workloads and applications, as well as the cost of downtime for the business they support, the right decisions can be made to protect data. IT leadership is empowered to select the right technologies and build a suitable strategy around data protection and disaster recovery.
Business Impact Analysis
A business impact analysis can help assess and weigh the impact and consequences, both financial and non-financial, of an interruption in business operations. These findings can help organizations determine their availability service level agreements (SLA), or the level of service expected by the customer from the entity providing the service. Most often, multiple SLAs are defined to match the various levels of criticality determined during the BIA.
For example, the following SLAs for uptime are commonly utilized:
- 99%, or two 9s, corresponds to 3 days 15 hours and 36 minutes of downtime per year.99.9%, or three 9s, corresponds to 8 hours 45 minutes and 36 seconds of downtime per year.
- 99.99%, or four 9s, corresponds to 52 minutes and 34 seconds of downtime per year.
In DRP, it is crucial to prioritize recovery services relative to their contribution to the business. Not all services are created equal, nor should the investment in recovery efforts be the same.
Steps after BIA
While no two DRPs are alike, subsequent steps revolve around a central framework to support the organization’s objectives as defined by the RPO and RTOs identified.
An example of subsequent steps would resemble the following:
- Set Clear Recovery Objectives
- Identify Professionals and Stakeholders– internal and external
- Document Existing Network Infrastructure
- Identify and Select Recovery Processes
- Define an Incident Criteria Checklist
- Document Disaster Recovery Procedures
- Identify Intervals for Testing and Test
- Continuously Update DRP
With the constantly changing world and business environment, the DRP needs to be constantly updated and tested regularly. Most IT managers view technology initiatives as projects with milestones and deliverables that have start and completion dates. In contrast, DRP is not a project; rather, it is a program that must be maintained and revisited often to stay ahead of emerging threats, address evolving business targets, and leverage technology advances.
Although there is no standard frequency for reviewing and updating your disaster recovery plan, you should review, test and update your DRP at least annually to make sure everything is functioning as expected. Quarterly updates and reviews are even better.
Constant testing ultimately unveils limitations in your existing DRP. Keep eliminating these flaws so that the new changes will be aligned with your company’s requirements.
Data processing operations are volatile in nature, resulting in frequent changes to equipment, programs and documentation. These actions make it critical to consider the plan as a living, breathing and ever-changing document.
Disaster recovery planning should also serve to streamline technology processes, identify and refresh hardware, and reduce the risk of human error. You are not just preparing to recover from a disaster. You are working to make your business more bullet-proof, efficient, and as profitable as possible.