[University of Arkansas][Computing Services]

Disaster Recovery Plan
Objectives and Overview
(DRPDR001)

Last update: Tuesday, 21-Mar-2000 10:25:18 CST

Over the years, dependence upon the use of computers in the day-to-day business activities of many organizations has become the norm. The University of Arkansas certainly is no exception to this trend. Today you can find very powerful computers in every department on campus. These machines are linked together by a sophisticated network that provides communications with other machines across campus and around the world. Vital functions of the University depend on the availability of this network of computers.

Consider for a moment the impact of a disaster that prevents the use of the system to process Student Registration, Payroll, Accounting, or any other vital application for weeks. Students and faculty rely upon our systems for instruction and research purposes, all of which are important to the well-being of the University. It is hard to estimate the damage to the University that such an event might cause. One tornado properly placed could easily cause enough damage to disrupt these and other vital functions of the University. Without adequate planning and preparation to deal with such an event, the University's central computer systems could be unavailable for many weeks.


Primary FOCUS of the Plan

The primary focus of this document is to provide a plan to respond to a disaster that destroys or severely cripples the University's central computer systems operated by the Computing Services Department. The intent is to restore operations as quickly as possible with the latest and most up-to-date data available.

IMPORTANT NOTE!

All disaster recovery plans assume a certain amount of risk, the primary one being how much data is lost in the event of a disaster. Disaster recovery planning is much like the insurance business in many ways. There are compromises between the amount of time, effort, and money spent in the planning and preparation of a disaster and the amount of data loss you can sustain and still remain operational following a disaster. Time enters the equation, too. Many organizations simply cannot function without the computers they need to stay in business. So their recovery efforts may focus on quick recovery, or even zero down time, by duplicating and maintaining their computer systems in separate facilities.

The techniques for backup and recovery used in this plan do NOT guarantee zero data loss. The University administration is willing to assume the risk of data loss and do without computing for a period of time in a disaster situation. To put it in a more fiscal sense, the University is saving dollars in up-front disaster preparation costs, and then relying upon business interruption and recovery insurance to help restore computer operations after a disaster.

Data recovery efforts in this plan are targeted at getting the systems up and running with the last available off-site backup tapes. Significant effort will be required after the system operation is restored to (1) restore data integrity to the point of the disaster and (2) to synchronize that data with any new data collected from the point of the disaster forward.

This plan does not attempt to cover either of these two important aspects of data recovery. Instead, individual users and departments will need to develop their own disaster recovery plans to cope with the unavailability of the computer systems during the restoration phase of this plan and to cope with potential data loss and synchronization problems.


Primary OBJECTIVES of the Plan

This disaster recovery plan has the following primary objectives:

  1. Present an orderly course of action for restoring critical computing capability to the UofA campus within 14 days of initiation of the plan.

  2. Set criteria for making the decision to recover at a cold site or repair the affected site.

  3. Describe an organizational structure for carrying out the plan.

  4. Provide information concerning personnel that will be required to carry out the plan and the computing expertise required.

  5. Identify the equipment, floor plan, procedures, and other items necessary for the recovery.


OVERVIEW of the Plan

This plan uses a "cookbook" approach to recovery from a disaster that destroys or severely cripples the computing resources at the Administrative Services Building at 155 Razorback Road in Fayetteville and possibly at other critical campus facilities.

Personnel
Immediately following the disaster, a planned sequence of events begins. Key personnel are notified and recovery teams are grouped to implement the plan. Personnel currently employed are listed in the plan. However, the plan has been designed to be usable even if some or all of the personnel are unavailable.

In a disaster it must be remembered that PEOPLE are your most valuable resource. The recovery personnel working to restore the computing systems will likely be working at great personal sacrifice, especially in the early hours and days following the disaster. They may have injuries hampering their physical abilities. The loss or injury of a loved one or coworker may affect their emotional ability. They will have physical needs for food, shelter, and sleep.

The University must take special pains to ensure that the recovery workers are provided with resources to meet their physical and emotional needs. This plan calls for the appointment of a person in the Administrative Support Team whose job will be to secure these resources so they can concentrate on the task at hand.

Salvage Operations at Disaster Site
Early efforts are targeted at protecting and preserving the computer equipment. In particular, any magnetic storage media (hard drives, magnetic tapes, diskettes) are identified and either protected from the elements or removed to a clean, dry environment away from the disaster site.

Designate Recovery Site
At the same time, a survey of the disaster scene is done by appropriate personnel to estimate the amount of time required to put the facility (in this case, the building and utilities) back into working order. A decision is then made whether to use the Cold Site, a location some distance away from the scene of the disaster where computing and networking capabilities can be temporarily restored until the primary site is ready. Work begins almost immediately at repairing or rebuilding the primary site. This may take months, the details of which are beyond the scope of this document.

Purchase New Equipment
The recovery process relies heavily upon vendors to quickly provide replacements for the resources that cannot be salvaged. The University will rely upon emergency procurement procedures documented in this plan and approved by the University's purchasing office and the Office of State Purchasing to quickly place orders for equipment, supplies, software, and any other needs.

Begin Reassembly at Recovery Site
Salvaged and new components are reassembled at the recovery site according to the instructions contained in this plan. Since all plans of this type are subject to the inherent changes that occur in the computer industry, it may become necessary for recovery personnel to deviate from the plan, especially if the plan has not been keep up-to-date. If vendors cannot provide a certain piece of equipment on a timely basis, it may be necessary for the recovery personnel to make last-minute substitutions. After the equipment reassembly phase is complete, the work turns to concentrate on the data recovery procedures.

Restore Data from Backups
Data recovery relies entirely upon the use of backups stored in locations off-site from the Administrative Services Building. Backups can take the form of magnetic tape, CDROMs, disk drives, and other storage media. Early data recovery efforts focus on restoring the operating system(s) for each computer system. Next, first line recovery of application and user data from the backup tapes is done. Individual application owners may need to be involved at this point, so teams are assigned for each major application area to ensure that data is restored properly.

Restore Applications Data
It is at this point that the disaster recovery plans for users and departments (e.g., the application owners) must merge with the completion of the Computing Services plan. Since some time may have elapsed between the time that the off-site backups were made and the time of the disaster, application owners must have means for restoring each running application database to the point of the disaster. They must also take all new data collected since that point and input it into the application databases. When this process is complete, the University computer systems can reopen for business. Some applications may be available only to a limited few key personnel, while others may be available to anyone who can access the computer systems.

Move Back to Restored Permanent Facility
If the recovery process has taken place at the Cold Site, physical restoration of the Administrative Services Building (or an alternate facility) will have begun. When that facility is ready for occupancy, the systems assembled at the Cold Site are to be moved back to their permanent home. This plan does not attempt to address the logistics of this move, which should be vastly less complicated than the work done to do the recovery at the Cold Site.


[Home Page] [Table of Contents] [Send Mail]
Copyright © 1997 University of Arkansas
All rights reserved