Log in ...Tribune: IT supplement of The Tribune, Chandigarh, India. Guest Speak

Monday, September 22, 2003		Guest Speak

Data recovery tools very essential
V. Vivekanand

V. Vivekanand

Business Development Manager, Hitachi Data Systems, India.

COMPANIES that survive on data and information consider them to be the most important assets. For these companies, data and information are not merely the by-products of business process. This calls for extreme importance to storage resources and their protection. Keeping data and information safe is critical for competitiveness and business to succeed. Consequently, backup and restore becomes the most important component of protecting information.

A known fact is that data storage requirements are skyrocketing and so are the complexities and cost of managing them. In such a scenario, it is important that companies choose their backup and restore solutions carefully. What translates is that backup and restore have become the largest component of the total cost of storage ownership. According to Gartner estimates, the backup and restore costs accounts for 32 to 54 per cent of the total cost. These costs will further increase for companies that work on a 24x7x365 basis.

Besides the increasing use of collaborative computing applications such as Lotus Notes and MS exchange, data warehousing, decision support systems and business automation applications (like ERP/ CRM) drastically increase the need for disaster recovery solutions. Each of these systems need backup and restore with varying degree of complexities. For instance, business process automation system runs on relational databases that provide a single view of the entire organisation from the shop floor on to the boardroom. In such cases, backup and restore solutions command high value in order to protect the operational environment of the business.

Storage environment

While discussing this issue, it is important to understand what kind of storage environments are most suitable for implementing effective backup and restore functionalities. Analysts from IDC offer that enterprise storage systems are most apt for deploying backup and restore functions. Cost is an important consideration for organisations deploying enterprise storage solution. When we use the word ‘enterprise storage’, it implies a consolidated storage system that is capable of supporting multiple operating systems and multiple mainframe, Unix and NT servers on a single storage system. IDC analysts also note that cost comparisons between enterprise storage systems and distributed storage systems indicate that total cost of operation is lesser in case of former. This may surprise many as enterprise storage systems incur higher hardware and labour costs. However, it is important to note that lowering of storage management costs offsets the hardware and manpower costs. IDC further observes that justification of storage on a mere $/MB basis is not enough. The justification comes from improvement in storage management efficiencies, avoidance of costly losses and enhancement of business value. This kind of approach makes storage backup and restore more important. Other IDC studies have also revealed that restore is 99 per cent successful in enterprise storage as compared to 25-75 per cent success in distributed and desktop servers.

In an enterprise storage system, the complexity and cost of backup/ restore function varies, depending upon whether it is conducted at the volume, file, database or application level. The lower levels of backup may be the easiest to execute from a hardware standpoint but may be the most expensive in terms of management costs and downtime.

Backup

When we look at file level backup/ restore, it is more difficult to perform, as it requires the backup software to know and communicate information on the locations and extents of a dataset or file. Further file level backup between platforms require communication between a backup agent on the client system that can interrogate the file system and a backup agent on the server that records the file control information that can be used for retrieval. Since it is done over a network, it has to be ensured that the speed of the network is not constrained and minimises backup time by transferring data at channel/ bus speeds rather than at network speeds.

While performing database level backups two modes are usually followed. They are ‘cold’ and ‘hot’ backups of databases. In ‘cold’ backups of databases, applications running against the databases must be stopped and all in-flight data be flushed to the disk before backup is started. The applications should remain suspended until the backup is complete resulting in a downtime. In order to reduce this downtime, companies use database utilities to do ‘hot’ backups. Online or ‘hot’ backups still require downtime. However, once the applications are stopped and data is flushed back to the disk and a synch point is established, applications can resume in a read-access mode. Online database backups require speed and the longer it takes to do the backup, more new transactions accumulate in the logs and longer it takes to re-synch the database after the backup. In order to reduce downtimes of this nature there are solutions offered by many storage vendors that take snapshot copies for backup of mainframe data.

However, when we look at the applications backups/ restore it is dynamically different. Application backup and recovery involves the backup and recovery of logically related groups of database objects across multiple databases. Backups must be synchronised to generate a consistent recovery point across multiple servers. Application backup requires a central management server to control and coordinate recovery.

End-to-end recovery

Having looked at the complexities involved in backup and restore functions across different levels, let us now take a look at the issues involved in end-to-end recovery process. The process of recovery is initiated once a failure has been detected in the system. Failure detection in high availability middleware can be detected automatically and the hardware connections for data services automatically switch over to a standby system. However once the hardware is recovered, a number of steps are required to be followed.

If a file system was in use it must be recovered. Further if a database instance has failed it must be restarted and the configuration files must be read. The recovery of a database begins with a roll forward which involves reading through the redo log files, and rebuilding and reapplying each applicable transaction. After roll-forward recovery is completed, a roll-back recovery is done to undo any uncommitted or incomplete transactions that belonged to the failed system. If disk mirroring was in use, the consistency of mirrors must be checked. If they are not in sync they must be reconstructed from logs. This step can be avoided by using disk arrays. Finally once database is recovered, the application must go through its own recovery and this depends upon the application itself.