Backups have two distinct purposes. The primary purpose is to recover data after its loss, be it by data deletion or corruption. Data loss is a very common experience of computer users.

Backup on personal computers or servers is always important to prevent permanent data loss. Therefore getting to know different backup tools is very important especially for System Administrators who work with large amounts of enterprise-level data and even on personal computers. It is always a good practice to keep on backing up data on our computers, this can either be done manually or configured to work automatically.

In information technology , a backup , or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is " back up ", whereas the noun and adjective form is " backup ". A backup system contains at least one copy of all data considered worth saving. The data storage requirements can be large. An information repository model may be used to provide structure to this storage.

There are different types of data storage devices used for copying backups of data that is already in secondary storage onto archive files. Data is selected, extracted, and manipulated for storage. The process can include methods for dealing with live data , including open files, as well as compression, encryption, and de-duplication. Additional techniques apply to enterprise client-server backup. Backup schemes may include dry runs that validate the reliability of the data being backed up.

There are limitations [5] and human factors involved in any backup scheme. A backup strategy requires an information repository, "a secondary storage space for data" [6] that aggregates backups of data "sources". The repository could be as simple as a list of all backup media DVDs, etc.

The backup data needs to be stored, requiring a backup rotation scheme , [4] which is a system of backing up data to computer media that limits the number of backups of different dates retained separately, by appropriate re-use of the data storage media by overwriting of backups no longer needed.

The scheme determines how and when each piece of removable storage is used for a backup operation and how long it is retained once it has backup data stored on it. The rule can aid in the backup process. It states that there should be at least 3 copies of the data, stored on 2 different types of storage media, and one copy should be kept offsite, in a remote location this can include cloud storage.

An unstructured repository may simply be a stack of tapes, DVD-Rs or external HDDs with minimal information about what was backed up and when. This method is the easiest to implement, but unlikely to achieve a high level of recoverability as it lacks automation.

A repository using this backup method contains complete source data copies taken at one or more specific points in time. However, imaging [9] is generally more useful as a way of deploying a standard configuration to many systems rather than as a tool for making ongoing backups of diverse systems.

An incremental backup stores data changed since a reference point in time. Subsequently, a number of incremental backups are made after successive time periods. Restores begin with the last full backup and then apply the incrementals. Continuous Data Protection CDP refers to a backup that instantly saves a copy of every change made to the data.

This allows restoration of data to any point in time and is the most comprehensive and advanced data protection. They can therefore only allow restores to an interval boundary. Near-CDP except for Apple Time Machine [15] intent-logs every change on the host system, [16] often by saving byte or block-level differences rather than file-level differences.

Intent-logging allows precautions for the consistency of live data, protecting self-consistent files but requiring applications "be quiesced and made ready for backup. Near-CDP is more practicable for ordinary personal backup applications, as opposed to true CDP, which must be run in conjunction with a virtual machine [17] [18] or equivalent [19] and is therefore generally used in enterprise client-server backups.

A Reverse incremental backup method stores a recent archive file "mirror" of the source data and a series of differences between the "mirror" in its current state and its previous states. A reverse incremental backup method starts with a non-image full backup. After the full backup is performed, the system periodically synchronizes the full backup with the live copy, while storing the data necessary to reconstruct older versions. This can either be done using hard links —as Apple Time Machine does, or using binary diffs.

A differential backup saves only the data that has changed since the last full backup. This means a maximum of two backups from the repository are used to restore the data.

However, as time from the last full backup and thus the accumulated changes in data increases, so does the time to perform the differential backup. Restoring an entire system requires starting from the most recent full backup and then applying just the last differential backup. A differential backup copies files that have been created or changed since the last full backup, regardless of whether any other differential backups have been made since, whereas an incremental backup copies files that have been created or changed since the most recent backup of any type full or incremental.

Other variations of incremental backup include multi-level incrementals and block-level incrementals that compare parts of files instead of just entire files. Regardless of the repository model that is used, the data has to be copied onto an archive file data storage medium.

The medium used is also referred to as the type of backup destination. Magnetic tape was for a long time the most commonly used medium for bulk data storage, backup, archiving, and interchange. It was previously a less expensive option, but this is no longer the case for smaller amounts of data. While tape media itself has a low cost per space, tape drives are typically dozens of times as expensive as hard disk drives and optical drives.

Many tape formats have been proprietary or specific to certain markets like mainframes or a particular brand of personal computer. By LTO had become the primary tape technology. The Oracle StorageTek T was discontinued in The use of hard disk storage has increased over time as it has become progressively cheaper. Hard disks are usually easy to use, widely available, and can be accessed quickly.

Some disk-based backup systems, via Virtual Tape Libraries or otherwise, support data deduplication, which can reduce the amount of disk storage capacity consumed by daily and weekly backup data. Optical storage uses lasers to store and retrieve data.

In the past, the capacities and speeds of these discs have been lower than hard disks or tapes, although advances in optical media are slowly shrinking that gap. Potential future data losses caused by gradual media degradation can be predicted by measuring the rate of correctable minor data errors , of which consecutively too many increase the risk of uncorrectable sectors. Support for error scanning varies among optical drive vendors. Many optical disc formats are WORM type, which makes them useful for archival purposes since the data cannot be changed.

Moreover, optical discs are not vulnerable to head crashes , magnetism, imminent water ingress or power surges , and a fault of the drive typically just halts the spinning. However, recordable media may degrade earlier under long-term exposure to light. Some optical storage systems allow for cataloged data backups without human contact with the discs, allowing for longer data integrity.

A French study in indicated that the lifespan of typically-sold CD-Rs was 2—10 years, [37] but one manufacturer later estimated the longevity of its CD-Rs with a gold-sputtered layer to be as high as years. Solid-state drives SSDs use integrated circuit assemblies to store data.

Flash memory , thumb drives , USB flash drives , CompactFlash , SmartMedia , Memory Sticks , and Secure Digital card devices are relatively expensive for their low capacity, but convenient for backing up relatively low data volumes.

Available SSDs have become more capacious and cheaper. Remote backup services or cloud backups involve service providers storing data offsite. This has been used to protect against events such as fires, floods, or earthquakes which could destroy locally stored backups.

Because speed and availability are limited by a user's online connection, [23] users with large amounts of data may need to use cloud seeding and large-scale recovery. Various methods can be used to manage backup media, striking a balance between accessibility, security and cost. These media management methods are not mutually exclusive and are frequently combined to meet the user's needs.

Using on-line disks for staging data before it is sent to a near-line tape library is a common example. Online backup storage is typically the most accessible type of data storage, and can begin a restore in milliseconds. An internal hard disk or a disk array maybe connected to SAN is an example of an online backup.

This type of storage is convenient and speedy, but is vulnerable to being deleted or overwritten, either by accident, by malevolent action, or in the wake of a data-deleting virus payload. Nearline storage is typically less accessible and less expensive than online storage, but still useful for backup data storage. A mechanical device is usually used to move media units from storage into a drive where the data can be read or written.

Generally it has safety properties similar to on-line storage. An example is a tape library with restore times ranging from seconds to a few minutes. Off-line storage requires some direct action to provide access to the storage media: for example, inserting a tape into a tape drive or plugging in a cable.

Because the data is not accessible via any computer except during limited periods in which they are written or read back, they are largely immune to on-line backup failure modes. Access time varies depending on whether the media are on-site or off-site.

Backup media may be sent to an off-site vault to protect against a disaster or other site-specific problem. The vault can be as simple as a system administrator's home office or as sophisticated as a disaster-hardened, temperature-controlled, high-security bunker with facilities for backup media storage. A data replica can be off-site but also on-line e. Such a replica has fairly limited value as a backup.

A backup site or disaster recovery center is used to store data that can enable computer systems and networks to be restored and properly configure in the event of a disaster.

Some organisations have their own data recovery centres, while others contract this out to a third-party. Due to high costs, backing up is rarely considered the preferred method of moving data to a DR site. A more typical way would be remote disk mirroring , which keeps the DR data as up to date as possible. A backup operation starts with selecting and extracting coherent units of data.

Most data on modern computer systems is stored in discrete units, known as files. These files are organized into filesystems. Deciding what to back up at any given time involves tradeoffs.

By backing up too much redundant data, the information repository will fill up too quickly. Backing up an insufficient amount of data can eventually lead to the loss of critical information. Files that are actively being updated present a challenge to back up.

One way to back up live data is to temporarily quiesce them e. At this point the snapshot can be backed up through normal methods. Snapshotting a file while it is being changed results in a corrupted file that is unusable. This is also the case across interrelated files, as may be found in a conventional database or in applications such as Microsoft Exchange Server. Backup options for data files that cannot be or are not quiesced include: [50].

Ultrium tape drives on HP Unix servers running the HP-UX operating system. The aim is to maximise tape performance therefore reducing backup and recovery.

