Backup & Recovery: Inexpensive Backup Solutions for Open Systems

8.6. Simultaneous Backup of Many Clients to One Drive

There are now backup drives that can write much faster than many disk drives. Backup drives now use Fibre Channel and large buffers, and sometimes write multiple streams of data to the drive at once. The result is that many backup drives can write at 100 MB/s or faster. As mentioned in Chapter 9, the difficulty with most of these drives, though, is that they are streaming tape drives. This means that you must supply them with a steady stream of incoming data equivalent to their maximum throughput. If you don't, the drive does not stream, resulting in a significantly slower transfer rate than it is capable of. This problem, of course, affects only streaming tape drives. It does not affect backup drives that simulate a disk drive, such as magneto-optical, CD, DVD, or virtual tape drives.

All streaming tape drives are designed to write most effectively at their optimum speed. If they are supplied with data at a slower rate than that, the result may be surprising. What begins to happen is referred to as repositioning. See the section "Tape Drives" in Chapter 9 for more information on this concept. The drive spends most of its time repositioning and has less time to actually write data. The result is that it will write at a small fraction of its maximum data rate.

When would this happen? Consider a single-threaded backup process such as dump or tar. It encounters many potential bottlenecks before getting to the drive. The first obstacle is the disk itself, which may not be able to supply data at a sufficient rate. Another is the SCSI bus and system to which the drive is connected. It may be busy satisfying other I/O requests, hence impacting the data rate that can be transferred across the bus. The next, of course, is the network. If it's a 100 MB network, the best possible data rate is less than 10 MB/s.[§] If it's a gigabit network, any individual server can typically handle only about 6075 MB/s. (Hopefully this limit will go away soon.) The final possible bottleneck is the system to which the backup drive is connected. Any one of these bottlenecks could slow the data rate to less than the optimal speed for the backup drive. If that happens, a tape drive stops streaming and begins repositioning in order to keep pace with the incoming data. When this happens, it becomes the new bottleneck.

[§] Dividing megabits by 8 gives us megabytes, so 100 Mb/s translates into about 12 MB/s, but is usually closer to 10 MB/s in reality. (And the math is easier this way.)

What can be done about this? Most commercial backup products solve this problem by implementing multiplexing. Multiplexed backups read data from several systems and several disks at one time and supply the data from all of these threads to the backup drive simultaneously. The data from these different sources is then interleaved onto the volume. The backup drive is therefore always being supplied with a sufficient stream of data so that it can write at its maximum speed. A cautious administrator learning of this feature for the first time might be worried. Some administrators don't like the fact that their data is spread out all over the volume. However, this feature is really the only way to stream a backup drive that can read or write at speeds faster than a disk. This is why most commercial backup products implement multiplexing.

Different backup software vendors use different terms for this feature. It is sometimes referred to as parallelism or interleaving.

The downside to multiplexing is that it slows down the speed of restores. Therefore, multiplexing was considered by many to be a necessary evil. The feature that has replaced it is disk-to-disk-to-tape backup, where backups are first sent to disk, then sent to tape.

Категории