By Dan Kusnetzky
Storage and retrieval of information is at the heart of almost every application. This makes storage software an important topic for system managers and developers. Unfortunately, storage software is a very broad topic that that creates an ideal opportunity for confusion to set in. Thanks to industry watchers, such as IDC, we have elaborate definitions of each type of storage software in use today. According to IDC, storage software breaks down into the following categories: Data protection and recovery software, Storage replication software, Archive and hierarchical storage management (HSM) software, Storage management software, Storage device management software, Storage infrastructure software, and File system software. Whew! I get exhausted just reading through that list. Let’s take a stroll through each of these types of software to see how and where they might be used when Linux is the target server environment.
The first type of storage software we’ll examine is “Data protection and recovery software” which is also known by the acyronym DPRS. What? You’re not familiar with this category of software? Sure you are. You’ve just heard this described as back up software. DPRS that makes copies of important files, directories and other storage entities allowing the environment to be recovered in case of an “oopsie”, that is an accidental deletion of something important, a storage subsystem failure or some other catastrophic event such as senior management walking into the data center unannounced. As with all types of software, system administrators must consider what types of data protection are needed in order to select the best option for his/her own environment. Here are a few products that would fall into this category: Acronis True Image 9.1 Server for Linux, Arkeia Network Backup, BakBone NetVault: Backup, HP's Data Protector Express Backup and Recovery, IBM Storage Manager for Linux, Novastor NovaNET, Veritas Backup Software for Linux, and, of course, many utility programs, such as BAR, BRU, cdbackup, CDTAR, FileBackup and the like. If you’re dealing with a relatively straightforward configuration, it may be possible to use open source backup and recovery tools rather than purchasing a more elaborate solution from a software vendor.
Storage replication software (SRS) is another form of back up software. I would bet that you would be familiar with this category if I called it “storage mirroring software.” This software is a different animal than database replication software, by the way. Database replication software has the specialized knowledge of the internal structure of database files allowing it to replicate tables or do translations from one database system to another during the replication process. SRS, on the other hand, just has specialized knowledge of the on-disk structure of various types of filesystems. As with DPRS, this is software that makes copies of important files, directories and other storage entities allowing the environment to be recovered in case of an “oopsie.” The primary differences between DPRS and SRS are that SRS typically is copying information across the network to a remote system or storage server and that this function may be copying information in real-time. The goodness here is that a more traditional backup process may not be needed or needed as often when SRS is on the case. When used as a form of real-time protection, SRS may be part of a high availability/fail over configuration. This type of software may also be utilized to support a distributed processing solution. SRS can be used to move key data files from a remote office, regional data center or cell in a manufacturing line to a central processing center. As with DPRS, system administrators must consider what types of data protection are needed in order to select the best SRS option for their own environment. SRS may be overkill for a small network or a single-system solution. Here are a few products that would fall into this category: BakBone NetVault: Replicator, EMC SRDF, IBM Tivoli Storage Manager, Ipsilon SyncIQ, Lake Technologies' MIMIX Echostream for Linux, Symantec Corp’s VERITAS Storage Foundation, and, of course, some open source solutions, such as Plasmid Replication Engine, Scot Cate File Replication System, and Web Synchronizer are available on the network. Which is right for you? Well, if you don’t have an EMC storage system, EMC SRDF may be of less interest than other potential solutions.
Archive and hierarchical storage management software (HSM) is policy-based variation of DPRS. The key difference here is that HSM tries to match how and when files are used to the storage devices available. Frequently used files will tend to be stored on local, high speed storage. Files that are seldom used will migrate towards low speed, low cost media, such as tape. Users do not need to know where files are located. The applications they are using can request access to a file and the HSM software will find it and bring it online for use. Note to administrators: HSM may be overkill on a single, standalone system. If you have a more complex environment, this software might make it possible to make greate use of lower cost storage devices and still get the job done. Once the guidelines have been set up telling the software where certain types of files should be placed, the HSM software manages everything automatically. Some products in this category are CA BrightStor HSM, IBM Tivoli Storage Manager for Space Management, Symantec Corp.'s VERITAS NetBackup Storage Migrator, and, as before, some open source solutions are available including: Hierarchical storage file system, Enterprise Volume Management System (EVMS), and fILM information lifecycle manager.
Storage management software (SMS) is one of the types of software that seems to be confusing. SMS manages and orchestrates the use of other storage software rather than actually being responsible for backing up or moving files around. Sophistated SMS frameworks are typically implemented in complex computing environments not in a single-system environment. Here are some products that fit into this category, EMC ControlCenter, IBM Tivoli Storage Manager, Symantec Corp’s VERITAS Storage Exec, and some open source solutions such as, Amanda, Aperi Project, or CleverSafe. While more straightforward environments may not need this type of software at all, if it is needed, one of these tools certainly reduce the workload of a system administrator.
Storage device management software (SDMS) is software and manages specific storage devices rather than being software that actually does the work of backing up files, moving files around or determinging the most efficient place to store files. As with SMS, some system administrators are confused by the name of this category of software and think that it actually delivers those functions rather than managing them. Typically, this type of software is provided with the storage device. It can be used by itself when low-level device management is required, or be orchestrated by SMS in a larger, more complex environment.
Storage infrastructure software (SIS) is software that provides the services necessary to provide a reliable storage framework. As with SDMS, this software is often provided as part of a larger solution. The internal communication framework used to allow storage management software to communicate and control DPRS, SRS or HSM is called storage infrastructure software. For the most part, this software is not available separately. Some vendors, however, sell their storage solutions as separate components allowing system administrators to tailor the solution to his/her organization’s needs. Some vendors may say that this is their intention when what they’re really trying to do is sell each component separately to increase their overall revenues.
File system software, more traditionally called filesystem software is a method for storing and organizing computer files and applications. This software keeps track of the actual physical location of each component of a computer file or application and makes it available to the operating system upon request. Although some distributed file systems have been made available as commercial products, most Linux users rely on the filesystem choices made by their Linux distribution. AFS and NFS are examples of this type of software.
As with most areas of computing, storage software offers a multifaceted web of products and tools, each designed to support some aspect of system operations. Some of these functions are only needed by the largest and most complex shops. Others make systems’ operations more reliable. System administrators must carefully review the needs of the organization before making a choice. Many times, the proper choice is to pass rather than purchasing an expensive solution to what is a minor issue. Other times, purchasing one of these products can make the life of an aministrator quite a bit easier.
Dan Kusnetzky, is a partner in the Kusnetzky Group, an Acuity Group, LLC, company. In a former life, he was Vice President of System Software Research for IDC and the Executive Vice President of Corporate and Marketing Strategy for Open-Xchange, Inc.