Advanced Maryland Automatic Network Disk Archiver
Traditionally, backups have been scheduled as a Full dump of everything on the weekend, then incremental backups throughout the week. There are several disadvantages to this strategy:
- The backup time on the weekend is much longer than during the week.
- The tape usage during the week is very small, but because the backups on the weekend are very large you must have very large tapes. This means wasted tape during the week.
- To recover an entire directory from Friday requires Sunday's, Monday's, Tuesday's, Wednesday's and Thursday's tapes.
Amanda changes that paradigm. The primary goal for Amanda is to balance tape usage each night. To do this, it varies the days on which certain backup "entries" (more on this later) are dumped in full, so that each night the amount of data that is backed up is more or less equal. This means:
- The backup time is approximately the same every night.
- The tape usage is uniform, so a smaller drive can be used and no tape has to be wasted.
To recover an entire directory from Friday may only require one tape at best, and a dumpcycle's worth of tapes at worst (more on this later as well).
Throughout this document, you will see references to "tape" quite frequently. Know that this does not necessarily mean magnetic tape. Amanda supports backing up to hard disk (via the file driver) or even a RAIT feature, which could be multiple tapes, multiple hard disks, or a combination thereof.
When talking about any subject, it's good to have a common vocabulary. Below are some important terms that should be understood.
- Level 0
A full dump, or back up, of a set of files. This means that every file is saved for later retreival.
- Level 1
The first incremental back up of a set of files. This means that only the files that have changed since the last Level 0 have been stored for later retreival.
- Level n
The incremental backup of files since the last Level n-1 backup. Note: Levels can sometimes be difficult to understand, so there is an Example available.
- Disklist Entry (DLE)
Amanda stores the items to be backed up in the disklist file, so each item becomes a Disklist Entry.
- The maximum length of time that a DLE receives a Level 0 backup. (Usually 1 week)
- The number of times during a dumpcycle that Amanda will be run. (Usually 7 for every night of the week or 5 for every weeknight, assuming a dumpcycle of 1 week)
- The number of tapes Amanda must use before "recycling" tapes. (Ideally this should be at minimum 2*runspercycle+1. That way you have two complete sets of backups, plus an extra tape if something goes wrong)
Amanda is a scheduler
Often the question is asked: "Will Amanda support my ACME Tape Drive 2000 Ultra Professional?". The answer is: "Does your Operating System Support it?"
The reason for this is that Amanda doesn't actually talk directly to the tape drive. It depends on Operating System utilities such as dd, tar and the many flavors of dump to do that. Amanda merely calculates the appropriate data to be backed up based on the parameters set and schedules it appropriately. How are things scheduled? The definitions above should provide a hint, but we'll explain it a little more in this section.
Amanda schedules backups mostly based around the three variables listed above; dumpcycle, runspercycle, and tapecycle. Amanda ensures (to the best of its ability) that each individual DLE receives at least one Level 0 backup per dumpcycle. It is at least one, because Amanda may "promote" a dump to receive a Level 0 ahead of its regularly scheduled time (remember the primary goal of Amanda - to balance tape usage) to help balance the size of the data. The runspercycle variable lets Amanda know how many chances it has to fit everything in during a dumpcycle. tapecycle is a safeguard to make sure you don't overwrite data that you shouldn't overwrite yet.
Amanda on CentOS
A long time ago, a decision was made to not require any configuration to be done on Amanda clients. This is good if you back up several hundred machines, however it has the drawback that certain options must be compiled into the client binaries. Among these are the address/hostname of your tape and index servers. This presents an interesting problem for those who package binaries for Amanda (e.g. just about every Linux distribution); they can't know settings at compile time, so some assumptions must be made. The biggest problem with that is using localhost as the server names. (See the Top Ten List for details on why this is a problem). Because of this it is recommended to rebuild the Amanda RPMs for your specific environment.
Recent releases of Amanda (2.5.1 and above) have broken with tradition and now can use a configuration file on the client. See http://wiki.zmanda.com/index.php/Amanda-client.conf for details on this file.
For CentOS 3 and 4, changing the defaults requires editing of the .spec file. In CentOS 5 and forward, changes to the .spec file have been made upstream to ease this problem. Specifically, the variables %defconfig, %tapeserver and %indexserver are now configurable through a define (e.g. rpmbuild --rebuild --define "defconfig Dailies" --define "tapeserver fqdn.com" amanda-2.5.0p2-4.i386.rpm)
For those of you who don't wish to rebuild a 2.4.5 or later version of Amanda on your CentOS 3 or 4 system, here's how you can rebuild the existing packages for your environment (using CentOS 4 as an example):
- Download the amanda-2.4.4p3-1.src.rpm file.
Install the source rpm: rpm -i amanda-2.4.4p3-1.src.rpm. This will extract the contents into your rpm directory (see YumAndRPM for help on setting up your build system).
- Edit the SPECS/amanda.spec file to reflect the appropriate servers, and any other changes you might want to make. It is also suggested that you change the Release: tag to indicate you've made changes (using initials or something similar).
- Rebuild the rpms: rpmbuild -ba SPECS/amanda.spec
- Enjoy your freshly customized RPMs
The current stable release is available at http://www.zmanda.com/download-amanda.php.