backing up a drupal site.

Feb 22, 2010 | Linux, Technology | 3 comments

I host a number of Drupal sites as well as wordpress and custom made ones as wel.

When you host a site, one of the first questions your asked is do you have the ability to back up and restore my site if something breaks?

For obvious reasons, that’s an important question. But, it’s a balancing act. It’s important to make sure you back up regularly but you don’t want to over do it and use up all your bandwidth on copying said backups off the server.

So, for backups you need to separate them in to four parts.

  • Nightly Full server backups.
    If the server goes down, I want to be able to bring it back within 5 minutes.
  • Monthly Full site backups.
    These will be compressed archives that contain everything from the site including content and databases.
  • Weekly differential site backups
    These are stored on a server that mirrors the configuration of the primary. It is used for testing new server configs before they go live on the production server.
  • Daily site backups
    This is a backup of important site files that can become dammaged as a result of errors during an upgrade or configuration change. This does not contain a database backup but is very useful for very quick restores.

With that in mind, I have created the final part of this puzzle. The following daily backup script archives the important directories in a drupal installation so their ready to be coppied by the remote server. I have these scripts saved to a location in the home folder of a very restricted account that is used simply for this task. A simbolic link in /etc/cron.daily points back to each of these scripts.

#!/bin/bash
thisdate=$(date +%Y%m%d)
backupstatus=false
tar -zcvf /home/UserName/backups/UserName.tar.gz /home/UserName/public_html/sites/all /home/UserName/public_html/sites/default/settings.php /home/UserName/public_html/sites/default/files/playlists /home/UserName/public_html/sites/default/files/js /home/UserName/public_html/sites/default/files/css /home/UserName/public_html/cron.php /home/UserName/public_html/includes /home/UserName/public_html/index.php /home/UserName/public_html/install.php /home/UserName/public_html/misc /home/UserName/public_html/modules /home/UserName/public_html/profiles /home/UserName/public_html/scripts /home/UserName/public_html/themes /home/UserName/public_html/update.php /home/UserName/public_html/xmlrpc.php && backupstatus=true
if [ $backupstatus = false ]; then
echo Error $thisdate Backup failed. >> /home/UserName/backups/UserName.log
else
echo $thisdate Backup completed without errors. >> /home/UserName/backups/UserName.log
fi
backupstatus=
thisdate=
chown RestrictedAccount UserName

So, what am I doing there?

  • First, I declare a variable to hold the date.
  • Second, I declare a variable that holds the value false. If the archive command doesn’t work, this will never be set to true.
  • Next, I archive very specific folders. Notice, I’m not archiving /home/UserName/public_html/sites/default/files because that contains audio, pictures and videos and I really don’t want or need to include them in every days backup file because it would be far too large.
  • Notice that there’s a change to the BackupStatus variable at the end of the archive command. Because this starts with an &&, it will not be run unless the archive command is successfull.
  • Next, I use an if statement. If the backup status is false, I write to the error file. Notice that I put error at the start of the line. This just makes things a bit easier because I can look through the start of the log for a line that doesn’t start with a date.
  • Of course, if the variable comes back true, then the log file is updated to reflect that the archival job was successfull.
  • Finally, I do some clean up. I set both variables to blank values and make sure that the user who has only very few access privlidges can get the file.
  • I don’t doubt that there may be a better way of doing that, but this way works very well.

    On the other machine, a cron job is set to run very early in the morning to copy down these archives. With every archive it copies, it logs it on the remote server. That way, if what I call the copy job fails, I can see it and take any required action.

    I may be doing too many backups at the moment. With any process like this, it will take some analysis for a few weeks to determine if I can reduce the frequency of backups depending on the number of updates made to each site. Because I don’t host a huge amount, I can even tailor the back up schedule per site so as sites that are updated frequently are backed up more often.

3 Comments

  1. shaun everiss

    Well I don’t obviously have this sort of backup solution here.
    I have 3 systems one laptop and 2 desktops.
    most updates are done via flash drive win win xp.
    the main system that does a lot of critical stuff is behind zonealarm which blocks basically everything.
    mostly its been backup up my system in a single hard drive an elements 640gb and get a mybook to back up the data to of 1tb so I have a backup so in case the other drive should fail.
    however since my elements, the working drive is only half the size of the main drive the mybook which is not used much bar the 6 month backups is the only drive that has everything.
    other important stuff is on flash drives or other systems.
    Yesterday I had a scare with a couple systems reporting faults with speedfan, one still does.
    however chkdsk reported that some indexes were faulty.
    these kept going on so i did a backup.
    interestingly I was able to get a faster responce uploading to the network at large and then putting it back to another system drive on my laptop external drives included rather than trying to run directly to the drive.
    Its not necessarily needed I back up all the data since most users backup critical stuff but I have never needed to have a copy of data for people.
    they don’t have much as I have about 60 gb total for 2 users.
    one has about 50 and the other does not have much.
    at any rate I am happy I did the backup I cleaned out the system of dust yesterday night and replaced a faulty dvd drive this seems to have fixed the issue though there was still a comman space error which I fixed.
    the system seems to run better.

    The lesser of the 2 systems has a probably slite fault with a cooling system though its strictly smart info from speedfan and not with the system.
    I have noticed the fan has run a bit high when its humid though.
    anyway thats for another night job.

    Reply

Leave a Reply to shaun everiss Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.