System statistics at a glance.

It’s important to see how servers are doing regularly to enable you to to take a proactive approach to resolving problems.

But, when your using a Linux server with no GUI, it can be easy to become complacent. After all, they so rairly go down and it can take time to type all the commands required to get the information you need.

That’s where Bash scripting comes in. Creating a script to provide you with memory, CPU, disk and bandwidth usage is very useful. Having this available to you on a day to day basis is even better again.

I’ve been looking for something that would give me a tailored report of what’s happening on a server that I run but nothing gave me what I needed. Everything was either too complicated, too process intensive, too graphical or had too many dependencies. So, after looking around for a while, I decided that it would take less time to throw something together my self.

As I say with all these scripts. This is only one way of doing things. It’s not necessarily the best way, but it works none the less.

The first script obtains all the information I need and simply writes it to a text file. This can be sent by email or just stored somewhere for analysis.

#!/bin/bash
logfile=/YourDirectory/output.log
echo Logged in users: >> $logfile
w >> $logfile
echo >> $logfile
echo Processor stats: >> $logfile
mpstat >> $logfile
echo >> $logfile
echo Virtual memory stats: >> $logfile
vmstat >> $logfile
echo >> $logfile
echo Top twenty memory hog applications: >> $logfile
ps -A -o pid,pcpu,pmem,start_time,state,time,comm | perl -e ‘($_ = join “”,<>) =~ s/(t)/ /g; print;’ |sort -g -k 3 -r | head -20 >> logfile
echo >> $logfile
echo Top twenty CPU hogging applications: >> $logfile
ps -A -o pid,pcpu,pmem,start_time,state,time,comm | perl -e ‘($_ = join “”,<>) =~ s/(t)/ /g; print;’ | sort -g -k 2 -r | head -10 >> $logfile
echo >> $logfile
echo Free memory: >> $logfile
free -m >> $logfile
echo >> $logfile
echo Processor information: >> $logfile
procinfo >> $logfile
echo >> $logfile
echo Established connections: >> $logfile
netstat -na |grep -i esta |grep -v 127.0.0.1 |sort -n -t. -k2 >> $logfile

This next script though is where it gets interesting. I’ve created a virtual host and all the files are written to a it’s directory. There’s an index and a separate html file for each day of the week. When the script is called, it populates the html file for the day with the data I want and it adds a line to the index to point to the new file that has just been created. So, instead of getting email every day from a server that is ordinarily very reliable, Touch Wood, I can visit this page every so often to check that everything is ok.

You’ll notice that in this script, I’ve also included checks for mail, Apache2, MySQL and Icecast errors as these are the most important services running on this server.

#!/bin/bash
logfile=/YourDirectory/logs/$(date +%Y%m%d).html
WHO=w
MPSTAT=mpstat
VMSTAT=vmstat
PS_MEM=ps -A -o pid,pcpu,pmem,start_time,state,time,comm | perl -e '($_ = join "",<>) =~ s/(t)/ /g; print;' |sort -g -k 3 -r | head -20
PS_CPU=ps -A -o pid,pcpu,pmem,start_time,state,time,comm | perl -e '($_ = join "",<>) =~ s/(t)/ /g; print;' | sort -g -k 2 -r | head -10
FREE=free -m
PROCINFO=procinfo
NETSTAT=netstat -na |grep -i esta |grep -v 127.0.0.1 |sort -n -t. -k2
APACHE2LOGS=tail /var/log/apache2/error.log
MYSQLLOGS=tail /var/log/mysql.err
MAILLOGS=tail /var/log/mail.err
ICECASTLOGS=tail /var/log/icecast2/error.log
cat <<- _EOF_ > $logfile

Server stats

Server statistics for $HOSTNAME

Updated on $(date +”%x %r %Z”) by $USER

Logged in users:

Processor stats:

Virtual memory stats:

Top twenty memory hog applications:

Top twenty CPU hogging applications:

Free memory:

Processor information:

Established connections:

Errors

In the mail logs:

MySQL logs

Apache2 logs

Icecast2 logs


Return to the index


_EOF_
echo “$(date +%Y%m%d)” >> /YourDirectory/logs/index.html

Internet connectivity problems for the past four weeks.

I really must write about this in more detail.

I have mentioned it once or twice before on the blog. The last time, I thought I had fixed it by changing the DNS. I’ve however since reversed these changes as they did not make a lasting difference.

Really simply, and without going too much into it right now, I can access Irish websites but I cannot access websites in other countries. Hence, I can write to this blog but I cant access twitter.com at the moment.

I want to see what the difference is between a tracert on this connection and the tracert on another vodafone connection. I know their not static routes but still, it will be interesting to see what the difference is.

Here is the output of the tracert on my machine.

Tracing route to twitter.com [128.121.146.228]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms 192.168.2.1 2 33 ms 33 ms 33 ms 89.19.64.2 3 34 ms 35 ms 37 ms 89.19.64.73 4 35 ms 36 ms 37 ms 213.233.129.93 5 138 ms 107 ms 108 ms 193.95.147.61 6 53 ms 51 ms 52 ms vlan72.rt001.cwt.esat.net [193.95.130.209] 7 51 ms 52 ms 50 ms ge3-0.br003.cwt.esat.net [193.95.129.6] 8 54 ms 54 ms 55 ms xe-0-1-0-104.dub20.ip4.tinet.net [213.200.67.253 ] 9 135 ms 138 ms 138 ms xe-5-1-0.was12.ip4.tinet.net [89.149.184.34] 10 * * * Request timed out. 11 191 ms 190 ms 191 ms as-3.r20.snjsca04.us.bb.gin.ntt.net [129.250.2.1 67] 12 191 ms 191 ms 192 ms xe-1-1-0.r20.mlpsca01.us.bb.gin.ntt.net [129.250 .5.61] 13 192 ms 193 ms 189 ms mg-1.c20.mlpsca01.us.da.verio.net [129.250.28.81 ] 14 190 ms 192 ms 189 ms 128.121.150.245 15 293 ms 188 ms 186 ms 128.121.146.213 16 261 ms 191 ms 191 ms 128.121.146.228 Trace complete.

Interestingly, I cannot access twitter from this machine. I can try on any one of the computers here. Not one will load the twitter page. If I ping it, I get a response. If I tracert it get a response also. But it jsut will not load http://www.twitter.com It’s not confined to Twitter either as I said earlier. I was trying to install winamp today from http://www.winamp.com and I couldn’t access that either. Again, I could tracert to it without any problems.

Here however is the output of tracert on another vodafone connection. Note, this check was done within two seconds of the first on my machine. This connection has no problems accessing any website. The configuration of this connection almost completely mirrors my own. The internal P address structure is the only noticeable difference.

Tracing route to twitter.com [128.242.240.20]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms 192.168.1.254 2 9 ms 10 ms 11 ms 89.19.64.129 3 13 ms 12 ms 12 ms 89.19.64.181 4 12 ms 13 ms 13 ms 193.95.147.65 5 11 ms 11 ms 11 ms vlan73.sw002.cwt.esat.net [193.95.130.217] 6 11 ms 11 ms 10 ms ge1-1.br003.cwt.esat.net [193.95.131.57] 7 11 ms 11 ms 12 ms xe-0-1-0-104.dub20.ip4.tinet.net [213.200.67.253 ] 8 124 ms 98 ms 97 ms xe-5-0-0.was12.ip4.tinet.net [89.149.185.17] 9 * * * Request timed out. 10 165 ms 168 ms 169 ms as-3.r20.snjsca04.us.bb.gin.ntt.net [129.250.2.1 67] 11 166 ms 164 ms 168 ms xe-1-1-0.r20.mlpsca01.us.bb.gin.ntt.net [129.250 .5.61] 12 168 ms 166 ms 171 ms mg-1.c20.mlpsca01.us.da.verio.net [129.250.28.81 ] 13 170 ms 170 ms 167 ms 128.241.122.197 14 163 ms 165 ms 168 ms 128.242.240.5 15 175 ms 202 ms 174 ms 128.242.240.20 Trace complete.

Again, I’ll vent at another stage about my frustrations while trying to get this fixed. Not now though, it’s late.

Update on Monday 01st March 2010 at 10:37PM.
Around twenty four hours after writing this blog post, internet connectivity seems to be back to normal. I expect it will stay this way for a day or so but unless resolved, the problem described earlier will return. As a demonstration of the change in the connection to the internet tonight, I will provide the output of the tracert command with Twitter continuing to be the required destination.

Tracing route to twitter.com [168.143.162.116]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms 192.168.2.1 2 34 ms 34 ms 34 ms 89.19.64.129 3 35 ms 34 ms 34 ms 89.19.64.181 4 34 ms 34 ms 34 ms 193.95.147.65 5 35 ms 37 ms 35 ms vlan73.sw002.cwt.esat.net [193.95.130.217] 6 35 ms 34 ms 34 ms ge1-1.br003.cwt.esat.net [193.95.131.57] 7 35 ms 35 ms 34 ms xe-0-1-0-104.dub20.ip4.tinet.net [213.200.67.253 ] 8 121 ms 124 ms 123 ms xe-5-1-0.was12.ip4.tinet.net [89.149.184.34] 9 * * * Request timed out. 10 191 ms 193 ms 190 ms as-3.r20.snjsca04.us.bb.gin.ntt.net [129.250.2.1 67] 11 191 ms 220 ms 191 ms xe-1-1-0.r20.mlpsca01.us.bb.gin.ntt.net [129.250 .5.61] 12 194 ms 193 ms 192 ms mg-1.c20.mlpsca01.us.da.verio.net [129.250.28.81 ] 13 194 ms 192 ms 191 ms 128.121.150.133 14 202 ms 189 ms 191 ms 168.143.162.85 15 191 ms 187 ms 189 ms 168.143.162.116 Trace complete.

Interestingly, the destination address of Twitter has changed and hop 9 timed out.

I cant explain the change, I just know that it exists.

All I can say is I really hope that the people responsibility for providing this connection can do something.

backing up to a remote server using scp and checking your results.

As promised, Here is the next part of my series on backing up a remote Linux server.

This script is still quite straight forward but on the up side, the more straight forward it is, the easier it is to troubleshoot if something goes wrong down the line.

It does a few things. It downloads all the archives in the backup directory, checks that their downloaded and if the check is successful it runs a check to make sure there are no problems. If something has gone wrong, it is logged to a file matching that date with an extention of .err.

#!/bin/sh
thisdate=$(date +%Y%m%d)
backupstatus=failed
logdir=/home/YourUserName/backups/logs
backupdir=/home/YourUserName/backups
mkdir $backupdir/$thisdate
scp YourRemoteUserName@IPAddressOfServer:backups/*.gz /home/YourUserName/backups/$thisdate/ && echo $thisdate files downloaded from server into $backupdir >> $logdir/$thisdate.log && backupstatus=success
if [ $backupstatus=”success” ]; then
ls $backupdir/$thisdate/ && echo $thisdate files are in $backupdir/$thisdate >> $logdir/$thisdate.log
tar ztvf $backupdir/$thisdate/*.gz && echo $thisdate archive archive checked and decompress correctly. >> $logdir/$thisdate.log && backupstatus=success
ls $backupdir/$thisdate/ && backupstatus=failed1
if [ backupstatus = “failed1” ]; then
echo $thisdate The files did not download >> $logdir/$thisdate.err
else
tar ztvf $backupdir/$thisdate/*.gz 2> $logdir/$thisdate.err
fi
fi
thisdate=
backupstatus=
logdir=
backupdir=

As always, I like to clean up my variables. The declarations of these are at the top and the bottom of the script.

In the middle is where the interesting stuff is.

As in the last script, the command after the && will only run after the first command completes successfully. Therefore, it’s a great way of easily checking for the right exit status.

So, when I run ls on the directory that should hold that nights backups, I’m validating the check done above that the download was indeed successful.

The next check is much more important. It makes sure that the downloaded archives are readable. notice the t switch after the tar command. “tar -ztvf”. Again, if this is not successfull, the log won’t be updated and the variable continues to be set to success.

Of course, if things fail, I want to know why! So, that’s where the next if block comes in. Instead of just writing success or fail status messages to the logs, it puts something meaningful into the error log. By piping the errors from the tar command, we’ll see what has happened. Is the file there, or is the archive corrupt.

Of course, there’s one draw back to this. What happens if not all the archives are generated on the server side? Well, that’s where the logs on the server come in to play. It would be nice to have them all together in one place but that’s an easy enough job using a few other commands.

In the next part of this, I will look at backing up indevidual MySQL databases.

using RSA or DSA for authentication to a Linux server via SSH or SCP.

Following on from my post yesterday about backups, I thought I’d give a further explination as to how to copy down the archives that I created in the script.

For this, I’m using SCP. However, if using SCP, you ordinarily need to log on.

If your prompted for a username and password every time your script runs an scp command, it’s kind of pointless having cron run the script at all.

So, to get around the requirement to log in, while at the same time keeping the set up secure, we use an RSA or DSA key.

for the rest of this post, I’m going to call the machines backup and server. The backup is the machine I am copying the backup files to.

On the backup machine, type the following commands to generate the files and copy the public file across to the server. I suggest you use a very restricted account on the backup and server for this.

ssh-keygen -t rsa
hit enter for the first question to agree to save the key to /home/YourUserName/.ssh/id_rsa
Hit enter without typing anything for the second and third questions as we don’t want a password for this particular key. Note, this is usually not recommended but it should be ok for this type of situation.
It will tell you that a public and private key has been created and it will give you the finger print of the newly created key as well.

Next, you will want to copy the public key across to your server. Note, the server is the machine that hosts your backup scripts.
scp .ssh/id_rsa.pub YourUserName@ServerName:.ssh/

If this is the first time you’ve used a public key then use the following command as it will make things easier for you.
scp .ssh/id_rsa.pub YourUserName@ServerName:.ssh/authorized_keys

If however you have used other keys, do the following:
ssh YourUserName@ServerAddress

Type your username and password to log in.

Now, type the following to append the id_rsa.pub to the authorized_keys file.
echo .ssh/id_rsa.pub >> .ssh/authorized_keys

Now, leave the ssh session by typing exit.

From the backup machine, you can now log in via ssh without providing a password.

Note!!!

You might want to secure your public key. If it goes missing, this could go very very baddly for you as this key does not require a password.

Log into the server by typing:
ssh YourUserName:ServerAddress

Now, change the permissions of the file so that this restricted user account is the only one with read and write access to the public key
chmod 600 .ssh/authorized_keys

Now, get out of the ssh session by typing exit.

The next step will be running scp to download your backups and verify that their readable. If their not, we’ll want to log the failure.

backing up a drupal site.

I host a number of Drupal sites as well as wordpress and custom made ones as wel.

When you host a site, one of the first questions your asked is do you have the ability to back up and restore my site if something breaks?

For obvious reasons, that’s an important question. But, it’s a balancing act. It’s important to make sure you back up regularly but you don’t want to over do it and use up all your bandwidth on copying said backups off the server.

So, for backups you need to separate them in to four parts.

  • Nightly Full server backups.
    If the server goes down, I want to be able to bring it back within 5 minutes.
  • Monthly Full site backups.
    These will be compressed archives that contain everything from the site including content and databases.
  • Weekly differential site backups
    These are stored on a server that mirrors the configuration of the primary. It is used for testing new server configs before they go live on the production server.
  • Daily site backups
    This is a backup of important site files that can become dammaged as a result of errors during an upgrade or configuration change. This does not contain a database backup but is very useful for very quick restores.

With that in mind, I have created the final part of this puzzle. The following daily backup script archives the important directories in a drupal installation so their ready to be coppied by the remote server. I have these scripts saved to a location in the home folder of a very restricted account that is used simply for this task. A simbolic link in /etc/cron.daily points back to each of these scripts.

#!/bin/bash
thisdate=$(date +%Y%m%d)
backupstatus=false
tar -zcvf /home/UserName/backups/UserName.tar.gz /home/UserName/public_html/sites/all /home/UserName/public_html/sites/default/settings.php /home/UserName/public_html/sites/default/files/playlists /home/UserName/public_html/sites/default/files/js /home/UserName/public_html/sites/default/files/css /home/UserName/public_html/cron.php /home/UserName/public_html/includes /home/UserName/public_html/index.php /home/UserName/public_html/install.php /home/UserName/public_html/misc /home/UserName/public_html/modules /home/UserName/public_html/profiles /home/UserName/public_html/scripts /home/UserName/public_html/themes /home/UserName/public_html/update.php /home/UserName/public_html/xmlrpc.php && backupstatus=true
if [ $backupstatus = false ]; then
echo Error $thisdate Backup failed. >> /home/UserName/backups/UserName.log
else
echo $thisdate Backup completed without errors. >> /home/UserName/backups/UserName.log
fi
backupstatus=
thisdate=
chown RestrictedAccount UserName

So, what am I doing there?

  • First, I declare a variable to hold the date.
  • Second, I declare a variable that holds the value false. If the archive command doesn’t work, this will never be set to true.
  • Next, I archive very specific folders. Notice, I’m not archiving /home/UserName/public_html/sites/default/files because that contains audio, pictures and videos and I really don’t want or need to include them in every days backup file because it would be far too large.
  • Notice that there’s a change to the BackupStatus variable at the end of the archive command. Because this starts with an &&, it will not be run unless the archive command is successfull.
  • Next, I use an if statement. If the backup status is false, I write to the error file. Notice that I put error at the start of the line. This just makes things a bit easier because I can look through the start of the log for a line that doesn’t start with a date.
  • Of course, if the variable comes back true, then the log file is updated to reflect that the archival job was successfull.
  • Finally, I do some clean up. I set both variables to blank values and make sure that the user who has only very few access privlidges can get the file.
  • I don’t doubt that there may be a better way of doing that, but this way works very well.

    On the other machine, a cron job is set to run very early in the morning to copy down these archives. With every archive it copies, it logs it on the remote server. That way, if what I call the copy job fails, I can see it and take any required action.

    I may be doing too many backups at the moment. With any process like this, it will take some analysis for a few weeks to determine if I can reduce the frequency of backups depending on the number of updates made to each site. Because I don’t host a huge amount, I can even tailor the back up schedule per site so as sites that are updated frequently are backed up more often.

Mad for trad is back.

Due to technical problems outside my control, Mad for trad had to be canceled last Saturday.

This week however, it’s back!

So, tune in to http://digitaldarragh.com:8000/madfortrad at 7PM GMT, 2PM Eastern and 11AM Pacific.

Don’t forget, www.digitaldarragh.com/madfortrad is the address for the page.

The history of DigitalDarragh

This blog is around 8 years old. It was a blog before the word blog was even used in every day language!

It was of course previously hosted at http://digitaldarragh.blogspot.com/ but since 2002 the digitaldarragh.com website has been serving up blog posts of one type or another.

But, because it’s so old, and I’ve done so much with it, content has been lost over time. Content that tells of memories from years ago when life was very different.

But, there’s a way that you can see all this interesting, disturbing and mildly entertaining content. Visit http://web.archive.org/web/*/http://www.digitaldarragh.com and look through the site. Some links obviously won’t work because although archive.org is a great site, it cant save everything. but, mostely everything important is there.

Oh, then there was this shocking and terrible attempt at web development Digytek. spelled baddly, written baddly and just ….. bad! http://web.archive.org/web/*/http://www.digytek.com

Aside from doing websites for my self, I did a few others. But, please! give me a break! I hadn’t a notion of what I was doing! the designs mark up, coding and in fact, everything about them were just horrid! http://web.archive.org/web/*/http://www.alpinefurniture.ie

But, it’s all just a bit of fun isn’t it?

What’s next then…. 🙂

Mad For Trad tomorrow is going to be brilliant!

This week’s mad for trad is going to be the best ever!

Ok, like every other station and program out there, I’ll focus on valentine’s day but obviously I’ll play the best Irish traditional music about love lost, love gained, too much love, too little love, loving other people, having no love and all that kind of good stuff.

I also want to hear about your most embarrassing valentine’s moment. Don’t worry; I’ll share the mother and father of humiliating moments too.

But wait. This is a Mad for Trad with a difference. I won’t be in the usual studio and I will have live audience participation. This is going to be the most interactive show yet!

So, come listen at 7PM GMT, 2PM eastern and 11AM Pacific. The address for the show page is www.digitaldarragh.com/madfortrad to listen to the stream live type the following address into your media player: http://digitaldarragh.com:8000/madfortrad Note, this doesn’t go live until around 6:45PM so if you tune in now, your media player is likely to scream and complain because it isn’t receiving anything.

Tomorrow is going to be a big show. You’d be mad to miss it.

GTK v QT.

This is a response to my Blog post about Linux accessibility. I wrote it in response to another comment. It turned into a bit of a long one though and the information is applicable to a lot of things. So, have a read of this:

KDE is based on QT and Gnome is based on GTK. It’s important to recognise the differences in these environments from the start. QT doesn’t support AT-SPI. It was planned to support this but it never happened. This is down to a decision by the QT developers.
Here’s an article that discusses how Gnome communicates with Orca. http://accessibility.kde.org/developer/atk.php

There’s information about the accessibility of QT packages at http://doc.trolltech.com/4.5/qt4-accessibility.html

GTK is not the only standard for the graphical desktop manager in Linux. QT is just as much of a player. That’s where the problem comes in. Any application that is written for Gnome and follows the Gnome development guidelines will communicate the necessary information to Orca. The problem arises when you launch a QT based application such as Acrobat reader as Orca doesn’t get the required information from it.

I understand your point about the limited progress in some areas due to developers not making enough progress with their compliants with the accessibility guidelines but tell me one platform where that isn’t an issue. It’s nothing to do with communication. It’s a problem with bad coding. Simple.

In relation to the problems you have encountered accessing applications that run as root in Ubuntu, there are a number of well documented fixes to this. Basically, Orca, like every other Linux application can not obtain information from a process running as a more privlidged user. This is the reason why Linux is such a secure platform. However, there are work arounds to this as I said earlier. In fact, Listen to one of the recordings in the Linux section of www.lalrecordings.com and you will hear one of the ways of getting around this. In Vinux, the environment is preconfigured for optomal accessibility. Perhaps you should try this out? You might find that with the environment configured to provide the most accessibility, most of your issues are resolved. If nothing else, it will illustrate to you that on most distributions of Linux running Gnome, problems can be ironed out if your willing to spend some time on configuration.

I’d also suggest you read http://live.gnome.org/Orca as I’m almost certain that you’ll find an alternative and accessible PDF reader. As for skype, use the plug in for Pidgin. I would bet that you will not find an application that doesn’t have a gTK counterpart in LInux. There are very few things in Linux that you will not be able to access if you keep at it. Just like the same can be said in windows or even OSX if your that way inclined.

Lists

digitaldarragh: The IrlGuideDogs and IrelandVIPNews lists should now be up to date and new members have been added to the Vics website as well.digitaldarragh: The IrlGuideDogs and IrelandVIPNews lists should now be up to date and new members have been added to the Vics website as well. [Twitter updates]

Administering and moderating lists is a lot of work.

I’ve also a few websites to manage as well.

So, if I don’t get around to your request right away, please. Send it again in a few days. I’ll get to it.

Unfortunately, sometimes, I’m actually out. You know, doing that stuff they actually call living? I think it’s called?

Oh. and if your not happy with this, tuff! I’m giving my time to moderate these websites and email lists. Unless your willing to do it, don’t annoy me when it takes a few days to get something done.

Right. rant over.

I just got an email that made me mad enough to send that.