Category Archives: Network Administration

Troubleshooting Disk Failures on a Linux Software RAID with LVM

The following describes a failure of a drive I had on Ubuntu Linux with a Linux software RAID 5 volume with LVM, how I diagnosed it, and how I went about fixing it. The server had 4 2TB drives in software RAID 5.

When checking kernel messages, here is an example of the bad sectors:

# dmesg -T
[Sun Jul 21 13:36:30 2013] ata4.00: status: { DRDY ERR }
[Sun Jul 21 13:36:30 2013] ata4.00: error: { UNC }
[Sun Jul 21 13:36:30 2013] ata4.00: configured for UDMA/133
[Sun Jul 21 13:36:30 2013] ata4: EH complete
[Sun Jul 21 13:36:32 2013] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
[Sun Jul 21 13:36:32 2013] ata4.00: irq_stat 0x40000008
[Sun Jul 21 13:36:32 2013] ata4.00: failed command: READ FPDMA QUEUED
[Sun Jul 21 13:36:32 2013] ata4.00: cmd 60/20:00:2c:eb:7e/00:00:08:00:00/40 tag 0 ncq 16384 in
[Sun Jul 21 13:36:32 2013] res 41/40:00:2f:eb:7e/00:00:08:00:00/40 Emask 0x409 (media error) <F>
[Sun Jul 21 13:36:32 2013] ata4.00: status: { DRDY ERR }
[Sun Jul 21 13:36:32 2013] ata4.00: error: { UNC }
[Sun Jul 21 13:36:32 2013] ata4.00: configured for UDMA/133
[Sun Jul 21 13:36:32 2013] sd 3:0:0:0: [sdd] Unhandled sense code
[Sun Jul 21 13:36:32 2013] sd 3:0:0:0: [sdd] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Sun Jul 21 13:36:32 2013] sd 3:0:0:0: [sdd] Sense Key : Medium Error [current] [descriptor]
[Sun Jul 21 13:36:32 2013] Descriptor sense data with sense descriptors (in hex):
[Sun Jul 21 13:36:32 2013] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[Sun Jul 21 13:36:32 2013] 08 7e eb 2f
[Sun Jul 21 13:36:32 2013] sd 3:0:0:0: [sdd] Add. Sense: Unrecovered read error - auto reallocate failed
[Sun Jul 21 13:36:32 2013] sd 3:0:0:0: [sdd] CDB: Read(10): 28 00 08 7e eb 2c 00 00 20 00
[Sun Jul 21 13:36:32 2013] end_request: I/O error, dev sdd, sector 142535471
[Sun Jul 21 13:36:32 2013] ata4: EH complete

It’s clear in this case with those messages that the problem was with sdd, but in some cases, it’s not always clear which ata# in dmesg matches up with which drive, so for that I followed this askubuntu guide to see if ata4.00 was the same as /dev/sdd:

In my case, it’s clear that the problem is indeed with /dev/sdd or (ata4)

I have my RAID configured to email me if there’s a problem with the RAID, but it wasn’t doing it, even though I confirmed it was configured to do so.

I checked to make sure the RAID looked normal:

cat /proc/mdstat

looked normal (all U’s on the drives)

The output of mdadm looked fine as well

# mdadm --detail /dev/sdd

The problem from dmesg seems to be that it was having trouble reading specific sectors on drive [sdd]. So I knew I needed to start checking that drive specifically.

There are some good guides available about dealing with badblocks, such as the “bad block HOWTO”

Also, the FAQ for smartmontools (which contains the smartctl program):

The problem with the bad block HOWTO was that didn’t cover my case which is RAID/LVM. You definitely don’t want to get the dd command wrong, and I was nervous about trying to get it right on my system. A better approach my be to use hdparm as described here (forcing a hard disk to reallocate bad sectors). However, I didn’t do this for reasons explained below

The crux of that last page is looking for the bad sector in dmesg, and then confirming is with this command (entering your sector):

# hdparm –read-sector 1261069669234239432572396425

You should get:

/dev/sdb: Input/Output error

Then, the drive can’t be part of the array when you do this, doing a:

# hdparm –write-sector 1261069669234239432572396425 /dev/sdb

Followed by an force assemble to get the drive back in.

However, I didn’t want to do all that for two reasons, I didn’t know the extent of the bad sectors (if the bad sectors were isolated, or the whole drive was going bad), and because the command writes 0’s into the sector which means you will loose data if not done carefully (taking precautions to ensure that since our array is still functioning otherwise, I don’t write 0’s into the good drives), so I decided to try something else instead.

I started with smartctl reported that the individual drive was healthy. It turns out that a drive can actually be on the verge of failure or not healthy, and smart still reports it as being healthy. Smart health is more like an indicator that things can go bad not a definitive way to tell:

# smartctl -Hc /dev/sdd
SMART overall-health self-assessment test result: PASSED

Then I checked for vendor-specific SMART attributes using -a. The thing that stands out here is the “197 Current_Pending_Sector” being higher than 0 (in my case 40), meaning that there are sectors pending re-allocation. This means there is a high likelihood there are bad sectors.

# smartctl -a
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 253 253 021 Pre-fail Always - 6200
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 23
5 Reallocated_Sector_Ct 0x0033 188 188 140 Pre-fail Always - 89
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 5693
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 21
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 18
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 4
194 Temperature_Celsius 0x0022 080 074 000 Old_age Always - 72
196 Reallocated_Event_Count 0x0032 113 113 000 Old_age Always - 87
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 40
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

I decided to run a short test:

# smartctl -t short /dev/sdd

And then 60 seconds later I checked the status:

# smartctl -l selftest /dev/sdd
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 2 Short offline Completed: read failure 90% 5691 71314612

That confirmed there was a bad sector. At this point, I decided I needed to make a judgment on whether the drive was going bad, or there were just some bad sectors I needed to mark as such and move on. You can mark sectors as bad by writing 0’s to the bad area, and the disk firmware should automatically mark them as bad. The reason for this is explained in the earlier mentioned smartmontools FAQ:

If the disk can read the sector of data a single time, and the damage is permanent, not transient, then the disk firmware will mark the sector as ‘bad’ and allocate a spare sector to replace it. But if the disk can’t read the sector even once, then it won’t reallocate the sector, in hopes of being able, at some time in the future, to read the data from it. A write to an unreadable (corrupted) sector will fix the problem. If the damage is transient, then new consistent data will be written to the sector. If the damange is permanent, then the write will force sector reallocation. Please see Bad block HOWTO for instructions about how to force this sector to reallocate (Linux only).

The disk still has passing health status because the firmware has not found other signs of trouble, such as a failing servo.

Such disks can often be repaired by using the disk manufacturer’s ‘disk evaluation and repair’ utility. Beware: this may force reallocation of the lost sector and thus corrupt or destroy any file system on the disk. See Bad block HOWTO for generic Linux instructions.

The problem I had is the “bad block HOWTO” guide I mentioned earlier doesn’t cover my scenario, RAID/LVM. I’m sure you could dig in and find exactly the sector and mark it, but I didn’t want to risk it. So I was about to track down a western digital disk evaluation and repair utility, when I ran across a post that suggested I can just do a RAID sync (was a “repair” on older kernels”). To initiate, you run:

# echo 'check' > /sys/block/md0/md/sync_action

Then check the RAID check status with:

# cat /proc/mdstat

In my case, it was going really slow, so I first did what I could to shut down unecessary activity on the drive, and then ran through suggestions from here

The main thing that sped things up was setting the stripe cache size to a higher level than the default 256.

# echo 32768 > /sys/block/md3/md/stripe_cache_size

As it was doing the check, lots of errors were being thrown about the drive in dmesg, so I knew it wasn’t an isolated incident I was going to be able to fix by using a drive utility to mark bad sectors, the whole drive would need to be replaced.

As I was monitoring the RAID status, it got through about 5% and then the RAID removed the drive from the array and stopped it’s work. Here’s what dmesg said:

[Sun Jul 21 17:14:29 2013] md/raid:md0: Disk failure on sdd2, disabling device.
[Sun Jul 21 17:14:29 2013] md/raid:md0: Operation continuing on 3 devices.
[Sun Jul 21 17:14:29 2013] md: md0: data-check done.

so now I have a degraded array, and need to get a new drive ASAP to replace it and rebuild the array.

To do this, I had to ensure that the serial numbers matched properly.  Also, since the drive was no longer showing up in my system, I had to issue one of the following commands:

mdadm /dev/md0 -r detached


mdadm /dev/md0 -r failed

The man page says:

“The first causes all failed device to be removed. The second causes any device which is no longer connected to the system (i.e an ‘open’ returns ENXIO) to be removed. This will only succeed for devices that are spares or have already been marked as failed.”

Find out what Subversion commits haven’t been merged to stable

We frequently use Subversion for version control, and use /trunk to commit all code in active development and merge code to a stable branch that represents what is currently on a production server.

Sometimes commits need to be done right away, so you merge them right away; others can wait until you do a push.  After developing a while with regular commits to trunk, you can have a state where several commits in trunk may already be in the stable branch while others aren’t.  When you’re ready to push a group of commits to stable or do a full release of everything from trunk to stable, it’s helpful to know which commits haven’t been merged to your stable branch yet.

Here’s a bash script that will show you the details of everything in trunk that hasn’t been merged to stable:

for i in `svn mergeinfo --show-revs eligible svn://server/project/trunk svn://server/project/branches/stable | cut -c 2-`
svn log -c $i svn://server/project/trunk
done | less

The magic here is the “svn merginfo –show-revs eligible”.  Very useful.

Automatic and Secure rsync over SSH

Let’s you want to set up an automatic rsync over ssh to a remote server, but you want to do it in a secure way.

Using rsync over ssh is a convenient way to have all of the power of rsync for synchronizing files, comparing differences, doing backups, without having to set up an rsync server.

Here is an example of using rsync in this manner, which will make the destination match the source exactly:

 rsync -a --delete -e "ssh" SOURCE DESTINATION

The problem is this won’t be automatic because ssh will prompt you for your ssh username and password. That means you can’t use rsync in a script, such as a cron job. To get around this, you can set up an ssh key and copy that to the remote server. To do this, you have to set up an ssh key and copy it to the remote server. This is done by running ssh-keygen, and then leaving the passphrase field empty when it prompts you:

username@local-server:~$ ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/home/username/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/username/.ssh/id_rsa.
Your public key has been saved in /home/username/.ssh/
The key fingerprint is:
bc:ff:a0:c6:f3:33:36:c2:3a:9b:19:20:22:a4:a1:f8 username@local-server
The key's randomart image is:

Then you normally copy the contents of the /home/username/.ssh/ file to the /home/username/.ssh/authorized_keys file on the remote server you want your rsync to be able to access.

You can test this by then using ssh with your new key:

 ssh username@remote -i /home/username/.ssh/id_rsa

If that works, your ssh key is working. (If you leave off the -i PATH_TO_KEY option, it will work if the key is named with the defaults).

This method works fine, except it isn’t secure. If anyone gets ahold of your private key, they will also have full access to your account on the remote server without any password required. To make it more secure, you can take advantage of the ssh feature that limits you to one command.

The best way to do this, is to find out exactly what command rsync runs on the remote server when it runs. To do this, run your rsync with -vv, turning on very verbose mode:

 rsync -a -vv --delete -e "ssh -i /home/USERNAME/.ssh/rsync_id_rsa" SOURCE DESTINATION

Then look at the first line rsync returns. It should be something like:

opening connection using: ssh -p 4022 -l username remote-server rsync --server --sender -vvlogDtpre.iLsf . /var/www

What you are interested in, is the rsync command and everything after it. In this case:

rsync --server --sender -vvlogDtpre.iLsf . /var/www

On the remote server, find the line where you copied your ssh key in the /home/username/.ssh/authorized_keys file and prepend this specific command to it in the “command” section. For good measure, you can include other security features. Here is an example:

command="rsync --server --sender -vvlogDtpre.iLsf . /var/www",no-port-forwarding,no-pty,no-agent-forwarding ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDs/DqwIOWrf6K8yUPOMx22jx2vDTnXa9HvAobK1gw5I0Dx/z/HJdr7s2Iopcb7kdEBRJ9xQKWvc6lvdtdxDmSXc7a5WWjV9/2IaZGpJC0GDw79 username@local-server

Keep in mind that the whole thing needs to be on one line, or if you need to put it on more than one line, use the \ character at the end of each line.

That will make so only that specific rsync command can run, thereby securing your rsync connection.