Disk Pooling and Failure tolerance for Ubuntu NAS – mhddfs and SnapRAID

One of the key goals for my Ubuntu NAS was to have the ability to create a single storage pool out of all the disks in the NAS (OS is running off a USB thumb drive) and also provide some fault tolerance such that if a single disk failed, data wasn’t lost entirely. Even though sometimes these two issues (pooling and fault tolerance) are conflated together, in reality, these are two separate issues, so I’ll deal with them separately. The reason for the frequent conflation is due to the way RAID is typically used (hardware or software).

tl;dr: Use snapraid and mhddfs to have a simple pool of disks with your choice of failure tolerance.

Failure tolerance

The traditional way of making an array of disks tolerant to hardware failures is RAID (Redundant Array of Independent Disks). Instead of reading up on RAID and driving yourself nuts, the thing to ask is what kind of events do you want to protect yourself against? Think of this as being similar to Insurance – it will protect you but has riders for cases where the protection has limits or doesn’t apply. For me, the biggest thing I want to protect against are two things:

  • Single hard disk failure – by far, this is the thing that happens most often and the single most likely failure event in today’s systems
  • Bit rot – this is a close second. As the size of disks increase, the likelihood of silent read failures or errors in disks keeps increasing. What this means is that even though there’s no overt signs of errors in your hard drive, it could still be returning garbage for certain read requests causing your data to be corrupt/lost.

When it comes to RAID, the first choice you need to make is hardware RAID or software RAID. Almost all modern motherboards ship with a RAID controller that will allow you to configure your disks into a limited set of RAID options that will include some form of redundancy. However, hardware RAID is simply a bad idea IMO. The issues are that you will be unable to access your data on the individual hard drives outside of your setup – if your motherboard dies, so does your data, unless you’re able to find the exact same RAID controller to replace it with. So for me at least, this choice was the easiest one – Software RAID as it offers you the maximum flexibility and portability of your data.

The next choice is what kind of protection do you want – do you want real-time protection where every bit you change on your disk is immediately parity checked and protected? In my case (and IMO for most Home NAS scenarios), real-time RAID is just overkill. For the most part, I am willing to tolerate some level of data loss (say up to the last 24 hours) in return for a simpler and more performant system.

Once I decided that real-time RAID is unnecessary, I looked at the following candidates for software RAID on my Ubuntu NAS: ZFS (overkill for home systems and resource requirements are prohibitive), FlexRaid (Commercial), UnRaid (Commercial) or SnapRaid (Open Source and Free). Here’s a great page from SnapRAID that compares the options and gives you more details. The more I read about SnapRAID, the more I fell in love with it – it was exactly what was needed for my Home NAS which consists of a large amount of slowly/rarely changing data and it protected data from bit rot as well. I decided to use SnapRAID as my fault tolerance solution.

Disk Pooling

Disk pooling was the second part of my storage and server reorganization – Disk pooling is the ability to take a bunch of potentially disparate disks (different sizes, manufacturers and even types [ssd or hdd]) and operate them as one single namespace for storing data. There are plenty of disk pooling solutions available for Linux – in my case, I wanted something very simple and that wouldn’t interfere with my preferred choice for failure tolerance (most disk pooling solutions are also tied in with a failure tolerance scheme – hardware RAID for example). The choices I looked at are: AuFS (doesn’t look like it is actively maintained and is about to be dropped from the Linux Kernel), mdadm (Linux software RAID, tries to be too hardware RAID like for my needs) and mhddfs (FUSE based simple disk pooling solution). After looking at the details, I picked mhddfs as it is a simple, light-weight solution that primarily focuses on disk pooling without trying to shove in failure tolerance into the mix.

Setting up SnapRAID and mhddfs

Before giving details on setting up SnapRAID and mhddfs, it is useful to present details of the storage available on my server. In my case, I have 5 hard disks in my home server, 3 1 TB and 2 2TB drives. The first thing to realize with SnapRAID is that you’ll have to dedicate at least one of your disks for parity in order to be able to get failure tolerance of at least one disk. You also have to dedicate the largest disk of your pool if you want the failure tolerance to be meaningful as you approach storage limits. So, in my case, 1 2TB disk is dedicated to SnapRAID parity. SnapRAID content files are simply a table of files, checksums and other bookkeeping data. I decided to have a content file in each of my content drives. Here’s my snapraid.conf:

parity /mnt/sde1/parity
content /mnt/sda1/content
content /mnt/sdb1/content
content /mnt/sdc1/content
content /mnt/sdd1/content
disk d1 /mnt/sda1/
disk d2 /mnt/sdb1/
disk d3 /mnt/sdc1/
disk d4 /mnt/sdd1/
exclude /mnt/sda1/tmp/
exclude lost+found/
exclude tmp/

Basically, I mount my five drives to /mnt/sd[abcde]1. sde1 is my parity drive and sd[abcd]1 are my data drives. Each of them host a content file copy as well. Once the configuration is done, simply run

snapraid sync

to populate the parity information from your pre-existing content. At this point, all you need to maintain a running system that does periodic snapshots is a couple of cron jobs that you can configure based on your tolerance for data loss balanced against maintenance activity. In my case, I decided to run a sync daily and a scrub weekly (the scrub basically checks some specified % of your data for bit rot). Here are the cron scripts that I use for maintaining snapraid automatically.

SnapRAID Sync Script

#! /bin/bash

function mailuser () {
    mailto=$1
    mailsubject=$2
    mailbody=$3
    /bin/cat <<EOF | /usr/sbin/ssmtp $mailto
To: $mailto
Subject: $mailsubject

$([[ -f "$mailbody" ]] && tail -7 "$mailbody" || echo "$mailbody")

EOF

}

DEL_THRESHOLD=250
SNAPRAID_BIN="/usr/local/bin/snapraid"

[email protected]

output_folder=/tmp
diff_output="$output_folder/snapraid.diff"
sync_output="$output_folder/snapraid.sync"

$SNAPRAID_BIN diff 2>&1 > "$diff_output"
diffreturn=$?

if [[ $diffreturn -ne 0 ]]; then
    mailsubject="Snapraid diff failed, $diffreturn"
    mailuser "$mailto" "$mailsubject" "$diff_output"
    exit 1
fi

changedcount=$(awk '/^ +[0-9]+ +changed$/ {print $1}' "$diff_output")
deletedcount=$(awk '/^ +[0-9]+ +removed$/ {print $1}' "$diff_output")
addedcount=$(awk '/^ +[0-9]+ +added$/ {print $1}' "$diff_output")
movedcount=$(awk '/^ +[0-9]+ +moved$/ {print $1}' "$diff_output")
modificationcount=$((changedcount + deletedcount + addedcount + movedcount))

if [[ $modificationcount -gt 0 ]]; then
  if [[ $deletedcount -gt $DEL_THRESHOLD ]]; then
    mailsubject="Delete threshold exceeded, $deletedcount deleted files"
    mailuser "$mailto" "$mailsubject" "$diff_output"
    exit 1
  else
    $SNAPRAID_BIN sync 2>&1 > "$sync_output"
    syncreturn=$?
    if [[ $syncreturn -ne 0 ]]; then
        mailsubject="Snapraid sync failed, Status: $syncreturn"
        mailuser "$mailto" "$mailsubject" "$sync_output"
        exit 1
    fi
  fi
fi

[ -f "$diff_output" ] && rm "$diff_output"
[ -f "$sync_output" ] && rm "$sync_output"

SnapRAID Scrub Script

#! /bin/bash

function mailuser () {
    mailto=$1
    mailsubject=$2
    mailbody=$3
    echo $mailbody
    /bin/cat <<EOF | /usr/sbin/ssmtp $mailto
To: $mailto
Subject: $mailsubject

$([[ -f "$mailbody" ]] && tail -7 "$mailbody" || echo "$mailbody")

EOF

}


[email protected]
SNAPRAID_BIN=/usr/local/bin/snapraid
output_dir=/tmp
scrub_output="$output_dir/snapraid.scrub"
status_output="$output_dir/snapraid.status"

$SNAPRAID_BIN -p 20 scrub 2>&1 > "$scrub_output"
scrubreturn=$?

if [[ $scrubreturn -ne 0 ]]; then
    mailsubject="Snapraid scrub failed, $scrubreturn"
    mailuser "$mailto" "$mailsubject" "$scrub_output"
    exit 1
fi

$SNAPRAID_BIN status 2>&1 > "$status_output"
statusreturn=$?

if [[ $statusreturn -ne 0 ]]; then
    mailsubject="Snapraid status failed, $statusreturn"
else
    mailsubject="Snapraid status"
fi

mailuser "$mailto" "$mailsubject" "$status_output"

[[ -f $scrub_output ]] && rm $scrub_output
[[ -f $status_output ]] && rm $status_output

With these two scripts, I simply setup cron jobs to run them at my desired frequency and time to keep the array updated. To add the cron jobs, simply put the two scripts in /etc/cron.d and add the file snapraid to /etc/cron.d with the following contents:

PATH=/usr/local/sbin:/usr/sbin:/usr/bin:/sbin:/bin:/etc/cron.d:/usr/local/bin

# Check for sync every day at 05:00 AM
00 05 * * * root /etc/cron.d/snapraidsync.sh 2>&1 > /tmp/snapraid.log

# Scrub 20% every Thursday
22 06 * * 4 root /etc/cron.d/snapraidscrub.sh 2>&1 > /tmp/snapraidscrub.log

Setting up mhddfs
Setting up mhddfs is very straightforward – simply add a line similar to the one below to create the pooled storage access point:

/mnt/sda1;/mnt/sdb1;/mnt/sdc1;/mnt/sdd1 /storage fuse.mhddfs rw,allow_other 0 0

Note that my parity disk is not part of this storage pool – you need to only put your content disks here. The allow_other parameter is essential as without that there are permission issues (especially in conjunction with Samba) that I couldn’t find any easy ways to overcome.

1 Reply to “Disk Pooling and Failure tolerance for Ubuntu NAS – mhddfs and SnapRAID”

Leave a Reply

Your email address will not be published. Required fields are marked *