From Mark Furneaux's Wiki
Revision as of 11:32, 13 May 2017 by Mark Furneaux (Talk | contribs) (Installation)

Jump to: navigation, search

ZFS is a combined filesystem and logical volume manager for UNIX and Linux systems.

This walkthrough assumes that we are building a RAIDZ1 pool named "tank" with 4x 1TB drives, a single hot spare, a single SSD for the L2ARC, and a single SSD partition for the ZIL.
We will then add a dataset called "storage".


Your system should contain all drives of the same type and size. It should have at least 1GB of free RAM. It must be running a 64 bit OS. You should make a list of all drives including their model, serial, location in the chassis, and port on the motherboard/HBA they are plugged into. This will make replacing dead disks much safer. Any new CPU will be fine for most applications; even Intel Atoms now have enough power if used as a dedicated file server, even with compression enabled.


For Ubuntu Linux 14.04 and earlier, add the repository by running:
# apt-get install python-software-properties
# add-apt-repository ppa:zfs-native/stable[1]
# apt-get update

Install the metapackage by running:
# apt-get install ubuntu-zfs

For Ubuntu 16.04 and later, simply install the ZFS utilities by running:
# apt install zfsutils-linux

For Debian Jessie, add the jessie-backports repository by appending the following to /etc/apt/sources.list:
deb http://ftp.debian.org/debian jessie-backports main contrib
# apt update
Then install the requisite packages:
# apt install linux-headers-$(uname -r) zfs-dkms zfs-initramfs

Setting Up A Pool

We are going to assume that you have recorded the drive models and serial numbers before they were installed in the server.

Partition the SSD for the ZIL. The ZIL will never need to be more than 4GB in size. You can use the remaining space for a root filesystem or an L2ARC.

Find the disk IDs for both the SSD partition and HDDs by running
$ ls -l /dev/disk/by-id/
to get IDs in the form of drive model/drive serial number.

Create the pool by running:
# zpool create -f -o ashift=12 -o autoreplace=on -o autoexpand=on tank raidz1 \
/dev/disk/by-id/ata-1000-1234-AAAA \
/dev/disk/by-id/ata-1000-1234-BBBB \
/dev/disk/by-id/ata-1000-1234-CCCC \

-f forces the creation of the pool. May or may not be necessary.
ashift=12 forces 4KiB sectors for new AF disks. Use ashift=9 for older 512B drives.
autoreplace=on allows ZFS to automatically switch to an available hot spare if it detects hardware errors on an online disk.
autoexpand=on allows the pool to grow when all VDEVs have been replaced with larger ones. This must be set before any drives are replaced, so it is best to set it now.

Setting Up Automatic Scrub

It is best practise to scrub consumer-grade SATA disks on a weekly or biweekly basis.


A weekly scrub can be achieved by running a scrub via cron by adding the following to /etc/crontab:
0 2 * * 3 root zpool scrub tank
This scrubs the pool "tank" every Wednesday at 2:00am.


Unfortunately cron does not have a method for biweekly execution. One elegant solution is to create the following script in /usr/local/bin/scrubzfs:


cd /root

if [ -e ran_zfs_scrub_last_week ]; then
        rm -f ran_zfs_scrub_last_week
        exit 0
        touch ran_zfs_scrub_last_week

zpool scrub tank

exit 0

and be sure to make it executable.

Then add the following to /etc/crontab:
0 2 * * 3 root /usr/local/bin/scrubzfs
This scrubs the pool "tank" every other Wednesday at 2:00am.

Adding Spares

Spare disks can be added simply by running:
# zpool add -f tank spare /dev/disk/by-id/ata-1000-1234-EEEE

Adding The ZIL

The ZIL partition created earlier can be added by running:
# zpool add -f tank log /dev/disk/by-id/ata-60-1234-AAAA-part2

Adding The L2ARC

The L2ARC drive can be added by running:
# zpool add -f tank cache /dev/disk/by-id/ata-60-5678-AAAA

Creating Datasets

Datasets can be created dynamically by running:
# zfs create tank/storage
# zfs set compression=lz4 tank/storage
# zfs set xattr=sa tank/storage
compression=lz4 turns on dynamic data compression.
xattr=sa allows small system attributes to be stored in inodes rather than hidden directories. This can speed up many operations by 3x.

Since the dataset will be owned by root, ownership must be changed for users to have write access. To do that run:
# chown -R mark:root /tank/storage


Once ZFS is installed and working, it is beneficial to tune some parameters to improve scrub, resilver, and general performance. Changes can be temporarily made by echoing values to /sys/module/zfs/parameters/ and then made permanent by adding lines to /etc/modprobe.d/zfs.conf in the form shown below.

options zfs zfs_arc_max=21474836480
options zfs zfs_top_maxinflight=64
options zfs zfs_scrub_delay=0
options zfs zfs_scan_idle=0
options zfs zfs_resilver_delay=0
options zfs zfs_vdev_scrub_max_active=10
options zfs zfs_scan_min_time_ms=5000
options zfs zfs_resilver_min_time_ms=5000
options zfs zfs_vdev_sync_read_max_active=100
options zfs zfs_vdev_sync_write_max_active=100


ZFS Not Mounting

Some old systems not running systemd boot too fast for ZFS to properly initialize all pools. This will lead to mountall running before the pool is available.
A quick fix for this is to simply run mountall again by adding it to the end of /etc/rc.local.

ZFS Module Fails to Build

Sometimes during kernel updates the ZFS module will fail to build. This is usually because ZFS depends on the SPL kernel module which may not have been built yet. To remedy this, make sure ubuntu-zfs is installed and updated by running:
# apt-get install ubuntu-zfs

Then reconfigure the SPL kernel module by running:
# dpkg-reconfigure spl-dkms
followed by the ZFS module:
# dpkg-reconfigure zfs-dkms

ARM Installation

The kmod packages cannot be built on ARM and thus you must install with regular make install. After installing, the binaries will not run due to bad library paths.
To fix this, add the following to /etc/ld.so.conf:


And then run:
# ldconfig

Custom Packages

If installing a custom kmod package, several steps must be peformed after installation to ensure a bootable system.

To fix update-grub, add the following to /etc/ld.so.conf:


And then run:
# ldconfig

To fix an unbootable RAIDz pool on Debian, edit the following line in /usr/sbin/grub-mkconfig:

GRUB_DEVICE="`${grub_probe} --target=device / | head -n1`"


GRUB_DEVICE="`${grub_probe} --target=device /`"

Add the ZFS and SPL modules to the module tree:
# depmod -a <kernel version>

Generate a new initramfs:
# update-initramfs -u -k <kernel version>

Update the grub menu list:
# update-grub


Many storage engineers and sysadmins claim that RAID5 and by extension RAIDZ1 are "broken". This comes from the fact that with the high capacity of modern drives and their intrinsic error rates, it is very unlikely that a pool can recover from a failure without suffering bit damage. With RAID5, this is for the most part true, however ZFS adds in another level of checksumming and healing. However this in most cases will only detect that corruption has taken place and will not be able to recover from it. It is advised that large pools use RAIDZ2 at a minimum. The exact definition of "large" is up to the creator of the pool and depends on the type of data stored on it.


  1. ZFS on Linux Repository