After two years of using my R630 as an expensive paperweight, I finally got around to reconfiguring it for what I hope will be an effective and useful purpose. I did a bunch of work to get here for which I will have to back-date posts — re-flashing the H330 card, new hardware, why I chose TrueNAS — but today I wanted to cover how the pools are configured and why.
First, some background on ZFS. There are lots and lots and lots of references for ZFS (in general) and TrueNAS (specifically). Here’s my distillation after doing about a week worth of research and reading.
In ZFS, a
pool is composed of one or more virtual devices (
vdevs). Depending on the documentation, ZFS stripes/distributes data across the
pool; in either case, the takeaway is the same: the failure of any
vdev results in a failure of the entire
pool (i.e. complete data loss). Honestly, this isn’t that different from traditional RAID layouts, where depending on raid level the same would be true (i.e. if you exceeded the failure threshold you would lose all data in the RAID array). But it’s scary enough that multiple sources reiterate this point to ensure you get it.
Like any redundant system, you’re trading resiliency with size, weight, power, performance, etc. In ZFS, that manifests as resiliency (i.e. how many concurrent failures can you tolerate at a
pool level) vs. storage efficiency (i.e. what percentage of raw storage is available for use). Virtual devices (
vdevs) can have one of the following topologies:
|Topology||Min Disks||Redundancy||Rebuild||Read||Write||Efficiency||Add/Remove Disks?|
There are a couple other factors that play into the layout decision:
vdevsin a pool must have the same topology; you can’t mix
raidzwith mirrors (or stripes)
raidztopologies cannot be modified after a
vdevhas been created (i.e. you can’t add drives or change a
RAIDZ2); they’re working on the first constraint but there are limitations even in the proposed feature; I don’t believe there is any plan to allow for changing
raidztopologies require all disks (within a
vdev) to be identical. When combined with the previous note, this implies that you need to upgrade all the drives in a
vdevat the same time
raidzrebuild times scale with disk size and
RAIDZ1), so a
vdevwith 1 TB drives could take ~1 day/drive to rebuild.
My R630 has 8 x 2.5” drive bays, loaded with a mix of 600GB and 900GB HDDs; assuming no cold spares, I have three potential topologies for my pool:
- 1 vdev (8-wide)
- 2 vdevs (4-wide/vdev)
- 4 vdevs (2-wide/vdev)
Option #3 implies mirrors (since there aren’t enough disks for a RAIDZ topology). The other two options imply RAIDZ topologies: Option #2 could be a RAIDZ1 or RAIDZ2; Option #1 could be RAIDZ1, RAIDZ2, or RAIDZ3. Each of those topologies trades storage efficiency vs. resiliency. Here are the options I evaluated over the weekend:
tl;dr: I went with Option “A” (4 vdevs x 2-wide mirror/vdev) because it gave me the best performance and resiliency (the things I care about most) as well as the flexibility to grow the storage 1-2 drives at a time. I can always buy more storage but this allows me to use up my old 600 GB disks and swap in my new 900 GB disks as they fail.
I also have a pair of 500GB NVMe drives configured in a second pool as a mirror for application data (e.g. Docker containers).
UPDATE: three of my old 600 GB disks already failed! So it’s been fun to see how the pool handles the failures and the rebuilds.