Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Both the ZIL and the L2ARC, plus a third "special" cache which aggregates small blocks and could hold the dedupe table.

The ZIL is the "ZFS Intent Log", a log-structured ordered stream of file operations to be performed on the ZFS volume.

If power goes out, or the disk controller goes away unexpectedly, this ZIL is the log that will get replayed to bring the volume back to a consistent state. I think.

Usually the ZIL is on the same storage devices as the rest of the data. So a write to the ZIL has to wait for disk in the same line as everybody else. It might improve performance to give the ZIL its own, dedicated storage devices. NVMe is great, lower latency the better.

Since the ZFS Intent Log gets flushed to disk every five seconds or so, a dedicated ZIL device doesn't have to be very big. But it has to be reliable and durable.

Windows made small, durable NVMe cache drives a mainstream item for a while, when most laptops still used rotating hard drives. Optane NVMe at 16GB is cheap, like twenty bucks, buy three of them and use a mirrored pair of two for your ZIL.

----

Then there's the read cache, the ARC. I use 1TB mirrored NVMe devices.

Finally, there's a "special" device that can for example be designated for use for intense things like the dedupe table (which Fast Dedupe is making smaller!).



A couple things:

- The ZIL is used exclusively for synchronous writes - critical for VMs, databases, NFS shares, and other applications requiring strict write consistency. Many conventional workloads won't benefit. Use `zilstat` to monitor.

- The cheap 16GB Optane devices are indeed great in terms of latency, but they were designed primarily for read caching and have significantly limited write speeds. If you need better throughput, look for the larger Optane models which don't have these limitations.

- SLOG doesn't need to be mirrored - the only risk is if your SLOG device fails at the exact moment your system crashes. While mirroring is reasonable for production systems, with these cheap 16GB Optanes you're just guaranteeing they'll wear out at the same time. You could kill one at a time instead. :)

- As for those 1TB NVMe devices for read cache (L2ARC) - that's probably overkill unless you have a very specific use case. L2ARC actually consumes RAM to track what's in the cache, and that RAM might be better used for ARC (the main memory cache). L2ARC only makes sense when you have well-understood workload patterns and your ARC is consistently under pressure - like in a busy database server or similar high-traffic scenario. Use `arcstat` to monitor your cache hit ratios before deciding if you need L2ARC.


>Optane NVMe at 16GB is cheap, like twenty bucks, buy three of them and use a mirrored pair of two for your ZIL.

I've been building a home media server lately and have thought about doing something like this. However, there's a big problem: these little 16GB Optane drives are NVMe. My main boot drive and where I keep the apps is also NVME (not mirrored, yet: for now I'm just regularly copying to the spinning disks for backup, but a mirror would be better). So ideally that's 4 NVMe drives, and that's with me "cheating" and making the boot drive a partition on the main NVMe drive instead of a separate drive as normally recommended.

So where are you supposed to plug all these things in? My pretty-typical motherboard has only 2 NVMe slots, one that connects directly to the CPU (PCIe 4.0) and one that connects through the chipset (slower PCIe 3.0). Is the normal method to use some kind of PCIe-to-NVMe adapter card and plug that into the PCIe x16 video slot?


Why are you looking at 16GB Optane drives? You probably don't need a SLOG device for your media server.

I think you're pretty far into XY territory here. I'd recommend hanging out in r/homelab and r/zfs, read the FAQs, and then if you still have questions, maybe start out with a post explaining your high level goals and challenges.


I'm not using them yet; I've already built my server without one, but I was wondering if it would be beneficial to add one for ZIL. Again, this is a home media server, so the main uses are pretty standard for a "home server" these days I think: NFS share, backups (of our PCs), video/music/photo storage, Jellyfin server, Immich server. I've read tons of FAQs and /r/homelab and /r/homeserver (honestly, /r/homelab isn't very useful, it's overkill for this kind of thing, with people building ridiculous rack-mount mega-systems; /r/homeserver is a lot better but it seems like a lot of people are just cobbling together a bunch of old junk, not building a single storage/media server).

My main question here was just what I asked about NVMe drives. Many times in my research that you recommended, people recommended using multiple NVMe drives. But even a mirror is going to be problematic: on a typical motherboard (I'm using a AMD B550 chipset), there's only 2 slots, and they're connected very differently, with one slot being much faster (PCIe4) than the other (PCIe3) and having very different latency, since the fast one connects to the CPU and the slow one goes through the chipset.


Ok, understood. The part I'm confused about is the focus on NVMe devices - do you also have a bunch of SATA/SAS SSDs, or even conventional disks for your media? If not, I'd definitely start there. Maybe something like six spinners in RAIDZ2, this would allow you to lose up to two drives without any data loss.

If NVMe is your only option, I'd try to find a couple used 1.92TB enterprise class drives on ebay, and go ahead and mirror those without worrying about the different performance characteristics (the pool will perform as fast as the slowest device, that's all) - but 1.92TB isn't much for a media server.

In general, I'd say consumer class SSDs aren't worth the time it'll take you to install them. I'd happily deploy almost any enterprise class SSD with 50% beat out of it over almost any brand new consumer class drive. The difference is stark - enterprise drives offer superior performance through PLP-improved latency and better sustained writes (thanks to higher quality NAND and over-provisioning), while also delivering much better longevity.


>The part I'm confused about is the focus on NVMe devices - do you also have a bunch of SATA/SAS SSDs

I do have 4 regular SATA spinning disks (enterprise-class), for bulk data storage, in a RAIDZ1 array. I know it's not as safe as RAIDZ2, but I thought it'd be safe enough with only 4 disks, and I want to keep power usage down if possible.

I'm using (right now) a single 512GB NVMe drive for both booting and app storage, since it's so much faster. The main data will on the spinners, but the apps themselves on the NVMe which should improve performance a lot. It's not mirrored obviously, so that's one big reason I'm asking about the NVMe slots; sticking a 2nd NVMe drive in this system would actually slow it down, since the 2nd slot is only PCIe3 and connected through the chipset, so I'm wondering if people do something different, like using some kind of adapter card for the x16 video slot. I just haven't seen any good recommendations online in this regard. For now, I'm just doing daily syncs to the raid array, so if the NVMe drive suddenly dies somehow, it won't be that hard to recover, though obviously not nearly as easy as with a mirror. This obviously isn't some kind of mission-critical system so I'm ok with this setup for now; some downtime is OK, but data loss is not.

Thanks for the advice!


Yeah, RAIDZ1 is a reasonable trade-off for the four disks.

Move your NVMe to the other slot, I bet you can't tell a difference without synthetic benchmarks.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: