I've used BTRFS exclusively for over a decade now on all my personal laptops, se...

KennyBlanken · on Dec 1, 2024

"Flagship"? I don't know a single person who uses it in production systems. It's the only filesystem I've lost data to. Ditto for friends.

Please go look up survivor bias. That's what all you btrfs fanboys don't seem to understand. It doesn't matter how well it has worked for 99.9% of you. Filesystems have to be the most reliable component in an operating system.

It's a flagship whose fsck requires you to contact developers to seek advice on how to use it because otherwise it might destroy your filesystem.

It's a flagship whose userspace tools, fifteen years in, are still seeing major changes.

It's a flagship whose design is so poor that fifteen years in the developers are making major changes to its structure and depreciating old features in ways that do not trigger an automatic upgrade or informative error to upgrade, but cause the filesystem to panic with error messages for which there is no documentation and little clue what the problem is.

No other filesystem has these issues.

jchw · on Dec 1, 2024

Btrfs is in production all over the damn place, at big corporations and all kinds of different deployments. Synology has their own btrfs setup that they ship to customers with their NAS software for example.

I found it incredibly annoying the first time I ran out of disk space on btrfs, but many of these points are hyperbolic and honestly just silly. For example, btrfs doesn't really do offline fsck. fsck.btrfs has a zero percent chance of destroying your volume because it does nothing. As for the user space utilities changing... I'm not sure how that demonstrates the filesystem is not production ready.

Personally I usually use either XFS or btrfs as my root filesystem. While I've caught some snags with btrfs, I've never lost any data. I don't actually know anyone who has, I've merely just heard about it.

And it's not like other well-regarded filesystems have never ran into data loss situations: even OpenZFS recently (about a year ago) uncovered a data-eating bug that called its reliability into question.

I'm sure some people will angrily tell me that actually btrfs is shit and the worst thing to ever be created and honestly whatever. I am not passionate about filesystems. Wake me up when there's a better one and it's mainlined. Maybe it will eventually be bcachefs. (Edit: and just to be clear, I do realize bcachefs is mainline and Kent Overstreet considers it to be stable and safe. However, it's still young and it's upstream future has been called into question. For non-technical reasons, but still; it does make me less confident.)

aaronmdjones · on Dec 1, 2024

    For example, btrfs doesn't really do offline fsck. fsck.btrfs has a
    zero percent chance of destroying your volume because it does nothing.

fsck.btrfs does indeed do nothing, but that's not the tool they were complaining about. From the btrfs-check(8) manpage:

    Warning

    Do not use --repair unless you are advised to do so by a
    developer or an experienced user, and then only after having
    accepted that no fsck can successfully repair all types of
    filesystem corruption. E.g. some other software or hardware
    bugs can fatally damage a volume.
    
    [...]
    
    DANGEROUS OPTIONS
    
    --repair
        enable the repair mode and attempt to fix problems where possible

        Note there’s a warning and 10 second delay when this option is
        run without --force to give users a chance to think twice
        before running repair, the warnings in documentation have
        shown to be insufficient

jchw · on Dec 1, 2024

Yes, but that doesn't do the job that a fsck implementation does. fsck is something you stuff into your initrd to do some quick checks/repairs prior to mounting, but btrfs intentionally doesn't need those.

If you need btrfs-check, you have probably hit either a catastrophic bug or hardware failure. This is not the same as fsck for some other filesystems. However, ZFS is designed the same way and also has no fsck utility.

So whatever point was intended to be made was not, in any case.

shiroiushi · on Dec 2, 2024

>I don't actually know anyone who has, I've merely just heard about it.

Well "yarg", a few comments up in this conversation, says he lost all his data to it with the last year.

I've seen enough comments like that that I don't see it as a trustworthy filesystem. I never see comments like that about ext4 or ZFS.

jchw · on Dec 2, 2024

Contrary to popular belief, people on a forum you happen to participate in are still just strangers. In line with popular belief, anecdotal evidence is not a good basis to form an opinion.

shiroiushi · on Dec 2, 2024

Exactly how do you propose to form an opinion on filesystem reliability then? Do my own testing with thousands of computers over the course of 15 years?

jchw · on Dec 2, 2024

You don't determine what CPUs are fast or reliable by reading forum comments and guessing, why would filesystems be any different?

That said, you make a good point. It's actually pretty hard to quantify how "stable" a filesystem is meaningfully. It's not like anyone is doing Jepsen-style analysis of filesystems right now, so the best thing we can go off of is testimony. And right now for btrfs, the two types of data-points are essentially, companies that have been using it in production successfully, and people on the internet saying it sucks. I'm not saying either of those is great, and I am not trying to tell anyone that btrfs is some subjective measure of good. I'm just here to tell people it's apparently stable enough to be used in production... because, well, it's being used in production.

Would I argue it is a particularly stable filesystem? No, in large part because it's huge. It's a filesystem with an integrated volume manager, snapshots, transparent compression and much more. Something vastly simpler with a lower surface area and more time in the oven is simply less likely to run into bugs.

Would I argue it is perfectly reasonable to use btrfs for your PC? Without question. A home use case with a simple volume setup is exceedingly unlikely to be challenging for btrfs. It has some rough edges, but I don't expect to be any more likely to lose data to btrfs bugs as I expect to lose data from hardware failures. The bottom line is, if you absolutely must not lose data, having proper redundancy and backups is probably a much bigger concern than btrfs bugs for most people.

shiroiushi · on Dec 2, 2024

>You don't determine what CPUs are fast or reliable by reading forum comments and guessing, why would filesystems be any different?

Your premise is entirely wrong. How else would I determine what CPUs are fast or reliable? Buy dozens of them and stress-test them all? No, I use online sites like cpu-monkey.com that compare different CPUs' features and performance according to various benchmarks, for the performance part at least. For reliability, what way can you possibly think of other than simply aggregating user ratings (i.e. anecdotes)? If you aren't running a datacenter or something, you have no practical alternative.

At least for spinning-rust HDDs, the helpful folks at Backblaze have made a treasure trove of long-term data available to us. But this isn't available for most other things.

> It's not like anyone is doing Jepsen-style analysis of filesystems right now, so the best thing we can go off of is testimony.

This is exactly my point. We have nothing better, for most of this stuff.

>companies that have been using it in production successfully, and people on the internet saying it sucks

Companies using something doesn't always mean it's any good, especially for individual/consumer use. Companies can afford teams of professionals to manage stuff, and they can also make their own custom versions of things (esp. true with OSS code). They're also using things in ways that aren't comparable to individuals. These companies may be using btrfs in a highly feature-restricted way that they've found, through testing, is safe and reliable for their use case.

> It's a filesystem with an integrated volume manager, snapshots, transparent compression and much more. Something vastly simpler with a lower surface area and more time in the oven is simply less likely to run into bugs.

This is all true, but ZFS has generally all the same features, yet I don't see remotely as many testimonials from people saying "ZFS ate my data!" as I have with btrfs over the years. Maybe btrfs has gotten better over time, but as the American car manufacturers found out, it takes very little time to ruin your reputation for reliability, and a very long time to repair that reputation.

jchw · on Dec 2, 2024

> Your premise is entirely wrong. How else would I determine what CPUs are fast or reliable? Buy dozens of them and stress-test them all? No, I use online sites like cpu-monkey.com that compare different CPUs' features and performance according to various benchmarks, for the performance part at least. For reliability, what way can you possibly think of other than simply aggregating user ratings (i.e. anecdotes)? If you aren't running a datacenter or something, you have no practical alternative.

My point is just that anecdotes alone don't tell you much. I'm not suggesting that everyone needs to conduct studies on how reliable something is, but if nobody has done the groundwork then the best thing we can really say is we're not sure how stable it is because the best evidence is not very good and it conflicts.

> Companies using something doesn't always mean it's any good, especially for individual/consumer use. Companies can afford teams of professionals to manage stuff, and they can also make their own custom versions of things (esp. true with OSS code). They're also using things in ways that aren't comparable to individuals. These companies may be using btrfs in a highly feature-restricted way that they've found, through testing, is safe and reliable for their use case.

For Synology you can take a look at what they're shipping since they're shipping it to consumers. It does seem like they're not using many of the volume management features, instead using some proprietary volume management scheme on the block layer. However otherwise there's nothing particularly special that I can see, it's just btrfs. Other advanced features like transparent compression are available and exposed in the UI.

(edit: Small correction. While I'm still pretty sure Synology has custom volume management for RAID which works on the block level, as it turns out, they are actually using btrfs subvolumes as well.)

I think the Synology case is an especially interesting bit of evidence because it's gotta be one of the worst cases of shipping a filesystem, since you're shipping it to customer machines you don't control and can't easily inspect later. It's not the only case of shipping btrfs to the customer either, I believe ChromeOS does this and even uses subvolumes, though I didn't actually look for myself when I was using it so I'm not actually 100% sure on that one.

> This is all true, but ZFS has generally all the same features, yet I don't see remotely as many testimonials from people saying "ZFS ate my data!" as I have with btrfs over the years. Maybe btrfs has gotten better over time, but as the American car manufacturers found out, it takes very little time to ruin your reputation for reliability, and a very long time to repair that reputation.

In my opinion, ZFS and other Solaris technologies that came out around that time period set a very high bar for reliable, genuinely innovative system features. I think we're going to have to live with the fact that just having a production-ready filesystem dropped onto the world is not going to be the common case, especially in the open source world: the filesystem will need to go through its growing pains in the open.

Btrfs has earned a reputation as the perpetually-unfinished filesystem. Maybe it's tainted and it will simply never approach the degree of stability that ZFS has. Or, maybe it already has, and it will just take a while for people to acknowledge it. It's hard to be sure.

My favorite option would be if I just simply don't have to find out, because an option arrives that quickly proves itself to be much better. bcachefs is a prime contender since it not only seems to have better bones but it's also faster than btrfs in benchmarks anyways (which is not saying much because btrfs is actually quite slow.) But for me, I'm still waiting. And until then, ZFS is not in mainline Linux, and it never will be. So for now, I'm using btrfs and generally OK recommending it for users that want more advanced features than ext4 can offer, with the simple caveat that you should always keep sufficient backups of your important data at all times.

I only joined in on this discussion because I think that the btrfs hysteria train has gone off the rails. Btrfs is a flawed filesystem, but it continues to be vastly overstated every time it comes up. It's just, simply put, not that bad. It does generally work as expected.

anonfordays · on Dec 2, 2024

>Synology has their own btrfs setup that they ship to customers with their NAS software for example.

Synology infamously/hilariously does not use btrfs as the underlying file system because even they don't trust btrfs's RAID subsystem. Synology uses LVM RAID that is presented to btrfs as a single drive. btrfs isn't managing any of the volumes/disks.

jchw · on Dec 3, 2024

Their reason for not using btrfs as a multi-device volume manager is not specified, though it's reasonable to infer that it is because btrfs's own built-in volume management/RAID wasn't suitable. That's not really very surprising: back in ~2016 when Synology started using btrfs, these features were still somewhat nascent even though other parts of the filesystem were starting to become more mature. To this day, btrfs RAID is still pretty limited, and I wouldn't recommend it. (As far as I know, btrfs RAID5/6 is even still considered incomplete upstream.) On the other hand, btrfs subvolumes as a whole are relatively stable, and that and other features are used in Synology DSM and ChromeOS.

That said, there's really nothing particularly wrong with using btrfs with another block-level volume manager. I'm sure it seems silly since it's something btrfs ostensibly supports, but filesystem-level redundancy is still one of those things that I think I would generally be afraid to lean on too hard. More traditional RAID at the block level is simply going to be less susceptible to bugs, and it might even be a bit easier to manage. (I've used ZFS raidz before and ran into issues/confusion when trying to manage the zpool. I have nothing but respect for the developers of ZFS but I think the degree to which people portray ZFS as an impeccable specimen of filesystem perfection is a little bit unrealistic, it can be confusing, limited, and even, at least very occasionally, buggy too.)

anonfordays · on Dec 3, 2024

>That's not really very surprising: back in ~2016 when Synology started using btrfs, these features were still somewhat nascent even though other parts of the filesystem were starting to become more mature.

btrfs was seven years old at that point and declared "stable" three years before that.

ZFS is an example of amazingly written code by awesome engineers. It's simple to manage, scales well, and easy to grok. btrfs sadly will go the wayside once bcachefs reaches maturity. I wouldn't trust btrfs for important data, and neither should you. If you experience data loss on a Synology box, the answer you'll get from them is "tough shit, hope you have backups, and here's a coupon for a new Synology unit."

jchw · on Dec 3, 2024

> btrfs was seven years old at that point and declared "stable" three years before that.

The on-disk format was declared stable in 2013[1]. That just meant that barring an act of God, they were not going to break the on-disk format, e.g. a filesystem created at that point would continue to be mountable for the foreseeable future. It was not a declaration that the filesystem was itself now stable necessarily, but especially was not suggesting that all of the features were stable. (As far as I know, many features still carried warning labels.)

Furthermore, the "it's been X years!" thing referring to open source projects has to stop. This is the same non-sense that happens with every other thing that is developed in the open. Who cares? What matters isn't how long it took to get here. What matters is where it's at. I know there's going to be some attempt at rationalizing this bit, but it's wasted on me because I'm tired of hearing this.

> ZFS is an example of amazingly written code by awesome engineers. It's simple to manage, scales well, and easy to grok.

Agreed. But ZFS was written by developers at Sun Microsystems for their commercial UNIX. We should all be gracious to live in a world where Sun Microsystems existed. We should also accept that Sun Microsystems is not the standard any more than Bell Labs was the standard, they are extreme outliers. If we measure everything based on whether it's as good as what Sun Microsystems was doing in the 2000s, we're going to have a bad time.

As an example, DTrace is still better than LTTng is right now. I hope that sinks in for everyone.

However, OpenZFS is not backed by Sun Microsystems, because Sun Microsystems is dead. Thankfully and graciously at that, it has been maintained for many years by volunteers, including at least one person who worked on ZFS at Sun. (Probably more, but I only know of one.)

Now if OpenZFS eats your data, there is no big entity to go to anymore than there is for btrfs. As far as I know, there's no big entity funding development, improvements, or maintenance. That's fine, that's how many filesystems are. But still, that's not what propelled ZFS to where it stood when Sun was murdered.

> btrfs sadly will go the wayside once bcachefs reaches maturity.

I doubt it will disappear quickly: it will probably continue to see ongoing development. Open Source is generally pretty good at keeping things alive in a zombie state. That's pretty important since it is typically non-trivial to do online conversion of filesystems. (Of course, we're in a thread about a tool that does seamless offline conversion of filesystems, which is pretty awesome and impressive in and of itself.)

But for what it's worth, I am fine with bcachefs supplanting btrfs eventually. It seems like it had a better start, it benchmarks faster, and it's maturing nicely. Is it safer today? Depends on who you ask. But it's hard to deny that it doesn't seem like the point at which bcachefs will be considered stable by most will take more than a year or two tops, assuming kernel drama doesn't hold back upstream.

Should users trust bcachefs with their data? I think you probably can right now with decent safety, if you're using mainline kernels, but bcachefs is still pretty new. Not aware of anyone using it in production yet. It really could use a bit more time before recommending people jump over to it.

> I wouldn't trust btrfs for important data, and neither should you.

I stand by my statement: you should always ensure you have sufficient backups for important data, but most users should absolutely fear hardware failures more than btrfs bugs. Hardware failures are an if, not a when. Hardware will always fail eventually. Data-eating btrfs bugs have certainly existed, but it's not like they just appear left and right. When such a bug appears, it is often newsworthy, and usually has to do with some unforeseen case that you are not so likely to run into by accident.

Rather than lose data, btrfs is instead more likely to just piss you off by being weird. There are known quirks that probably won't lose you any data, but that are horribly annoying. It is still possible, to my knowledge, to get stuck in a state where the filesystem is too full to delete files and the only way out is in recovery. This is pretty stupid.

It's also not particularly fast, so if someone isn't looking for a feature-rich CoW filesystem with checksums, I strongly recommend just going with XFS instead. But if you run Linux and you do want that, btrfs is the only mainline game in town. ZFS is out-of-tree and holds back your kernel version, not to mention you can never really ship products using it (with Linux) because of silly licensing issues.

> If you experience data loss on a Synology box, the answer you'll get from them is "tough shit, hope you have backups, and here's a coupon for a new Synology unit."

That suggests that their brand image somewhat depends on the rarity of btrfs bugs in their implementation, but Synology has a somewhat good reputation actually. If anything really hurts their reputation, it's mainly the usual stuff (enshittification.) The fact that DSM defaults to using btrfs is one of the more boring things at this point.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

anonfordays · on Dec 15, 2024

This is clearly gish gallop, but I'll reply.

>The on-disk format was declared stable in 2013

And the mkfs.btrfs tool was labeled stable, and the experimental tag was removed from mkfs.btrfs warnings. It was also mainlined into the stable branch at that point. It wasn't just the on-disk format that was declared stable in 2013.

>It was not a declaration that the filesystem was itself now stable necessarily,

Yes it was, as explained above.

>Furthermore, the "it's been X years!" thing referring to open source projects has to stop.

This isn't some "open source project," this is the operating system that underpins modern computing. It's developed by full time engineers paid by megacorps. It's not some solo developer's side project, not in the slightest.

>If we measure everything based on whether it's as good as what Sun Microsystems was doing in the 2000s, we're going to have a bad time.

Not at all, we're going to have a great time. Great engineering stands the test of time.

>As an example, DTrace is still better than LTTng is right now. I hope that sinks in for everyone.

Did you get ChatGPT to write this? It's clear you have zero experience in the tracing and observability space. eBPF is the next evolution, not LTTng. It's never been and LTTng, LTTng has minuscule usage. I hope you're ignorance is clear for everyone reading this.

>However, OpenZFS is not backed by Sun Microsystems, because Sun Microsystems is dead.

No, it's maintained by dozens of engineers from multiple different companies and governmental orgaizations.

>Now if OpenZFS eats your data, there is no big entity to go to anymore than there is for btrfs.

Wrong, Canonical supports ZFS.

>As far as I know, there's no big entity funding development, improvements, or maintenance.

LLNL (the US government), among others, are funding the development of OpenZFS.

>That suggests that their brand image somewhat depends on the rarity of btrfs bugs in their implementation, but Synology has a somewhat good reputation actually.

No, Synology has a terrible reputation. They removed btrfs functionality from budget models and left customers in the dust: https://news.ycombinator.com/item?id=26800062

danudey · on Dec 2, 2024

I agree with what you say, and I would never trust btrfs with my data because of issues that I've seen in the past, My last job I installed my Ubuntu desktop with btrfs and within three days it had been corrupted so badly because of a power outage that I had to completely wipe and reinstall the system.

That said:

> but cause the filesystem to panic with error messages for which there is no documentation and little clue what the problem is.

The one and only time I experimented with ZFS as a root filesystem I got bit in the ass because the zfs tools one day added a new feature flag to the filesystem that the boot loader (grub) didn't understand and therefore it refused to read the filesystem, even read-only. Real kick in the teeth, that one, especially since the feature flag was completely irrelevant to just reading enough of the filesystem for the boot loader to load the kernel and there was no way to override it without patching grub's zfs module on another system then porting it over.

Aside from that, ZFS has been fantastic, and now that we're all using UEFI and our kernels and initrds are on FAT32 filesystems I'm much less worried, but I'm still a bit gunshy. Not as much as with BTRFS, mind you, but somewhat.

cmurf · on Dec 3, 2024

Meta (Facebook) has millions of instances of Btrfs in production. More than any other filesystem by far. A few years ago when Fedora desktop variants started using Btrfs by default, Meta’s experience showed it was no less reliable than ext4 or XFS.

stuaxo · on Dec 2, 2024

I list data on btrfs on a raspberry pi with a slightly dodgy PSU.

We need more testing of filesystems and pulling the power.

I switched to a NAS with battery backup and it's been better.

So that was inconclusive, before that the last time I lost data like that was to Reiserfs in the early 2000s.

eru · on Dec 3, 2024

> Please go look up survivor bias. That's what all you btrfs fanboys don't seem to understand. It doesn't matter how well it has worked for 99.9% of you. Filesystems have to be the most reliable component in an operating system.

Not sure. It's useful if they are reliable, but they only need to be roughly as reliable as your storage media. If your storage media breaks down once in a thousand years (or once a year for a thousand disks), then it doesn't matter much if your filesystem breaks down once in a million years or once in a trillion years.

That being said, I had some trouble with BTRFS.