i’ve instaled opensuse tumbleweed a bunch of times in the last few years, but i always used ext4 instead of btrfs because of previous bad experiences with it nearly a decade ago. every time, with no exceptions, the partition would crap itself into an irrecoverable state

this time around i figured that, since so many years had passed since i last tried btrfs, the filesystem would be in a more reliable state, so i decided to try it again on a new opensuse installation. already, right after installation, os-prober failed to setup opensuse’s entry in grub, but maybe that’s on me, since my main system is debian (turns out the problem was due to btrfs snapshots)

anyway, after a little more than a week, the partition turned read-only in the middle of a large compilation and then, after i rebooted, the partition died and was irrecoverable. could be due to some bad block or read failure from the hdd (it is supposedly brand new, but i guess it could be busted), but shit like this never happens to me on extfs, even if the hdd is literally dying. also, i have an ext4 and an ufs partition in the same hdd without any issues.

even if we suppose this is the hardware’s fault and not btrfs’s, should a file system be a little bit more resilient than that? at this rate, i feel like a cosmic ray could set off a btrfs corruption. i hear people claim all the time how mature btrfs is and that it no longer makes sense to create new ext4 partitions, but either i’m extremely unlucky with btrfs or the system is in fucking perpetual beta state and it will never change because it is just good enough for companies who can just, in the case of a partition failure, can just quickly switch the old hdd for a new one and copy the nightly backup over to it

in any case, i am never going to touch btrfs ever again and i’m always going to advise people to choose ext4 instead of btrfs

  • Markaos@discuss.tchncs.de
    link
    fedilink
    arrow-up
    2
    ·
    1 month ago

    My two cents: the only time I had an issue with Btrfs, it refused to mount without using a FS repair tool (and was fine afterwards, and I knew which files needed to be checked for possible corruption). When I had an issue with ext4, I didn’t know about it until I tried to access an old file and it was 0 bytes - a completely silent corruption I found out probably months after it actually happened.

    Both filesystems failed, but one at least notified me about it, while the second just “pretended” everything was fine while it ate my data.

  • ikidd@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 month ago

    I have had small issues with btrfs over the years, but nothing like the dataloss issues people reported a few years back that the devs supposedly fixed. Its scrubbing mechanism doesn’t work great, and the failure modes on RAID are fucking goofy. I wouldn’t trust it for raid at all, and they’ve never really fixed the bugs that have been exposed over the years.

    Frankly, it does everything worse than ZFS except for be in the kernel. DKMS isn’t that hard and I’ve never had a ZFS build hook fail. The only thing I use btrfs for is cattle computers that I can nuke and pave at will, and most of those could use ext4 just fine, but that’s what Fedora uses by default and I can’t be arsed to partition manually.

  • Kualk@lemm.ee
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    24 days ago

    I switched to XFS.

    The most important feature to me is support for file deduction which is supported by XFS through reflink. BTRFS supports reflinks as well.

    Snapshot in BTRFS seems like the most desirable feature, but in real life I ended up not using it.

    I usually prefer drive mirror setup, but it can give its own headaches.

    These days I simply have 2 disks and nightly rsync job copies content of one drive to another. This protects from drive failure.

    Rclone job sends most important data to offsite backup.

    The biggest loss is missing data checksums, but it is a unique feature of BTRFS that most filesystems manage without.

    I don’t have setup to expand partition beyond one drive. It comes with its own headache. I simply use large enough disks.

  • BCsven@lemmy.ca
    link
    fedilink
    arrow-up
    1
    ·
    1 month ago

    My system has been btrfs since 2017. No issues. Maybe you have random powerloss?

    • dwt@feddit.org
      link
      fedilink
      Deutsch
      arrow-up
      0
      ·
      1 month ago

      You know, protecting against Powerloss was the major feature of filesystems in a time gone by…

      • Atemu@lemmy.ml
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        1 month ago

        It only works if the hardware doesn’t lie about write barriers. If it says it’s written some sectors, btrfs assumes that reading any of those sectors will return the written data rather than the data that was there before. What’s important here isn’t that the data will forever stay in-tact but ordering. Once a metadata generation has been written to disk, btrfs waits on the write barrier and only updates the superblock (the final metadata “root”) afterwards.

        If the system loses power while the metadata generation is being written, all is well because the superblock still points at the old generation as the write barrier hasn’t passed yet. On the next boot, btrfs will simply continue with the previous generation referenced in the superblock which is fully committed.
        If the hardware lied about the write barrier before the superblock update though (i.e. for performance reasons) and has only written e.g. half of the sectors containing the metadata generation but did write the superblock, that would be an inconsistent state which btrfs cannot trivially recover from.

        If that promise is broken, there’s nothing btrfs (or ZFS for that matter) can do. Software cannot reliably protect against this failure mode.
        You could mitigate it by waiting some amount of time which would reduce (but not eliminate) the risk of the data before the barrier not being written yet but that would also make every commit take that much longer which would kill performance.

        It can reliably protect against power loss (bugs not withstanding) but only if the hardware doesn’t lie about some basic guarantees.

        • FuckBigTech347@lemmygrad.ml
          link
          fedilink
          arrow-up
          2
          ·
          1 month ago

          I had a drive where data would get silently corrupted after some time no matter what filesystem was on it. Machine’s RAM tested fine. Turned out the write cache on the drive was bad! I was able to “fix” it by disabling the cache via hdparm until I was able to replace that drive.

  • Atemu@lemmy.ml
    link
    fedilink
    arrow-up
    1
    ·
    1 month ago

    could be due to some bad block or read failure from the hdd (it is supposedly brand new, but i guess it could be busted)

    I’d suspect the controller or cable first.

    shit like this never happens to me on extfs, even if the hdd is literally dying

    You say that as if it’s a good thing. If you HDD is “literally dying”, you want the filesystem to fail safe to make you (and applications) aware and not continue as if nothing happened. extfs doesn’t fail here because it cannot even detect that something is wrong.

    btrfs has its own share of bugs but, in theory, this is actually a feature.

    i have an ext4 and an ufs partition in the same hdd without any issues.

    Not any issue that you know of. For all extfs (and, by extension, you) knows, the disk/cable/controller/whatever could have mangled your most precious files and it would be none the wiser; happily passing mangled data to applications.

    You have backups of course (right?), so that’s not an issue you might say but if the filesystem isn’t integer, that can permeate to your backups because the backup tool reading those files is none the wiser too; it relies on the filesystem to return the correct data. If you don’t manually verify each and every file on a higher level (e.g. manual inspection or hashing) and prune old backups, this has potential for actual data loss.

    If your hardware isn’t handling the storage of data as it should, you want to know.

    even if we suppose this is the hardware’s fault and not btrfs’s, should a file system be a little bit more resilient than that? at this rate, i feel like a cosmic ray could set off a btrfs corruption.

    While the behaviour upon encountering an issue is in theory correct, btrfs is quite fragile. Hardware issues shouldn’t happen but when they happen, you’re quite doomed because btrfs doesn’t have the option to continue despite the integrity of a part of it being compromised.
    btrfs-restore disables btrfs’ integrity; emulating extfs’s failure mode but it’s only for extracting files from the raw disks, not for continuing to use it as a filesystem.

    I don’t know enough about btrfs to know whether this is feasible but perhaps it could be made a bit more log-structured such that old data is overwritten first which would allow you to simply roll back the filesystem state to a wide range of previous generations, of which some are hopefully not corrupted. You’d then discard the newer generations which would allow you to keep using the filesystem.
    You’d risk losing data that was written since that generation of course but that’s often a much lesser evil. This isn’t applicable to all kinds of corruption because older generations can become corrupted retroactively of course but at least a good amount of them I suspect.

    • beleza pura@lemmy.eco.brOP
      link
      fedilink
      arrow-up
      0
      ·
      1 month ago

      as i said, maybe that’s the ideal for industrial/business applications (e.g. servers, remote storage) where the cost of replacing disks due to failure is already accounted for and the company has a process ready and pristine data integrity is of utmost importance, but for home use, reliability of the hardware you do have right now is more important than perfect data integrity, because i want to be as confident as possible that my system is going to boot up next time i turn it on. in my experience, i’ve never had any major data loss in ext4 due to hardware malfunction. also, most files on a filesystem are replaceable anyway (especially the system files), so it makes even less sense to install your system on a btrfs drive from that perspective.

      what you’re saying me is basically “btrfs should never be advised for home use”

      • Ephera@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 month ago

        I mean, as someone who hasn’t encountered these same issues as you, I found btrfs really useful for home use. The snapshotting functionality is what gives me a safe feeling that I’ll be able to boot my system. On ext4, any OS update could break your system and you’d have to resort to backups or a reinstall to fix it.

        But yeah, it’s quite possible that my hard drives were never old/bad enough that I ran into major issues…

  • dingdongitsabear@lemmy.ml
    link
    fedilink
    arrow-up
    1
    ·
    1 month ago

    I realize this is a rant but you coulda included hardware details.

    I’m gonna contrast your experience with about 300 or so installs I did in the last couple of years, all on btrfs, 90% fedora, 9% ubuntu and the rest debian and mint and other stragglers, nothing but the cheapest and trashiest SSDs money can buy, the users are predominantly linux illiterate. I also run all my stuff (5 workstations and laptops) exclusively on btrfs and have so for 5+ years. not one of those manifested anything close to what you’re describing.

    so I hope the people that get your recommendations also take into consideration your sample size.

  • sadTruth@lemmy.hogru.ch
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 month ago

    I am running BTRFS on multiple PCs and Laptops since about 8-10 years ago, and i had 2 incidents:

    1. Cheap SSD: BTRFS reported errors, half a year later the SSD failed and never worked again.
    2. Unstable RAM: BTRFS reported errors, i did a memtest and found RAM was unstable.

    I am using BTRFS RAID0 since about 6 years. Even there, i had 0 issues. In all those years BTRFS snapshoting has saved me countless hours when i accidentially misconfigured a program or did a accidential rm -r ~/xyz.

    For me the real risk in BTRFS comes from snapper, which takes snapshots even when the disk is almost full. This has resulted in multiple systems not booting because there was no space left. That’s why i prefer Timeshift for anything but my main PC.

  • unknowing8343@discuss.tchncs.de
    link
    fedilink
    arrow-up
    1
    ·
    1 month ago

    Been using BTRFS since I learned I could squeeze more data on my cheap-ass drive and… It’s been 3 years, no problem at all, and I have backups anyway.

  • Yozul@beehaw.org
    link
    fedilink
    arrow-up
    0
    ·
    1 month ago

    I mean, unless you really like one of the weird bells and whistles btrfs supports ext4 is just faster and more reliable. If you don’t have weird power user needs then anything else is a downgrade. Even ZFS really only makes a significant difference if you’re moving around gigabytes of data on a daily basis. If you’re on a BSD anyway feel free to go for it, but for most people there is no real benefit. Every other fancy new file system is just worse for typical desktop use cases. People desperately want to replace ext4 because it’s old, but there’s just really nothing to gain from it. Sometimes simple and reliable is good.

      • Yozul@beehaw.org
        link
        fedilink
        arrow-up
        1
        ·
        1 month ago

        Copy on write is pretty overrated for most use cases. It’d be nice to have, but I don’t find it’s worth the bother. Disk compression and snapshots have had solutions for longer than btrfs has existed, so I don’t understand why I’d want to cram them into an otherwise worse file system and call it an improvement. I will admit that copy on write and snapshots do at least have a little synergy together, but storage has gotten to be one of the cheapest parts of a computer. I’d rather just have a real backup.

        • ProgrammingSocks@pawb.social
          link
          fedilink
          arrow-up
          1
          ·
          1 month ago

          Myself and many others have found lots of use in these features. If it’s not important to you that’s fine, but there ARE reasons many of us default to btrfs now.

          • Yozul@beehaw.org
            link
            fedilink
            arrow-up
            1
            ·
            30 days ago

            Sure, if it’s making your life easier or making you happy or whatever, then have it. Don’t let me yuck your yum. I just think it doesn’t provide any real benefit for most people. Am I not allowed to talk about my opinion?

    • ReversalHatchery@beehaw.org
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      1 month ago

      being able to revert a failed upgrade by restoring a snapshot is not a power user need but a very basic feature for everyday users who do not want to debug every little problem that can go wrong, but just want to use their computer.

      ext4 does not allow that.

      • Yozul@beehaw.org
        link
        fedilink
        arrow-up
        2
        ·
        1 month ago

        You know file systems are not the only way to do that, right? Heck, Timeshift is explicitly designed to do that easily and automatically without ever even having to look at a command line. Backup before upgrade is a weird thing to cram into a file system.

        • ReversalHatchery@beehaw.org
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 month ago

          Timeshift is explicitly designed to do that easily and automatically

          by consuming much more space. but you’re right, I did not think about it

          Backup before upgrade is a weird thing to cram into a file system.

          I agree, but these are not really backups, but snapshots, which are stored more efficiently, without duplicating data. of course it does not replace an off site backup, but I think it has its use cases.

      • Unmapped@lemmy.ml
        link
        fedilink
        arrow-up
        1
        ·
        1 month ago

        By using NixOS I can do this on ext4. Just reboot back to the previous image before the update. Not saying everyday users should be running nixos but there are other Immutable distros that can do the same.

        • ReversalHatchery@beehaw.org
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 month ago

          will that also restore your data? what happens when a program updates its database structure with the update, and the old version you restore won’t understand it anymore?

          • Unmapped@lemmy.ml
            link
            fedilink
            arrow-up
            1
            ·
            1 month ago

            That is a good point. I’ve only had to rollback twice and nether time had any issues. But from my understanding of how it works, you are correct, the data wouldn’t rollback.

            • ReversalHatchery@beehaw.org
              link
              fedilink
              English
              arrow-up
              2
              ·
              1 month ago

              I’ve learned this lesson with my Android phone a few years ago. There it was actually about sqlite databases of a system app (contacts I think?), but this can happen with other formats too. Worst is if it doesn’t just error out, but tries to work with “garbage”, because it’ll possibly take much more time to debug it, or even realize that your data is corrupt.

      • WalnutLum@lemmy.ml
        link
        fedilink
        arrow-up
        0
        ·
        1 month ago

        Typically when there are “can’t mount” issues with btrfs it’s cause the write log got corrupted, and memory errors are usually the cause.

        BTRFS needs a clean write log to guarantee the state of the blocks to put the filesystem overlay on top of, so if it’s corrupted btrfs usually chooses to not mount until you do some manual remediations.

        If the data verification stuff seems more of a pain in the ass than it’s worth you can turn most of those features off with mount options.

        • beleza pura@lemmy.eco.brOP
          link
          fedilink
          arrow-up
          0
          ·
          1 month ago

          oh wow, that’s crazy. thanks for the info, but it’s a little fucked up that btrfs can make a memory failure cause a filesystem corruption

          • Atemu@lemmy.ml
            link
            fedilink
            arrow-up
            1
            ·
            1 month ago

            It’s the other way around: The memory failure causes the corruption.

            Btrfs is merely able to detect it while i.e. extfs is not.

  • yum13241@lemm.ee
    link
    fedilink
    arrow-up
    0
    ·
    1 month ago

    I literally daily drive btrfs. Just don’t use a crappy drive or use raid5/raid6.

    • FuckBigTech347@lemmygrad.ml
      link
      fedilink
      arrow-up
      1
      ·
      1 month ago

      BTRFS RAID5/6 is fine as long you don’t run into a scenario where your machine crashes and there was still unwritten data in the cache. Also write performance sucks and scrubbing takes an eternity.

      • yum13241@lemm.ee
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        18 days ago

        Just do a search on your favorite search engine for “btrfs raid5/6 write hole bug” and you’ll see. If power gets cut, any file on the set of disks could be missing, or just have bunch of garbage.

        • FuckBigTech347@lemmygrad.ml
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          16 days ago

          That’s literally what I’m saying; It’s fine as long as there wasn’t any unwritten data in the cache when the machine crashes/suddenly loses power. RAID controllers have a battery backed write cache for this reason, because traditional RAID5/6 has the same issue.

  • blackstrat@lemmy.fwgx.uk
    link
    fedilink
    arrow-up
    0
    arrow-down
    1
    ·
    1 month ago

    You’re right to give up on btrfs. It’s been so long in development and it just isn’t ready. Ext4 or ZFS are mature and excellent file systems. There’s no need for btrfs these days. It always has and always will disappoint.

    Everyone singing the praises of it are the sysadmin equivalent of the software engineer yelling ‘it works on my machine’ when a user finds an issue.

    • Liam Mayfair@lemmy.sdf.org
      link
      fedilink
      arrow-up
      0
      ·
      1 month ago

      I can’t comment on its server use cases or exotic workstation setups with RAID, NAS, etc. but I’ve been running Fedora on Btrfs for quite a few years now and I’ve had zero issues with it. Am I deliberately using all of its features like CoW, compression, snapshots…? No, but neither would your average Linux user who just wants something that works, like ext4.

      I don’t miss ext4, Btrfs worked for me since day 1.

        • cool_pebble@aussie.zone
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 month ago

          Aren’t we all? Aren’t Ext4 and ZFS considered mature because so many people have said “it works on my machine”?

          I agree this person’s experience may contrast to your own, but I don’t think the fact that something has worked well for some people, and perhaps not for yourself, is a reason to discount it entirely.