FYI: ZAP HOSTING - Irreparable Raid Failure of Host System (DE)

YmpkerYmpker OGContent Writer
edited June 28 in General

Not sure, if anyone else is affected by this, but just received this message:

Irreparable raid failure of the host system
28.06.24
Unfortunately, we have to inform you that the raid system of the host system has failed. All attempts to rebuild and restore the raid have failed. We hope that you have used our free backup storage for backups or made local backups.
As compensation for the inconvenience, we will credit you with 15 EUR ZAP credit.

I already got 27€ in credits for system failures/downtimes lol. Since I was running nothing important there, I'll just have to rebuild the vps some time-ish.

Comments

  • Out of curiosity, is this on your lifetime system?

  • YmpkerYmpker OGContent Writer

    @Wonder_Woman said:
    Out of curiosity, is this on your lifetime system?

    Not on Finland location or webhosting

  • @Ympker said:
    All attempts to rebuild and restore the raid have failed

    Another one curious: don't you happen to know which RAID level they were using?
    Unrepairable most likely means either RAID 0 or multiple drive failure (beyond mirror or parity protection).

    As compensation for the inconvenience, we will credit you with 15 EUR ZAP credit.

    Noooice! The clients' data is clearly considered high-value.

    Thanked by (1)Ympker
  • MikeAMikeA Hosting ProviderOG

    @DataRecovery said:

    @Ympker said:
    All attempts to rebuild and restore the raid have failed

    Another one curious: don't you happen to know which RAID level they were using?
    Unrepairable most likely means either RAID 0 or multiple drive failure (beyond mirror or parity protection).

    If they were using Samsung drives I wouldn't be surprised. I've had two or three servers this two months that had RAID-1 and both Samsung NVMe drives died at the same time in both instances, particularly during power loss or soft reboots. Almost as if the memory controller died. Also had another Samsung drive (unrelated) that wasn't in any important server. All 980 Pro. Since then I've removed and replaced any Samsung drives that aren't recently manufactured 990's or 980 Pros that are well tested and latest firmware.

    If they were running RAID-0 that's bad but I kinda doubt they would be, especially with how affordable both NVMe M.2 and U.2 drives have become.

  • YmpkerYmpker OGContent Writer

    @DataRecovery said:

    @Ympker said:
    All attempts to rebuild and restore the raid have failed

    Another one curious: don't you happen to know which RAID level they were using?
    Unrepairable most likely means either RAID 0 or multiple drive failure (beyond mirror or parity protection).

    As compensation for the inconvenience, we will credit you with 15 EUR ZAP credit.

    Noooice! The clients' data is clearly considered high-value.

    I don't know which RAID level they were using, unfortunately. I do think 15€ is quite a generous amount in this case, given that's a lifetime service. What's not so generous/troublesome is that those 15€ (just like any zap credit) can only be spent on products that have a recurring cost (e.g. monthly/yearly), or prepaid products (but not lifetime) iirc. I'd argue that, normally, you should be free to spend your credit "balance" in any way you want to (i.e. also use it for lifetime products upgrade/purchase, or even pay out balance to bank account).

  • @MikeA said: Also had another Samsung drive (unrelated) that wasn't in any important server. All 980 Pro.

    Did you update the 980 Pro firmware?
    There is a bug that makes them read-only after some hours https://www.tomshardware.com/news/samsung-980-pro-ssd-failures-firmware-update but that was "fixed" in 2023 firmware upgrade...

    Thanked by (1)Ympker

    Haven't bought a single service in VirMach Great Ryzen 2022 - 2023 Flash Sale.
    https://lowendspirit.com/uploads/editor/gi/ippw0lcmqowk.png

  • AdvinAdvin Hosting Provider

    @Jab said:

    @MikeA said: Also had another Samsung drive (unrelated) that wasn't in any important server. All 980 Pro.

    Did you update the 980 Pro firmware?
    There is a bug that makes them read-only after some hours https://www.tomshardware.com/news/samsung-980-pro-ssd-failures-firmware-update but that was "fixed" in 2023 firmware upgrade...

    It’s still a complete mess, even 990 Pro had some health abnormality. Recently produced drives with updated firmware has been okay in our experience, but many years ago we ran into so many failures on Samsung consumer NVMe.

    Almost every hypervisor we have for our VPS lineup uses Enterprise NVMe. The performance is so much more stable, better, and the endurance is crazy. I can write 1 PBW and the wearout will still be between 0-1. Never had a disk failure on a U.2/U.3 enterprise disk as well :)

    Thanked by (1)MikeA

    I am a representative of Advin Servers

  • AdvinAdvin Hosting Provider
    edited June 29

    @DataRecovery said:

    @Ympker said:
    All attempts to rebuild and restore the raid have failed

    Another one curious: don't you happen to know which RAID level they were using?
    Unrepairable most likely means either RAID 0 or multiple drive failure (beyond mirror or parity protection).

    As compensation for the inconvenience, we will credit you with 15 EUR ZAP credit.

    Noooice! The clients' data is clearly considered high-value.

    RAID is never a backup, there’s lots of things that can go wrong, even in a RAID1/10 configuration.

    Thanked by (1)Ympker

    I am a representative of Advin Servers

  • YmpkerYmpker OGContent Writer

    @Advin said:

    @DataRecovery said:

    @Ympker said:
    All attempts to rebuild and restore the raid have failed

    Another one curious: don't you happen to know which RAID level they were using?
    Unrepairable most likely means either RAID 0 or multiple drive failure (beyond mirror or parity protection).

    As compensation for the inconvenience, we will credit you with 15 EUR ZAP credit.

    Noooice! The clients' data is clearly considered high-value.

    RAID is never a backup, there’s lots of things that can go wrong, even in a RAID1/10 configuration.

    OVH Datacenter incident showed that RAID can never be the only option for crucial data.

  • Thats why you wont mess with that hoster from beginning.

  • Well, I'm not affected so I don't get 15 EUR ZAP credit.

    Anyway @Ympker , do your LXC lifetime have high cpu load if viewed from htop or hetrix?

    Thanked by (1)Ympker
  • YmpkerYmpker OGContent Writer

    @skizio said:
    Well, I'm not affected so I don't get 15 EUR ZAP credit.

    Anyway @Ympker , do your LXC lifetime have high cpu load if viewed from htop or hetrix?

    You can check cpu load for lxc finland here: https://hetrixtools.com/report/uptime/f409b9c69d3f363d8ef67455d5e6d4bc/

  • MikeAMikeA Hosting ProviderOG

    @Jab said:

    @MikeA said: Also had another Samsung drive (unrelated) that wasn't in any important server. All 980 Pro.

    Did you update the 980 Pro firmware?
    There is a bug that makes them read-only after some hours https://www.tomshardware.com/news/samsung-980-pro-ssd-failures-firmware-update but that was "fixed" in 2023 firmware upgrade...

    Nah, I'm almost certain it's unrelated. All of the drives were not detected by any hardware, as if a power event killed them. I also had an issue with a newer 990 Pro. I'm just not touching them anymore.

  • @MikeA said: I'm just not touching them anymore.

    a lot of hardware reviewers and such moved to Sabrent as their main drives after them issues.

  • @dgc1980 said:

    @MikeA said: I'm just not touching them anymore.

    a lot of hardware reviewers and such moved to Sabrent as their main drives after them issues.

    Sabrent isn't that good, they're just sponsorship heavy. Only ever buy drives from people who make the NAND themselves.

    Thanked by (1)_MS_
Sign In or Register to comment.