We love big disks (the storage thread)

imokimok OG

There is a lot of interesting information spread in multiple threads. Please continue the conversation here.

@AuroraZero said:

@host_c said:

MS said:

@host_c said:

@MS said: I just thought that the RAM to Storage ratio, like 4.55 TiB to 4.12 PiB (or ~1GB RAM for 1000GB Storage), is low.

That math is for ZFS. =)

If you go HW Raid ( cache on controller ) you only have to worry about VM's running plus the overhead they do on the node.

With ZFS you can have flexibility ( expand the pool as you change / upgrade the drives ) on the go but will eat RAM,CPU and induce Latency
With HW raid you will have the lowest Latency, no CPU and RAM usage on the Node ( all that if you use a modern HW raid controller, not something 2 decades ago ), but cannot expand the pool on the go ( you will need to empty the node, decommission the raid array, put the new drives, wait a few days for the to initialize )

It is a trade off, we picked out poison. ( I will go for low latency any day of the week )

Until ZFS crashes then let the cluster begin!!!!

I use Proxmox and ZFS. Don't let my cluster explode!

«1

Comments

  • host_chost_c Hosting Provider
    edited May 19

    @imok said: I use Proxmox and ZFS. Don't let my cluster explode!

    =)

    Do you have 1 GB of ram / 1 TB of data??? if you have a 4 bay setup, did you use raid 10? or raid 5 as you wished for "as much space I can have" - if the later, you might run into a problem at some-point =)

    Jokes aside, this will also make a good read, id did not wish to double post:

    https://lowendspirit.com/discussion/9570/whats-the-difference-between-software-raid-vs-hardware-raid-now-in-2025#latest

    PS: That is a node, Cluster implies a few nodes ( 2 for minimum ) to be managed/linked together/clustered )

    You are fine :3

    Regardless of what you use HW or SW raid, please do not use RAID5/Z1.

    PS: @imok

    If I can find a picture, I have a tale that is precisely " seconds from disaster ".

    I will search for it and share what happened on a Sunday that almost lead to 108 TB of usable data to go out the door, on a pretty new server.

    Give me till tomorrow to search the pic.

    Thanked by (1)imok

    Host-C - VPS & Storage VPS Services – Reliable, Scalable and Fast - AS211462

    "If there is no struggle there is no progress"

  • Yes! Double D's are my most loved parts of the computer. Double Disks in RAID 1 for the champions.

    Thanked by (2)bingobangobongo imok
  • AuroraZeroAuroraZero Hosting ProviderRetired

    @legendary said:
    Yes! Double D's are my most loved parts of the computer. Double Disks in RAID 1 for the champions.

    Real men go commando raid 0 no backups!!!

  • @host_c said:
    PS: That is a node, Cluster implies a few nodes ( 2 for minimum ) to be managed/linked together/clustered )

    Yeah I have a Proxmox cluster with 3 nodes, but no shared storage yet nor HA. I wish I can afford it but I don't think it's possible over a 1gbps link right?

    Also I would have to set up switches and stuff like that right? Seems a bit complicated, and if it fails... Oh man I don't want to think about it. ZX Host comes to my mind.

  • MasonMason AdministratorOG

    I like big disks and I cannot lie

    Head Janitor @ LES • AboutRulesSupport

  • @Mason said:
    I like big disks and I cannot lie

    Any failures?

  • AuroraZeroAuroraZero Hosting ProviderRetired

    @Mason said:
    I like big disks and I cannot lie

    Sexy shit right there!!!!!

    Thanked by (1)Mason
  • @host_c said:
    Do you have 1 GB of ram / 1 TB of data??? if you have a 4 bay setup, did you use raid 10? or raid 5 as you wished for "as much space I can have" - if the later, you might run into a problem at some-point =)

    What if I have 1 TB of ram and 1 GB of data? What happens to me then?

    On a side note, can anyone mail me 8 sticks of 128GB sticks of DDR4 ECC RAM? :lol:

    Thanked by (1)host_c

    Never make the same mistake twice. There are so many new ones to make.
    It’s OK if you disagree with me. I can’t force you to be right.

  • cybertechcybertech OGBenchmark King

    @somik said:

    @host_c said:
    Do you have 1 GB of ram / 1 TB of data??? if you have a 4 bay setup, did you use raid 10? or raid 5 as you wished for "as much space I can have" - if the later, you might run into a problem at some-point =)

    What if I have 1 TB of ram and 1 GB of data? What happens to me then?

    On a side note, can anyone mail me 8 sticks of 128GB sticks of DDR4 ECC RAM? :lol:

    what speed?

    I bench YABS 24/7/365 unless it's a leap year.

  • @cybertech said:

    @somik said:

    @host_c said:
    Do you have 1 GB of ram / 1 TB of data??? if you have a 4 bay setup, did you use raid 10? or raid 5 as you wished for "as much space I can have" - if the later, you might run into a problem at some-point =)

    What if I have 1 TB of ram and 1 GB of data? What happens to me then?

    On a side note, can anyone mail me 8 sticks of 128GB sticks of DDR4 ECC RAM? :lol:

    what speed?

    Only 2400mhz

    Never make the same mistake twice. There are so many new ones to make.
    It’s OK if you disagree with me. I can’t force you to be right.

  • @Mason said:
    I like big disks and I cannot lie

    Large and girthy di(s)ks indeed! Congrats.

  • bikegremlinbikegremlin ModeratorOGContent Writer

    All my life, I've been consoling myself that size doesn't matter...

    Thanked by (3)host_c _MS_ COLBYLICIOUS
  • @bikegremlin said:
    All my life, I've been consoling myself that size doesn't matter...

    Eh... we are still talking about "disk", with a "S" right? Right?!?

    Thanked by (1)bikegremlin

    Never make the same mistake twice. There are so many new ones to make.
    It’s OK if you disagree with me. I can’t force you to be right.

  • AuroraZeroAuroraZero Hosting ProviderRetired

    @somik said:

    @bikegremlin said:
    All my life, I've been consoling myself that size doesn't matter...

    Eh... we are still talking about "disk", with a "S" right? Right?!?

    Maybe with him ya never know anymore

    Thanked by (1)bikegremlin
  • bikegremlinbikegremlin ModeratorOGContent Writer

    @AuroraZero said:

    @somik said:

    @bikegremlin said:
    All my life, I've been consoling myself that size doesn't matter...

    Eh... we are still talking about "disk", with a "S" right? Right?!?

    Maybe with him ya never know anymore

    What do you mean - you know I'm all about desks!

  • WolveixWolveix OG
    edited May 20

    My home server :D

    CPU: Ryzen 9 5900X
    RAM: 64GB

  • AuroraZeroAuroraZero Hosting ProviderRetired

    @bikegremlin said:

    @AuroraZero said:

    @somik said:

    @bikegremlin said:
    All my life, I've been consoling myself that size doesn't matter...

    Eh... we are still talking about "disk", with a "S" right? Right?!?

    Maybe with him ya never know anymore

    What do you mean - you know I'm all about desks!

    You are about Pidgeons!!!!

    Thanked by (1)bikegremlin
  • @Wolveix said:
    My home server :D

    What are you using the 100 TB of data for? And what kind of backup do you use for the 100 TB of data?

    @AuroraZero said:

    @bikegremlin said:

    @AuroraZero said:

    @somik said:

    @bikegremlin said:
    All my life, I've been consoling myself that size doesn't matter...

    Eh... we are still talking about "disk", with a "S" right? Right?!?

    Maybe with him ya never know anymore

    What do you mean - you know I'm all about desks!

    You are about Pidgeons!!!!

    How many pigeons can you fit in a TB?

    Never make the same mistake twice. There are so many new ones to make.
    It’s OK if you disagree with me. I can’t force you to be right.

  • AuroraZeroAuroraZero Hosting ProviderRetired

    @somik said:

    @Wolveix said:
    My home server :D

    What are you using the 100 TB of data for? And what kind of backup do you use for the 100 TB of data?

    @AuroraZero said:

    @bikegremlin said:

    @AuroraZero said:

    @somik said:

    @bikegremlin said:
    All my life, I've been consoling myself that size doesn't matter...

    Eh... we are still talking about "disk", with a "S" right? Right?!?

    Maybe with him ya never know anymore

    What do you mean - you know I'm all about desks!

    You are about Pidgeons!!!!

    How many pigeons can you fit in a TB?

    3 bigguns and one petite one, pigdies take up alot of room

    Thanked by (1)Kris
  • @somik said:

    @Wolveix said:
    My home server :D

    What are you using the 100 TB of data for? And what kind of backup do you use for the 100 TB of data?

    Linux ISOs, family data archiving, some data processing for work stuff (need a few TB fairly regularly for research), and local cloud storage.

    Nothing on there is too important, so no real backups for it. Anything that is important is backed up with borg elsewhere 🙏

    Thanked by (1)imok
  • @AuroraZero said: 3 bigguns and one petite one

    Thanked by (1)imok
  • havochavoc OGContent WriterSenpai

    @Wolveix said: Anything that is important is backed up with borg elsewhere

    Same. Must admit I'm a little puzzled by storage VPS in general

    Mission critical stuff goes to big providers (hetzner and up). Linux ISOs & friends I'll keep two copies on local LAN - acceptable risk profile and I'd rather not have linux isos with my name on it floating about the internet.

    Clearly there is demand for it so someone has a use case for it, I just don't get it

    Thanked by (2)Wolveix imok
  • edited May 22


    big boi
    ZFS array:
    zpool status
      pool: main-pool
     state: ONLINE
      scan: resilvered 462M in 00:01:13 with 0 errors on Thu Apr 10 18:09:48 2025
    config:
    
            NAME                                      STATE     READ WRITE CKSUM
            main-pool                                 ONLINE       0     0     0
              raidz2-0                                ONLINE       0     0     0
                ata-TOSHIBA_MG07ACA12TE_X960A0MRFDUG  ONLINE       0     0     0
                wwn-0x5000039998c95dd7                ONLINE       0     0     0
                wwn-0x5000039998c94b4e                ONLINE       0     0     0
                wwn-0x50000399b8c8d37c                ONLINE       0     0     0
                wwn-0x50000399b8c92782                ONLINE       0     0     0
                scsi-350000399b8c91246                ONLINE       0     0     0
                wwn-0x5000039998c93afe                ONLINE       0     0     0
                wwn-0x5000039998c9657c                ONLINE       0     0     0
    
    errors: No known data errors
    

    Totals:

    df -h | grep pool
    ssd-pool               923G  542G  382G  59% /ssd-pool
    main-pool               62T   27T   36T  43% /main-pool
    

    I have another server @ home with ~10 TB of SSD space (same raidz2 config), but can’t find my SSH key for that right now (on my phone, anyway).

    Mainly use “big boi” for storing ISOs, local backups, etc. CPU is a 5600X, as my 5800X was about to turn the stock cooler into molten aluminum.

    Edit: Ought to get more backups going. Only have my important stuff copied to two places ATM, with one being dirtbox-tier (questionable reliability) lol

    Thanked by (1)Wolveix
  • This 2TB NVMe is dying?

    smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.8.8-1-pve] (local build)
    Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF INFORMATION SECTION ===
    Model Family:     Samsung based SSDs
    Device Model:     SAMSUNG MZ7LM1T9HMJP-00005
    Serial Number:    S2TVNX0K336393
    LU WWN Device Id: 5 002538 c40a0e81a
    Firmware Version: GXT5204Q
    User Capacity:    1,920,383,410,176 bytes [1.92 TB]
    Sector Size:      512 bytes logical/physical
    Rotation Rate:    Solid State Device
    Form Factor:      2.5 inches
    TRIM Command:     Available, deterministic, zeroed
    Device is:        In smartctl database 7.3/5319
    ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4c
    SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
    Local Time is:    Tue Jul 29 23:14:29 2025 CDT
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    General SMART Values:
    Offline data collection status:  (0x02) Offline data collection activity
                        was completed without error.
                        Auto Offline Data Collection: Disabled.
    Self-test execution status:      (   0) The previous self-test routine completed
                        without error or no self-test has ever 
                        been run.
    Total time to complete Offline 
    data collection:        ( 6000) seconds.
    Offline data collection
    capabilities:            (0x53) SMART execute Offline immediate.
                        Auto Offline data collection on/off support.
                        Suspend Offline collection upon new
                        command.
                        No Offline surface scan supported.
                        Self-test supported.
                        No Conveyance Self-test supported.
                        Selective Self-test supported.
    SMART capabilities:            (0x0003) Saves SMART data before entering
                        power-saving mode.
                        Supports SMART auto save timer.
    Error logging capability:        (0x01) Error logging supported.
                        General Purpose Logging supported.
    Short self-test routine 
    recommended polling time:    (   2) minutes.
    Extended self-test routine
    recommended polling time:    ( 100) minutes.
    SCT capabilities:          (0x003d) SCT Status supported.
                        SCT Error Recovery Control supported.
                        SCT Feature Control supported.
                        SCT Data Table supported.
    
    SMART Attributes Data Structure revision number: 1
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      5 Reallocated_Sector_Ct   0x0033   099   099   010    Pre-fail  Always       -       14
      9 Power_On_Hours          0x0032   089   089   000    Old_age   Always       -       52043
     12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       80
    177 Wear_Leveling_Count     0x0013   095   095   005    Pre-fail  Always       -       351
    179 Used_Rsvd_Blk_Cnt_Tot   0x0013   099   099   010    Pre-fail  Always       -       14
    180 Unused_Rsvd_Blk_Cnt_Tot 0x0013   099   099   010    Pre-fail  Always       -       6530
    181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
    182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
    183 Runtime_Bad_Block       0x0013   099   099   010    Pre-fail  Always       -       14
    184 End-to-End_Error        0x0033   100   100   097    Pre-fail  Always       -       0
    187 Uncorrectable_Error_Cnt 0x0032   099   099   000    Old_age   Always       -       141
    190 Airflow_Temperature_Cel 0x0032   055   041   000    Old_age   Always       -       45
    194 Temperature_Celsius     0x0022   055   041   000    Old_age   Always       -       45 (Min/Max 19/59)
    195 ECC_Error_Rate          0x001a   199   199   000    Old_age   Always       -       141
    197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
    199 CRC_Error_Count         0x003e   099   099   000    Old_age   Always       -       1
    202 Exception_Mode_Status   0x0033   100   100   010    Pre-fail  Always       -       0
    235 POR_Recovery_Count      0x0012   099   099   000    Old_age   Always       -       13
    241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       1257841068317
    242 Total_LBAs_Read         0x0032   099   099   000    Old_age   Always       -       972987617606
    243 SATA_Downshift_Ct       0x0032   100   100   000    Old_age   Always       -       0
    244 Thermal_Throttle_St     0x0032   100   100   000    Old_age   Always       -       0
    245 Timed_Workld_Media_Wear 0x0032   100   100   000    Old_age   Always       -       65535
    246 Timed_Workld_RdWr_Ratio 0x0032   100   100   000    Old_age   Always       -       65535
    247 Timed_Workld_Timer      0x0032   100   100   000    Old_age   Always       -       65535
    251 NAND_Writes             0x0032   100   100   000    Old_age   Always       -       1501990630400
    
    SMART Error Log Version: 1
    ATA Error Count: 141 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
    Powered_Up_Time is measured from power on, and printed as
    DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
    SS=sec, and sss=millisec. It "wraps" after 49.710 days.
    
    Error 141 occurred at disk power-on lifetime: 51900 hours (2162 days + 12 hours)
      When the command that caused the error occurred, the device was active or idle.
    
      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 00 88 39 82 40  Error: UNC at LBA = 0x00823988 = 8534408
    
      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      60 80 00 88 39 82 40 00      09:59:53.490  READ FPDMA QUEUED
      60 00 f0 88 31 82 40 1e      09:59:53.490  READ FPDMA QUEUED
      60 80 e8 00 4d 49 40 1d      09:59:53.490  READ FPDMA QUEUED
      60 00 e0 00 f0 aa 40 1c      09:59:53.490  READ FPDMA QUEUED
      60 80 d8 80 9f 10 40 1b      09:59:53.490  READ FPDMA QUEUED
    
    Error 140 occurred at disk power-on lifetime: 51889 hours (2162 days + 1 hours)
      When the command that caused the error occurred, the device was active or idle.
    
      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 18 20 79 d2 40  Error: UNC at LBA = 0x00d27920 = 13793568
    
      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      60 20 18 20 79 d2 40 03      09:59:14.186  READ FPDMA QUEUED
      60 20 10 c0 0d 49 40 02      09:59:14.186  READ FPDMA QUEUED
      60 20 58 00 79 d2 40 0b      09:59:14.186  READ FPDMA QUEUED
      2f 00 01 13 00 00 40 01      09:59:14.186  READ LOG EXT
      2f 00 01 00 00 00 40 01      09:59:14.186  READ LOG EXT
    
    Error 139 occurred at disk power-on lifetime: 51889 hours (2162 days + 1 hours)
      When the command that caused the error occurred, the device was active or idle.
    
      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 08 00 79 d2 40  Error: UNC at LBA = 0x00d27900 = 13793536
    
      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      60 20 08 00 79 d2 40 01      09:59:14.186  READ FPDMA QUEUED
      60 20 00 c0 0d 49 40 00      09:59:14.186  READ FPDMA QUEUED
      60 20 38 e0 78 d2 40 07      09:59:14.186  READ FPDMA QUEUED
      2f 00 01 13 00 00 40 1f      09:59:14.186  READ LOG EXT
      2f 00 01 00 00 00 40 1f      09:59:14.186  READ LOG EXT
    
    Error 138 occurred at disk power-on lifetime: 51889 hours (2162 days + 1 hours)
      When the command that caused the error occurred, the device was active or idle.
    
      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 f8 e0 78 d2 40  Error: UNC at LBA = 0x00d278e0 = 13793504
    
      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      60 20 f8 e0 78 d2 40 1f      09:59:14.186  READ FPDMA QUEUED
      60 20 b8 c0 0d 49 40 17      09:59:14.186  READ FPDMA QUEUED
      60 20 18 c0 78 d2 40 03      09:59:14.186  READ FPDMA QUEUED
      2f 00 01 13 00 00 40 16      09:59:14.186  READ LOG EXT
      2f 00 01 00 00 00 40 16      09:59:14.186  READ LOG EXT
    
    Error 137 occurred at disk power-on lifetime: 51889 hours (2162 days + 1 hours)
      When the command that caused the error occurred, the device was active or idle.
    
      After command completion occurred, registers were:
      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 b0 c0 78 d2 40  Error: UNC at LBA = 0x00d278c0 = 13793472
    
      Commands leading to the command that caused the error were:
      CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
      -- -- -- -- -- -- -- --  ----------------  --------------------
      60 20 b0 c0 78 d2 40 16      09:59:14.186  READ FPDMA QUEUED
      60 20 a8 c0 0d 49 40 15      09:59:14.186  READ FPDMA QUEUED
      60 20 b8 a0 78 d2 40 17      09:59:14.186  READ FPDMA QUEUED
      2f 00 01 13 00 00 40 14      09:59:14.186  READ LOG EXT
      2f 00 01 00 00 00 40 14      09:59:14.186  READ LOG EXT
    
    SMART Self-test log structure revision number 1
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Short offline       Completed without error       00%     41122         -
    # 2  Short offline       Completed without error       00%     40285         -
    # 3  Short offline       Completed without error       00%     40285         -
    # 4  Short offline       Completed without error       00%     40285         -
    # 5  Short offline       Completed without error       00%     40285         -
    # 6  Short offline       Completed without error       00%     40267         -
    # 7  Short offline       Completed without error       00%     30221         -
    # 8  Short offline       Completed without error       00%     29789         -
    # 9  Short offline       Completed without error       00%     29358         -
    #10  Short offline       Completed without error       00%     28768         -
    
    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
      255        0    65535  Read_scanning was completed without error
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.
    
  • Something is wrong with this forum. Every time I read this thread on front-page, I read it as "We love big dicks (the storage thread)".

    Please stop the planet! I wish to get off!

  • @root said:
    I read it as "We love big dicks (the storage thread)".

    It's fine if you are into that. I don't judge people.

    #nohomo

  • FalzoFalzo Senpai
    edited July 30

    @imok said:
    This 2TB NVMe is dying?

    yes and no.

    it heavily depends on how you use it right now and has been used in the past. yes, it has blocks dying but it also still has a lot reserve.
    while it is quite aged already this can be normal, especially if it is filled up quite good with stuff that rarely moves and the daily write load always hits the same few sectors.

    background: for a ssd/nvme to last long it requires to balance its writing across all available blocks as much as possible.
    so if you have a lot of it's capacity occupied by stuff that is only read and never written/removed, then obviously these "blocked" sectors cannot be used in the balancing scheme and only free blocks have to handle the load.

    if this is the case here, simply copying long existing data into the now free area and deleting the original data afterwards can help extending the NVMes lifespan, because other sectors will be used fpr the daily load in the future, and those haven't been written to in a long time.

    if your data across the whole disk changes a lot, then this does not apply and probably indeed the disk is dying anyway.

    Thanked by (4)imok eliphas msatt Shazan
  • edited July 30

    Wait, woot, 2TB ssd dying? Or just nvme?

  • I had a VM that kept running, but I couldn’t take backups. I’m not sure if the issue was related to writing the backup or reading the data.

    It might have been unable to write data, because after freeing up some space, the backup succeeded.

  • FalzoFalzo Senpai

    @imok said:
    I had a VM that kept running, but I couldn’t take backups. I’m not sure if the issue was related to writing the backup or reading the data.

    It might have been unable to write data, because after freeing up some space, the backup succeeded.

    If you have VMs running using disk image files, try running fstrim regularly within the VMs or resparse the images regularly. This should helpnfreeing space and with that increase the available blocks for the nvme to spread the writes better.

    Thanked by (1)imok
Sign In or Register to comment.