Need help troubleshooting slow I/O on Synology NAS device

edited January 2021 in Help

Perhaps a bit of an unusual topic for this forum, but since a Synology NAS basically is a ‘stripped down’ Linux machine, I figured it wouldn’t hurt to try.

I’ve received a Synology DS218Play NAS from a friend of mine to fix. It died completely from one day to the other. Lights would blink, but he was unable to reach it via the web interface most of the time and if it worked, it was painfully slow.

The device has 2 HDDs (spinning rust) in RAID and I found one of these to be defective. It would basically hang up the entire system, so as soon as I removed that drive, it seemed to work fine. Until I tried to do some I/O intensive tasks (like downloading a file); I/O would cap out at 8 megabytes per second. Even for spinning rust on an ARM based machine that’s slow. So, I swapped the drive for a WD RED Plus (CMR) HDD. Still slow… now what?

I reset the NAS to factory settings, formatted the drive and started off fresh but still; as soon it needs to do some I/O intensive tasks (like restore a backup for example) it craps out. What could this be causing?

I ran some disk benchmarks using a script I found here: https://gist.github.com/gxfxyz/0d81429fc48a82ef6c2402b178a82566 but I’m having trouble interpreting the results:


sudo hdparm -Tt /dev/md3

/dev/md3: Timing cached reads: 1114 MB in 2.00 seconds = 556.39 MB/sec Timing buffered disk reads: 432 MB in 3.00 seconds = 143.82 MB/sec

# create a 1 GiB test file with random data (it can take minutes)
head -c 1G </dev/urandom >test
# clear cache
sync; echo 3 | sudo tee /proc/sys/vm/drop_caches > /dev/null
# read from test file, read 1 MiB each time, read 1024 times, total read 1 GiB
dd if=test of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 6.85157 s, 157 MB/s

# write 0 to test file, write 1 GiB each time, write 2 times, total write 2 GiB
dd if=/dev/zero of=test bs=1G count=2 oflag=dsync
2+0 records in
2+0 records out
2147483648 bytes (2.1 GB) copied, 67.5115 s, 31.8 MB/s

# write 0 to test file, write 128 MiB each time, write 8 times, total write 1 GiB
dd if=/dev/zero of=test bs=128M count=8 oflag=dsync
8+0 records in
8+0 records out
1073741824 bytes (1.1 GB) copied, 8.26623 s, 130 MB/s

# write 0 to test file, write 1 MiB each time, write 1024 times, total write 1 GiB
dd if=/dev/zero of=test bs=1M count=1024 oflag=dsync
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 22.2873 s, 48.2 MB/s

# write 0 to test file, write 128 KiB each time, write 1024 times, total write 128 MiB
dd if=/dev/zero of=test bs=128k count=1024 oflag=dsync
1024+0 records in
1024+0 records out
134217728 bytes (134 MB) copied, 15.573 s, 8.6 MB/s

# write 0 to test file, write 4 KiB each time, write 1024 times, total write 4 MiB
dd if=/dev/zero of=test bs=4k count=1024 oflag=dsync
1024+0 records in
1024+0 records out
4194304 bytes (4.2 MB) copied, 15.5282 s, 270 kB/s

# write 0 to test file, write 512 bytes each time, write 1024 times, total write 512 KiB
dd if=/dev/zero of=test bs=512 count=1024 oflag=dsync
1024+0 records in
1024+0 records out
524288 bytes (524 kB) copied, 17.2119 s, 30.5 kB/s

fio xfio.conf
seq-read-1m: (g=0): rw=read, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1
seq-write-1m: (g=1): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1
seq-read-4k: (g=2): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
seq-write-4k: (g=3): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
rand-read-1m: (g=4): rw=randread, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1
rand-write-1m: (g=5): rw=randwrite, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1
rand-read-4k: (g=6): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
rand-write-4k: (g=7): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
fio-2.13
Starting 8 processes
seq-read-1m: Laying out IO file(s) (1 file(s) / 5120MB)

seq-read-1m: (groupid=0, jobs=1): err= 0: pid=32244: Tue Jan  5 13:44:52 2021
  read : io=5120.0MB, bw=162974KB/s, iops=159, runt= 32170msec
    clat (msec): min=2, max=69, avg= 6.28, stdev= 2.14
     lat (msec): min=2, max=69, avg= 6.28, stdev= 2.14
    clat percentiles (usec):
     |  1.00th=[ 5408],  5.00th=[ 5472], 10.00th=[ 5600], 20.00th=[ 5664],
     | 30.00th=[ 5792], 40.00th=[ 6048], 50.00th=[ 6240], 60.00th=[ 6304],
     | 70.00th=[ 6368], 80.00th=[ 6496], 90.00th=[ 6688], 95.00th=[ 6944],
     | 99.00th=[ 8160], 99.50th=[12096], 99.90th=[47872], 99.95th=[50944],
     | 99.99th=[69120]
    bw (KB  /s): min=117375, max=174384, per=98.83%, avg=161061.86, stdev=11020.53
    lat (msec) : 4=0.16%, 10=99.34%, 20=0.20%, 50=0.25%, 100=0.06%
  cpu          : usr=0.09%, sys=4.48%, ctx=5128, majf=0, minf=274
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=5120/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
seq-write-1m: (groupid=1, jobs=1): err= 0: pid=32321: Tue Jan  5 13:44:52 2021
  write: io=5120.0MB, bw=164819KB/s, iops=160, runt= 31810msec
    clat (msec): min=2, max=48, avg= 6.05, stdev= 1.38
     lat (msec): min=2, max=48, avg= 6.21, stdev= 1.38
    clat percentiles (usec):
     |  1.00th=[ 5280],  5.00th=[ 5344], 10.00th=[ 5408], 20.00th=[ 5472],
     | 30.00th=[ 5600], 40.00th=[ 5920], 50.00th=[ 6048], 60.00th=[ 6176],
     | 70.00th=[ 6240], 80.00th=[ 6368], 90.00th=[ 6560], 95.00th=[ 6816],
     | 99.00th=[ 6944], 99.50th=[ 8256], 99.90th=[21632], 99.95th=[47872],
     | 99.99th=[48384]
    bw (KB  /s): min=122529, max=164259, per=85.70%, avg=141244.06, stdev=6802.38
    lat (msec) : 4=0.23%, 10=99.43%, 20=0.21%, 50=0.12%
  cpu          : usr=2.96%, sys=1.82%, ctx=5154, majf=0, minf=17
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=5120/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
seq-read-4k: (groupid=2, jobs=1): err= 0: pid=32375: Tue Jan  5 13:44:52 2021
  read : io=1024.0MB, bw=76316KB/s, iops=19078, runt= 13740msec
    clat (usec): min=45, max=41219, avg=48.55, stdev=159.47
     lat (usec): min=45, max=41220, avg=49.08, stdev=159.47
    clat percentiles (usec):
     |  1.00th=[   45],  5.00th=[   45], 10.00th=[   46], 20.00th=[   46],
     | 30.00th=[   46], 40.00th=[   46], 50.00th=[   46], 60.00th=[   46],
     | 70.00th=[   46], 80.00th=[   47], 90.00th=[   47], 95.00th=[   49],
     | 99.00th=[   76], 99.50th=[   78], 99.90th=[   89], 99.95th=[   95],
     | 99.99th=[ 1512]
    bw (KB  /s): min=42861, max=74722, per=91.51%, avg=69832.15, stdev=6667.91
    lat (usec) : 50=95.49%, 100=4.47%, 250=0.02%, 500=0.01%, 750=0.01%
    lat (usec) : 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
  cpu          : usr=9.10%, sys=39.67%, ctx=262179, majf=0, minf=19
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=262144/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
seq-write-4k: (groupid=3, jobs=1): err= 0: pid=32406: Tue Jan  5 13:44:52 2021
  write: io=1024.0MB, bw=70097KB/s, iops=17524, runt= 14959msec
    clat (usec): min=50, max=39594, avg=53.15, stdev=119.00
     lat (usec): min=50, max=39596, avg=53.78, stdev=119.01
    clat percentiles (usec):
     |  1.00th=[   50],  5.00th=[   51], 10.00th=[   51], 20.00th=[   51],
     | 30.00th=[   51], 40.00th=[   51], 50.00th=[   51], 60.00th=[   51],
     | 70.00th=[   52], 80.00th=[   52], 90.00th=[   53], 95.00th=[   55],
     | 99.00th=[   81], 99.50th=[   89], 99.90th=[  131], 99.95th=[  145],
     | 99.99th=[ 1288]
    bw (KB  /s): min=53700, max=67940, per=93.76%, avg=65722.48, stdev=3553.70
    lat (usec) : 100=99.83%, 250=0.15%, 500=0.01%, 750=0.01%, 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
  cpu          : usr=9.09%, sys=26.41%, ctx=262191, majf=0, minf=23
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=262144/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
rand-read-1m: (groupid=4, jobs=1): err= 0: pid=32439: Tue Jan  5 13:44:52 2021
  read : io=1024.0MB, bw=76869KB/s, iops=75, runt= 13641msec
    clat (msec): min=2, max=83, avg=13.31, stdev= 5.13
     lat (msec): min=2, max=83, avg=13.31, stdev= 5.13
    clat percentiles (usec):
     |  1.00th=[ 2160],  5.00th=[ 6368], 10.00th=[ 8512], 20.00th=[ 9792],
     | 30.00th=[11072], 40.00th=[12224], 50.00th=[13376], 60.00th=[14272],
     | 70.00th=[15552], 80.00th=[16768], 90.00th=[17792], 95.00th=[18304],
     | 99.00th=[25728], 99.50th=[36608], 99.90th=[58624], 99.95th=[83456],
     | 99.99th=[83456]
    bw (KB  /s): min=61188, max=86383, per=88.10%, avg=67719.48, stdev=5093.78
    lat (msec) : 4=2.83%, 10=19.53%, 20=76.17%, 50=1.17%, 100=0.29%
  cpu          : usr=0.22%, sys=1.91%, ctx=1027, majf=0, minf=272
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=1024/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
rand-write-1m: (groupid=5, jobs=1): err= 0: pid=32465: Tue Jan  5 13:44:52 2021
  write: io=1024.0MB, bw=119523KB/s, iops=116, runt=  8773msec
    clat (msec): min=2, max=49, avg= 8.39, stdev= 5.32
     lat (msec): min=2, max=49, avg= 8.56, stdev= 5.32
    clat percentiles (usec):
     |  1.00th=[ 2288],  5.00th=[ 2288], 10.00th=[ 2288], 20.00th=[ 2384],
     | 30.00th=[ 6048], 40.00th=[ 7520], 50.00th=[ 7968], 60.00th=[ 8512],
     | 70.00th=[ 9664], 80.00th=[11456], 90.00th=[15040], 95.00th=[17792],
     | 99.00th=[24448], 99.50th=[27520], 99.90th=[48896], 99.95th=[49408],
     | 99.99th=[49408]
    bw (KB  /s): min=76487, max=176390, per=77.51%, avg=92647.88, stdev=22157.86
    lat (msec) : 4=24.32%, 10=48.05%, 20=25.00%, 50=2.64%
  cpu          : usr=2.05%, sys=1.60%, ctx=1029, majf=0, minf=17
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=1024/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
rand-read-4k: (groupid=6, jobs=1): err= 0: pid=32489: Tue Jan  5 13:44:52 2021
  read : io=102400KB, bw=1140.2KB/s, iops=285, runt= 89812msec
    clat (usec): min=70, max=327343, avg=3502.41, stdev=4495.53
     lat (usec): min=70, max=327345, avg=3503.04, stdev=4495.54
    clat percentiles (usec):
     |  1.00th=[   74],  5.00th=[   83], 10.00th=[   87], 20.00th=[   90],
     | 30.00th=[   92], 40.00th=[   94], 50.00th=[ 1768], 60.00th=[ 3856],
     | 70.00th=[ 5792], 80.00th=[ 7840], 90.00th=[ 9664], 95.00th=[10816],
     | 99.00th=[11584], 99.50th=[11712], 99.90th=[12352], 99.95th=[36096],
     | 99.99th=[52992]
    bw (KB  /s): min=  333, max= 4192, per=75.94%, avg=865.74, stdev=313.03
    lat (usec) : 100=43.69%, 250=3.53%, 500=0.46%, 750=0.04%, 1000=0.04%
    lat (msec) : 2=3.30%, 4=9.80%, 10=30.63%, 20=8.44%, 50=0.06%
    lat (msec) : 100=0.01%, 500=0.01%
  cpu          : usr=0.21%, sys=0.80%, ctx=25613, majf=0, minf=20
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=25600/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
rand-write-4k: (groupid=7, jobs=1): err= 0: pid=32698: Tue Jan  5 13:44:52 2021
  write: io=102400KB, bw=2853.5KB/s, iops=713, runt= 35887msec
    clat (usec): min=77, max=66798, avg=1396.62, stdev=2198.08
     lat (usec): min=78, max=66799, avg=1397.36, stdev=2198.08
    clat percentiles (usec):
     |  1.00th=[   86],  5.00th=[   94], 10.00th=[   97], 20.00th=[  104],
     | 30.00th=[  109], 40.00th=[  118], 50.00th=[ 1128], 60.00th=[ 1288],
     | 70.00th=[ 1496], 80.00th=[ 2008], 90.00th=[ 3248], 95.00th=[ 4512],
     | 99.00th=[11456], 99.50th=[13888], 99.90th=[22400], 99.95th=[23936],
     | 99.99th=[41728]
    bw (KB  /s): min= 2232, max= 5128, per=98.43%, avg=2808.11, stdev=400.27
    lat (usec) : 100=14.48%, 250=30.50%, 500=0.16%, 750=0.06%, 1000=0.03%
    lat (msec) : 2=34.76%, 4=13.58%, 10=5.06%, 20=1.18%, 50=0.18%
    lat (msec) : 100=0.01%
  cpu          : usr=0.47%, sys=1.31%, ctx=25615, majf=0, minf=16
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=25600/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=5120.0MB, aggrb=162974KB/s, minb=162974KB/s, maxb=162974KB/s, mint=32170msec, maxt=32170msec

Run status group 1 (all jobs):
  WRITE: io=5120.0MB, aggrb=164818KB/s, minb=164818KB/s, maxb=164818KB/s, mint=31810msec, maxt=31810msec

Run status group 2 (all jobs):
   READ: io=1024.0MB, aggrb=76315KB/s, minb=76315KB/s, maxb=76315KB/s, mint=13740msec, maxt=13740msec

Run status group 3 (all jobs):
  WRITE: io=1024.0MB, aggrb=70096KB/s, minb=70096KB/s, maxb=70096KB/s, mint=14959msec, maxt=14959msec

Run status group 4 (all jobs):
   READ: io=1024.0MB, aggrb=76869KB/s, minb=76869KB/s, maxb=76869KB/s, mint=13641msec, maxt=13641msec

Run status group 5 (all jobs):
  WRITE: io=1024.0MB, aggrb=119523KB/s, minb=119523KB/s, maxb=119523KB/s, mint=8773msec, maxt=8773msec

Run status group 6 (all jobs):
   READ: io=102400KB, aggrb=1140KB/s, minb=1140KB/s, maxb=1140KB/s, mint=89812msec, maxt=89812msec

Run status group 7 (all jobs):
  WRITE: io=102400KB, aggrb=2853KB/s, minb=2853KB/s, maxb=2853KB/s, mint=35887msec, maxt=35887msec

Disk stats (read/write):
    md3: ios=300032/299963, merge=0/0, ticks=177520/109410, in_queue=285050, util=93.63%, aggrios=300056/300101, aggrmerge=4/21, aggrticks=178200/108570, aggrin_queue=286380, aggrutil=93.39%
  sda: ios=300056/300101, merge=4/21, ticks=178200/108570, in_queue=286380, util=93.39%

For me, as a some-what noob, these look ‘fine’. Am I missing the bigger picture?
All help and pointers are welcome ?

Thanks!

Comments

  • Are you running ZFS? This has considerable overhead, especially if you are running it with compression and de-duplication.

    Cheap dedis are my drug, and I'm too far gone to turn back.

  • DreamDream OGServices Provider

    Hey, What are the Temperatures of the Subject in pain?

    I had one time, that the Dumb thing ran hot and down clock itself to a death machine.

    After I blasted it with a compressor it ran like a charm again. It may or may not help it in this case.

  • @CamoYoshi said:
    Are you running ZFS? This has considerable overhead, especially if you are running it with compression and de-duplication.

    No, I'm afraid not. I'm running ext4.

    @Dream said:
    Hey, What are the Temperatures of the Subject in pain?

    I had one time, that the Dumb thing ran hot and down clock itself to a death machine.

    After I blasted it with a compressor it ran like a charm again. It may or may not help it in this case.

    Good one! But according to DSM the temps are 'Normal' (40 degrees celcius).

Sign In or Register to comment.