Performance issue with in-Memory computing for containers
Hello all,
I'm trying to configure an environment for in-memory computing where containers run completely in memory and persistent storage volumes are attached only. My workload consists of many small io quieres and lots of computing (model rendering, relation computing and then simulation tests) so my hope is to speed up the build cycle by this. Since there's no actual data produced (or better say I can be easily reproduced by re-rendering the predefined model) the possibility of data loss on server crash is negligible.
I've got a testing machine with two Epyc 7543 and 512GB RAM. So far my approach was to create a ramdisk (with tmpfs) and create the containers within this directory. This works, I can see the impact on RAM usage with "free -m" when starting the containers but the build cycle reports a lower io speed than I get when I run this on my Softwareraid-10 NVMe Gen4 (2GBit/s vs 5GBit/s). I was expecting the in-memory containers to be much faster than the NVMe one.
To simplify my setup for testing, I replaced the containerized setup by a Proxmox installation, created again a ramdisk, added this as storage directory to Proxmox and spawned a container with Debian 12 to test. Same result, the ramdisk container isn't as performant as the NVMe one. Ok, no big surprise changing the setup wasn't helpful but I thought from this point on you can help me better because more users are familiar with Proxmox than Rancher.
Did I miss a point why in-memory it can't be faster than on NVMe? Or is there an error in my setup?
Appreciate any thoughts and help on this.
Comments
Nobody of you doing in-memory computing? Not even for databases or gameservers? I appreciate any experience, thoughts or hints about that issue you can share.
did you dd your ramdisk? are the results the same?
Free NAT KVM | Free NAT LXC | Bobr
ITS WEDNESDAY MY DUDES
Yes, I was running some dd and fio tests. Result is the same, the ramdisk is slowlier than the NVMe.
Access to ramdisk is still going through kernel filesystem layer.
The app would be copying data back and forth between memory and ramdisk.
Ramdisk is competing with normal memory access for RAM bandwidth.
If such copies are not using DMA controller for some reason, it may even compete for CPU cycles.
To maximize performance, you should build the app to allocate memory in hugepages and store data in that memory.
The app should not use the filesystem API.
Accepting submissions for IPv6 less than /64 Hall of Incompetence.
Yes I agree with @yoursunny . It seems to me that many small io operations in parallel with computations (that I assume need memory bandwidth) are causing memory bus saturation
On the other hand nvme (which is very fast too) is on its own bus, thus offering better parallelism between io and and computing operations for this particular workload
Allocating in hugepages may probably help, otherwise you stay with nvme for the above reason
I echo yoursunny's thoughts, you should be loading your data to RAM directly without the ancient TMPFS layer. NVMe's architecture is direct-mapped to cpu/chipset almost like RAM. And gen4 is crazy fast already.
With current filesystem's and kernel's optimizations for caching, I can see it being faster than tmpfs as it is not used that much.
Just for kicks, a quick and dirty test with my note's DDR4 RAM vs my local NVME disk (Gen1 I think):
--
Cheers
Thank you for this explanation. Rancher doesn't really have hugepages support so I'm not trying it out but from what I read this seems to be a way to optimize the speed. Having checked my mainboard config it seems like the ram bandwidth is the bottleneck. I wasn't considering that, so it appears to me more practical and more performant to continue using my NVMes and wait a little longer until the process finishes.
Thanks for the interesting discussion. Quite fascinating we have workloads now where ("next gen") storage is faster than RAM!