Is nested virtualisation as terrible as professionals say?

edited March 31 in General

I've always heard that nested virtualisation is neat for testing, but should never be used for production because of performance and unintelligible mumbling "reasons"

What are people's actual experience?

Is anyone running nested virt in production?

Is it performant?

poll
  1. Do you use nested virtualisation?33 votes
    1. Never tried it
      30.30%
    2. Tried it, worked poorly
      24.24%
    3. I use it for testing
      27.27%
    4. I use it for production
      12.12%
    5. Other (post in thread)
        6.06%

Comments

  • Why would you want to use it? Constrained resources? Sandboxed sandboxing?

  • Does running containers in a VM count? I do, to keep docker-supplied functionality separate from full OS LXC containers. It's all quite low volume.

    There's probably a better way to reach the same result :-)

  • @wankel said:
    Does running containers in a VM count?

    I had in mind specifically KVM-in-KVM, for example running a hypervisor like Proxmox or XCP-ng in a virtual machine.

    Searching for "nested" here nets many people asking vendors for the feature - I figured there's some experience floating around :)

  • In case of the "worked poorly", it would be nice to have some insight in which combinations worked poorly, and especially, when running production, which combination that is and whether it runs satisfactory.

    Thanked by (1)IAmNix
  • @IAmNix said:
    Searching for "nested" here nets many people asking vendors for the feature - I figured there's some experience floating around :)

    I think most ask this because they need amd-v or intel-vt cpu flags enabled for containerization, not necessarily for kvm-in-kvm. Nested as kvm-in-kvm has very poor disk performance within the nested box and slightly unstable network. That's what I experienced myself when running under high load in a nested test server. Another issue for production setups is that you add another layer which increases the trusting computing base.

    Thanked by (2)IAmNix SashkaPro
  • edited April 1

    @webcraft said:

    @IAmNix said:
    Searching for "nested" here nets many people asking vendors for the feature - I figured there's some experience floating around :)

    I think most ask this because they need amd-v or intel-vt cpu flags enabled for containerization, not necessarily for kvm-in-kvm. Nested as kvm-in-kvm has very poor disk performance within the nested box and slightly unstable network. That's what I experienced myself when running under high load in a nested test server. Another issue for production setups is that you add another layer which increases the trusting computing base.

    Gotcha, thank you!

    Hmm, I guess it's just an unoptimized usecase then. I can imagine that all kinds of queues and timings in networking/disk IO/CPU scheduling break down when you recurse them - like how TCP-in-TCP tunnels sounds like they should work, but just don't. Thanks!

    Now I'm curious if there's any workarounds :). Like having a network passthrough driver, or a non-journalling filesystem on one of the levels. It would need an active community to figure out a setup like that :).

    Edit: Thinking some more - I guess a performant system would either need massive tuning and tweaks on the scale it took to implement containers on linux - or the bare metal host needs to be aware of all recursive guests. And at that point you might as well simplify the system down to a single hypervisor with multiple tenants. Huh.

  • Used it for testing:

    • Ran docker containers in LXC container. No noticeable performance hit.
    • Ran qemu VM in VM in KVM VM (3 layers). Significant performance hit.
    • Ran docker container in VM. No noticeable performance hit.
    • Did not figure out how to run VM in LXC container without passing in hosts KVM, which is same as running it on host.
    Thanked by (1)IAmNix

    Artificial intelligence is no match for our natural stupidity.

    Time flies like an arrow; fruit flies like a banana.

  • terrible performance compared to what? if it's just some bullshit MBA talk, ignore them.

    this argument has a lot of lacking information..for example;

    • if you use your own dedicated hardware, using nested virtualization might doesn't makes sense if you want to squeeze out every last of hardware performance available.
    • if you already rented a kvm based vm, you have poor planning to deploy your software if you still have to add 2 or more virtualization layer on top of that. have you consider just splitting them up and see if the cost is more efficient? ( i.e. 2x 2vcpu2gbram could be cheaper compared to 4vcpu8gbram)

    since this is mostly about resource talk, you should reconsider for the cost efficiency, and does that decision to use nested virtualization cause more trouble than it's worth? (like it just increasing customer support cost because your users notice something "weird" with the performance).

    sure it might be slow, but if you're able to make sure it has better cost efficiency overall in production, support and maintenance terms, just run it then, why not? (looking at ye, weird legacy systems).


    from my experience, I only have this level of nested virtualization:
    1. own hardware
    2. install XCP-ng on it
    3. from XCP-ng, create VMs
    4. inside the VMs, deploy the software using docker / kubernetes

    there are no significant performance hit, and this setup can run just fine in production environment.

    Fuck this 24/7 internet spew of trivia and celebrity bullshit.

  • I'm using nested virtualization in production (KVM-in-KVM)
    I don't see any performance issues.
    I haven't tested it for applications that are very, very sensitive to the slightest latency.
    But it seems to work well for many uses.

    Thanked by (2)IAmNix SashkaPro
  • Nearly all NAT services either start or run on nested virt. Hell even BuyVM ran regular VPS on it for awhile.

    As long as the passthrough is set correctly then you will not even notice a difference performance wise.

    URL Shortener | YetiNode | Come join us on the MetalVPS IRC channel!!! | Don't be a jerk, let YetiNode do the work.

  • I think there can be cases where you will see an impact but modt likely this then is due to missing awareness between the layers.

    For IO think of stripe and blocksizes that are not matching or cache settings that compete or even conflict each other (barriers etc.). For network think of overhead in MTUs and similar stuff.

    In the end I have to side with @remy here. If your layers match critical settings, there should not be much of an issue or big performance hit after all. Even for the often blamed kvm in kvm.

    Thanked by (1)IAmNix
Sign In or Register to comment.