RESOLVED: ebtables eats 100% CPU on proxmox node

miumiu
edited April 2021 in Help

Good day.

I have installed PROXMOX VE on Deb10 host node where has been already running and configured Firewalld & MariaDB & postfix & BIND.

It is new fresh installation of PROXMOX where host is currently w no load, no traffic, yet idling. (host node is OVH KVM 2vCPU, nested virtualization enabled)

Host node net config /etc/network/interfaces:

    auto lo
    iface lo inet loopback

    # The primary network interface for PROXMOX host node
    auto ens3
    iface ens3 inet static
            address 152.228.91.75/32
            gateway 152.228.90.1
            pointopoint 152.228.90.1
            dns-nameservers 127.0.0.1 8.8.8.8 8.8.4.4
            post-up echo 1 > /proc/sys/net/ipv4/ip_forward
            post-up echo 1 > /proc/sys/net/ipv4/conf/ens3/proxy_arp

    # for KVM VM-IPs
    auto vmbr0
    iface vmbr0 inet static
      address 152.228.91.75/32
      bridge_ports none
      bridge_stp off
      bridge_fd 0
    #  up ip route for additional-IPs/32 on vmbr0
      up ip route add 178.32.100.221/32 dev vmbr0

Firewalld uses 2 zones:

PUBLIC for host node (for ens3):

        <?xml version="1.0" encoding="utf-8"?>
        <zone>
          <short>Public</short>
          <interface name="ens3"/>
          <service name="smtp"/>
          <service name="smtps"/>
          <service name="pop3"/>
          <service name="pop3s"/>
          <service name="imap"/>
          <service name="imaps"/>
          <port port="587" protocol="tcp"/>
          <port port="53" protocol="tcp"/>
          <port port="2222" protocol="tcp"/>
          <port port="3306" protocol="tcp"/>
          <port port="53" protocol="udp"/>
          <port port="22" protocol="tcp"/>
        <port port="8006" protocol="tcp"/>
        </zone>

TRUSTED for KVM instance (for vmbr0; allowing all traffic to vmbr0 IPs passthrough host system without filtering by its running firewall):

        <?xml version="1.0" encoding="utf-8"?>
        <zone target="ACCEPT">
          <short>Trusted</short>
          <description>All network connections are accepted.</description>
          <interface name="vmbr0"/>
        </zone>

ISSUE:

Immediately (or after host node reboot) all works fine:

Hostnode OS has low CPU load (0% - max 8 or 9%; permanently always under 10%),
KVM instance (using vmbr0 iface) load is also idling (<4-5% CPU permanently, all time, its LAMP+MTA & all internet connections work fine),
both no traffic, ALL FINE

BUT: after a few hours (second days morning) i always found:
- Hostnode does have crazy load 100% of CPU: ebtables-restor process consume permanently/whole time all available CPU; traffic is still idling
- (KVM instance remain ok, idling CPU and no traffic too, there is no problem)

BTW: I tried the same (allow all traffic for vbmr0 go directly to VM without be affected by host firewall) also with other (to me know) option with other firewall config as FORWARD rules:

    firewall-cmd --permanent --direct --passthrough ipv4 -I FORWARD -i vmbr0 -j ACCEPT
    firewall-cmd --permanent --direct --passthrough ipv4 -I FORWARD -o vmbr0 -j ACCEPT
    firewall-cmd --reload; service firewalld restart

But unfortunately, the result after several hours is exactly the same ISSUE as in 1st case with vmbr0 added in TRUSTED zone (host node does have after few hours 100% bussy CPU, consumed by ebtables-restor)

I am already total desparate from it: I killed 2 whole days with this issue (tried all reinstall several times on new, but always exactly the same problem happened and ebtables consume all 100% CPU on idling hostnode & servers. I also cannot find 1 such or similar cases when search on internet or any useful resolutions, suggestions, nothing... Absolutely strange thingh for me and i am not able resolving it and move somewhere.

BTW#2: I know that best is have on host running nothing but Proxmox only, but i need keep current case from several reasons (have and keep on host node running also firewalld + mentioned software)

Thank you very much for all effort and help, i would be really grateful if someone can help me resolve this crazy, strange (for me) issue. All attempts really highly appreciated by me.

Good day and Goodbye

Comments

  • Additionally - iptables output if this can be useful:

    # iptables -S
    
    -P INPUT ACCEPT
    -P FORWARD ACCEPT
    -P OUTPUT ACCEPT
    -N FORWARD_IN_ZONES
    -N FORWARD_IN_ZONES_SOURCE
    -N FORWARD_OUT_ZONES
    -N FORWARD_OUT_ZONES_SOURCE
    -N FORWARD_direct
    -N FWDI_public
    -N FWDI_public_allow
    -N FWDI_public_deny
    -N FWDI_public_log
    -N FWDI_trusted
    -N FWDI_trusted_allow
    -N FWDI_trusted_deny
    -N FWDI_trusted_log
    -N FWDO_public
    -N FWDO_public_allow
    -N FWDO_public_deny
    -N FWDO_public_log
    -N FWDO_trusted
    -N FWDO_trusted_allow
    -N FWDO_trusted_deny
    -N FWDO_trusted_log
    -N INPUT_ZONES
    -N INPUT_ZONES_SOURCE
    -N INPUT_direct
    -N IN_public
    -N IN_public_allow
    -N IN_public_deny
    -N IN_public_log
    -N IN_trusted
    -N IN_trusted_allow
    -N IN_trusted_deny
    -N IN_trusted_log
    -N OUTPUT_direct
    -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
    -A INPUT -i lo -j ACCEPT
    -A INPUT -j INPUT_direct
    -A INPUT -j INPUT_ZONES_SOURCE
    -A INPUT -j INPUT_ZONES
    -A INPUT -m conntrack --ctstate INVALID -j DROP
    -A INPUT -j REJECT --reject-with icmp-host-prohibited
    -A FORWARD -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
    -A FORWARD -i lo -j ACCEPT
    -A FORWARD -j FORWARD_direct
    -A FORWARD -j FORWARD_IN_ZONES_SOURCE
    -A FORWARD -j FORWARD_IN_ZONES
    -A FORWARD -j FORWARD_OUT_ZONES_SOURCE
    -A FORWARD -j FORWARD_OUT_ZONES
    -A FORWARD -m conntrack --ctstate INVALID -j DROP
    -A FORWARD -j REJECT --reject-with icmp-host-prohibited
    -A OUTPUT -j OUTPUT_direct
    -A FORWARD_IN_ZONES -i vmbr0 -j FWDI_trusted
    -A FORWARD_IN_ZONES -i ens3 -g FWDI_public
    -A FORWARD_IN_ZONES -g FWDI_public
    -A FORWARD_OUT_ZONES -o vmbr0 -j FWDO_trusted
    -A FORWARD_OUT_ZONES -o ens3 -g FWDO_public
    -A FORWARD_OUT_ZONES -g FWDO_public
    -A FWDI_public -j FWDI_public_log
    -A FWDI_public -j FWDI_public_deny
    -A FWDI_public -j FWDI_public_allow
    -A FWDI_public -p icmp -j ACCEPT
    -A FWDI_trusted -j FWDI_trusted_log
    -A FWDI_trusted -j FWDI_trusted_deny
    -A FWDI_trusted -j FWDI_trusted_allow
    -A FWDI_trusted -j ACCEPT
    -A FWDO_public -j FWDO_public_log
    -A FWDO_public -j FWDO_public_deny
    -A FWDO_public -j FWDO_public_allow
    -A FWDO_trusted -j FWDO_trusted_log
    -A FWDO_trusted -j FWDO_trusted_deny
    -A FWDO_trusted -j FWDO_trusted_allow
    -A FWDO_trusted -j ACCEPT
    -A INPUT_ZONES -i vmbr0 -j IN_trusted
    -A INPUT_ZONES -i ens3 -g IN_public
    -A INPUT_ZONES -g IN_public
    -A INPUT_direct -p tcp -m multiport --dports 22 -m set --match-set f2b-sshd src -j REJECT --reject-with icmp-port-unreachable
    -A IN_public -j IN_public_log
    -A IN_public -j IN_public_deny
    -A IN_public -j IN_public_allow
    -A IN_public -p icmp -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 25 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 465 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 21 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 110 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 995 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 143 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 993 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 587 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 53 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 2222 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p udp -m udp --dport 53 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 22 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 3306 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_public_allow -p tcp -m tcp --dport 8006 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
    -A IN_trusted -j IN_trusted_log
    -A IN_trusted -j IN_trusted_deny
    -A IN_trusted -j IN_trusted_allow
    -A IN_trusted -j ACCEPT
    

    Thanks for all help!

    Good day and Goodbye

  • FalzoFalzo Senpai

    get rid of firewalld. it's probably not a good idea to run it in parallel to proxmox.

  • miumiu
    edited April 2021

    @Falzo said:
    get rid of firewalld. it's probably not a good idea to run it in parallel to proxmox.

    Yes I have the same opinion that is better use Iptables only instead firewalld in combination w proxmox.
    Problem is there is (on host node) used another software what require firewalld (uses just firewalld)
    And just these applications are important (runs significantly better&faster directly on host than inside another nested virtualized environment - mean when i would place it inside nested KVM instance instead on proxmox node.. because host always get much more preferably system resources when compete about them w nested VMs)

    Good day and Goodbye

  • FalzoFalzo Senpai

    yeah but most likely that will continue cuasing problems. proxmox is not designed to run in parallel to anything else. while you can have a small nginx or the likes to use a reverse proxy, that firewalld shit is not gonna fly if it messes with the network or proxmox keeps on changing things because it can't work otherwise. put your app in a guest instead...

    Thanked by (1)miu
  • miumiu
    edited April 2021

    @Falzo said:
    yeah but most likely that will continue cuasing problems. proxmox is not designed to run in parallel to anything else. while you can have a small nginx or the likes to use a reverse proxy, that firewalld shit is not gonna fly if it messes with the network or proxmox keeps on changing things because it can't work otherwise. put your app in a guest instead...

    And just these applications are important (runs significantly better&faster directly on host than inside another nested virtualized environment - mean when i would place it inside nested KVM instance instead on proxmox node.. because host always get much more preferably system resources when compete about them w nested VMs)

    imo: WHERE I WILL NOT FIND SOLUTION or direct cause what would be improved, eliminated (it will not able work well ever together) , then i will must move all on more powerful VPS (where will be in nested VM still enough resource power)
    (is possible as u wrote that proxmox and current sw incl. firewalld will fight between self and sabotage each other constantly.. :'( )

    in every case thanks for your opinions!

    Good day and Goodbye

  • @Falzo and everybody who is interesting in this issue

    UPDATE:

    After another day of searching, reading and investigations probably i found probable reason of problem and also usable solution (not quit clean but yet seems all work well)

    CAUSE:

    Seems, when firewalld and Proxmox VE meet self on one host/OS, they compete about control of ebtables, have conflict: On begin, ebtables is empty (all connection allowed), but after any time i found ebtable full of nonsensual rules and many times repeating rows, after few hours i found there several thousands of them. Then ibtables-restor try/begin maintain (unsuccessful, around) ebtables and this begin eat permanently crazy portion of CPU resources and cause my issue.

    MY PARTIAL SOLUTION (what seems yet that work well):

    As i do not need ethernet bridge firewall ever (because each IP is filtered by own OS firewall), i decided try disable ebtables. As i not found option in firewalld DOCS how to do it, i disabled ebtables in kernel (in modprobe.d as blacklisted module). From this time all works well: after more than 24h is load still normal and very low ( CPU load averages example => 0.14 (1 min) 0.21 (5 mins) 0.18 (15 mins) ), when ebtables is off.

    Thanked by (1)Falzo

    Good day and Goodbye

  • Just FYI, I've been running CSF (iptables) fine on Proxmox nodes, for years (once setup with rules) and don't use the supplied Proxmox firewall at all.

    Thanked by (1)miu

    It wisnae me! A big boy done it and ran away.
    NVMe2G for life! until death (the end is nigh)

Sign In or Register to comment.