Using part of the IPv6 /64 block to provide public ips to wireguard clients

Thanks to @MaxKVM for providing an awesome hosting service. I have a ticket with them that they and their upstream provider have not been able to resolve, and I would like to get a second opinion here.

I get a /64 block of IPv6 address of which 1 is allocated to the eth0 interface on my VPS. I then allocate a /112 block to Wireguard outside of the eth0 address, and statically assign IPv6 address from this block to wireguard clients.

MaxKVM does not do routed IPv6, but uses on-link IPv6, so I have to enable proxy_ndp on my VPS so that the eth0 interface would respond to neighbor solication (NS) messages with a neighbor advertisement (NA) for addresses in the /112 block.

sudo sysctl -w net.ipv6.conf.all.proxy_ndp = 1
sudo ip -6 neigh add proxy 2402:xxxx:xxxx:xxxx::200:4 dev eth0

When I try to ping an external IPv6 address on my wireguard client, the upstream router of the VPS would then ask who has the 2402:xxxx:xxxx:xxxx::200:4 address so that it knows where to route the response to. The issue though is that the upstream router is sending NS messages with a fe80::xxxx:xxxx:xxxx:fdc0 (IPv6 EUI-64 address) and expecting a reply back to that fe80 address. See tcpdump output below.

jon@max1 /etc: sudo tcpdump -i eth0 -v 'icmp6[icmp6type]=icmp6-neighborsolicit or icmp6[icmp6type]=icmp6-neighboradvert'
04:32:07.414482 IP6 (class 0xc0, hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::xxxx:xxxx:xxxx:fdc0 > ff02::1:ff00:4: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has 2402:xxxx:xxxx:xxxx::200:4
      source link-address option (1), length 8 (1): xx:xx:xx:xx:fd:c0
04:32:07.482930 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::yyyy:yyyy:yyyy:2d51 > fe80::xxxx:xxxx:xxxx:fdc0: [icmp6 sum ok] ICMP6, neighbor advertisement, length 32, tgt is 2402:xxxx:xxxx:xxxx::200:4, Flags [solicited]
      destination link-address option (2), length 8 (1): xx:xx:xx:xx:2d:51
04:32:07.550926 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::yyyy:yyyy:yyyy:2d51 > fe80::xxxx:xxxx:xxxx:fdc0: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::xxxx:xxx:xxxx:fdc0
      source link-address option (1), length 8 (1): xx:xx:xx:xx:2d:51

Since I enabled ndp proxying, my VPS tries to respond back to the router's fe80 address with a NA, but determines that it cannot, and sends a NS asking for how to route to that address. As a result, my wireguard client gets a host unreachable error because it gets no response.

However, if I ping the global IPv6 address that is the IPv6 gateway (which is also the router) from the wireguard client, I will see a NS coming from that global IPv6 address. And because it is the gateway, my VPS has no problems with responding with a NA and IPv6 starts working on my wireguard client.

04:39:34.124527 IP6 (class 0xc0, hlim 255, next-header ICMPv6 (58) payload length: 32) 2402:zzzz:zzzz::1 > ff02::1:ff00:4: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has 2402:xxxx:xxxx:xxxx::200:4
      source link-address option (1), length 8 (1): xx:xx:xx:xx:fd:c0
04:39:34.718943 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) <My Public IPv6 address> > 2402:zzzz:zzzz::1: [icmp6 sum ok] ICMP6, neighbor advertisement, length 32, tgt is 2402:xxxx:xxxx:xxxx::200:4, Flags [solicited]
      destination link-address option (2), length 8 (1): xx:xx:xx:xx:2d:51

Is it normal to block ICMPv6 access to the fe80 address of the upstream router?

For now though, I have switch to NATed IPv6 for my wireguard clients, but what a waste of the /64 block though.

Thanks
Jonathan

Thanked by (2)jureve hey
Tagged:
«1

Comments

  • This is a really interesting idea

  • Any guides pls

  • A few points:

    1. I presume you are forwarding between eth0 and your wg* interface (and that you have the relevant forwarding sysctl's set to 1)
    2. I think for a forwarding scenario, you need to have accept_ra=2 on eth0
    3. What is your gateway setup for the primary wg* interface?

    I'm familiar with something like this in the Proxmox context (eth0 external, non-routed IPv6/64, internal vmbr1, IPv6's from the /64) and except that you're using WG there's no other difference and it works. There's a IPv6 setup for vmbr1 (and that is the gateway for the rest of the internal VMs sitting on vmbr1).

  • @nullnothere said:
    A few points:

    1. I presume you are forwarding between eth0 and your wg* interface (and that you have the relevant forwarding sysctl's set to 1)
    2. I think for a forwarding scenario, you need to have accept_ra=2 on eth0
    3. What is your gateway setup for the primary wg* interface?

    I'm familiar with something like this in the Proxmox context (eth0 external, non-routed IPv6/64, internal vmbr1, IPv6's from the /64) and except that you're using WG there's no other difference and it works. There's a IPv6 setup for vmbr1 (and that is the gateway for the rest of the internal VMs sitting on vmbr1).

    1. Yes, I have forwarding setup on wg* on both iptables and ip6tables.
    -A FORWARD -i wg+ -j ACCEPT
    -A FORWARD -o wg+ -j ACCEPT
    

    IPv4 and IPv6 forwarding is enabled in sysctl:

    net.ipv4.conf.all.forwarding = 1
    net.ipv6.conf.all.forwarding = 1 
    
    1. RA is usually used to provide link local ipv6 gateway addresses right (among other ipv6 network info)? The IPv6 gateway that MaxKVM provides is a global ipv6 address and I do not think that MaxKVM has ra enabled on their network. I will try setting the accept_ra flag to 2 though.
    net.ipv6.conf.all.accept_ra = 2
    net.ipv6.conf.default.accept_ra=2
    net.ipv6.conf.eth0.accept_ra=2
    
    1. This was a question asked by support too. There is no gateway set on the wg interface on the VPS instance. Instead, traffic will flow according to the default route which is set by the eth0 gateway. On wireguard clients, the AllowedIPs is set to ::/0, so wireguard will create a default route for all IPv6 traffic on the client back to the VPS wg interface.
  • @jnraptor said:

    @nullnothere said:
    A few points:

    1. I presume you are forwarding between eth0 and your wg* interface (and that you have the relevant forwarding sysctl's set to 1)
    2. I think for a forwarding scenario, you need to have accept_ra=2 on eth0
    3. What is your gateway setup for the primary wg* interface?

    I'm familiar with something like this in the Proxmox context (eth0 external, non-routed IPv6/64, internal vmbr1, IPv6's from the /64) and except that you're using WG there's no other difference and it works. There's a IPv6 setup for vmbr1 (and that is the gateway for the rest of the internal VMs sitting on vmbr1).

    1. Yes, I have forwarding setup on wg* on both iptables and ip6tables.
    -A FORWARD -i wg+ -j ACCEPT
    -A FORWARD -o wg+ -j ACCEPT
    

    IPv4 and IPv6 forwarding is enabled in sysctl:

    net.ipv4.conf.all.forwarding = 1
    net.ipv6.conf.all.forwarding = 1 
    
    1. RA is usually used to provide link local ipv6 gateway addresses right (among other ipv6 network info)? The IPv6 gateway that MaxKVM provides is a global ipv6 address and I do not think that MaxKVM has ra enabled on their network. I will try setting the accept_ra flag to 2 though.
    net.ipv6.conf.all.accept_ra = 2
    net.ipv6.conf.default.accept_ra=2
    net.ipv6.conf.eth0.accept_ra=2
    
    1. This was a question asked by support too. There is no gateway set on the wg interface on the VPS instance. Instead, traffic will flow according to the default route which is set by the eth0 gateway. On wireguard clients, the AllowedIPs is set to ::/0, so wireguard will create a default route for all IPv6 traffic on the client back to the VPS wg interface.

    Setting accept_ra to 2 does not help. My VPS instance is not able to route to the upstream router's link local address IPV6 address (fe80::xxxx:xxxx:xxxx:fdc0), so it is unable to reply to the NS messages coming from that IPv6 address.

    jon@max1 /etc/network: ip ne show dev eth0
    107.xxx.xxx.xxx lladdr xx:xx:xx:xx:fd:c0 REACHABLE
    2402:zzzz:zzzz::1 lladdr xx:xx:xx:xx:fd:c0 router REACHABLE
    fe80::xxxx:xxxx:xxxx:fdc0 FAILED
    
  • Hmmm... I'm thinking aloud here.

    1. Can you check your ip6tables forward rules to confirm that there are packets coming/going (counters will help here).
    2. The wg* interface needs to have a default gateway for IPv6 that it can reach. Because in a sense it is a "virtual" interface, it will need to use an IPv6 that is on eth0 as the gateway (or so I think).
    3. Can you dump (suitably masked) ip -6 route show?
    4. From what you've posted originally, your wg* interfaces should just have your eth0 IPv6 as their gateway so roughly:
      gateway <-> eth0 <-> wg0
    5. As you mention, your pinging a global IPv6 works so the outgoing route is fine at that point, but once upstream looses your neighbor cache entry, the NS arrives and for whatever reason a response isn't going out.
    6. (just for reference, in my case, I have gateway <-> eth0 <-> vmbr1 <-> container/vm everything very similar to what you have except that I have an IPv6 (say a /80) for vmbr1 which is the gateway for the container. I'm just wondering because (I assume) wg* is essentially a tun interface if there's something different to watch out for.
  • @nullnothere said:
    Hmmm... I'm thinking aloud here.

    1. Can you check your ip6tables forward rules to confirm that there are packets coming/going (counters will help here).
    2. The wg* interface needs to have a default gateway for IPv6 that it can reach. Because in a sense it is a "virtual" interface, it will need to use an IPv6 that is on eth0 as the gateway (or so I think).
    3. Can you dump (suitably masked) ip -6 route show?
    4. From what you've posted originally, your wg* interfaces should just have your eth0 IPv6 as their gateway so roughly:
      gateway <-> eth0 <-> wg0
    5. As you mention, your pinging a global IPv6 works so the outgoing route is fine at that point, but once upstream looses your neighbor cache entry, the NS arrives and for whatever reason a response isn't going out.
    6. (just for reference, in my case, I have gateway <-> eth0 <-> vmbr1 <-> container/vm everything very similar to what you have except that I have an IPv6 (say a /80) for vmbr1 which is the gateway for the container. I'm just wondering because (I assume) wg* is essentially a tun interface if there's something different to watch out for.
    1. Output from ip6tables confirms that packets are being forwarded on the wg interface. This would also be confirmed since it starts working after pinging the gateway, and it also works when I switch to NATed IPv6.
    sudo ip6tables -L -v -n
    Chain FORWARD (policy DROP 179 packets, 19062 bytes)
     pkts bytes target     prot opt in     out     source               destination                    
     1161  427K ACCEPT     all      wg+    *       ::/0                 ::/0                
      717  387K ACCEPT     all      *      wg+     ::/0                 ::
    

    2/3. Output from ip -6 route below. 2402:zzzz:zzzz::1 is the gateway provided by MaxKVM and is a /48 address. 2402:zzzz:zzzz:yyyy::/64 is the /64 block assigned to the VPS and 2402:zzzz:zzzz:yyyy::200:0/112 is the /112 block I have assigned to wg0.

    jon@max1 ~/traefik: ip -6 route
    ::1 dev lo proto kernel metric 256 pref medium
    2402:zzzz:zzzz::1 dev eth0 metric 1024 pref medium
    2402:zzzz:zzzz:yyyy::200:0/112 dev wg0 proto kernel metric 256 pref medium
    2402:zzzz:zzzz:yyyy::/64 dev eth0 proto kernel metric 256 pref medium
    fe80::/64 dev eth0 proto kernel metric 256 pref medium
    default via 2402:zzzz:zzzz::1 dev eth0 metric 1024 onlink pref medium
    
    1. Yes, gateway <-> eth0 <-> wg0 would be the path that data going to and leaving the wg interface would take.
    2. Pinging the IPv6 gateway creates a reachable neigh entry on the upstream router. The cache for that router takes >6 hours, but certainly <1 day before it gets cleared and I have to ping it again from the wireguard client. The client is not connected all the time though.
  • First, I assume the wg interface does not have a MAC. That is likely going to be one of the reasons you are not having a fe80:: address for wg0 (key difference in my case with a vmbr* interface).

    Can you also confirm that your wg0 interface's IP is reliably/consistently reachable from the outside? This is the one proxy entry you have added (and not of the clients). Are you having troubles reaching only your other clients on the wg* interfaces or even the primary wg* IPv6 that you've explicitly proxied on the VM?

    If you wish to dynamically support various client IPv6s in your /112 subnet, you'll have to either run an ndp proxy on your VM or else statically add all desired IPs to via neigh proxy.

    Do also dump (masked) ip -6 neigh show proxy.

    That missing MAC is getting me thinking...

  • AbdullahAbdullah Hosting ProviderOG
    edited September 2020

    Maybe try disabling ebtables for your VPS once, power off & restart the VM.
    I had a slightly similar case earlier.

  • Disabling ebtables did not help in this case. :/

    Perhaps our resident networking Ph.D can help where we, SolusVM, and HIVELOCITY failed. @yoursunny do you have any ideas?

  • I have this working on a VPS from skb-enterprise, but when I tried to recreate the same setup on tetahost it didn't work.
    While tetahost was still up tcpdump showed that their side was constantly soliciting, getting advertise in response, then soliciting again as if it ignored it.

    I would suggest validating your setup on a different network.

  • @jnraptor said:
    MaxKVM does not do routed IPv6, but uses on-link IPv6

    @MaxKVM said:
    Perhaps our resident networking Ph.D can help

    You need to have routed IPv6.

    When we teach networking class, we need to send IPv4 traffic into a simulation system, and I just asked lab staff to setup static routes for a /16 subnet on the L3 switches.

    No hostname left!

  • edited September 2020

    @nullnothere said:
    First, I assume the wg interface does not have a MAC. That is likely going to be one of the reasons you are not having a fe80:: address for wg0 (key difference in my case with a vmbr* interface).

    Can you also confirm that your wg0 interface's IP is reliably/consistently reachable from the outside? This is the one proxy entry you have added (and not of the clients). Are you having troubles reaching only your other clients on the wg* interfaces or even the primary wg* IPv6 that you've explicitly proxied on the VM?

    If you wish to dynamically support various client IPv6s in your /112 subnet, you'll have to either run an ndp proxy on your VM or else statically add all desired IPs to via neigh proxy.

    Do also dump (masked) ip -6 neigh show proxy.

    That missing MAC is getting me thinking...

    Yes, the wg interface has no MAC address:

    21: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1000
        link/none 
        inet 10.9.0.1/24 scope global wg0
           valid_lft forever preferred_lft forever
        inet6 2402:zzzz:zzzz:yyyy::200:1/112 scope global 
           valid_lft forever preferred_lft forever
    

    I do not proxy the wg0 interface's IP, so it will not be reachable from outside. wg clients can access it though, for dns resolution.

    I mention in the original post that I am proxying ndp, testing with just 1 client IP now. The server and clients can ping each other though.

    jon@max1 ~: ip ne show proxy
    2402:zzzz:zzzz:yyyy::200:4 dev eth0  proxy
    
  • @yoursunny said:

    @jnraptor said:
    MaxKVM does not do routed IPv6, but uses on-link IPv6

    @MaxKVM said:
    Perhaps our resident networking Ph.D can help

    You need to have routed IPv6.

    When we teach networking class, we need to send IPv4 traffic into a simulation system, and I just asked lab staff to setup static routes for a /16 subnet on the L3 switches.

    Sadly, not many hosts provide routed IPv6, and that is why I have to use proxy_ndp.

  • edited September 2020

    @jnraptor said: I do not proxy the wg0 interface's IP, so it will not be reachable from outside.

    Aaha. OK. Suggestion. Proxy the wg0 IPv6 as well and setup the default route for the clients via that IPv6 and you should be all set.

    If need be setup an explicit gateway via your eth0 IPv6 address on the wg side (remember that because there's no MAC, there no possibility of using the fe80 interface to discover routes on that part of the network and so you need to have the explicit route setup.

    Do post your findings as this is definitely an interesting case (I have no issues with a virtual bridge which does have a MAC but otherwise everything is identical).

    Edit Add: Also (from my earlier post): If you wish to dynamically support various client IPv6s in your /112 subnet, you'll have to either run an ndp proxy on your VM or else statically add all desired IPs to via neigh proxy.

  • edited September 2020

    @nullnothere said:

    @jnraptor said: I do not proxy the wg0 interface's IP, so it will not be reachable from outside.

    Aaha. OK. Suggestion. Proxy the wg0 IPv6 as well and setup the default route for the clients via that IPv6 and you should be all set.

    If need be setup an explicit gateway via your eth0 IPv6 address on the wg side (remember that because there's no MAC, there no possibility of using the fe80 interface to discover routes on that part of the network and so you need to have the explicit route setup.

    Do post your findings as this is definitely an interesting case (I have no issues with a virtual bridge which does have a MAC but otherwise everything is identical).

    @jnraptor said:
    3. This was a question asked by support too. There is no gateway set on the wg interface on the VPS instance. Instead, traffic will flow according to the default route which is set by the eth0 gateway. On wireguard clients, the AllowedIPs is set to ::/0, so wireguard will create a default route for all IPv6 traffic on the client back to the VPS wg interface.

    On the client side, there is already a route created when wireguard starts the connection.

    jon@ubuntu-vm ~: ip -6 route show table 51820
    default dev wg0 metric 1024 pref medium
    

    I will try proxying the wg0 IPv6 address too though - UPDATE: that does not help. Clients can already access the VPS wg0 interface through the tunnel. On the VPS side, tcpdump shows the same NS and NA loop for the interface address now, if i try pinging it from outside.

    sudo ip -6 neigh add proxy 2402:xxxx:xxxx:yyyy::200:1 dev eth0
    

    I am not exactly sure what you mean by setup an explicit gateway via your eth0 IPv6 address on the wg side? Do you mean setting up an explicit route on the client? I tried doing something similar, but I still get the NS/NA loop on the VPS.

    sudo ip -6 route del ::/0 dev wg0 table 51820
    sudo ip -6 route add 2402:zzzz:zzzz::1 dev wg0 table 51820
    sudo ip -6 route add default via 2402:zzzz:zzzz::1 table 51820
    

    Just to confirm the routing on the client side is working fine, I tried doing a traceroute on the wireguard client and it is using the expected IPv6 gateway. The same result occurs whether I use the wireguard created route, or add the explicit routing.

    jon@ubuntu-vm ~: traceroute6 google.com
    traceroute to google.com (2404:6800:4003:c00::8b) from 2402:xxxx:xxxx:yyyy::200:4, 30 hops max, 24 byte packets
     1  2402:xxxx:xxxx:yyyy::200:1  308.9000 ms  212.1753 ms  215.5430 ms
     2  2402:zzzz:zzzz::1 742.0964 ms  251.5280 ms  212.3483 ms
     3  <snipped>  255.4452 ms  217.5764 ms  294.1100 ms
    ...
    

    Interesting note is that after the traceroute, IPv6 will start working, because the 2nd hop pings the IPv6 gateway, generating a NS that my VPS can respond to.

    edit: More info to the traceroute

  • @comi said:
    I have this working on a VPS from skb-enterprise, but when I tried to recreate the same setup on tetahost it didn't work.
    While tetahost was still up tcpdump showed that their side was constantly soliciting, getting advertise in response, then soliciting again as if it ignored it.

    I would suggest validating your setup on a different network.

    The NS/NA loop that you mention looks very similar to the issue I have.

    Anyway, this is the second host I am using NDP to try and get around the on-link IPv6. IPv6 completely stopped working on the first host after a few days of testing, so I cancelled it.

    However, I have successfully tested without NDP on other services before:
    1. Linode provides a /128 IPv6 address and /64 block (or larger) on request. They handle the routing of the block to the instance's /128 address so no NDP proxying is required.
    2. AWS EC2 allows for multiple IPv6 addresses to be added to an instance. By disabling DHCPv6, I can use one of the IPv6 addresses as the public IP, and allocate the rest to wireguard clients without having to proxy ndp.
    3. If I use tunnelbroker, HE will route the assigned /64 or /48 subnet to the client IPv6 address, so public IPv6 addresses can be allocated from the subnet to wireguard clients without having to proxy ndp. The IPv6 routing is not as good though and geolocation is broken too (if outside US).

  • comicomi OG
    edited September 2020

    @jnraptor
    Hosters here mentioned on multiple ocasions ipv6 has low usage, so I think it's quite likely it simply doesn't get enough attention.

    @deepak_leb said: Any guides pls

    Not a guide but a working setup for reference (I hope I didn't forget anything):

    ~# ip a
    2: eth0:
        inet6 2a0d:xxxx::1/64 scope global
    6: wg0:
        inet6 2a0d:xxxx::b055:1/112 scope global
    
    ~# ip -6 route
    2a0d:xxxx::b055:0/112 dev wg0 proto kernel metric 256 pref medium
    2a0d:xxxx::/64 dev eth0 proto kernel metric 256 pref medium
    default via 2a0d:xxxx::1 dev eth0 metric 1024 onlink pref medium
    
    ~# cat /etc/sysctl.conf
    net.ipv4.ip_forward=1
    net.ipv6.conf.all.forwarding=1
    net.ipv6.conf.all.proxy_ndp=1
    
    ~# ip -6 neigh show proxy
    2a0d:xxxx::b055:30 dev eth0  proxy
    
    ~# cat /etc/wireguard/wg0.conf
    [Interface]
    Address = 2a0d:xxxx::b055:1/112
    
    [Peer]
    AllowedIPs = 2a0d:xxxx::b055:20/123
    
    Thanked by (1)deepak_leb
  • I'm just going back to the basics now:

    1. wg initiated ipv6 works fine - confirms that (outward initiated) routing (and return path) is not a problem and everything is ok.
    2. Once upstream's neigh cache expires, incoming (outside initiated IPv6 traffic, not wg initiated responses) will stop because of no reply to a NS from upstream to the wg IPv6.
    3. From your initial post, the NS is reaching the VPS but it is unable to reply (not clear why) and so likely upstream thinks your wg IPv6 is unreachable and you drop off the net.
    4. I'm assuming that there's no funny firewalling issues (since 1 is OK)
    5. Key is the NA (response to the NS from upstream) from your VPS (say eth0) that is not reaching upstream.

    Some thoughts (and bear with me and any stupid ideas)

    1. Do you know if you can get things to work purely using a fe80::1 gateway for eth0? Then you are using purely onlink routing which will help overcome the NA replies not going upstream
    2. I'm quite puzzled about the >6 hours upstream ndp cache - that seems very unusual and unlikely and I'm more used to a much smaller lifetime.
    3. Your latest post traceroute has latency at 300ms - what gives? Seems very odd.
    4. Do you have the luxury of a clean start to ensure that there's no "remnant" effect playing tricks here?
    5. From an other IPv6 enabled host, can you consistently ping eth0's IPv6 and the wg0 IPv6? It'll start well but will/should soon (and I mean much less than 6hrs) time out on wg0. If it doesn't you can bring down wg0 which should then trigger some unreachable responses to upstream and stop wg0 IPv6 traffic. Now when you bring up wg0 again, things will take <1m to stabilize and you should get reconnected again. Having your tcpdump (as you showed in your first post) will help identify things here (not sure if it is going to be the same or if there'll be some more data points to help).
    6. Why doesn't your tun interface have a fe80::/64 route entry?
    7. In my case, pretty much same setup, except instead of wg0 I have vmbr1 and it works exactly as expected (of course different host, but that shouldn't matter).
    8. I'm not sure on what other blackmagic wg is weaving underneath in the routing tables or if any firewalling rules but (1) above seems to confirm that all is well (unless of course it is because of established/connected/related rules). A good way to verify is to add a LOG target to your IPv6 FORWARD chain at the end (since your policy is a drop and there ARE some drops). Likewise I'm a bit puzzled on the rather high packet sizes on your IPv6 tables (relatively low packet count but large byte count - so that's not purely the ICMPv6 traffic). It'll help if you can have other things quiet and start with a fresh rule set to help check the counters and see if there's any funny business there.

    I hope this helps and irrespective I'm very curious on the root cause and assuming this is reasonably run-of-the-mill upstream, I'm beginning to suspect some wg trickery that is confounding things.

  • @nullnothere said:
    I'm just going back to the basics now:

    1. Key is the NA (response to the NS from upstream) from your VPS (say eth0) that is not reaching upstream.

    @jnraptor said:
    Is it normal to block ICMPv6 access to the fe80 address of the upstream router?

    Yes, my suspicion is that this is the root case, hence my original question. I was not able to get an answer to this from MaxKVM or Hivelocity.

    1. Do you know if you can get things to work purely using a fe80::1 gateway for eth0? Then you are using purely onlink routing which will help overcome the NA replies not going upstream

    That works if I can ping fe80::1 gateway from eth0 and if upstream router is configured to listen to that. Unfortunately, that does not work.

    jon@max1 ~: ping -I eth0 fe80::1
    ping: Warning: source address might be selected on device other than: eth0
    PING fe80::1(fe80::1) from :: eth0: 56 data bytes
    
    --- fe80::1 ping statistics ---
    4 packets transmitted, 0 received, 100% packet loss, time 3080ms
    
    1. Your latest post traceroute has latency at 300ms - what gives? Seems very odd.

    I am in US, and the VPS is in Singapore.

    1. Do you have the luxury of a clean start to ensure that there's no "remnant" effect playing tricks here?

    I can try that over the weekend.

    1. Why doesn't your tun interface have a fe80::/64 route entry?

    Reading online, it is likely because wireguard operates at layer 3 (IP tunnel), instead of layer 2 (ethernet mac tunnel).

    However, as part of the starting fresh, I could try to bring up an OpenVPN tunnel and route a different /112 block.

  • @jnraptor said: Is it normal to block ICMPv6 access to the fe80 address of the upstream router?

    Right and I'd say no. Lots of bad things happen when (some) ICMPv6 packets are disallowed.

    Yes, my suspicion is that this is the root case, hence my original question. I was not able to get an answer to this from MaxKVM or Hivelocity.

    Seems odd because if you have an upstream /48 gateway, I'd think that things are routed and of course you shouldn't need any fe80 magic.

    layer 3 (IP tunnel), instead of layer 2 (ethernet mac tunnel).

    Sure, that and of course there is no MAC address here... but I've seen other instances where IIRC the tun device also did get fe80::/64 - and I'm not very sure on the impact here and so just ignore it for now.

    OpenVPN tunnel and route a different /112 block

    Please post findings. Very interested in this as well.

    (/discussion/comment/41606/#Comment_41606): I am in US, and the VPS is in Singapore.

    Got it.

    Logically I'm in the clear here and can't figure out why this isn't working as expected. In any case, do keep an eye on the iptables and if possible check the LOG target and counters. I'm just suspicious that something is coming in the way and I don't yet have a clue.

    If I think of something, I'll post my thoughts and tag you @jnraptor - definitely an interesting puzzle.

    Thanked by (1)jnraptor
  • rm_rm_
    edited September 2020

    1) do not expect proxy_ndp to work;
    2) https://github.com/DanielAdolfsson/ndppd may have a better chance of working, but it often won't either, because of this: https://github.com/DanielAdolfsson/ndppd/issues/55
    Ask @Neoon, he had a ton of experience with this on NanoKVM.

    The proper solution is to have (beg / demand / ask for) routed subnets. Anything else is basically a set of horrible hacks to work around the provider being incompetent and giving you unusable networking setup.

  • @jnraptor said:

    @yoursunny said:

    @jnraptor said:
    MaxKVM does not do routed IPv6, but uses on-link IPv6

    @MaxKVM said:
    Perhaps our resident networking Ph.D can help

    You need to have routed IPv6.

    When we teach networking class, we need to send IPv4 traffic into a simulation system, and I just asked lab staff to setup static routes for a /16 subnet on the L3 switches.

    Sadly, not many hosts provide routed IPv6, and that is why I have to use proxy_ndp.

    ADD35 may unlock routed IPv6.

    Thanked by (1)Brueggus

    No hostname left!

  • @yoursunny said:
    ADD35 may unlock routed IPv6.

    :/

  • NeoonNeoon OGSenpai
    edited October 2020

    @rm_ said:
    Ask @Neoon, he had a ton of experience with this on NanoKVM.

    Yes, SolusVM & Virtualizor ARE BEST FOR THIS!
    RECOMMENDED 10/10

    Besides, you need to ask the provider to disable security features to make a full /64 work in Virtualizor, if its routed.
    And as @rm_ said you need at least one routed /64 anything else will be pain in the arse or close to impossible.

    Also sometimes, you need to set the Router NDP entry on static, to make it work.
    Because the provider has a broken setup.

    Thanked by (1)yoursunny
  • @jnraptor said:
    3. If I use tunnelbroker, geolocation is broken too (if outside US).

    Geolocation of US PoPs is also broken: Ashburn VA goes to Fremont CA, causing suboptimal CDN node selection in some cases.
    However, this isn't HE's fault. Blame the location database provider.

    @rm_ said:
    The proper solution is to have (beg / demand / ask for) routed subnets. Anything else is basically a set of horrible hacks to work around the provider being incompetent and giving you unusable networking setup.

    Truth. It's also what I said earlier.
    As a stallion coder, I don't do dirty hacks.

    No hostname left!

  • edited October 2020

    @Neoon said:
    Besides, you need to ask the provider to disable security features to make a full /64 work in Virtualizor, if its routed.

    @MaxKVM noted that they disabled ebtables. Do you know what about security features are required to be disabled?

    Also sometimes, you need to set the Router NDP entry on static, to make it work.
    Because the provider has a broken setup.

    I tried doing a permanent neighbor entry for the upstream router's fe80 address.

    sudo ip ne add fe80::xxxx:xxxx:xxxx:fdc0  lladdr xx:xx:xx:xx:fd:c0 dev eth0 router
    

    My VPS no longer sends out the NS entry below asking for who has the router's fe80 address, but my VPS is still unable to communicate over ICMPv6 with the router's fe80 address, so tcpdump will just show a constant NS/NA loop.

    04:32:07.550926 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::yyyy:yyyy:yyyy:2d51 > fe80::xxxx:xxxx:xxxx:fdc0: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::xxxx:xxx:xxxx:fdc0
          source link-address option (1), length 8 (1): xx:xx:xx:xx:2d:51
    

    Edit: Add first question

  • I was able to get public IPv6s from a /112 block to work on @Abdullah's WebHorizon though.

    1. Allocate IPv6 address(es) to the OVZ node on the control panel. This will automatically modify the node's /etc/network/interfaces file.
    2. Compile and install boringtun or wireguard-go, as the nodes run on a 3.10 kernel, which does not support native wireguard kernel module. Enable IPv6 forwarding in sysctl. No proxy_ndp is required.
    3. In the wg0.conf file, add a PreUp line that tears down the IPv6 address(es) from venet0 that will be allocated to clients:
    PreUp = ip -6 a del 2402:xxxx:xxxx:xxxx:102/64 dev venet0 || true
    PreUp = ip -6 a del 2402:xxxx:xxxx:xxxx:103/64 dev venet0 || true
    ...
    
    1. Add the IPv6 address to the AllowedIPs for each peer in wg0.conf, and also to the peers' conf files. Use a /128 netmask for all IPv6 addresses to prevent conflicts with the /112 netmask on venet0.

    What is interesting is that with userland wireguard, a fe80 address is generated on the wireguard interface, most probably because it is using a tun interface behind the scenes:

    3: wg0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1420 qdisc mq state UNKNOWN group default qlen 500
        link/none 
        inet 10.9.0.101/24 scope global wg0
           valid_lft forever preferred_lft forever
        inet6 2402:xxxx:xxxx:xxxx:101/128 scope global 
           valid_lft forever preferred_lft forever
        inet6 fe80::xxxx:xxxx:xxxx:927e/64 scope link stable-privacy 
           valid_lft forever preferred_lft forever
    

    I tried using the same boringtun userland setup on MaxKVM too, and though a fe80 address was generated, I still had the same problem where my VPS instance could not communicate with the upstream router's fe80 address, so it could not reply properly to NS messages.

  • NeoonNeoon OGSenpai

    @jnraptor said:

    @Neoon said:
    Besides, you need to ask the provider to disable security features to make a full /64 work in Virtualizor, if its routed.

    @MaxKVM noted that they disabled ebtables. Do you know what about security features are required to be disabled?

    Also sometimes, you need to set the Router NDP entry on static, to make it work.
    Because the provider has a broken setup.

    Exactly.

    I tried doing a permanent neighbor entry for the upstream router's fe80 address.

    sudo ip ne add fe80::xxxx:xxxx:xxxx:fdc0  lladdr xx:xx:xx:xx:fd:c0 dev eth0 router
    

    My VPS no longer sends out the NS entry below asking for who has the router's fe80 address, but my VPS is still unable to communicate over ICMPv6 with the router's fe80 address, so tcpdump will just show a constant NS/NA loop.

    Well, it does not always fix stuff, In a specific case, when you check the neighbours discovery entries and the entry of the router goes to FAIL, you can try it if it fixes the issue.

  • Thanks to @MaxKVM for being patient with me and following this thread too. They had me reboot the VPS after verifying the networking changes on their end and it is working now!

    My VPS is able to ping the upstream router's fe80 address now and thus able to respond back with NAs, and IPv6 starts working on wireguard clients immediately. No config changes other than what has been specified above (ip forwarding in sysctl and iptables, proxy_ndp enabled and wireguard ipv6 addresses added to neigh proxy).

Sign In or Register to comment.