Can’t SSH via Wireguard
Hello.
I've been installing a Proxmox cluster.
- Node 1 in provider A
- Node 2 in provider A
- Node 3 in provider B
For security reasons (maybe), I’m using WireGuard, and everything works fine except that node 3 can’t connect to the other nodes via SSH. If I use the public IPs of the nodes, the connection succeeds.
SSH log: https://pastebin.com/raw/UTvb50xx
Any ideas?
Comments
missconfigured wireguard or DPI is blocking it.
Free NAT KVM | Free NAT LXC
Do a
tcpdump
on both sides on the wireguard interfaces for the ssh traffic, and then maybe also do a tcpdump on both sides on the wireguard traffic (and match up the packets each time).MTU so low MSS even lower TCP cannot cope
vps9
hostname is available. affbrrIt's always DNS. When it isn't, it's always MTU.
Check MTU first as mentioned - most likely issue. I had one situation where it was a DSCP issue. Check that DSCP 0x8 is not being dropped. Quick test is to force-change outgoing DSCP to 0x10 (this is done on Node 3 in your case):
Or whatever firewall equivalent.
Adjusting MTU didn't work.
DSCP didn't work.
Only SSH incoming/outgoing connections are blocked in node 3
Did you try the tcpdump as mentionedby @cmeerw ? That'll show you if the packets are getting there and then dropped. Since it's ssh specific and other traffic is fine it feels like it has to be firewall.
this is tcpdump on node3. I was trying to connect from node1.
I don't have any firewall in place.
pve-firewall is stopped.
ufw not installed.
I'm ready to nuke this node.
Can you ssh in normally?
Free Hosting at YetiNode | MicroNode| Cryptid Security | URL Shortener | LaunchVPS | ExtraVM | Host-C | In the Node, or Out of the Loop?
Yes, I can use SSH normally with public addresses. Only in/out in the wireguard network fails.
The rest of the nodes don't have any issue.
The main difference is this node is in another provider (BreezeHost). Ping is between 2-3ms
mss 1460 is asking for trouble - why isn't it lower? (shouldn't it be something like 1380 with the default wireguard MTU of 1420?) To me it looks like an MTU problem. You did say "Adjusting MTU didn't work." - how did you do that? And how did the tcpdump look after the adjustment?
BTW, you are getting
seq 1:41
and thenseq 1489:1601
, so you are missingseq 41:1489
which presumably gets dropped somewhere because of an MTU issue.Sorry if I'm being dim but if you're running that on Node 3 why are you only seeing traffic between Node 1 and Node 2?
Same O.S image? Not the first time I've experienced weird issues with these shitty panel templates.
I already tried with lower MTU. Will check again with 1380.
EDIT: Same issue with 1380 and lower.
node2 is actually node3. My mistake obfuscating the logs.
These are bare metal servers. Directly from the PVE ISO.
I'm rebuilding node3 right now.
In which case, that doesn't show any traffic on port 22/tcp which would imply the traffic is never making it there, right? Or were are you running a random port for sshd?
ssh port is 50773
sudo netstat -plant | grep :what ever port you are using
Also ssh -vvv
Free Hosting at YetiNode | MicroNode| Cryptid Security | URL Shortener | LaunchVPS | ExtraVM | Host-C | In the Node, or Out of the Loop?
Can you show the corresponding tcpdump for that? (and then also the tcpdump for the wireguard traffic)
And then compare that with the tcpdump for the non-wireguarded ssh connection.
And then you can do some ping tests between the nodes with varying packet sizes.
Reinstalled node3 from scratch and I can ping 10.100.0.1 and 10.100.0.2 after installing wireguard (not configured yet)
I didn't tried ping before installing wireguard.
WTFFFFFF!!?
traceroute 10.100.0.1 on node3:
I reinstalled another distro for testing and those private IPs were reachable. I'm waiting for an answer from the provider.
I'm reinstalling again PVE and I will need to use another set of IPs.
Cluster is gone. Deleted the wrong files in the wrong node. Fuck.
Da hail you say?
Free Hosting at YetiNode | MicroNode| Cryptid Security | URL Shortener | LaunchVPS | ExtraVM | Host-C | In the Node, or Out of the Loop?
Maybe it was a bad idea to run the cluster on Wireguard.
While migrating a VM, everything went down.
inevitable
Free Hosting at YetiNode | MicroNode| Cryptid Security | URL Shortener | LaunchVPS | ExtraVM | Host-C | In the Node, or Out of the Loop?
How that?
Free NAT KVM | Free NAT LXC
Looks like the connection gets stuck during key exchange. Double check if WireGuard MTU/MSS is set correctly on node 3 (try MTU 1280 or 1420). Also, make sure no firewall or iptables rules are blocking or mangling packets on node 3. Could also try SSH with -o IPQoS=none just to test.
Thanks guys. Sorry I didn't give you an update on time.
The main issue was the provider already using the IP 10.0.0.3 for their networking stuff. That's why ping was working but not ssh.
It took me some time but right now my Proxmox cluster over Wireguard is working perfectly. I had issues with DDoS protection, corosync network bandwidth and Proxmox firewall. But it was fun.