[2022] ★ VirMach ★ RYZEN ★ NVMe ★★ The Epic Sales Offer Thread ★★

VirMach · July 2022

Okay after working on this all day, outside of the two nodes somehow still down (SJCZ004 and NYCB036) I've counted these specifically and we've gone down from something like 2% of people after the most recent round of migrations to exactly 0.91% of people on Ryzen having a VPS with an incorrect IP or non-bootable. At this point for the other two I basically just need the DC hands to reboot them at least and then I'll probably be able to try to take it from there... I hope it doesn't end up taking two weeks for that type of request but we'll see as we're on our way there.

All the others left have been organized and require a re-migration at this point. This number also does include all the broken VMs from Ryzen migrate button, which make up the majority and are clustered on 28 different nodes. What we're going to do is first create the LVMs manually to fix the majority and boot them up, and then either within 24 hours try to restore the data if it exists, or if it takes any longer, send out a ticket and ask people if they still wanted their old data (in case they begin using it and already loading in their own or just don't have important data.)

This last bit took astronomically longer to identify and fix but now that it's the ending portion of it and heavily organized with all the other problems (the other 1% fixed today) out of the way, it should go smoothly.

Then we'll probably run our auto credit and close ticket script again for everyone, probably piss off a few hundred people at least, but actually be able to get on track with tickets as otherwise the majority of the tickets are going to end up being for issues we already resolved. I'll try to wait on this until we get the other two nodes back up and finalized as well and any remaining migrations done.

Daevien · July 2022

From the fact that I still have a number of vps that haven't migrated yet, I'm guessing there is a certain amount total that haven't changed yet. I'm assuming the future migrations left hopefully will be to existing systems that have been tested?

VirMach · July 2022

@Daevien said:
From the fact that I still have a number of vps that haven't migrated yet, I'm guessing there is a certain amount total that haven't changed yet. I'm assuming the future migrations left hopefully will be to existing systems that have been tested?

All systems are existing systems that have been tested, some more than others. The newer nodes I've actually spent less time testing, not as a result of being careless, but in that a lot of the issues were already ironed out so it was a quicker fix and it's possible less of these core issues get passed onto those nodes. We have not sold anything on these Ryzen nodes for several weeks now so a good portion of migrations will be to existing nodes that freed up a little bit. The rest will mostly be Hivelocity setups and the final bit will be the last servers I've built which essentially have the best solid state drives (IMO) so they shouldn't "drop off" and the motherboards have been more extensively tested and pre-flashed (mostly.) Any issues these nodes may have will be related to power and brackets falling apart during shipping which get fixed before it goes online. I'm being specific about where I deploy them to ensure if we run into problems, it should hopefully be a quick fix. Only caveat is that the Hivelocity locations will have zero RAM available. We had to do so many RAM swaps and they're all with existing partners, I ran out, but I memory tested everything.

Networking configuration will be the one that gives us no problems straight out the gate for the new nodes as well and I've not sent any additional servers outside of storage to the partner who shall not be named who has taken over a week for a power button press and two weeks for switch configuration or initial node deployments.

We've also gotten quite good at migrations by now so those will hopefully be as smooth as they get for the last bit.

VirMach · July 2022

I did also finally get some switch access, so you will either see more nodes up in NYC that we can use for migrations, or you'll see me break the networking.

tototo · July 2022

Is PHXZ004 working fine? I couldn't log in via SSH, so I logged in via VNC, but even ping to xxx.xxx.xxx.1 failed.
No reply required. Instead of writing a reply to me, you can rest your eyes and body 😉

VirMach · July 2022

I was going to get a flight to San Jose for tomorrow morning but I'm still waiting to see how we can even get DC access since it's the first time. I might still go on Tuesday and see if I can beat the DC hands to it for SJCZ004 at this point, only about an hour flight.

@tototo said:
Is PHXZ004 working fine? I couldn't log in via SSH, so I logged in via VNC, but even ping to xxx.xxx.xxx.1 failed.
No reply required. Instead of writing a reply to me, you can rest your eyes and body 😉

Phoenix actually has some of our infrastructure on it and I can confirm it's being terrible. But it's still mostly usable. We need to redo network configuration on it still.

netrix · July 2022

@VirMach said:

all my VPS (LA) can't ping to 8.8.8.8. any explanation on that?

VirMach · July 2022

@netrix said:

@VirMach said:

all my VPS (LA) can't ping to 8.8.8.8. any explanation on that?

Very unlucky user or very lucky user based on preference, when it happened, etc. Maybe your services are getting migrated to Ryzen, maybe they're all broken.

VirMach · July 2022

NYC storage node, I had to do a lot of weird setups for this one to make it work as we faced problems such as a disk going missing, the wrong switch, and so on in the background. Right now it's LACP aggregated 3Gbps.

Originally was supposed to have 36 disks, then had 31, then Amazon delayed it, then either one died or DC lost it so we had 30 and I can't keep waiting at this point to try to do 36 disk RAID. This is a beefier card, I figured we give it a try with two 16 disk arrays to essentially emulate two Tokyo smaller servers or to add further fun challenges if I'm wrong about this decision. I configured dm-cache but I don't know what to do with it yet outside of just maybe using it for a test VM. SolusVM surprisingly added this feature in V2, whenever that ends up happening we'll be ready and you can get a whopping 7GB of Gen4 NVMe cache with your 1TB storage (I can probably get it it up higher it has open slots.) Don't know if it's any good.

Anyway, Geekbench:
https://browser.geekbench.com/v5/cpu/16224250/claim?key=836554

This is a Threadripper 3960X. It's the infamous motherboard I dropped, I killed two of the RAM slots so that's why it's at 192GB instead of the planned 256GB and why 36 disk --> 2 x 16 (32) makes more sense, but I ran a lot of tests on it and it was otherwise healthy.

This 9560 has a weird battery (CacheVault) non-issue, I think it just takes longer to boot or has a more strict requirement before initiating. It's been like that since brand new and I've looked into it in every way I could with the limited time, but it only happens on initial boot and shows healthy state.

Fun fact though, it's now impossible to get a replacement motherboard for these, fully. I've looked so hard. So if it ever dies we'll just do a board swap with Epyc, 5950X, 6950X by then, or something like that.

9560-16i RAID controller.

Oh and we already have like 3 NICs now to make it 10Gbps, 20Gbps and later 40Gbps even maybe. We basically spam ordered and crossed fingers for this one since it's initially a lot larger than Tokyo and needs the 10Gbps. They just need to connect it.

I'm about to send the maintenance email for one of the current storage nodes, while we were running backups this week two of the disks on it died and it's degraded. That's one reason I was rushing this out in the last 2 days, I am absolutely not getting DC hands involved at the old DC, every single RAID-related thing they've done recently has been guaranteed destroying all data and the last time they worked on this they almost jumbled the data. Luckily that means we already have like 80% of a massive storage server already backed up to another storage server and I'll be moving off the people that didn't get backed up first.

YABS on the first VPS:

Basic System Information:
---------------------------------
Uptime     : 0 days, 0 hours, 4 minutes
Processor  : QEMU Virtual CPU version 2.5+
CPU cores  : 2 @ 3799.952 MHz
AES-NI     : ❌ Disabled
VM-x/AMD-V : ❌ Disabled
RAM        : 2.9 GiB
Swap       : 3.0 GiB
Disk       : 913.2 GiB
Distro     : Debian GNU/Linux 10 (buster)
Kernel     : 4.19.0-6-amd64

fio Disk Speed Tests (Mixed R/W 50/50):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 22.18 MB/s    (5.5k) | 7.53 MB/s      (117)
Write      | 22.20 MB/s    (5.5k) | 7.94 MB/s      (124)
Total      | 44.38 MB/s   (11.0k) | 15.47 MB/s     (241)
           |                      |
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 86.89 MB/s     (169) | 116.46 MB/s    (113)
Write      | 91.50 MB/s     (178) | 124.21 MB/s    (121)
Total      | 178.39 MB/s    (347) | 240.67 MB/s    (234)

iperf3 Network Speed Tests (IPv4):
---------------------------------
Provider        | Location (Link)           | Send Speed      | Recv Speed
                |                           |                 |
Clouvider       | London, UK (10G)          | 213 Mbits/sec   | 920 Mbits/sec
Online.net      | Paris, FR (10G)           | 882 Mbits/sec   | 744 Mbits/sec
Hybula          | The Netherlands (40G)     | 767 Mbits/sec   | 1.75 Gbits/sec
Uztelecom       | Tashkent, UZ (10G)        | 793 Mbits/sec   | 499 Mbits/sec
Clouvider       | NYC, NY, US (10G)         | 941 Mbits/sec   | 2.82 Gbits/sec

Okay let's pretend I didn't forget to enable the cache.


fio Disk Speed Tests (Mixed R/W 50/50):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 313.06 MB/s  (78.2k) | 1.18 GB/s    (18.4k)
Write      | 313.88 MB/s  (78.4k) | 1.18 GB/s    (18.5k)
Total      | 626.95 MB/s (156.7k) | 2.37 GB/s    (37.0k)
           |                      |
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 5.34 GB/s    (10.4k) | 5.11 GB/s     (4.9k)
Write      | 5.63 GB/s    (11.0k) | 5.45 GB/s     (5.3k)
Total      | 10.97 GB/s   (21.4k) | 10.56 GB/s   (10.3k)

Alright now let's fill it and bring those numbers down. I'll also try to start moving the other one less aggressively to LAX storage and shifting our disaster recovery backups from that to non-RAID since they're mostly complete. Once we finish the rest of the NY storage we'll let you move back and I apologize in advance for the double move but it's better than it getting yoinked offline.

vyas · July 2022

Any space left on this one for new sign ups?

(Asking for a friend)

tetech · July 2022

What is the consensus with networking in Phoenix (PHXZ003), are whole nodes having issues? I cannot ping to 8.8.8.8 from the VPS after trying the usual stuff like a hard power-off then on. I have the "Fix Ryzen IP" option, but don't want to just try stuff randomly if the whole site is having issues.

cybertech · July 2022

@tetech said:
What is the consensus with networking in Phoenix (PHXZ003), are whole nodes having issues? I cannot ping to 8.8.8.8 from the VPS after trying the usual stuff like a hard power-off then on. I have the "Fix Ryzen IP" option, but don't want to just try stuff randomly if the whole site is having issues.

my LAX had 8.8.8.8 issues for a while , but went back up.

funny enough i was trying to do a bench.sh test and it was coincidentally down, and up afterwards.

FrankZ · July 2022

@tetech said: What is the consensus with networking in Phoenix (PHXZ003), are whole nodes having issues? I cannot ping to 8.8.8.8 from the VPS after trying the usual stuff like a hard power-off then on. I have the "Fix Ryzen IP" option, but don't want to just try stuff randomly if the whole site is having issues.

Not just you, I'm on PHXZ002 and the network was so so before, but now no network at all for last ~24 hours. I don't think you can fix this yourself. VirMach mentioned issues in Phoenix above.

@VirMach said: Phoenix actually has some of our infrastructure on it and I can confirm it's being terrible. But it's still mostly usable. We need to redo network configuration on it still.

atomi · July 2022

@VirMach said:

@willie said:
Looks like my vpsshared site is down, not responding to pings. It's a super low traffic personal site, so I'll survive, but it sounds like things are probably backlogged there.

I'd scream at CC but I've given up on that. They placed a permanent nullroute on the main IP. This is probably the 30th time they've done this, from a single website being malicious. I'm trying to get a server up and just move it at this point.

Again I'd usually be freaking out and working on it immediately to get it online but I'm just being realistic here when there's probably a dozen others in worse states. It sucks, it's unprofessional, but there's no use having more stress over it as that'll just slow me down.

I'm wondering if its LA10GKVM14 since that node has been down almost one month already

FrankZ · July 2022

@atomi said: I'm wondering if its LA10GKVM14 since that node has been down almost one month already

I expect he was referring to the shared hosting server in Buffalo with the comment you quoted. I could be wrong, but that was my understanding.

VirMach · July 2022

@cybertech said:

@tetech said:
What is the consensus with networking in Phoenix (PHXZ003), are whole nodes having issues? I cannot ping to 8.8.8.8 from the VPS after trying the usual stuff like a hard power-off then on. I have the "Fix Ryzen IP" option, but don't want to just try stuff randomly if the whole site is having issues.

my LAX had 8.8.8.8 issues for a while , but went back up.

funny enough i was trying to do a bench.sh test and it was coincidentally down, and up afterwards.

Honestly at this point I think 8.8.8.8 issue is actually not on any specific datacenter's end. We had it happen on our WHMCS for a little bit, and that one's not even hosted with any datacenter we use right now for these Ryzens. We've also had it reported on LAX which is QN and Phoenix which is PhoenixNAP IIRC.

So it has to be something upstream probably unless it's in some weird way related?

VirMach · July 2022

@atomi said:

@VirMach said:

@willie said:
Looks like my vpsshared site is down, not responding to pings. It's a super low traffic personal site, so I'll survive, but it sounds like things are probably backlogged there.

I'd scream at CC but I've given up on that. They placed a permanent nullroute on the main IP. This is probably the 30th time they've done this, from a single website being malicious. I'm trying to get a server up and just move it at this point.

Again I'd usually be freaking out and working on it immediately to get it online but I'm just being realistic here when there's probably a dozen others in worse states. It sucks, it's unprofessional, but there's no use having more stress over it as that'll just slow me down.

I'm wondering if its LA10GKVM14 since that node has been down almost one month already

LA10GKVM14 was part of a massive PDU or power circuit event with them, where like a dozen plus servers went offline and online, and then offline, causing a bunch of power supplies to fail. A lot of times when we've had power issues like that, CC also moves them to another switch without telling us and then does not configure the networking properly. This is one of those cases, so it's basically been left without functional networking and no one's willing to help. We're marking it at a loss at this point because it's impossible to get them to do anything anymore. It's possible this also has other issues like a failed controller/data corruption as a result of it taking so long for them to hook up another PSU, I think it took 4 or 5 days and the battery drained for the cache but it doesn't matter. We have to try to locate backups which we haven't had luck with so far because backups also failed and were closely tied to the power event, I think we have partial backups.

So right now we're mostly stuck on regenerating services.

VirMach · July 2022

@FrankZ said:

@tetech said: What is the consensus with networking in Phoenix (PHXZ003), are whole nodes having issues? I cannot ping to 8.8.8.8 from the VPS after trying the usual stuff like a hard power-off then on. I have the "Fix Ryzen IP" option, but don't want to just try stuff randomly if the whole site is having issues.

Not just you, I'm on PHXZ002 and the network was so so before, but now no network at all for last ~24 hours. I don't think you can fix this yourself. VirMach mentioned issues in Phoenix above.

@VirMach said: Phoenix actually has some of our infrastructure on it and I can confirm it's being terrible. But it's still mostly usable. We need to redo network configuration on it still.

Phoenix is completely trashed right now.

cybertech · July 2022

@VirMach said:

@cybertech said:

@tetech said:
What is the consensus with networking in Phoenix (PHXZ003), are whole nodes having issues? I cannot ping to 8.8.8.8 from the VPS after trying the usual stuff like a hard power-off then on. I have the "Fix Ryzen IP" option, but don't want to just try stuff randomly if the whole site is having issues.

my LAX had 8.8.8.8 issues for a while , but went back up.

funny enough i was trying to do a bench.sh test and it was coincidentally down, and up afterwards.

Honestly at this point I think 8.8.8.8 issue is actually not on any specific datacenter's end. We had it happen on our WHMCS for a little bit, and that one's not even hosted with any datacenter we use right now for these Ryzens. We've also had it reported on LAX which is QN and Phoenix which is PhoenixNAP IIRC.

So it has to be something upstream probably unless it's in some weird way related?

i think so too. something related geographically in in US

VirMach · July 2022

Yeah I don't know how reliable this random site is that I found but:

atomi · July 2022

@VirMach said: We have to try to locate backups which we haven't had luck with so far because backups also failed and were closely tied to the power event, I think we have partial backups.
So right now we're mostly stuck on regenerating services.

I think most of us/customers would be happy with fresh idlers servers. Specially after a month of downtime so I wouldn't spend any extra time for hunting backups but I usually to keep my own backups.

TrueBlumfeld · July 2022

@VirMach said:

@TrueBlumfeld said:
My VPS order is: 4678913601
Pls activate it, thank you @VirMach

Tokyo is full right now, these will get activated by Wednesday most likely. If you'd like any of you guys (mucstudio, nauthnael, TrueBlumdfeld, gkl1368) can reply back to me here and request a refund or I can activate you in San Jose for now and migrate you to Tokyo when it's ready. Or of course you can just wait.

Could you please migrate me to Tokyo？
It's been offline for ten days on SJZ004 and the control pannel page timeout

order is: 4678913601

VirMach · July 2022

@TrueBlumfeld said:

@VirMach said:

@TrueBlumfeld said:
My VPS order is: 4678913601
Pls activate it, thank you @VirMach

Tokyo is full right now, these will get activated by Wednesday most likely. If you'd like any of you guys (mucstudio, nauthnael, TrueBlumdfeld, gkl1368) can reply back to me here and request a refund or I can activate you in San Jose for now and migrate you to Tokyo when it's ready. Or of course you can just wait.

Could you please migrate me to Tokyo？
It's been offline for ten days on SJZ004 and the control pannel page timeout

order is: 4678913601

How do you propose a migration of an offline server?

tototo · July 2022

@VirMach said:
How do you propose a migration of an offline server?

Can you migrate my broken VPS in Phoenix to getRandomLocation() without data?
(Yes, needless to say, this is a joke. I'd rather wait than open a ticket)

TrueBlumfeld · July 2022

@VirMach said:

@TrueBlumfeld said:

@VirMach said:

@TrueBlumfeld said:
My VPS order is: 4678913601
Pls activate it, thank you @VirMach

Tokyo is full right now, these will get activated by Wednesday most likely. If you'd like any of you guys (mucstudio, nauthnael, TrueBlumdfeld, gkl1368) can reply back to me here and request a refund or I can activate you in San Jose for now and migrate you to Tokyo when it's ready. Or of course you can just wait.

Could you please migrate me to Tokyo？
It's been offline for ten days on SJZ004 and the control pannel page timeout

order is: 4678913601

How do you propose a migration of an offline server?

there is no important data ，a new vps is enough

VirMach · July 2022

@tototo said:

@VirMach said:
How do you propose a migration of an offline server?

Can you migrate my broken VPS in Phoenix to getRandomLocation() without data?
(Yes, needless to say, this is a joke. I'd rather wait than open a ticket)

I'd actually accept these kinds of request if it was possible to keep it clean but realistically that means you get a new VM and then the old one just kind of hangs around while it's offline and takes up space until we manually verify which ones are abandoned and clear up the space.

For Phoenix though it's technically online so we could start allowing Ryzen migrations. Let me just finish up with what I need to do this morning and I'll activate the button again.

tototo · July 2022

@VirMach said:

@tototo said:

@VirMach said:
How do you propose a migration of an offline server?

Can you migrate my broken VPS in Phoenix to getRandomLocation() without data?
(Yes, needless to say, this is a joke. I'd rather wait than open a ticket)

I'd actually accept these kinds of request if it was possible to keep it clean but realistically that means you get a new VM and then the old one just kind of hangs around while it's offline and takes up space until we manually verify which ones are abandoned and clear up the space.

For Phoenix though it's technically online so we could start allowing Ryzen migrations. Let me just finish up with what I need to do this morning and I'll activate the button again.

I think it would help some users, but "(non-Phoenix location) has no buttons!!!" and that could open a ticket.
I am not really in a hurry and can wait. That said, I respect your decision.

mayiprint · July 2022

@Virmach
Additional 7 IPs to VPS two months ago. Some additional IPs have been unavailable for about a month. The additional IPs of the billing panel also cannot be displayed. I opened a ticket on July 6th. I'd appreciate it if you could help me check it out.
Ticket #218767

VirMach · July 2022

@tototo said:

@VirMach said:

@tototo said:

@VirMach said:
How do you propose a migration of an offline server?

Can you migrate my broken VPS in Phoenix to getRandomLocation() without data?
(Yes, needless to say, this is a joke. I'd rather wait than open a ticket)

I'd actually accept these kinds of request if it was possible to keep it clean but realistically that means you get a new VM and then the old one just kind of hangs around while it's offline and takes up space until we manually verify which ones are abandoned and clear up the space.

For Phoenix though it's technically online so we could start allowing Ryzen migrations. Let me just finish up with what I need to do this morning and I'll activate the button again.

I think it would help some users, but "(non-Phoenix location) has no buttons!!!" and that could open a ticket.
I am not really in a hurry and can wait. That said, I respect your decision.

Button updated.

I can't get it to work but let's see if anyone else has any luck (edit -- looks like it did work for at least one person so far.) To be far I was trying to break it by being impatient and refreshing/closing it. Let me allow it to run for 5 minutes and see what happens instead.

Oh wait I think it's because I activated our trap card

atomi · July 2022

Would it be possible to show that button also ppl with servers in broken nodes (like LA10GKVM14)? That way users could regenerate their services in different nodes

[2022] ★ VirMach ★ RYZEN ★ NVMe ★★ The Epic Sales Offer Thread ★★

Comments