[2022] ★ VirMach ★ RYZEN ★ NVMe ★★ The Epic Sales Offer Thread ★★

15051535556277

Comments

  • VirMachVirMach Hosting Provider

    I got lucky with SJCZ004. After about 20 reboots and messing with the BIOS, when I finally gave up and was going to try recovering data, I selected virtual disk on BIOS and it booted into the OS instead (???) seems like it just wants to do the opposite of everything I tried.

    Anyway, something's definitely wrong with it and it was just luck that it even powered on into the BIOS. We're short on servers but I don't see any reason for it to crash so soon, last time it was related to the network reconfiguration, it's been so long I don't remember, but wee should have a lot more servers up soon in the next few days and we'll evaluate an emergency migration at that time.

    Huge shoutout to remote hands again for doing literally nothing until the server decided to spontaneously boot itself up. (I know they weren't involved because BMC keeps track of power events, aka, no power button press.)

    There's been a lot of cases of people doing their own network configurations due to issues with reconfigure button, but then they set the IP address to the node's main IP or a random IP in the subnet. This is considered IP stealing/network abuse, so I don't know if I have to provide a PSA here but definitely only use the IP address assigned to you. We've had some issues with the anti-IP theft script which uses ebtables so had to disable it on a lot of nodes but this doesn't mean you're allowed to use an IP address you're not paying for/not part of your service. You will be suspended and while I understand in some cases it's just a mistake, we're going to be relatively strict depending on the scenario it caused. With that said, one of these cases pretty much caused SJCZ004 to have problems with the initial configuration and indirectly led to this long outage.

    Some other updates:

    • I've done some limited testing of re-enabling IP security tools after the VLAN splits, and it looks like it's able to run properly at least on some servers, so we're going to focus on fixing these slowly to avoid further problems caused by the IP stealing. This should at least fix a small percentage of those services unable to use their IP as it's possible someone else is using it. Rare but possible.
    • DALZ009 & DALZ010 are resembling the issue we've had with that board / chassis / CPU / heatsink combo where the clamp down on the CPU is weak and during shipping the CPU gets slightly "lifted" causing it to stick to the heatsink. Usually a CPU re-seat will fix this but the problem is AM4 has the pins on the CPU and if it's not removed properly by a good tech, they could easily damage the CPU and make it worse. So we're shipping in new CPUs and hopefully scheduling it with non-DC techs.
    • ATLZ007 has a similar issue to SJCZ004 at this time, but not necessarily the same cause, just the same endpoint. I'm attempting to fix it in the same way but it looks like it might actually be functioning more as expected so I'm attempting boot repair.

    These nodes are still being looked into but have a problem that's either waiting on DC hands or difficult to identify/complete at the time:

    • SJCZ008
    • LAXA014
    • NYCB036
    • FFME001
  • VirMachVirMach Hosting Provider
    edited July 2022

    @willie said:
    Anyone know if CC Buffalo is on some IP blacklists? I have trouble reaching my server there from Comcast where I am right now, but it works fine from just about everywhere else. One person has been able to reproduce the issue from another location on Cox network. The server itself is fine afaict. I have not yet had the energy to compare packet traces at both ends, but will do that tomorrow. This is trés weird.

    This is one of the things where I don't even think it's their fault but they're on a blacklist UCEPROTECT which as @FrankZ mentioned should not be trusted but for some reason it's "industry standard" for a bunch of lazy sys admins to use it. I guess by literally having infinite false positives, it gets their job done of also happening to block some malicious people.

    The people that maintain the blacklist essentially extort you for money, and easily put hundreds of thousands of IP addresses on their "3" list I believe, just for a few dozen IP addresses on it being marked as malicious, and then some oblivious companies use this list instead of at the very least using their "2" list which is more specific. Again, they might as well just block a few million IP addresses randomly and they'd achieve the same results.

    Thanked by (3)FrankZ tototo AlwaysSkint
  • edited July 2022

    Yo @Virmach : a wee heads-up in that there's an imminent transfer request coming. Standard Ticket to Billing?
    [$/£ exchange rate is killing me, just now. :'( ]

    P.S.Yessum, UCEProtect scam is a real PITA!

    It wisnae me! A big boy done it and ran away.
    NVMe2G for life! until death (the end is nigh)

  • VirMachVirMach Hosting Provider

    Finally fixed NYCB036.

    They had the LAN port plugged in as dedicated IPMI on the switch we use for IPMI, and labeled it as a port on the main switch when it wasn't. Took a fair bit of detective work to find since the whole setup is a mess at this point. I'm probably going to have to go down to New York and fix it one day and make it neat, they never followed initial rack instructions and just put everything in randomly basically.

    Thanked by (1)FrankZ
  • VirMachVirMach Hosting Provider

    @AlwaysSkint said:
    Yo @Virmach : a wee heads-up in that there's an imminent transfer request coming. Standard Ticket to Billing?
    [$/£ exchange rate is killing me, just now. :'( ]

    P.S.Yessum, UCEProtect scam is a real PITA!

    I haven't been billing for these recently but they also take an outrageous amount of time to complete, so a fair bit of warning. This is one of the things I've tried and failed to fix over the last two years. Maybe we'll finally have time for reworking the process but until then you can realistically expect to wait weeks if not longer.

    Yes, standard ticket to wherever and it'll get marked as logistical support by us.

    Thanked by (2)AlwaysSkint FrankZ
  • edited July 2022

    @VirMach said: ..they never followed initial rack instructions..

    If you want something done..
    [Alternate]
    Can't get the staff these days.

    (( Ahh, are you sending this equipment with colour coded ports and matching cables? "You put the blue one here.." Square peg; round hole. ))

    Thanked by (1)FrankZ

    It wisnae me! A big boy done it and ran away.
    NVMe2G for life! until death (the end is nigh)

  • FrankZFrankZ Moderator
    edited July 2022

    @VirMach said: Finally fixed NYCB036.

    and ATL-Z007 seem back as well. :+1:


    @AlwaysSkint said:
    (( Ahh, are you sending this equipment with colour coded ports and matching cables? "You put the blue one here.." Square peg; round hole. ))

    Normally this would be a good idea, but I don't think this would work for VirMach as I expect he would get his work order assigned to the only color blind tech. :bleep_bloop:

    I am currently traveling in mostly remote areas until sometime in April 2024. Consequently DM's sent to me will go unanswered during this time.
    For staff assistance or support issues please use the helpdesk ticket system at https://support.lowendspirit.com/index.php?a=add

  • edited July 2022

    @FrankZ said: and ATL-Z007 seem back as well. :+1:

    I'll second that. "Hmm, shall I wait before switching it back as the secondary nameserver?" i'll see how my day goes. ;)

    ((No green & blue cables in the same batch!!!))

    Thanked by (1)FrankZ

    It wisnae me! A big boy done it and ran away.
    NVMe2G for life! until death (the end is nigh)

  • @VirMach said:
    There's been a lot of cases of people doing their own network configurations due to issues with reconfigure button, but then they set the IP address to the node's main IP or a random IP in the subnet. This is considered IP stealing/network abuse, so I don't know if I have to provide a PSA here but definitely only use the IP address assigned to you. We've had some issues with the anti-IP theft script which uses ebtables so had to disable it on a lot of nodes but this doesn't mean you're allowed to use an IP address you're not paying for/not part of your service. You will be suspended and while I understand in some cases it's just a mistake, we're going to be relatively strict depending on the scenario it caused. With that said, one of these cases pretty much caused SJCZ004 to have problems with the initial configuration and indirectly led to this long outage.

    Two weeks ago, I noticed an IP change when I invoked sudo reboot.
    I suppose the IP renumbering occurred prior to my reboot, but my OS didn't pick it up until it reboots.

    I have DHCP client, but the DHCP server seems to give out very long leases (many years).
    Consequently, in the period between IP renumbering and my reboot, my OS could be using some else's IP address.

    Now, if you need to renumber IP assignment, your should reduce DHCP lease duration.
    Otherwise, it's not my fault of using someone else's IP address for a long time.

    Original incident log:
    https://lowendspirit.com/discussion/comment/92125/#Comment_92125

    ServerFactory aff best VPS; HostBrr aff best storage.

  • DHCP for servers is a bloody silly idea, IMHumbleO. (I know, it makes auto-deployment easier!)

    It wisnae me! A big boy done it and ran away.
    NVMe2G for life! until death (the end is nigh)

  • cybertechcybertech OGBenchmark King

    @VirMach said:
    Finally fixed NYCB036.

    They had the LAN port plugged in as dedicated IPMI on the switch we use for IPMI, and labeled it as a port on the main switch when it wasn't. Took a fair bit of detective work to find since the whole setup is a mess at this point. I'm probably going to have to go down to New York and fix it one day and make it neat, they never followed initial rack instructions and just put everything in randomly basically.

    how much are DC hands paid? i would like to take up as summer job. seems like no experience necessary

    I bench YABS 24/7/365 unless it's a leap year.

  • @VirMach
    what happened? I can't use it. How can it become a network abuse?SJCZ004.

  • @AlwaysSkint said:
    DHCP for servers is a bloody silly idea, IMHumbleO. (I know, it makes auto-deployment easier!)

    I keep DHCP enabled so that the provider can renumber IP addresses if they want; this also allows the service to come up after migration to a different location upon my request.
    I shouldn't be guilty of "network abuse" if the provider sets improper DHCP lease duration which causes my OS to keep using an IP address that has been unassigned from my service.

    So far, four providers renumbered IP address on my services: SecureDragon, AlphaRacks, WebHorizon, VirMach.
    Among those, VirMach was the only provider that renumbered IP address without sending prior notice.
    If they plan to do so regularly, I may need to setup more scripts to automatically adjust DNS records and such.

    Thanked by (1)adly

    ServerFactory aff best VPS; HostBrr aff best storage.

  • @VirMach said:

    @atomi said:

    @AlwaysSkint said:
    @Virmach - my fallback secondary server is now working again. The Ryzen Fix IP function did the trick. Phew!
    One less thing on your plate. Like that will make a difference! :|

    I've had same issue after migration that some other VPS uses same IP and there are no Ryzen IP fix available at NYCB018. So I have fast Ryzen idler with semiworking network :p

    I've gone down these dozens of times at this point and checked for broken IP assignments, looks like I missed one on NYCB018 because I'm semi-dyslexic and it was .212 instead of .221 on a single VM. I've corrected that one now.

    Mine is .232 and when its used some other server, route goes somewhere else than NYCB018 but its happening quite often

    but its happening a lot

  • VirMachVirMach Hosting Provider

    @FrankZ said:

    @VirMach said: Finally fixed NYCB036.

    and ATL-Z007 seem back as well. :+1:


    @AlwaysSkint said:
    (( Ahh, are you sending this equipment with colour coded ports and matching cables? "You put the blue one here.." Square peg; round hole. ))

    Normally this would be a good idea, but I don't think this would work for VirMach as I expect he would get his work order assigned to the only color blind tech. :bleep_bloop:

    All labeling ever did for us is cause MORE problems so I stopped doing it.

    Example A:

    • Nothing is labeled.
    • Techs set it up how they want, and maybe communicate it correctly if we're lucky.
    • Tech may be able to follow instructions for something in relation to RU9.

    Example B:

    • Everything is labeled.
    • Techs set it up how they want, and maybe communicate it correctly if we're lucky.
    • Tech gets confused because RU9 server is called "014" on the label.
    Thanked by (2)FrankZ AlwaysSkint
  • VirMachVirMach Hosting Provider

    @atomi said:

    @VirMach said:

    @atomi said:

    @AlwaysSkint said:
    @Virmach - my fallback secondary server is now working again. The Ryzen Fix IP function did the trick. Phew!
    One less thing on your plate. Like that will make a difference! :|

    I've had same issue after migration that some other VPS uses same IP and there are no Ryzen IP fix available at NYCB018. So I have fast Ryzen idler with semiworking network :p

    I've gone down these dozens of times at this point and checked for broken IP assignments, looks like I missed one on NYCB018 because I'm semi-dyslexic and it was .212 instead of .221 on a single VM. I've corrected that one now.

    Mine is .232 and when its used some other server, route goes somewhere else than NYCB018 but its happening quite often

    but its happening a lot

    Yeah, possibly IP conflict. Enabled protection on it. Reconfigure and try again. If it doesn't work, let me know. This could also potentially knock out the entire network for NYCB018 if we're extremely unlucky so I'll check back in 10 minutes.

    Thanked by (1)atomi
  • @VirMach said:
    Yeah, possibly IP conflict. Enabled protection on it. Reconfigure and try again. If it doesn't work, let me know. This could also potentially knock out the entire network for NYCB018 if we're extremely unlucky so I'll check back in 10 minutes.

    Done and seems work atleast for now. Hopefully net uptime has now better stats, thanks!

  • @cybertech said:
    how much are DC hands paid? i would like to take up as summer job. seems like no experience necessary

    With Virmach's recent experiences, seems like no braincells to rub together are needed in some of these companies. I hope he's planning on going back over updates like these and documenting them all to the DC. Or beating them with the paperwork and contracts until they have a braincell fire. Something.

    I'd have been roasted over a firepit if i did any of the things he's had done / not done, let alone any combo of them from one DC alone.

    If I wasn't Canadian and really not wanting to travel still, I'd say Virmach hire me and I'll just fly around the US going DC to DC as your remote hands for a while lol I've got a bunch of related knowledge, but even if I turned off my brain and just did exactly what was asked, it would be smoother for everyone :tongue:

  • @VirMach said: FFME001

    Wanted to ask what is wrong with FFME001 as my hetrixtools does not complain...

    But then I tried to login as root and it did not allow me (???????????!?!?!? I know my password!)... so I decided to go to panel and I see it's dead (?)

    What fucking vodoo magic is happening here? :D

    Haven't bought a single service in VirMach Great Ryzen 2022 - 2023 Flash Sale.
    https://lowendspirit.com/uploads/editor/gi/ippw0lcmqowk.png

  • VirMachVirMach Hosting Provider

    @atomi said:

    @VirMach said:

    @atomi said:

    @AlwaysSkint said:
    @Virmach - my fallback secondary server is now working again. The Ryzen Fix IP function did the trick. Phew!
    One less thing on your plate. Like that will make a difference! :|

    I've had same issue after migration that some other VPS uses same IP and there are no Ryzen IP fix available at NYCB018. So I have fast Ryzen idler with semiworking network :p

    I've gone down these dozens of times at this point and checked for broken IP assignments, looks like I missed one on NYCB018 because I'm semi-dyslexic and it was .212 instead of .221 on a single VM. I've corrected that one now.

    Mine is .232 and when its used some other server, route goes somewhere else than NYCB018 but its happening quite often

    but its happening a lot

    Yeah, possibly IP conflict. > @cybertech said:

    @VirMach said:
    Finally fixed NYCB036.

    They had the LAN port plugged in as dedicated IPMI on the switch we use for IPMI, and labeled it as a port on the main switch when it wasn't. Took a fair bit of detective work to find since the whole setup is a mess at this point. I'm probably going to have to go down to New York and fix it one day and make it neat, they never followed initial rack instructions and just put everything in randomly basically.

    how much are DC hands paid? i would like to take up as summer job. seems like no experience necessary

    It's actually a pretty hilarious paradox. They keep raising the rates, reducing the level of service they provide, and I highly doubt it translates over to the hands getting paid much. From what I've seen they basically end up making around 10-20% of the hourly rate you get charged. To be fair though I also assume once you add in benefits, all the dead hours where they still need to be paid, and I'd imagine some level of liability insurance, or possibly sub-contracting, it probably somehow ends up leaving thin margins still.

    The part I still don't understand though is the reduction of the level of service AND raising prices astronomically. Basically, at least one of the datacenters raised their rate by like let's say +50% and then they updated their guidelines to where within that timeframe you have to pay for, they'll only do a list of 5 different tasks and that's it. Tasks that definitely take way less than the hour minimum. So once you take that into account you might end up paying $100-200 for a button press and even then they still struggle to do it, so why not just hire more people if the demand is so high? It's not like they're hiring experienced people anyway if they're also unable to do anything above a button press.

    @Jab said: Wanted to ask what is wrong with FFME001 as my hetrixtools does not complain...

    Haven't been able to look into that one, I've been putting it off/lower on the priority list because I noticed the same thing, it only appears to be a panel issue. It could be the following:

    • Getting blocked by SolusVM
    • Wrong configuration settings (highly unlikely as it previously worked)
    • A process crashed
    • Disk issue causing partial functionality
  • @atomi said:

    @VirMach said:
    Yeah, possibly IP conflict. Enabled protection on it. Reconfigure and try again. If it doesn't work, let me know. This could also potentially knock out the entire network for NYCB018 if we're extremely unlucky so I'll check back in 10 minutes.

    Done and seems work atleast for now. Hopefully net uptime has now better stats, thanks!

    It didn't help, network still going up'n'down

  • VirMachVirMach Hosting Provider

    @atomi said:

    @atomi said:

    @VirMach said:
    Yeah, possibly IP conflict. Enabled protection on it. Reconfigure and try again. If it doesn't work, let me know. This could also potentially knock out the entire network for NYCB018 if we're extremely unlucky so I'll check back in 10 minutes.

    Done and seems work atleast for now. Hopefully net uptime has now better stats, thanks!

    It didn't help, network still going up'n'down

    Message me the IP, I want to make sure it's not a node-wide issue.

    Thanked by (1)atomi
  • @VirMach i cannot access service details page of my vps since a week now i always get timed out errors i forgot the new ip and just want to see the ip address of my vps if that's still active what to do?, Also is there any way to reset/check my username so that I can see if ip address there , i always accessed pannel from services detail page which auto logged me in so don't know my user name please help

    Want free vps ? https://microlxc.net

  • @VirMach said:

    @atomi said:

    @atomi said:

    @VirMach said:
    Yeah, possibly IP conflict. Enabled protection on it. Reconfigure and try again. If it doesn't work, let me know. This could also potentially knock out the entire network for NYCB018 if we're extremely unlucky so I'll check back in 10 minutes.

    Done and seems work atleast for now. Hopefully net uptime has now better stats, thanks!

    It didn't help, network still going up'n'down

    Message me the IP, I want to make sure it's not a node-wide issue.

    Done, not sure tho if its nodewide since it seems to go to different node when its not working on my server

  • VirMachVirMach Hosting Provider

    Everything other than LAXA014 should be at the very least temporary resolved. I'm compiling a list of more permanent solutions to implement and still investigating some nodes to see what happened.

    @codelock said:
    @VirMach i cannot access service details page of my vps since a week now i always get timed out errors i forgot the new ip and just want to see the ip address of my vps if that's still active what to do?, Also is there any way to reset/check my username so that I can see if ip address there , i always accessed pannel from services detail page which auto logged me in so don't know my user name please help

    Are you on Ryzen?

  • edited July 2022

    @652698 said:
    @ VirMach
    what happened? I can't use it. How can it become a network abuse?SJCZ004.

    Have you been using weak passwords like these?
    https://www.tomsguide.com/news/worst-passwords-2022

    (edit) I did not intend to send a mentions, sorry @ VirMach

  • @VirMach said:
    Everything other than LAXA014 should be at the very least temporary resolved. I'm compiling a list of more permanent solutions to implement and still investigating some nodes to see what happened.

    @codelock said:
    @VirMach i cannot access service details page of my vps since a week now i always get timed out errors i forgot the new ip and just want to see the ip address of my vps if that's still active what to do?, Also is there any way to reset/check my username so that I can see if ip address there , i always accessed pannel from services detail page which auto logged me in so don't know my user name please help

    Are you on Ryzen?

    One of my VPS is on laxa014 😢 and already paid the migration (without data) @ 26 July. Please help me if possible.

    Invoice #1462938. Ticket ID #596130.

    Thank you

  • edited July 2022

    @VirMach said:
    Everything other than LAXA014 should be at the very least temporary resolved. I'm compiling a list of more permanent solutions to implement and still investigating some nodes to see what happened.

    @codelock said:
    @VirMach i cannot access service details page of my vps since a week now i always get timed out errors i forgot the new ip and just want to see the ip address of my vps if that's still active what to do?, Also is there any way to reset/check my username so that I can see if ip address there , i always accessed pannel from services detail page which auto logged me in so don't know my user name please help

    Are you on Ryzen?

    Yup, i was migrated to ryzen last month

    Want free vps ? https://microlxc.net

  • Nice, SJC2005 is back up. My VM there responds to pings, but ssh reports that the host key has changed, and my ssh key logins don't work. I tried changing the root password through the client area but between the different retries I now seem to be locked out of ssh (refusing connections). Hopefully some timer will reset after a while. But it feels like the server has been reinstalled. I guess I can live with that, and can do my own reinstall if needed.

    I have a different SJC VM (from before this offer) on another node, that is still unreachable (client area times out). Is that the one waiting for new network hardware or something? I don't know which node it is, and can't connect to it from client area to find out.

  • @VirMach my services got renewed last month but cannot able to use it for a long time.

    I hope it will be resolved sooner, can you please check ticket #621045 and do the needful if time permits?

This discussion has been closed.