Neoon
Neoon
About
- Username
- Neoon
- Joined
- Visits
- 12,254
- Last Active
- Roles
- Member, OG, Content Writer, Senpai
- Thanked
- 4110
Comments
-
Its for 1 year, IPv6 only.
-
WebHorizon: Emergency Maintenance (Singapore) We have sent you this email to let you know that we have network card failure on one of our Singapore nodes which has caused servers on that node become inaccessible. Maintenance has been enabled for t…
-
Heads up for the India Node. A few days ago, I got a email saying WH is gonna discontinue their old IPv4 subnet. I just read this email today, I wasn't send to the correct mailbox, my bad. Hence we have to change the IPv4 today, since its gonna st…
-
any Tierhive hikes yet?
-
(Quote) 87.76.179.1
-
3.25€/m Madrid # ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## Yet-Another-Bench-Script ## v2026-05-11 ## https://github.com/masonr/yet-another-bench-script ## ## ## ## ## ## …
-
1.28€/m Macao # ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## Yet-Another-Bench-Script ## v2026-05-11 ## https://github.com/masonr/yet-another-bench-script ## ## ## ## ## ## #…
-
1.99 Netgrid.Host Barcelona # ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## Yet-Another-Bench-Script ## v2026-05-11 ## https://github.com/masonr/yet-another-bench-script ## ##…
-
0.80€/m Dubaaaaai VM root@basic-vm-dubai-01:~# curl -sL https://yabs.sh | bash# ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## Yet-Another-Bench-Script ## v2026-05-11 ## https:…
-
(Quote) No, they can differ, so does performance.
-
(Quote) Again, I am not using Qwen3-VL
-
https://www.reddit.com/r/LocalLLaMA/comments/1tzbcyp/llamacpp_gemma4_mtp_support_merged/ (Image)
-
EnLighten mahNuts
-
I can fit Q4 on my 32gigs, Q6 is too tight. Q6 fits easily on the 64gig idle dedi though. Google just released a new gemma model, might be worth trying, just 12B so fits 16GB. https://www.reddit.com/r/LocalLLaMA/comments/1tvtn6m/googlegemma412b_hug…
-
(Quote) Interesting, my local setup gives me about the same as the KS-LE-B.
-
(Quote) what hardware? Q6 on KS-LE-B was 9/t for me.
-
(Quote) Actually, ik has a webinterface now. github.com/ikawrakow/ik_llama.cpp its pretty barebones and it has way less features than the original llama.cpp. BUT, I get about 9t/s stable on the KS-LE-B vs the 6t/s on the llama.cpp one with Qwen 3.6…
-
When South West?
-
Look mom, we are on TV https://point.free/blog/gemma-4-on-a-2016-xeon/
-
(Quote) No idea, currently not using Qwen3 VL, just Qwen 3.5/6 35B
-
(Quote) Yea, you juse load the correct model in llama.cpp with vision enabled. You click upload and it processes your picture. The upload button is disabled until the model is loaded though.
-
(Quote) just input, image generation on CPU is painful.
-
(Quote) Vision on Qwen is amaze, other models suck ass. The higher the res, the longer it takes obviously.
-
Now we are cooking. (Image)
-
https://www.reddit.com/r/LocalLLaMA/comments/1tluma3/llamacpp_server_have_builtin_native_tools_exec/ You can enable build-in tools, without anything extra. (Image)
-
If you enable MTP, make sure the model supports it. You can find Qwen 3.5/3.6 here: https://huggingface.co/unsloth/Qwen3.6-35B-A3B-MTP-GGUF https://huggingface.co/unsloth/Qwen3.5-35B-A3B-MTP-GGUF
-
To enable vision support, MPT and the recommended parameters, you can provide a config file for llama.cpp to load. I copied the original of the subreddit, this is mine currently. https://pastebin.com/raw/ZLP5t0fc You just have to provide --models-p…
-
https://www.reddit.com/r/LocalLLaMA/comments/1tr7hzw/psa/ (Image)
-
(Quote) https://lowendspirit.com/discussion/10471/how-to-ab-use-your-ks-le-b-for-llm-models
-
(Quote) Sadly I don't have a 16GB Card, hence I can't fit that model into memory. Skill issue, then run it on CPU, you can actually for once read it all.
-
(Quote) Indeed and for such a KS-LE-B with 64gig DDR4 is purfect.
-
(Quote) No, just get a RX 6000, it only has 96GB of VRAM yes, but its fast as fuck boy.
-
BS, if you wanna run something with 10t/s get a KS-LE-B. Unified Memory isn't as fast as a RTX 6000 Blackwell. If you wanna run things fast, get a RTX 6000. You will be able to run these models on Unified Memory for sure, but it ain't as fast.