Just a guy shilling for gun ownership, tech privacy, and trans rights.

I’m open for chats on mastodon https://hachyderm.io/

my blog: thinkstoomuch.net

My email: nags@thinkstoomuch.net

Always looking for penpals!

  • 9 Posts
  • 220 Comments
Joined 2 years ago
cake
Cake day: December 21st, 2023

help-circle
  • From what I understand its not as fast as a consumer Nvdia card but but close.

    And you can have much more “Vram” because they do unified memory. I think the max is 75% of total system memory goes to the GPU. So a top spec Mac mini M4 Pro with 48GB of Ram would have 32gb dedicated to GPU/NPU tasks for $2000

    Compare that to JUST a 5090 32GB for $2000 MSRP and its pretty compelling.

    $200 and its the 64GB model with 2x 4090’s amounts of Vram.

    Its certainly better than the AMD AI experience and its the best price for getting into AI stuff so says nerds with more money and experience than me.


  • From what I understand its not as fast as a consumer Nvdia card but but close.

    And you can have much more “Vram” because they do unified memory. I think the max is 75% of total system memory goes to the GPU. So a top spec Mac mini M4 Pro with 48GB of Ram would have 32gb dedicated to GPU/NPU tasks for $2000

    Compare that to JUST a 5090 32GB for $2000 MSRP and its pretty compelling.

    $200 and its the 64GB model with 2x 4090’s amounts of Vram.

    Its certainly better than the AMD AI experience and its the best price for getting into AI stuff so says nerds with more money and experience than me.





  • Gonna write my short story about the orc barbarians who destroy human colonies that get too close to orc territory, not because they’re inherently evil, but because they’ve seen what human greed for power and domination does to subjugated races, the flow of magic, and the health of the earth. So they view humans as evil.

    “Your kind knows nothing but exploitation! You drain the lands of their nutrients to feed cities of sycophants until they are fat! Tell me, adventurer, when was the last time you heard of a dragon attacking an orc caravan? We have no fear of such beings as they only attack the depraved greed of man.”

    “Attacked the village? Do your handlers even lie to hired blades? Yes we burned the village you call Argath, but no one was harmed. Humans, as dangerous as you are, are still cowards. Surrounding a mining village and telling them to leave when they’re outnumbered ten to one is hardly, what you would call, a negotiation. We sent hunters to escort them out of the mountains of Gri’ut Kar and burned the village to ensure the trek was one way.”






  • Ollama and all that runs on it its just the firewall rules and opening it up to my network that’s the issue.

    I cannot get ufw, iptables, or anything like that running on it. So I usually just ssh into the PC and do a CLI only interaction. Which is mostly fine.

    I want to use OpenWebUI so I can feed it notes and books as context, but I need the API which isn’t open on my network.




  • nagaram@startrek.websiteOPtoSelfhosted@lemmy.world1U mini PC for AI?
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    3 days ago

    Ollama + Gemma/Deepseek is a great start. I have only ran AI on my AMD 6600XT and that wasn’t great and everything that I know is that AMD is fine for gaming AI tasks these days and not really LLM or Gen AI tasks.

    A RTX 3060 12gb is the easiest and best self hosted option in my opinion. New for <$300 and used even less. However, I was running with a Geforce 1660 ti for a while and thats <$100




  • I do already have a NAS. It’s in another box in my office.

    I was considering replacing the PIs with a BOD and passing that through to one of my boxes via USB and virtualizing something. I compromised by putting 2tb Sata SSDs in each box to use for database stuff and then backing that up to the spinning rust in the other room.

    How do I do that? Good question. I take suggestions.


  • With a RTX 3060 12gb, I have been perfectly happy with the quality and speed of the responses. It’s much slower than my 5060ti which I think is the sweet spot for text based LLM tasks. A larger context window provided by more vram or a web based AI is cool and useful, but I haven’t found the need to do that yet in my use case.

    As you may have guessed, I can’t fit a 3060 in this rack. That’s in a different server that houses my NAS. I have done AI on my 2018 Epyc server CPU and its just not usable. Even with 109gb of ram, not usable. Even clustered, I wouldn’t try running anything on these machines. They are for docker containers and minecraft servers. Jeff Geerling probably has a video on trying to run an AI on a bunch of Raspberry Pis. I just saw his video using Ryzen AI Strix boards and that was ass compared to my 3060.

    But to my use case, I am just asking AI to generate simple scripts based on manuals I feed it or some sort of writing task. I either get it to take my notes on a topic and make an outline that makes sense and I fill it in or I feed it finished writings and ask for grammatical or tone fixes. Thats fucking it and it boggles my mind that anyone is doing anything more intensive then that. I am not training anything and 12gb VRAM is plenty if I wanna feed like 10-100 pages of context. Would it be better with a 4090? Probably, but for my uses I haven’t noticed a difference in quality between my local LLM and the web based stuff.