1U mini PC for AI?

nagaram@startrek.website · edit-2 3 days ago

I learned something interesting from my AI researcher friend.

ChatGPT is actually pretty good at giving mundane medical advice.

Like “I’m pretty sure I have the flu, what should I do?” Kinda advice

His group was generating a bunch of these sorta low stakes urgent care/free clinic type questions and in nearly every scenario, ChatGPT 4 gave good advice that surveyed medical professionals agreed they would have given.

There were some issues though.

For instance it responded to

“Help my toddler has the flu. How do I keep it from spreading to the rest of my family?”

And it said

“You should completely isolate the child. Absolutely no contact with him.”

Which you obviously can’t do, but it is technically a correct answer.

Better still, it was also good at knowing its limits and anything that needed more than OTC and bedrest was seemingly recognized and it would suggest going to an urgent care or ER

So they switched to Claude and Deepseek because they wanted to research how to mitigate failures and GPT wasn’t failing often enough.

nagaram@startrek.website · 4 days ago

Dell Optiplex 3050

Lenovo m720

HP whatever with a 7th gen Intel

All can be had for $50 ish

nagaram@startrek.website · 5 days ago

A few months ago now, Arizona? Arkansas maybe? Some state legalized “AI powered” home schooling systems. But it was mostly clickbait and the system is less like ChatGPT and more like the YouTube Algorithm machine learning. It takes into account the stuff that students do well at and let’s them advance beyond “grade level” limitations while also learning how to present problem areas in ways the student responds to.

I had asked my home schooled AI researcher buddy his thoughts and he obviously liked it. I like the idea too, but my hang up was on socializing kids. That to me is the more important role of schools.

I wouldn’t trust an LLM in this set up though. A human tutor would still need to step in for questions outside of a FAQ IMO. I love working with an LLM by giving it all the manuals, guides, and config files I used then asking where I went wrong because it can usually give me a good enough interpretation to see where to go next. But that’s just a rubber duck. My mind and skills are developed. A kid learning math for Tue first time can’t do that.

nagaram@startrek.website · 5 days ago

As a floatplane subscriber, you’re really not missing much. I don’t even watch most of the exclusives.

nagaram@startrek.website · 5 days ago

Oh the rossman video.

I hate how obsessed on dumb shit he gets. The man is legitimately doing great work usually, and then he takes something minor that an otherwise ally says or does and blows it out of proportion.

This man would have made a great tankie. Unfortunately he made a whole 20 minute video on why AOC is stupid for saying unskilled labor doesn’t exist and then explaining exactly the points she was making.

I legitimately love this mans work and I wanna support him, but man is he petty.

nagaram@startrek.website · 6 days ago

A Microsoft glazing botnet leveraging copilot and all of r/linuxsucks training data to shitpost on Lemmy made by a developer who took a Janatorial job at Microsoft to “get his foot in the door” during an internal hackathon he was accidentally invited to.

nagaram@startrek.website · 9 days ago

From what I understand its not as fast as a consumer Nvdia card but but close.

And you can have much more “Vram” because they do unified memory. I think the max is 75% of total system memory goes to the GPU. So a top spec Mac mini M4 Pro with 48GB of Ram would have 32gb dedicated to GPU/NPU tasks for $2000

Compare that to JUST a 5090 32GB for $2000 MSRP and its pretty compelling.

$200 and its the 64GB model with 2x 4090’s amounts of Vram.

Its certainly better than the AMD AI experience and its the best price for getting into AI stuff so says nerds with more money and experience than me.

nagaram@startrek.website · 9 days ago

From what I understand its not as fast as a consumer Nvdia card but but close.

And you can have much more “Vram” because they do unified memory. I think the max is 75% of total system memory goes to the GPU. So a top spec Mac mini M4 Pro with 48GB of Ram would have 32gb dedicated to GPU/NPU tasks for $2000

Compare that to JUST a 5090 32GB for $2000 MSRP and its pretty compelling.

$200 and its the 64GB model with 2x 4090’s amounts of Vram.

Its certainly better than the AMD AI experience and its the best price for getting into AI stuff so says nerds with more money and experience than me.

nagaram@startrek.website · 9 days ago

Honestly if you’re not gaming or playing with new hardware, there is absolutely no point.

I’ve considered swapping this computer over to Fedora for a hot minute, but it really is a gaming PC and I should stop trying to break it.

nagaram@startrek.website · 9 days ago

True, but I have an addiction and that’s buying stuff to cope with all the drawbacks of late stage capitalism.

I am but a consumer who must be given reasons to consume.

nagaram@startrek.website · 9 days ago

The Lenovo Thinkcentre M715q were $400 total after upgrades. I fortunately had 3 32 GB kits of ram from my work’s e-waste bin but if I had to add those it would probably be $550 ish The rack was $120 from 52pi I bought 2 extra 10in shelves for $25 each the Pi cluster rack was also $50 (shit I thought it was $20. Not worth) Patch Panel was $20 There’s a UPS that was $80 And the switch was $80

So in total I spent $800 on this set up

To fully replicate from scratch you would need to spend $160 on raspberry pis and probably $20 on cables

So $1000 theoratically

nagaram@startrek.website · 9 days ago

The PIs were honestly because I had them.

I think I’d rather use them for something else like robotics or a Birdnet pi.

But the pi rack was like $20 and hilarious.

The objectively correct answer for more compute is more mini PCs though. And I’m really thinking about the Mac Mini option for AI.

nagaram@startrek.website · 11 days ago

Ollama and all that runs on it its just the firewall rules and opening it up to my network that’s the issue.

I cannot get ufw, iptables, or anything like that running on it. So I usually just ssh into the PC and do a CLI only interaction. Which is mostly fine.

I want to use OpenWebUI so I can feed it notes and books as context, but I need the API which isn’t open on my network.

nagaram@startrek.website · 11 days ago

I was thinking about that now that I have Mac Minis on the mind. I might even just set a mac mini on top next to the modem.

nagaram@startrek.website · edit-2 11 days ago

Ollama + Gemma/Deepseek is a great start. I have only ran AI on my AMD 6600XT and that wasn’t great and everything that I know is that AMD is fine for gaming AI tasks these days and not really LLM or Gen AI tasks.

A RTX 3060 12gb is the easiest and best self hosted option in my opinion. New for >$300 and used even less. However, I was running with a Geforce 1660 ti for a while and thats >$100

nagaram@startrek.website · 11 days ago

A mac is a very funny and objectively correct option

nagaram@startrek.website · 11 days ago

I think I’m going to have a harder time fitting a threadripper in my 10 inch rack than I am getting any GPU in there.

nagaram@startrek.website · 11 days ago

I do already have a NAS. It’s in another box in my office.

I was considering replacing the PIs with a BOD and passing that through to one of my boxes via USB and virtualizing something. I compromised by putting 2tb Sata SSDs in each box to use for database stuff and then backing that up to the spinning rust in the other room.

How do I do that? Good question. I take suggestions.

nagaram@startrek.website · 11 days ago

With a RTX 3060 12gb, I have been perfectly happy with the quality and speed of the responses. It’s much slower than my 5060ti which I think is the sweet spot for text based LLM tasks. A larger context window provided by more vram or a web based AI is cool and useful, but I haven’t found the need to do that yet in my use case.

As you may have guessed, I can’t fit a 3060 in this rack. That’s in a different server that houses my NAS. I have done AI on my 2018 Epyc server CPU and its just not usable. Even with 109gb of ram, not usable. Even clustered, I wouldn’t try running anything on these machines. They are for docker containers and minecraft servers. Jeff Geerling probably has a video on trying to run an AI on a bunch of Raspberry Pis. I just saw his video using Ryzen AI Strix boards and that was ass compared to my 3060.

But to my use case, I am just asking AI to generate simple scripts based on manuals I feed it or some sort of writing task. I either get it to take my notes on a topic and make an outline that makes sense and I fill it in or I feed it finished writings and ask for grammatical or tone fixes. Thats fucking it and it boggles my mind that anyone is doing anything more intensive then that. I am not training anything and 12gb VRAM is plenty if I wanna feed like 10-100 pages of context. Would it be better with a 4090? Probably, but for my uses I haven’t noticed a difference in quality between my local LLM and the web based stuff.