Guide to Self Hosting LLMs Faster/Better than Ollama

brucethemoose@lemmy.world · edit-2 8 hours ago

One can’t offload “usable” LLMs without tons of memory bandwidth and plenty of RAM. It’s just not physically possible.

You can run small models like Phi pretty quick, but I don’t think people will be satisfied with that for copilot, even as basic autocomplete.

About 2x faster than Intel’s current IGPs is the threshold where the offloading can happen, IMO. And that’s exactly what AMD/Apple are producing.

brucethemoose@lemmy.world · edit-2 8 hours ago

My level of worry hasn’t lowered in years…

But honestly? Low on the totem pole. Even with Trumpy governments.

Things like engagement optimized social media warping people’s minds for profit, the internet outside of apps dying before our eyes, Sam Altman/OpenAI trying to squelch open source generative models so we’re dependent on their Earth burning plans, blatant, open collusion with the govt, everything turning into echo chambers… There are just too many disasters for me to even worry about the government spying on me.

If I lived in China or Russia, the story would be different. I know, I know. But even now, I’m confident I can given the U.S. president the middle finger in my country, but I’d really be more scared for my life in more authoritarian strongman regions.

brucethemoose@lemmy.world · edit-2 9 hours ago

The localllama crowd is supremely unimpressed with Intel, not just because of software issues but because they just don’t have beefy enough designs, like Apple does, and AMD will soon enough. Even the latest chips are simply not fast enough for a “smart” model, and the A770 doesn’t have enough VRAM to be worth the trouble.

They made some good contributions to runtimes, but seeing how they fired a bunch of engineers, I’m not sure that will continue.

brucethemoose@lemmy.world · edit-2 9 hours ago

I wouldn’t call that “large.”

Strix Halo (256 bit LPDDR5X, 40 AMD CUs) is where I’d start calling integrated graphics “large.” Intel is going to remain a laughing stock in the gaming world without bigger designs than their little 128-bit IGPs.

brucethemoose@lemmy.world · edit-2 15 hours ago

If they wanna abandon discrete GPUs… OK.

But they need graphics. They should make M Pro/Max-ish integrated GPUs like AMD is already planning on doing, with wide busses, instead of topping out at bottom-end configs.

They could turn around and sell them as GPU-accelerated servers too, like the market is begging for right now.

brucethemoose@lemmy.world · edit-2 3 days ago

I can’t imagine (as a unmarried guy) living in a marriage where my wife would be afraid to tell me who she voted for. Even if she votes Trump, she’s my spouse, and that comes way before that. She should chew me out if I make her feel afraid.

…It makes me feel shame for the culture some men are raised in.

brucethemoose@lemmy.world · 5 days ago

Trump thanks you for your vote.

brucethemoose@lemmy.world · edit-2 5 days ago

Why does Trump always get a pass?

It’s like everyone against him has to be an utter saint, and one wrong move? Welp, voting for Trump, I guess.

And yes, strategically… this makes no sense for anyone who cares about what’s happening in Israel.

brucethemoose@lemmy.world · 6 days ago

He is really betting the farm on Trump getting elected.

Like, he is set if he is, and screwed if he isn’t.

brucethemoose@lemmy.world · edit-2 7 days ago

He’s just waiting for Trump to win, so he’ll be pardoned, I guess.

brucethemoose@lemmy.world · edit-2 8 days ago

The double standard is eye watering.

Trump painted ALL Democrats as literal demons for years. Not figuratively, like literally, you could fill encyclopedias with his rhetoric. And it barely moves the needle.

But one gaff from Biden is damaging?

Ugh. You are probably not wrong, unfortunately.

brucethemoose@lemmy.world · 8 days ago

I remember when SBF news was peaking right around the time Stable Diffusion 1.5 came out, and thinking of how fundamentally gutted the entire premise of an NFT was in like a month.

brucethemoose@lemmy.world · edit-2 8 days ago

Evangelists of the stuff will tell you that you can own your own digital corner of the information highway (Second Life came out in 2003, and most MMOs have housing), or that you can trade rare items with your fellow players (TF2 and Counter-Strike have been doing this forever). Then there’s this idea that you “own the item” in question more than you would otherwise (you don’t, you own a certificate that’s associated with it, and the item will vanish if the infrastructure does). Then there’s the whole “you could use a sword from one game in another game!” nonsense, which I think we can all agree was cooked up by people who don’t understand how game design works on even a fundamental level.

This is so on point for the web3 space, and parts of the AI space too.

Evangelists waltz in and berate you for not understanding how gloriously awesome their system is… without even making a cursory effort to check if it already exists, much less accumulate a deep understanding and appreciation like they expect you to do.

brucethemoose@lemmy.world · 8 days ago

Presumably you will advance along with humanity though, or failing that, just figure out the transcendence thing yourself with so much time?

I don’t think anyone would choose to stay ‘meatbag human’ for trillions of years.

brucethemoose@lemmy.world · edit-2 8 days ago

Companies need shareholders to get off the ground, and you don’t have to be rich to be a shareholder. That’s the whole idea… otherwise only the mega rich have the capital to start businesses.

Paying C-suites this much is just idiotic though. I own a few stocks, and seeing some of the companies pay executives and upper management so much to feud and slowly destroy companies makes me sick. It is not what anyone sane wants unless they’re parasitic daytraders or drinking the corporate kool-aid.

Greedy capitalism is the problem, but it’s also a culture problem, I think.

brucethemoose@lemmy.world · 9 days ago

But what disinformation? What’s the lie?

brucethemoose@lemmy.world · edit-2 9 days ago

Iran!

And CSTO countries that don’t partularly like Russia anymore.

brucethemoose@lemmy.world · 9 days ago

No surprise there. Russia’s command structure seems like a disaster based on what I read from the ISW.

brucethemoose@lemmy.world · 9 days ago

Seems like a hard secret to keep, even for Russia.

I’m guessing state media just downplays it? But what do they say? Like, do they just very quickly mention assistance from Korea (not even specifying North/South) in a blip and move on to the next blurb? Is there a longer justification?

Or do they keep it a secret and tell soldiers to keep their mouths shut?

I am morbidly interested in the propaganda aspect.

brucethemoose@lemmy.world · edit-2 9 days ago

I wonder how this is being portrayed inside Russia?

I mean… did the public look on North Korea very favorably before the war?

brucethemoose@lemmy.world · edit-2 24 days ago

Guide to Self Hosting LLMs Faster/Better than Ollama

brucethemoose@lemmy.world · 3 months ago

Pressure grows as "last chance" negotiations for Gaza deal resume

brucethemoose@lemmy.world · 3 months ago

Hostage-ceasefire deal talks stall over new Netanyahu demands, Israeli officials say