Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Jeff - check out the distributed-llama project...you should be able to distribute over entire cluster


I've been testing Exo (seems dead), llama.cpp RPC (has a lot of performance limitations) and distributed-llama (faster but has some Vulkan quirks and only works with a few models).

See my AI cluster automation setup here: https://github.com/geerlingguy/beowulf-ai-cluster

I was building that through the course of making this video, because it's insane how much manual labor people put into building home AI clusters :D



He mentioned that in the video.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: