700

New top story on Hacker News: Llama.cpp can do 40 tok/s on M2 Max, 0% CPU usage, using all 38 GPU cores

Llama.cpp can do 40 tok/s on M2 Max, 0% CPU usage, using all 38 GPU cores
20 by samwillis | 4 comments on Hacker News.


Comments

Popular posts from this blog