Post by matsuu (@matsuu@fedi.matsuu.org)

llama-cppの新しいオプション--cpu-moeを使えば、64GBのメインメモリーとGeForce RTX3000級のGPUでgpt-oss-120bモデルが高速に動作するらしい。
---
Reddit - The heart of the internet
https://www.reddit.com/r/LocalLLaMA/comments/1mke7ef/120b_runs_awesome_on_just_8gb_vram/
#bookmarks