LLM AI #3

parsehex · 2023-12-31T22:14:11Z

parsehex · 2024-01-08T22:08:15Z

In calculating the num-gpu-layers:

The available VRAM factors in
For gguf, the parameter count (e.g. 7b, 13b) and the quant (e.g. Q4, Q5_K_M) factor in

I suppose, for a given model size and quant we could come up with a vram estimate, but how much vram does each layer equal in vram?

--

This is a library for Go to parse gguf files, MIT licensed. Maybe do something to run from server?

parsehex changed the title ~~Maintainability Improvements~~ LLM AI Jan 6, 2024

Provide feedback