6.11 bpw, a mixture of Q6_K and Q5_K_M
Fits ~32k CTX + MMPROJ on a 24GiB GPU
Chat template
We're not able to determine the quantization variants.
Base model