Need help with quantizing the original model

by aaryaman - opened Dec 14, 2023

Dec 14, 2023

Can someone provide instructions on how the original model can be quantized?
I downloaded the model from microsoft/phi-2 and tried to quantize it using the scripts in llama.cpp but got an error only to realize the model is not yet supported on llama.cpp.

Any insights or suggestions would be greatly appreciated.

Bususer

Dec 15, 2023

I think python code is used maybe python transformers library or the one used in example code

guser

Dec 16, 2023

Robin Kroonem in TheBloke/phi-2-GPTQ release discussion mentioned a change to fix it https://github.com/mrgraycode/llama.cpp/commit/12cc80cb8975aea3bc9f39d3c9b84f7001ab94c5#diff-150dc86746a90bad4fc2c3334aeb9b5887b3adad3cc1459446717638605348efR6239

rinormaloku

Jan 9, 2024

@aaryaman were you able to quantize phi-2?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment