You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm one of the maintainer working on LoRA support on llama.cpp
FYI, we already had a script convert_lora_to_gguf.py that can convert any PEFT-compatible LoRA adapter into GGUF, without merging into base model.
I would like to discuss if we can take advantage of this feature to convert fine-tuned adapter directly into GGUF. An idea could be:
# add save_method = "lora" to export just the adapter, not mergingmodel.save_pretrained_gguf("dir", tokenizer, save_method="lora", quantization_method="f16")
Hi, I'm one of the maintainer working on LoRA support on llama.cpp
FYI, we already had a script
convert_lora_to_gguf.py
that can convert any PEFT-compatible LoRA adapter into GGUF, without merging into base model.I would like to discuss if we can take advantage of this feature to convert fine-tuned adapter directly into GGUF. An idea could be:
For demo, here is a list of GGUF LoRA adapter: https://huggingface.co/collections/ggml-org/gguf-lora-adapters-677c49455d8f7ee034dd46f1
Happy to discuss more if you find this interesting.
Thank you.
The text was updated successfully, but these errors were encountered: