Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: export to GGUF LoRA (not merging) #1546

Open
ngxson opened this issue Jan 15, 2025 · 1 comment
Open

Feature request: export to GGUF LoRA (not merging) #1546

ngxson opened this issue Jan 15, 2025 · 1 comment

Comments

@ngxson
Copy link

ngxson commented Jan 15, 2025

Hi, I'm one of the maintainer working on LoRA support on llama.cpp

FYI, we already had a script convert_lora_to_gguf.py that can convert any PEFT-compatible LoRA adapter into GGUF, without merging into base model.

I would like to discuss if we can take advantage of this feature to convert fine-tuned adapter directly into GGUF. An idea could be:

# add save_method = "lora" to export just the adapter, not merging
model.save_pretrained_gguf("dir", tokenizer, save_method = "lora", quantization_method = "f16")

For demo, here is a list of GGUF LoRA adapter: https://huggingface.co/collections/ggml-org/gguf-lora-adapters-677c49455d8f7ee034dd46f1

Happy to discuss more if you find this interesting.

Thank you.

@danielhanchen
Copy link
Contributor

Yes that would be very cool indeed! There was a code path to convert to GGUF LoRA, but it's not maintained that well.

I'm currently working on refactoring saving to GGUF in https://github.com/unslothai/unsloth-zoo/blob/nightly/unsloth_zoo/llama_cpp.py

But having direct LoRA exporting would be very cool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants