Feature request: export to GGUF LoRA (not merging) #1546

ngxson · 2025-01-15T17:29:53Z

Hi, I'm one of the maintainer working on LoRA support on llama.cpp

FYI, we already had a script convert_lora_to_gguf.py that can convert any PEFT-compatible LoRA adapter into GGUF, without merging into base model.

I would like to discuss if we can take advantage of this feature to convert fine-tuned adapter directly into GGUF. An idea could be:

# add save_method = "lora" to export just the adapter, not merging
model.save_pretrained_gguf("dir", tokenizer, save_method = "lora", quantization_method = "f16")

For demo, here is a list of GGUF LoRA adapter: https://huggingface.co/collections/ggml-org/gguf-lora-adapters-677c49455d8f7ee034dd46f1

Happy to discuss more if you find this interesting.

Thank you.

The text was updated successfully, but these errors were encountered:

danielhanchen · 2025-01-16T11:15:13Z

Yes that would be very cool indeed! There was a code path to convert to GGUF LoRA, but it's not maintained that well.

I'm currently working on refactoring saving to GGUF in https://github.com/unslothai/unsloth-zoo/blob/nightly/unsloth_zoo/llama_cpp.py

But having direct LoRA exporting would be very cool!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: export to GGUF LoRA (not merging) #1546

Feature request: export to GGUF LoRA (not merging) #1546

ngxson commented Jan 15, 2025 •

edited

Loading

danielhanchen commented Jan 16, 2025

Feature request: export to GGUF LoRA (not merging) #1546

Feature request: export to GGUF LoRA (not merging) #1546

Comments

ngxson commented Jan 15, 2025 • edited Loading

danielhanchen commented Jan 16, 2025

ngxson commented Jan 15, 2025 •

edited

Loading