Phi-4 Conversion Failure #148

c0zaut · 2024-12-25T19:33:32Z

Configuration: https://huggingface.co/c01zaut/phi-4/

Model weights: https://huggingface.co/NyxKrage/Microsoft_Phi-4/

Code:

>>> from rkllm.api import RKLLM
>>> rk = RKLLM()
INFO: rkllm-toolkit version: 1.1.4
>>> rk.load_huggingface('/root/toolkit/models/phi-4')
WARNING: `flash-attention` package not found, consider installing for better performance: No module named 'flash_attn'.
WARNING: Current `flash-attention` does not support `window_size`. Either upgrade or use `attn_implementation='eager'`.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [01:06<00:00, 11.02s/it]
0
>>> rk.build(do_quantization=True, optimization_level=0, quantized_dtype="w8a8")
Building model: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 527/527 [02:56<00:00,  2.99it/s]
0
>>> rk.export_rkllm("/root/toolkit/models/phi-4.rkllm")
ERROR: Catch exception when converting model: Argument 'value' has incorrect type (expected int, got NoneType)
-1
>>> print(rk.base.state)
run_state.build_model

Could you take a look and let me know what setting/parameter needs to be set/how to get proper debug output, so I can adjust the setting in the Phi3 config?

c0zaut · 2024-12-25T19:36:59Z

I won't post the code here, but I also tested the tokenizer via the RKLLM API, and it was producing correct output for encoding/decoding tokens.

Also, is it possible to enable flash attention for optimizing? I know it is possible with rknn, but don't see an option in the LLM converter API.

Thank you!

imkebe · 2025-01-09T14:52:13Z

Official PHI-4 on HF has been released.... any updates?

waydong · 2025-01-10T01:05:50Z

Hi, there will be updates in the near future.

c0zaut · 2025-01-11T16:12:07Z

Thanks @waydong ! Will the updates be in the same 1.1.x version of the library, i.e. 1.1.5, so it is backwards compatible? If not, will there be a similar update_rkllm() function so I can just pull and update my models in Huggingface instead of re-doing the conversion and UI?

waydong · 2025-01-13T01:44:45Z

Thanks @waydong ! Will the updates be in the same 1.1.x version of the library, i.e. 1.1.5, so it is backwards compatible? If not, will there be a similar update_rkllm() function so I can just pull and update my models in Huggingface instead of re-doing the conversion and UI?

Yes, we will maintain an interface(update_rkllm) to support easy model upgrades.

c0zaut · 2025-01-14T01:58:34Z

@waydong Thank you! Will the updated library also require a new kernel module? If so, could you make dynamic instead of built-in?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phi-4 Conversion Failure #148

Phi-4 Conversion Failure #148

c0zaut commented Dec 25, 2024 •

edited

Loading

c0zaut commented Dec 25, 2024

imkebe commented Jan 9, 2025

waydong commented Jan 10, 2025

c0zaut commented Jan 11, 2025

waydong commented Jan 13, 2025

c0zaut commented Jan 14, 2025

Phi-4 Conversion Failure #148

Phi-4 Conversion Failure #148

Comments

c0zaut commented Dec 25, 2024 • edited Loading

c0zaut commented Dec 25, 2024

imkebe commented Jan 9, 2025

waydong commented Jan 10, 2025

c0zaut commented Jan 11, 2025

waydong commented Jan 13, 2025

c0zaut commented Jan 14, 2025

c0zaut commented Dec 25, 2024 •

edited

Loading