We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
支持如gme-Qwen2-VL在内的多模态embeding.由于基于解码器的embeding模型在多语言和跨语言甚至跨模态的embeding效果非常好,但因为基于解码器的embeding模型参数规模比较大,CPU运行的话速度太慢,而且embeding模型相关的量化加速方案也比较少。同时gguf在这方面优势突出,基于解码器的embeding模型也支持灵活的量化方案,也可以充分利用CPU的算力。希望贵团队能够考虑添加llamacpp embeding后端支持。感谢贵团队的开源工作!
/
The text was updated successfully, but these errors were encountered:
社区同学 @pengjunfeng11 正在引入 embedding 的多后端,届时可以引入 llamacpp 后端。
Sorry, something went wrong.
No branches or pull requests
Feature request / 功能建议
支持如gme-Qwen2-VL在内的多模态embeding.由于基于解码器的embeding模型在多语言和跨语言甚至跨模态的embeding效果非常好,但因为基于解码器的embeding模型参数规模比较大,CPU运行的话速度太慢,而且embeding模型相关的量化加速方案也比较少。同时gguf在这方面优势突出,基于解码器的embeding模型也支持灵活的量化方案,也可以充分利用CPU的算力。希望贵团队能够考虑添加llamacpp embeding后端支持。感谢贵团队的开源工作!
Motivation / 动机
/
Your contribution / 您的贡献
/
The text was updated successfully, but these errors were encountered: