Does tract support quantized onnx models? #1615

ARKEYTECT · 2025-01-08T19:58:11Z

Just curious if anyone tried it with tract.

kali · 2025-01-20T08:28:15Z

"Quantized models" is a very overloaded terminology :) tract has some support for the QMatMul and QConv operators in ONNX, but ONNX lagged behind the SOTA for a lot of time with regards to model quantization and compression (which can maybe be explained by a focus on the training side affairs). For instance, last time I checked, there was no Q8-like type (with scale and offset) support in regular arithmetic operations in tract (Add, Mul, ...) making it a difficult format to manage quantized models. And with the LLM boom, it feels like the community is moving to more bespoke formats like GGML...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does tract support quantized onnx models? #1615

Does tract support quantized onnx models? #1615

ARKEYTECT commented Jan 8, 2025

kali commented Jan 20, 2025

Does tract support quantized onnx models? #1615

Does tract support quantized onnx models? #1615

Comments

ARKEYTECT commented Jan 8, 2025

kali commented Jan 20, 2025