Port oss f16_fast_gemv into fbcode #3610

YUNQIUGUO · 2025-01-23T23:19:12Z

Summary:
This diff content includes:

Port OSS FastGEMV fp16 kernel into fbcode and expose to python as a step 1 - torch.ops.fbgemm.f16_fast_gemv
https://github.com/wangsiping97/FastGEMV/blob/1fdff6f74aade033c02727a419afd6a4b4bfbc3f/fast_gemv.cu#L14
Add fp16_oss_fast_gemv to quantize ops benchmark script
Add two simple tests for custom optorch.ops.fbgemm.f16_fast_gemv to test
- torch.compile() able
- correctness

Next step:
Need fp8 mixed precision support for fast gemv kernel which is what we want

Differential Revision: D68470488

facebook-github-bot · 2025-01-23T23:19:37Z

This pull request was exported from Phabricator. Differential Revision: D68470488

netlify · 2025-01-23T23:19:38Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`e6c0730`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/6792fdced72e110008fcd8bf
😎 Deploy Preview	https://deploy-preview-3610--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Summary: X-link: facebookresearch/FBGEMM#688 This diff content includes: 1. Port OSS FastGEMV `fp16` kernel into fbcode and expose to python as a step 1 - `torch.ops.fbgemm.f16_fast_gemv` https://github.com/wangsiping97/FastGEMV/blob/1fdff6f74aade033c02727a419afd6a4b4bfbc3f/fast_gemv.cu#L14 2. Add `fp16_oss_fast_gemv` to quantize ops benchmark script 3. Add two simple tests for custom op`torch.ops.fbgemm.f16_fast_gemv` to test - `torch.compile()` able - correctness **Next step:** Need fp8 mixed precision support for fast gemv kernel which is what we want Differential Revision: D68470488

facebook-github-bot · 2025-01-24T00:48:31Z

This pull request was exported from Phabricator. Differential Revision: D68470488

Summary: X-link: facebookresearch/FBGEMM#688 This diff content includes: 1. Port OSS FastGEMV `fp16` kernel into fbcode and expose to python as a step 1 - `torch.ops.fbgemm.f16_fast_gemv` https://github.com/wangsiping97/FastGEMV/blob/1fdff6f74aade033c02727a419afd6a4b4bfbc3f/fast_gemv.cu#L14 2. Add `fp16_oss_fast_gemv` to quantize ops benchmark script 3. Add two simple tests for custom op`torch.ops.fbgemm.f16_fast_gemv` to test - `torch.compile()` able - correctness **Next step:** Need fp8 mixed precision support for fast gemv kernel which is what we want Differential Revision: D68470488

facebook-github-bot · 2025-01-24T02:41:30Z

This pull request was exported from Phabricator. Differential Revision: D68470488

facebook-github-bot added the cla signed label Jan 23, 2025

facebook-github-bot added the fb-exported label Jan 23, 2025

YUNQIUGUO force-pushed the export-D68470488 branch from 7e680f3 to b031f69 Compare January 24, 2025 00:48

YUNQIUGUO force-pushed the export-D68470488 branch from b031f69 to e6c0730 Compare January 24, 2025 02:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port oss f16_fast_gemv into fbcode #3610

Port oss f16_fast_gemv into fbcode #3610

YUNQIUGUO commented Jan 23, 2025

facebook-github-bot commented Jan 23, 2025

netlify bot commented Jan 23, 2025 •

edited

Loading

facebook-github-bot commented Jan 24, 2025

facebook-github-bot commented Jan 24, 2025

Port oss f16_fast_gemv into fbcode #3610

Are you sure you want to change the base?

Port oss f16_fast_gemv into fbcode #3610

Conversation

YUNQIUGUO commented Jan 23, 2025

facebook-github-bot commented Jan 23, 2025

netlify bot commented Jan 23, 2025 • edited Loading

✅ Deploy Preview for pytorch-fbgemm-docs ready!

facebook-github-bot commented Jan 24, 2025

facebook-github-bot commented Jan 24, 2025

netlify bot commented Jan 23, 2025 •

edited

Loading