Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port oss f16_fast_gemv into fbcode #3610

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

YUNQIUGUO
Copy link

Summary:
This diff content includes:

  1. Port OSS FastGEMV fp16 kernel into fbcode and expose to python as a step 1 - torch.ops.fbgemm.f16_fast_gemv
    https://github.com/wangsiping97/FastGEMV/blob/1fdff6f74aade033c02727a419afd6a4b4bfbc3f/fast_gemv.cu#L14
  2. Add fp16_oss_fast_gemv to quantize ops benchmark script
  3. Add two simple tests for custom optorch.ops.fbgemm.f16_fast_gemv to test
    • torch.compile() able
    • correctness

Next step:
Need fp8 mixed precision support for fast gemv kernel which is what we want

Differential Revision: D68470488

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D68470488

Copy link

netlify bot commented Jan 23, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit e6c0730
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/6792fdced72e110008fcd8bf
😎 Deploy Preview https://deploy-preview-3610--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

YUNQIUGUO added a commit to YUNQIUGUO/FBGEMM that referenced this pull request Jan 24, 2025
Summary:
X-link: facebookresearch/FBGEMM#688


This diff content includes:
1. Port OSS FastGEMV `fp16` kernel into fbcode and expose to python as a step 1 - `torch.ops.fbgemm.f16_fast_gemv`
https://github.com/wangsiping97/FastGEMV/blob/1fdff6f74aade033c02727a419afd6a4b4bfbc3f/fast_gemv.cu#L14
2. Add `fp16_oss_fast_gemv` to quantize ops benchmark script
3. Add two simple tests for custom op`torch.ops.fbgemm.f16_fast_gemv` to test
     - `torch.compile()` able
     -  correctness

**Next step:**
Need fp8 mixed precision support for fast gemv kernel which is what we want

Differential Revision: D68470488
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D68470488

Summary:
X-link: facebookresearch/FBGEMM#688


This diff content includes:
1. Port OSS FastGEMV `fp16` kernel into fbcode and expose to python as a step 1 - `torch.ops.fbgemm.f16_fast_gemv`
https://github.com/wangsiping97/FastGEMV/blob/1fdff6f74aade033c02727a419afd6a4b4bfbc3f/fast_gemv.cu#L14
2. Add `fp16_oss_fast_gemv` to quantize ops benchmark script
3. Add two simple tests for custom op`torch.ops.fbgemm.f16_fast_gemv` to test
     - `torch.compile()` able
     -  correctness

**Next step:**
Need fp8 mixed precision support for fast gemv kernel which is what we want

Differential Revision: D68470488
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D68470488

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants