Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example to run Megatron #196

Open
Juanhui28 opened this issue Feb 26, 2024 · 3 comments
Open

Example to run Megatron #196

Juanhui28 opened this issue Feb 26, 2024 · 3 comments

Comments

@Juanhui28
Copy link

Hi,

Thanks for the exciting work!! I want to use the parallel methods when running Megatron, but seems there isn't an example/script to run Megatron and I cannot find a main function. Could you please share the example to run Megatron based on different parallel methods (e.g., data and model paralle)? Thanks!

@laekov
Copy link
Owner

laekov commented Feb 26, 2024

To run FastMoE with Megatron, you are supposed to use Megatron's main function, e.g. pretrain_gpt.py, with FastMoE's patch applied.

@Juanhui28
Copy link
Author

Thanks for your response!

Which patch I should use if I want to enable the expert parallel? Thanks!

@laekov
Copy link
Owner

laekov commented Feb 26, 2024

You should use the patch that matches your Megatron version. The key operation to enable moe is adding --fmoefy argument to the pretrain_xxx.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants