-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xe: jit: gemm: Add k-parallelism parameter to Xe2 gemm kernels #2477
base: main
Are you sure you want to change the base?
Conversation
make test |
@Simonsays095 can you run Xe2 perf testing? |
{{'G', "gemm", {"F", "H", "S"}, {"T", "N", "N"}}, {-1, -1, {-1, 25, -1}, {-1, 32, -1}, {-1, 25, -1}, {-1, 32, -1}, {16, 16, 1}, "IAB"}, "at32+m128@80 am32+m128@80 aB wg 8x1x4 ikr wx2 xaf vav hi pt sr br sb128 bk0 sm sn bm0 nmk sys", {16, (LoopType) 255, 128, {(LoopType) 209, (LoopType) 255, (LoopType) 2}, {16777216, 262144, 16777216}, {262144, 262144, 16777216}, {16, 16, 128}, {8, 1, 4}, 2, (WGType) 1, 4357, 0, 8192, {16, 16, 4}, {true, true, true}}, {'W', 1, {256}}}, | ||
{{'G', "gemm", {"F", "H", "S"}, {"T", "N", "N"}}, {-1, -1, {-1, 33, -1}, {-1, 48, -1}, {-1, 33, -1}, {-1, 48, -1}, {16, 16, 1}, "ABI"}, "at64+m64@48 am32+m16@48 aB wg 4x1 xaf rr vav hi pt sr br sb64 bk0 sm grf256 sys np", {16, (LoopType) 255, 256, {(LoopType) 208, (LoopType) 255, (LoopType) 255}, {524288, 524288, 16777216}, {524288, 524288, 16777216}, {32, 32, 64}, {4, 1, 1}, 1, (WGType) 1, 257, 0, 0, {16, 16, 4}, {true, true, true}}, {'W', 1, {1024}}}, | ||
{{'G', "gemm", {"F", "O", "S"}, {"T", "N", "N"}}, {-1, -1, {-1, -1, -1}, {-1, -1, -1}, {-1, -1, -1}, {-1, -1, -1}, {16, 16, 1}, "ABI"}, "at32+m128@96 am32x2+m64@96 aB wg 2x16 vav hi pt sr br sb128 bk0 grf256 sys acb cr16", {16, (LoopType) 255, 256, {(LoopType) 208, (LoopType) 255, (LoopType) 255}, {2097152, 262144, 16777216}, {2097152, 262144, 16777216}, {128, 16, 32}, {2, 16, 1}, 1, (WGType) 1, 257, 0, 0, {16, 16, 4}, {true, true, true}}, {'E', 17, {879529, 62860.9, 0, 0, 0, 0, 1.12572, 1.9182, 3.81465, 7.84556, 0.00532516, 0.00532516, 0, 1, 1.01261, 1.00705, -3.00232e-14}}}, | ||
{{'G', "gemm", {"F", "O", "S"}, {"T", "N", "N"}}, {-1, -1, {-1, 1, -1}, {-1, 1, -1}, {-1, -1, -1}, {-1, -1, -1}, {16, 16, 1}, "ABI"}, "at128 am128 ab wg 2x1x16 sys ikr sr br", {16, (LoopType) 255, 128, {(LoopType) 0, (LoopType) 1, (LoopType) 255}, {8192, 8192, 16777216}, {8192, 8192, 16777216}, {32, 1, 128}, {2, 1, 1}, 1, (WGType) 0, 257, 0, 0, {16, 16, 4}, {true, true, true}}, {'E', 17, {533005, 706.931, 0, 0, 0, 0, 1.03522, 1.49979, 2.9056, 6.09078, 0.0666521, -0.0162066, 0.0674277, 0.261398, 1.07943, 0, 0}}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After modifying strategies in the catalog "by hand," you'll want to update the embedded DriverInfo structs with ktool --reinfo
. In this case only the strategies you added ikr
to really need it.
Addresses MFDNN-13067. The Xe2 dynamic quantization kernels have some parameters that need to be tweaked:
ar
->sr br
: This results in faster execution and should be the default in all casesikr
added to the strategy to be validdi cc
can be removed from 2 kernelsThe result should be a few correctness passes that used to fail (below), and slightly more optimized execution.