-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement extension for dpnp.choose
#2201
base: master
Are you sure you want to change the base?
Implement extension for dpnp.choose
#2201
Conversation
c42f2ae
to
8793b54
Compare
Tests will need to be added in general, and especially to |
8793b54
to
e7f0a0b
Compare
904d629
to
70a85c3
Compare
ed84f7a
to
7642045
Compare
7642045
to
efa7e1c
Compare
|
||
using dpctl::utils::keep_args_alive; | ||
sycl::event arg_cleanup_ev = | ||
keep_args_alive(exec_q, {src, py_chcs, dst}, host_task_events); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't feels right that keeping args alive depends on lifetime of temporaries.
I believe you are doing it in order to combine several independent events into one.
But this could potentially delay deletion of args resulting in increasing of memory consumption.
Though I don't know if it actually happens
} | ||
|
||
std::vector<sycl::event> | ||
_populate_choose_kernel_params(sycl::queue &exec_q, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function accepts too many parameters and it is too easy to get confused.
It also doing several things at a time going against single responsibility approach.
I'd refactor it into something like:
auto host_size_offset = make_size_offset();
auto host_chc_ptrs_shp = make_chc_ptrs();
auto host_shape_strides_shp = make_shape_strides();
std::vector<sycl::events> copy_evnts;
sycl::event host_event;
auto copy_evnts = batch_copy(exec_q, {device_chc_ptrs, host_chc_ptrs_shp}, {device_shape_strides, host_shape_strides_shp}, {device_chc_offsets, host_chc_offsets_shp});
auto host_evnt = async_free(exec_q, copy_evnts, host_size_offset, host_chc_ptrs_shp, host_shape_strides_shp)
Where batch_copy
is some hypothetical variadic function which enques copy.
efa7e1c
to
131308f
Compare
a54f420
to
0fb1fdf
Compare
With new kernel implementation, it's no longer necessary
This squeezes the output, removing trivial out dimension
Based on suggestions by @AlexanderKalistratov Create unique_ptr wraps a device allocation, which still needs to be manually freed after kernel run, but will be deallocated automatically during validation leading to launch
0fb1fdf
to
cf50357
Compare
Removes need for accumulating a list of USM types and queues
Also corrects errors for unexpected dtype to TypeError to match NumPy
Logic now handles 0d inputs to _populate_choose_kernel_params to avoid dereferencing empty shape and strides of input arrays
Co-authored-by: Anton <[email protected]>
Fix spacing in choices type list Fix spacing in example arrays
cf50357
to
925b7d8
Compare
This PR implements a new SYCL kernel for
dpnp.choose
in a new extension,_indexing_impl
.choose
is an embarrassingly parallel copy operation, where the array to copy from is chosen from a list based on the index in the indices array.