Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement extension for dpnp.choose #2201

Open
wants to merge 40 commits into
base: master
Choose a base branch
from

Conversation

ndgrigorian
Copy link
Contributor

This PR implements a new SYCL kernel for dpnp.choose in a new extension, _indexing_impl.

choose is an embarrassingly parallel copy operation, where the array to copy from is chosen from a list based on the index in the indices array.

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • If this PR is a work in progress, are you filing the PR as a draft?

dpnp/backend/extensions/indexing/choose.cpp Outdated Show resolved Hide resolved
dpnp/backend/extensions/indexing/choose.cpp Outdated Show resolved Hide resolved
dpnp/backend/extensions/indexing/choose.cpp Outdated Show resolved Hide resolved
dpnp/backend/extensions/indexing/choose.cpp Outdated Show resolved Hide resolved
dpnp/backend/extensions/indexing/choose.cpp Outdated Show resolved Hide resolved
@ndgrigorian ndgrigorian force-pushed the dedicated-choose-kernel branch 3 times, most recently from c42f2ae to 8793b54 Compare December 9, 2024 23:08
@ndgrigorian
Copy link
Contributor Author

Tests will need to be added in general, and especially to test_usm_type and test_sycl_queue, but suggestions are committed, as well as existing tests passing, so this PR is ready for first phase of review @antonwolfy

@ndgrigorian ndgrigorian marked this pull request as ready for review December 10, 2024 00:50
@ndgrigorian ndgrigorian force-pushed the dedicated-choose-kernel branch from 8793b54 to e7f0a0b Compare December 10, 2024 00:54
dpnp/backend/kernels/dpnp_krnl_indexing.cpp Show resolved Hide resolved
dpnp/dpnp_iface_indexing.py Outdated Show resolved Hide resolved
dpnp/dpnp_iface_indexing.py Show resolved Hide resolved
dpnp/dpnp_iface_indexing.py Show resolved Hide resolved
dpnp/dpnp_iface_indexing.py Outdated Show resolved Hide resolved
dpnp/dpnp_array.py Outdated Show resolved Hide resolved
dpnp/dpnp_iface_indexing.py Outdated Show resolved Hide resolved
dpnp/backend/extensions/indexing/choose.cpp Outdated Show resolved Hide resolved
dpnp/backend/extensions/indexing/choose_kernel.hpp Outdated Show resolved Hide resolved
dpnp/dpnp_iface_indexing.py Show resolved Hide resolved
dpnp/dpnp_iface_indexing.py Show resolved Hide resolved
dpnp/dpnp_iface_indexing.py Outdated Show resolved Hide resolved
dpnp/dpnp_iface_indexing.py Outdated Show resolved Hide resolved
dpnp/dpnp_iface_indexing.py Outdated Show resolved Hide resolved
dpnp/dpnp_iface_indexing.py Outdated Show resolved Hide resolved
dpnp/dpnp_iface_indexing.py Outdated Show resolved Hide resolved
dpnp/tests/test_indexing.py Outdated Show resolved Hide resolved
dpnp/backend/extensions/indexing/choose.cpp Outdated Show resolved Hide resolved
@ndgrigorian ndgrigorian force-pushed the dedicated-choose-kernel branch from ed84f7a to 7642045 Compare December 13, 2024 17:24
@coveralls
Copy link
Collaborator

coveralls commented Dec 13, 2024

Coverage Status

coverage: 71.211% (+0.4%) from 70.856%
when pulling 925b7d8 on ndgrigorian:dedicated-choose-kernel
into 356184a on IntelPython:master.

@ndgrigorian ndgrigorian force-pushed the dedicated-choose-kernel branch from 7642045 to efa7e1c Compare December 14, 2024 00:14
dpnp/backend/extensions/indexing/choose_kernel.hpp Outdated Show resolved Hide resolved
dpnp/backend/extensions/indexing/choose.cpp Outdated Show resolved Hide resolved
dpnp/backend/extensions/indexing/choose.cpp Outdated Show resolved Hide resolved

using dpctl::utils::keep_args_alive;
sycl::event arg_cleanup_ev =
keep_args_alive(exec_q, {src, py_chcs, dst}, host_task_events);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't feels right that keeping args alive depends on lifetime of temporaries.
I believe you are doing it in order to combine several independent events into one.

But this could potentially delay deletion of args resulting in increasing of memory consumption.

Though I don't know if it actually happens

}

std::vector<sycl::event>
_populate_choose_kernel_params(sycl::queue &exec_q,
Copy link
Collaborator

@AlexanderKalistratov AlexanderKalistratov Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function accepts too many parameters and it is too easy to get confused.

It also doing several things at a time going against single responsibility approach.

I'd refactor it into something like:

auto host_size_offset = make_size_offset();
auto host_chc_ptrs_shp = make_chc_ptrs();
auto host_shape_strides_shp = make_shape_strides();

std::vector<sycl::events> copy_evnts;
sycl::event host_event;
auto copy_evnts = batch_copy(exec_q, {device_chc_ptrs, host_chc_ptrs_shp}, {device_shape_strides, host_shape_strides_shp}, {device_chc_offsets, host_chc_offsets_shp});
auto host_evnt = async_free(exec_q, copy_evnts, host_size_offset, host_chc_ptrs_shp, host_shape_strides_shp)

Where batch_copy is some hypothetical variadic function which enques copy.

dpnp/backend/extensions/indexing/choose.cpp Show resolved Hide resolved
dpnp/backend/extensions/indexing/choose.cpp Outdated Show resolved Hide resolved
@ndgrigorian ndgrigorian force-pushed the dedicated-choose-kernel branch from efa7e1c to 131308f Compare December 26, 2024 18:40
@ndgrigorian ndgrigorian force-pushed the dedicated-choose-kernel branch 3 times, most recently from a54f420 to 0fb1fdf Compare January 12, 2025 02:13
With new kernel implementation, it's no longer necessary
This squeezes the output, removing trivial out dimension
Based on suggestions by @AlexanderKalistratov

Create unique_ptr wraps a device allocation, which still needs to be manually freed
after kernel run, but will be deallocated automatically during validation leading
to launch
@ndgrigorian ndgrigorian force-pushed the dedicated-choose-kernel branch from 0fb1fdf to cf50357 Compare January 21, 2025 21:07
ndgrigorian and others added 26 commits January 21, 2025 13:16
Removes need for accumulating a list of USM types and queues
Also corrects errors for unexpected dtype to TypeError to match NumPy
Logic now handles 0d inputs to _populate_choose_kernel_params
to avoid dereferencing empty shape and strides of input arrays
Fix spacing in choices type list

Fix spacing in example arrays
@ndgrigorian ndgrigorian force-pushed the dedicated-choose-kernel branch from cf50357 to 925b7d8 Compare January 21, 2025 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants