You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The goal of this issue is to clarifty the semantics of a split operation when more than one worker, i.e., process or thread, is assigned to a resource piece. For example, let's assume we have a system with 2 NUMA domains and each NUMA has 4 cores. A user has 4 workers and launches a split_at operation on NUMA boundaries.
This operation will split the hardware into 2 pieces. Each piece will have a NUMA domain with 4 cores. Since we have 4 workers and 2 pieces, a single piece will get 2 workers. Say, T0 and T1 are assigned to NUMA0. Two possible mappings here:
Both T0 and T1 are assigned to the same resources: NUMA 0 with 4 cores
T0 is assigned to NUMA 0 with 2 cores and T1 is assigned to NUMA 0 with the other 2 cores.
The first case matches our current implementation, but it may not be what a user expects. For example, in mpibind, if there are enough resources, the resulting mapping would be that of the second case, i.e., minimize resource overlap as much as possible.
If our implementation uses (1), the semantic of minimum overlap could be implemented by the user by calling a second split operation on the resulting subscopes, but if most users need this, then a user would have to make two split operations (and associated scope management) to get the desired result.
We discussed that the API could provide a coloring policy such as QV_SCOPE_SPLIT_AFFINITY_PRESERVING so that users could express whether they want the semantics of option (1) or option (2). For example, to get option (2), a user could specify the color QV_SCOPE_SPLIT_MIN_OVERLAP and to get option (1) the color QV_SCOPE_SPLIT_SHARED_RES. Then, based on driving use cases, we could make one or the other the default.
@samuelkgutierrez also suggested another coloring policy QV_SCOPE_SPLIT_NO_OVERLAP that would provide no overlapping PUs for any two workers. In this case, if the number of workers are more than the max number with no overlap, the implementation would return NULL subscopes for a subset of workers. Users would need a function to check whether a scope is NULL. Workers with a NULL subscope did not get assigned any resources, but the others are guaranteed to have non-overlapping resources!
samuelkgutierrez
changed the title
Add QV_SCOPE_SPLIT_MIN_OVERLAP
Implement QV_SCOPE_SPLIT_MIN_OVERLAP
Jul 18, 2024
@eleon and @GuillaumeMercier have ideas and can expand on how this might work.
The text was updated successfully, but these errors were encountered: