Implement QV_SCOPE_SPLIT_MIN_OVERLAP #228

samuelkgutierrez · 2024-07-16T16:37:21Z

@eleon and @GuillaumeMercier have ideas and can expand on how this might work.

eleon · 2024-07-16T22:30:57Z

The goal of this issue is to clarifty the semantics of a split operation when more than one worker, i.e., process or thread, is assigned to a resource piece. For example, let's assume we have a system with 2 NUMA domains and each NUMA has 4 cores. A user has 4 workers and launches a split_at operation on NUMA boundaries.

This operation will split the hardware into 2 pieces. Each piece will have a NUMA domain with 4 cores. Since we have 4 workers and 2 pieces, a single piece will get 2 workers. Say, T0 and T1 are assigned to NUMA0. Two possible mappings here:

Both T0 and T1 are assigned to the same resources: NUMA 0 with 4 cores
T0 is assigned to NUMA 0 with 2 cores and T1 is assigned to NUMA 0 with the other 2 cores.

The first case matches our current implementation, but it may not be what a user expects. For example, in mpibind, if there are enough resources, the resulting mapping would be that of the second case, i.e., minimize resource overlap as much as possible.

If our implementation uses (1), the semantic of minimum overlap could be implemented by the user by calling a second split operation on the resulting subscopes, but if most users need this, then a user would have to make two split operations (and associated scope management) to get the desired result.

We discussed that the API could provide a coloring policy such as QV_SCOPE_SPLIT_AFFINITY_PRESERVING so that users could express whether they want the semantics of option (1) or option (2). For example, to get option (2), a user could specify the color QV_SCOPE_SPLIT_MIN_OVERLAP and to get option (1) the color QV_SCOPE_SPLIT_SHARED_RES. Then, based on driving use cases, we could make one or the other the default.

@samuelkgutierrez also suggested another coloring policy QV_SCOPE_SPLIT_NO_OVERLAP that would provide no overlapping PUs for any two workers. In this case, if the number of workers are more than the max number with no overlap, the implementation would return NULL subscopes for a subset of workers. Users would need a function to check whether a scope is NULL. Workers with a NULL subscope did not get assigned any resources, but the others are guaranteed to have non-overlapping resources!

samuelkgutierrez self-assigned this Jul 16, 2024

samuelkgutierrez changed the title ~~Add QV_SCOPE_SPLIT_MIN_OVERLAP~~ Implement QV_SCOPE_SPLIT_MIN_OVERLAP Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement QV_SCOPE_SPLIT_MIN_OVERLAP #228

Implement QV_SCOPE_SPLIT_MIN_OVERLAP #228

samuelkgutierrez commented Jul 16, 2024

eleon commented Jul 16, 2024

Implement QV_SCOPE_SPLIT_MIN_OVERLAP #228

Implement QV_SCOPE_SPLIT_MIN_OVERLAP #228

Comments

samuelkgutierrez commented Jul 16, 2024

eleon commented Jul 16, 2024