What's new
Welcome to the first release of Volcano Global! π π π£
With the rapid growth of enterprise business, a single Kubernetes cluster often cannot meet the demands of large-scale AI training and inference tasks. Users typically need to manage multiple Kubernetes clusters to achieve unified workload distribution, deployment, and management. Currently there are already many users using Volcano in multiple clusters and using Karmada to managem them, in order to better support AI jobs in multi-cluster environment, support global queue management, job priority and fair scheduling, etc., the Volcano community has incubated the Volcano Global sub-project. This project extends Volcano's powerful scheduling capabilities in single clusters to provide a unified scheduling platform for multi-cluster AI jobs, supporting cross-cluster job distribution, resource management, and priority control.
Volcano Global provides the following enhancements on top of Karmada to meet the complex demands of multi-cluster AI job scheduling:
- Supports Cross-Cluster Scheduling of Volcano Jobs
Users can deploy and schedule Volcano Jobs across multiple clusters, fully utilizing the resources of multiple clusters to improve task execution efficiency. - Queue Priority Scheduling
Supports cross-cluster queue priority management, ensuring high-priority queue tasks can obtain resources first. - Job Priority Scheduling and Queuing
Supports job-level priority scheduling and queuing mechanisms in multi-cluster environments, ensuring critical tasks are executed promptly. - Multi-Tenant Fair Scheduling
Provides cross-cluster multi-tenant fair scheduling capabilities, ensuring fair resource allocation among tenants and avoiding resource contention.
For detailed introduction and user guide, please refer to: Multi-cluster Scheduling | Volcano.
Changes
- Bump karmada version to support resourceBinding suspension. (#10 @Monokaix)
- chore: remove the pod group (#9 @Vacant2333)
- Update desgin img (#8 @Monokaix)
- Change the traversal method of queue to round-robin (#6 @MondayCha)
- fix: go mod confliction (#5 @Vacant2333)
- [Proposal] Queue capacity management proposal (#2 @Vacant2333)
- [Init] Add volcano-global dispatcher, controller manager, webhook manager and deploy guide (#1 @Vacant2333)