-
-
torchtitan Public
Forked from shwetasalaria/torchtitanA native PyTorch Library for large model training
Python BSD 3-Clause "New" or "Revised" License UpdatedOct 16, 2024 -
mlbatch Public
Forked from project-codeflare/mlbatchQueuing and quota management for AI/ML batch jobs on Kubernetes
JavaScript Apache License 2.0 UpdatedSep 18, 2024 -
opendatahub-operator Public
Forked from opendatahub-io/opendatahub-operatorOpen Data Hub operator to manage ODH component integrations
Go Apache License 2.0 UpdatedJul 25, 2024 -
autopilot Public
Forked from IBM/autopilotA tool to detect infrastructure issues on cloud native AI systems
Python Apache License 2.0 UpdatedJul 11, 2024 -
-
flux-sched Public
Forked from flux-framework/flux-schedFluxion Graph-based Scheduler
-
virtual-kubelet Public
Forked from virtual-kubelet/virtual-kubeletVirtual Kubelet is an open source Kubernetes kubelet implementation.
Go Apache License 2.0 UpdatedSep 7, 2022 -
DeepLearningExamples Public
Forked from NVIDIA/DeepLearningExamplesDeep Learning Examples
Python UpdatedJun 9, 2022 -
scheduler-plugins Public
Forked from kubernetes-sigs/scheduler-pluginsRepository for out-of-tree scheduler plugins based on scheduler framework.
-
dlrm Public
Forked from facebookresearch/dlrmAn implementation of a deep learning recommendation model (DLRM)
Python MIT License UpdatedFeb 10, 2022 -
scheduler-framework-sample Public
Forked from angao/scheduler-framework-sampleThis repo is a sample for Kubernetes scheduler framework.
Go Apache License 2.0 UpdatedMar 11, 2021 -
data-broker Public
Forked from IBM/data-brokerThe Data Broker (DBR) is a distributed, in-memory container of key-value stores enabling applications in a workflow to exchange data through one or more shared namespaces. Thanks to a small set of …
C Apache License 2.0 UpdatedAug 4, 2020