Chapter 9: Machine Learning & GPU Orchestration (AI/MLops)
Coming soon. This chapter covers NVIDIA device plugin, GPU sharing, MIG slicing, and LLM serving architectures on Kubernetes.
Planned Topics
- Compute Acceleration: NVIDIA Device Plugin, fractional GPU sharing, Multi-Instance GPU (MIG) slicing
- AI Workload Scheduling: Ray clusters and vLLM engines on Kubernetes for LLM serving