Chapter 9: Machine Learning & GPU Orchestration (AI/MLops)

Coming soon. This chapter covers NVIDIA device plugin, GPU sharing, MIG slicing, and LLM serving architectures on Kubernetes.

Planned Topics

Compute Acceleration: NVIDIA Device Plugin, fractional GPU sharing, Multi-Instance GPU (MIG) slicing
AI Workload Scheduling: Ray clusters and vLLM engines on Kubernetes for LLM serving