The Future is Intelligent: AI in the Cloud-Native Era

Abstract

Mirantis senior staff explained why Kubernetes has become the de‑facto operating system for modern AI workloads and how open‑source, cloud‑native tooling can keep AI infrastructure vendor‑agnostic, cost‑effective, and data‑sovereign. The talk traced the evolution from early private‑cloud solutions to today’s multi‑cloud, GPU‑aware Kubernetes stacks, highlighted survey data on AI‑ML developers, and detailed the technical and operational challenges of building AI‑ready platforms. The presenters then showcased a composable, open‑source platform (Codent) that leverages Cluster‑API, Helm‑based templating, and a suite of AI‑focused tools (KTSGPT, k‑agent, AI‑bricks, etc.) and concluded with a live demo of an end‑to‑end translation service running on a GPU‑enabled Azure cluster.

Detailed Summary

The session began with housekeeping (group photo) and a brief welcome from Bharath N R.
Bharath introduced Mirantis’ Open Source Program Office (OSPO), whose mission is to contribute upstream to the open‑source projects that underpin enterprise cloud stacks.
Satyam Bhardwaj followed, describing his focus on CNCF‑related projects, especially Kubernetes, and positioning Mirantis as a long‑standing pioneer in private‑cloud technology (OpenStack, Docker Enterprise, Mirantis Kubernetes Engine, Lens UI).

2. From Private Cloud to Cloud‑Native AI

2.1 Evolution of Cloud Architecture

Early cloud expectations: a single, public‑cloud endpoint.
Reality today: multi‑cloud (AWS, Azure, private clouds, edge) with 20+ clusters per organization.
The proliferation of APIs and services has turned cloud from a “simple interaction” into a complex orchestration problem.

2.2 Kubernetes as the Common Control Plane

Kubernetes is framed as the “OS of the future” – a universal control plane, scheduler, and API surface that can abstract away the underlying heterogeneity.
The speaker stressed that calling Kubernetes an OS is no longer optimistic; it is already the reality for most production workloads.

2.3 Cloud‑Native Survey Highlights

Metric	Figure (CNCF Survey)
Total cloud‑native developers	15.6 M
Developers who identify as AI/ML engineers	52 % (~7.1 M)
Already running AI workloads on K8s	36 %
Planning AI workloads on K8s	18 %

The surge in LLMs and “agents” has accelerated the migration of AI workloads onto Kubernetes.

3. AI Infrastructure Challenges (Presented by Satyam)

3.1 End‑to‑End Stack Complexity

An AI request traverses GPU, storage, network, and monitoring layers.
Inefficient GPU utilization (e.g., “burning GPUs”) drives the need for tighter orchestration and cost control.

3.2 Kubernetes Fundamentals (for the audience)

Analogy: a set of five servers hosting an app; Kubernetes guarantees high‑availability, auto‑scaling, self‑healing, and auto‑hauling.
Kubernetes is the second‑largest open‑source project after Linux, governed by a neutral community rather than a single vendor.

3.3 Operational Friction

Challenge	Description
Multi‑cluster / multi‑cloud fragmentation	Managing dozens of clusters creates a proliferation of YAML manifests and divergent APIs.
Visibility & governance	Disparate regions make it hard to obtain a unified view; GitOps tools struggle at scale.
Regulation & compliance	Data‑sovereignty mandates (e.g., EU) require audit trails, security hardening, and consistent policy enforcement.
GPU onboarding & multi‑tenancy	On‑board GPUs quickly, share them efficiently across teams, and deal with vendor‑specific slicing (NVIDIA MIG, AMD SR‑IOV).
Operational efficiency	Slow provisioning (weeks for cloud GPUs), high maintenance overhead, and noisy‑neighbor effects.

Takeaway: Modern infrastructure is harder; without automation, innovation stalls.

4. AI‑Specific Infrastructure Pain Points

Technical complexity – No single standard; countless frameworks (TensorFlow, PyTorch, JAX) and platform components must interoperate.
Operational efficiency – Utilization, provisioning speed, and networking become bottlenecks once the stack is assembled.
User experience – Developers still face 6‑week GPU request cycles on hyperscalers; they demand instant, pre‑configured environments with reliable performance.
Multi‑tenancy – High‑cost GPUs must be shared safely; different vendors expose different slicing APIs, leading to fragmentation and noise.

5. Kubernetes AI Conformance

A conformance layer ensures that AI/ML workloads are portable across managed Kubernetes services (GKE, AKS, private clusters).
Six conformance factors:
1. Hardware accelerators – GPU, TPU, etc.
2. Operators – Standardized CRDs for AI workloads.
3. Scheduling – Consistent resource‑allocation semantics.
4. Security / compliance – CVE scanning, policy enforcement (OPA/Kyverno).
5. Observability – OpenTelemetry, Prometheus, Grafana.
6. Lifecycle management – Version‑ed APIs, upgrade pathways.

6. AI Workflow Pillars

Pillar	Typical Open‑Source Tools
Training	PyTorch (≈ 80 % of Hugging Face training), TensorFlow, JAX
Inference	AI‑bricks (CNCF‑graduated inference framework), LLMD, VLLM (distributed inference, KV‑cache)
Agents	Custom LLM‑driven agents that perform node selection, memory management, or service orchestration

7. Platform‑Engineering Blueprint (Presented by Bharath)

A five‑step model for building a resilient AI platform:

Developer Experience – Unified portal, self‑service tooling (e.g., Backstage).
Security – CVE mitigation, software‑supply‑chain policies (OPA, Kyverno).
Foundation – CI/CD pipelines, IaC, API management (Kong), feature‑flag systems.
Resilience Engineering – Incident management, chaos testing, reliability testing.
Cost & Observability – Cloud‑cost dashboards (OpenCost), logging & tracing (Elastic, OpenTelemetry).

7.1 Highlighted Open‑Source Tools

Tool	Function
KTSGPT	LLM‑driven debugging of K8s clusters (log analysis + remediation suggestions).
k‑agent	Framework for deploying and orchestrating AI agents inside K8s.
k‑tops	Packages DevOps practices as reusable AI/ML models (OCI‑compatible).
Kubeflow	End‑to‑end ML workflow engine (training → CI/CD → deployment).
KServe	Production‑grade inference serving (model versioning, autoscaling).
AI‑bricks / LLMD / VLLM	Specialized inference runtimes with GPU‑aware scheduling and KV‑cache.
Nebius, Chainguard, SalePoint	Security‑focused solutions for AI pipelines.
QAI	Multi‑agent orchestration platform for K8s.
GitLab, Harness	AI‑enhanced software delivery pipelines.
AI‑SRE, Cube‑Cost AI	Observability and cost‑optimization for AI workloads.

8. “MCP Server” Concept

MCP (Modular Control Plane) servers act as plug‑and‑play adapters for each cloud‑native component, providing a standardized API surface and simplifying integration across the ecosystem.

9. Codent (Composable AI‑Ready Platform)

9.1 Architectural Overview

Three layers:
1. Cluster Management – Powered by Cluster‑API (CAPI) with provider‑specific implementations (AWS, Azure, GCP, OpenStack, Bare‑Metal).
2. State Management – Handled by SvelteOS (Helm‑based, GitOps‑ready) for services such as ingress, cert‑manager, GPU operators.
3. Observability – Standard stack (Prometheus, Grafana, OpenTelemetry, OpenCost).
Composable design: each layer is defined by YAML templates (cluster spec + service spec). Swapping a GPU type (e.g., T4 → H100) or a cloud provider is a single change in the template.

9.2 Toolchain Choices

Layer	Technology
Container Runtime	K0s (lightweight K8s distribution)
Cluster Provisioning	Cluster‑API (CAPI)
Service Deployment	Helm charts (no heavy IaC tools required)
GitOps	Argo CD (integrated with SvelteOS)
GPU Operator	NVIDIA GPU Operator (or AMD equivalent)
Service Mesh	Istio
Ingress / Cert‑Management	Kong, cert‑manager
Serverless / Model Serving	Knative, KServe

9.3 Live Demo (Azure GPU Cluster)

Cluster spec: Azure VM series with Tesla T4 GPUs; control‑plane and worker flavors defined in a single Helm‑based YAML.
Service spec: Deploy NVIDIA GPU Operator, Istio, cert‑manager, KServe, K0s.
Application: A translation service (English → Hindi) using an offline AI model. Demonstrated end‑to‑end provisioning in ≈ 15‑20 minutes.
Observation: The demo highlighted a gap in Indian‑language models, prompting a call for India‑native AI assets.

10. Future Outlook

Anticipated growth: ~75 % of cloud‑native engineers will be AI/ML engineers within the next few years.
Roadmap: tighter integration of MCP servers for autonomous platform behavior, broader support for regional AI models, and continued open‑source convergence across the CNCF landscape.

11. Q&A / Closing Remarks

The audience asked about GPU driver pain points, time to provision clusters, and availability of Indian language models.
Bharath reiterated that the composable, Helm‑only approach reduces operational overhead and that the open‑source community is actively building the missing models.
The session wrapped with a thank‑you from the moderators and an invitation to connect for deeper technical discussions.

Key Takeaways

Kubernetes is the universal OS for AI workloads; its ecosystem now includes GPU‑aware scheduling and dynamic resource allocation (GA in K8s 1.34/1.35).
More than half of cloud‑native developers are AI/ML engineers, and a growing fraction already run AI on Kubernetes.
AI infrastructure challenges are three‑fold: technical complexity, operational efficiency, and multi‑tenancy/security/compliance.
Kubernetes AI Conformance (six-factor checklist) is essential for portable, repeatable AI workloads across clouds.
A five‑step platform‑engineering model (developer experience → cost & observability) provides a practical roadmap for building resilient AI platforms.
Open‑source tooling (KTSGPT, k‑agent, AI‑bricks, Kubeflow, KServe, etc.) enables end‑to‑end AI pipelines without vendor lock‑in.
Codent demonstrates that a composable, Helm‑templated stack can provision a full AI‑ready Kubernetes cluster (including GPU operators) in under 20 minutes.
MCP servers act as standardized adapters, simplifying the integration of disparate cloud‑native components.
India‑specific AI models are still scarce; the community is urged to develop and open‑source localized models.
Future workforce shift: expect three‑quarters of cloud‑native engineers to be AI/ML focused, underscoring the strategic importance of open, cloud‑native AI platforms.

See Also:

India AI Impact Summit 2026

Explorer

the-future-is-intelligent-ai-in-the-cloud-native-era