Problem: Writing K8s YAML is Tedious and Error-Prone
You need to deploy an app to Kubernetes but spending 30 minutes writing YAML with correct indentation, resource limits, and security contexts feels like a waste. One typo and kubectl apply fails.
You'll learn:
- How to generate manifests using AI (Claude, ChatGPT, or local LLMs)
- What to specify for production-ready configs
- How to validate and customize generated YAML
Time: 12 min | Level: Intermediate
Why This Matters
Manual YAML writing causes these issues every week:
Common problems:
- Indentation errors that kubectl can't parse
- Missing resource limits causing OOMKills
- No liveness probes leading to zombie pods
- Insecure defaults (running as root, privileged containers)
AI tools can generate 80% of your manifest correctly in seconds. You spend time reviewing, not typing.
Solution
Step 1: Prepare Your Requirements
Before asking AI, know exactly what you need. Here's a template:
# Save this as requirements.txt
App: my-api
Image: ghcr.io/company/my-api:v1.2.3
Port: 8080
Replicas: 3
Resources: 250m CPU, 512Mi memory
Environment: production namespace
Storage: 10Gi persistent volume for /data
Ingress: api.example.com with TLS
Why this works: Specific requirements = accurate output. Vague prompts get generic configs.
Expected: A clear list of deployment needs
Step 2: Generate the Base Manifest
Use this prompt with Claude, ChatGPT, or local LLM:
Generate a production-ready Kubernetes manifest for:
- Deployment named "my-api" with 3 replicas
- Image: ghcr.io/company/my-api:v1.2.3
- Container port 8080
- Resource requests: 250m CPU, 512Mi memory
- Resource limits: 500m CPU, 1Gi memory
- Liveness probe: HTTP GET /health on port 8080
- Readiness probe: HTTP GET /ready on port 8080
- Run as non-root user (UID 1000)
- Security context: read-only root filesystem
- ConfigMap for env vars: API_KEY, DATABASE_URL
- PersistentVolumeClaim: 10Gi mounted at /data
- Service (ClusterIP) exposing port 8080
- Ingress with TLS for api.example.com
Use Kubernetes 1.30+ features. Include proper labels and annotations.
Expected: AI returns ~200 lines of YAML covering deployment, service, configmap, PVC, and ingress.
If it fails:
- Error: "Too vague": Add specific values for all resources
- Gets deployment only: Ask explicitly for "all required resources including service and ingress"
Step 3: Review Critical Sections
AI gets most things right, but always check these:
# ✅ Verify resource limits exist (prevents OOMKills)
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
# ✅ Confirm probes have correct paths
livenessProbe:
httpGet:
path: /health # Must match your actual endpoint
port: 8080
initialDelaySeconds: 30 # Long enough for app startup
periodSeconds: 10
# ✅ Security context is non-root
securityContext:
runAsNonRoot: true
runAsUser: 1000
readOnlyRootFilesystem: true # Prevents container writes
allowPrivilegeEscalation: false
# ✅ Labels are consistent
metadata:
labels:
app: my-api
version: v1.2.3
environment: production
Why this matters: AI might use generic values. Verify paths, timing, and UIDs match your app.
Step 4: Add Environment-Specific Config
AI doesn't know your cluster setup. Customize these manually:
# Add your namespace
metadata:
namespace: production
# Use your ingress class
spec:
ingressClassName: nginx # or traefik, haproxy
# Reference your TLS secret
tls:
- secretName: api-example-com-tls # Must exist in cluster
hosts:
- api.example.com
# Use correct storage class
spec:
storageClassName: fast-ssd # or gp3, standard
If it fails:
- Error: "storageclass not found": Run
kubectl get storageclassand use exact name - Ingress not working: Check ingress controller is installed (
kubectl get pods -n ingress-nginx)
Step 5: Validate Before Applying
Always dry-run first:
# Check YAML syntax
kubectl apply -f manifest.yaml --dry-run=client
# Validate against cluster (doesn't create resources)
kubectl apply -f manifest.yaml --dry-run=server
# See what will be created
kubectl diff -f manifest.yaml
Expected: No errors, diff shows only new resources.
If it fails:
- Error: "unknown field": AI used deprecated API - ask for "Kubernetes 1.30+ syntax"
- Error: "invalid indentation": Copy-paste issue, run through
yamllint manifest.yaml
Step 6: Apply and Monitor
Deploy it:
# Apply all resources
kubectl apply -f manifest.yaml
# Watch rollout
kubectl rollout status deployment/my-api -n production
# Check pod health
kubectl get pods -n production -l app=my-api
# View logs
kubectl logs -n production -l app=my-api --tail=50
Expected: Pods reach Running state within 60 seconds, readiness probes pass.
Verification
Test the deployment:
# Check all resources created
kubectl get all,ingress,pvc -n production -l app=my-api
# Test service internally
kubectl run -it --rm debug --image=curlimages/curl --restart=Never -- \
curl http://my-api.production.svc.cluster.local:8080/health
# Test ingress externally
curl https://api.example.com/health
You should see: HTTP 200 responses, all pods healthy.
Advanced: Using Local LLMs
For air-gapped clusters or sensitive workloads, use local models:
# Install Ollama with CodeLlama
curl -fsSL https://ollama.com/install.sh | sh
ollama pull codellama:13b
# Generate manifest
ollama run codellama:13b "Generate production K8s deployment for..."
Why this works: Keeps config generation offline, no data leaves your network.
Limitation: Smaller models (7B) produce less accurate YAML. Use 13B+ for production configs.
What You Learned
- AI generates 80% of K8s manifests correctly in seconds
- Always verify resource limits, probes, and security contexts manually
- Use dry-run validation before applying to production
When NOT to use AI:
- Complex multi-tenant configs (use Helm or Kustomize)
- Stateful apps with specific ordering requirements
- When you're learning K8s (write manifests manually first to understand)
Next steps:
- Use Kustomize to manage environment variants
- Set up GitOps with ArgoCD for automated deployments
- Learn Helm for reusable chart templates
Real-World Example
Here's what a full AI-generated manifest looks like:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: my-api-config
namespace: production
labels:
app: my-api
data:
API_KEY: "placeholder-change-me"
DATABASE_URL: "postgresql://db.production.svc:5432/mydb"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-api-data
namespace: production
labels:
app: my-api
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast-ssd
resources:
requests:
storage: 10Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-api
namespace: production
labels:
app: my-api
version: v1.2.3
spec:
replicas: 3
selector:
matchLabels:
app: my-api
template:
metadata:
labels:
app: my-api
version: v1.2.3
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
containers:
- name: api
image: ghcr.io/company/my-api:v1.2.3
ports:
- containerPort: 8080
name: http
protocol: TCP
env:
- name: API_KEY
valueFrom:
configMapKeyRef:
name: my-api-config
key: API_KEY
- name: DATABASE_URL
valueFrom:
configMapKeyRef:
name: my-api-config
key: DATABASE_URL
resources:
requests:
cpu: "250m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "1Gi"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop:
- ALL
volumeMounts:
- name: data
mountPath: /data
- name: tmp
mountPath: /tmp
volumes:
- name: data
persistentVolumeClaim:
claimName: my-api-data
- name: tmp
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: my-api
namespace: production
labels:
app: my-api
spec:
type: ClusterIP
ports:
- port: 8080
targetPort: 8080
protocol: TCP
name: http
selector:
app: my-api
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-api
namespace: production
labels:
app: my-api
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
tls:
- hosts:
- api.example.com
secretName: api-example-com-tls
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-api
port:
number: 8080
What's good here:
- Security context properly configured (non-root, read-only fs)
- Resource limits prevent resource exhaustion
- Probes ensure traffic only goes to healthy pods
- Tmp volume allows writes despite read-only root
- All resources properly labeled for tracking
What to change:
- Replace placeholder API_KEY with actual secret reference
- Adjust replica count based on load
- Tune probe timing for your app's startup time
- Add horizontal pod autoscaler if traffic varies
Common Pitfalls
1. Trusting AI blindly Generated YAML might use deprecated APIs or incorrect probe paths. Always review.
2. Skipping resource limits Without limits, one pod can consume all node resources and crash other workloads.
3. Running as root Many AI models default to root user. Explicitly request non-root security context.
4. Forgetting validation Dry-run catches 90% of errors before they hit production. Never skip it.
5. Not testing locally Use minikube or kind to test manifests before deploying to shared clusters.
Tools Comparison
AI Tools for Manifest Generation:
| Tool | Pros | Cons | Best For |
|---|---|---|---|
| Claude Sonnet 4.5 | Best at K8s specifics, understands security | Requires API key | Production configs |
| ChatGPT 4 | Good all-around, explains choices | Sometimes verbose | Learning + production |
| Copilot | IDE integration, inline suggestions | Needs GitHub subscription | Active development |
| Local LLM (CodeLlama) | Offline, no data leak | Less accurate on edge cases | Air-gapped environments |
| K8sGPT | Purpose-built for K8s, includes cluster analysis | New tool, limited features | Debugging existing clusters |
Traditional Tools (still useful):
- kubectl create deployment --dry-run=client -o yaml: Fast for simple deploys
- Helm: Better for reusable charts across environments
- Kustomize: Better for managing variants (dev/staging/prod)
Use AI for one-off deployments or initial configs. Use Helm/Kustomize for repeated patterns.
Production Checklist
Before deploying AI-generated manifests to production:
- Resource requests and limits set
- Liveness and readiness probes configured
- Security context: non-root, read-only filesystem
- Labels include app, version, environment
- Namespace explicitly set (not default)
- ConfigMap secrets moved to Secret resources
- Ingress TLS secret exists in cluster
- Storage class matches cluster configuration
- Replica count appropriate for traffic
- Validated with
kubectl apply --dry-run=server - Tested in staging environment first
- Monitored post-deployment (logs, metrics, alerts)
Tested with Kubernetes 1.30.x, kubectl 1.30.0, Claude Sonnet 4.5 Works on GKE, EKS, AKS, and on-prem clusters