Six hours. That's how long my cluster was partially broken during a Flannel to Cilium migration. This article covers the proper way to install Cilium from the start, configure native L2 load balancing (replacing MetalLB), and implement network policies. I'll explain every networking concept clearly so you understand not just the "how" but the "why" behind each configuration.
Understanding CNI (Container Network Interface)
Before diving into Cilium, let's understand what a CNI does:
What is a CNI?
A CNI plugin provides networking for pods in Kubernetes:
- Assigns IP addresses to each pod
- Enables pod-to-pod communication across nodes
- Implements network policies for security
- Handles service load balancing and DNS
Without a CNI, your pods can't talk to each other - that's why nodes show "NotReady" after Part 2.
Common CNI Options
- Flannel: Simple, uses VXLAN overlay, limited features
- Calico: Uses BGP routing, good for large clusters
- Weave: Mesh networking, easy but slower
- Cilium: eBPF-based, feature-rich, our choice
Why Cilium Over Other CNIs
After running Flannel, Calico, and finally Cilium in production, here's why Cilium wins:
eBPF Technology Advantage
What is eBPF?
eBPF (extended Berkeley Packet Filter) lets Cilium run networking code directly in the Linux kernel, like having a Formula 1 engine instead of a regular car engine.
Traditional CNI (iptables):
Packet → User Space → iptables rules → Kernel → Destination
(slow) (thousands of rules)
Cilium (eBPF):
Packet → Kernel eBPF program → Destination
(fast, no context switching)
Results in my testing:
- 10-20% better throughput (data transfer speed)
- 32% lower latency (response time)
- 37% less CPU usage
Native Load Balancing
Traditional Setup: Kubernetes + CNI + MetalLB (3 components)
With Cilium: Just Kubernetes + Cilium (2 components)
Cilium includes load balancing built-in:
- L2 mode: Announces IPs via ARP (what we'll use)
- BGP mode: For advanced routing (datacenter style)
- No extra components means less to break
Observability with Hubble
Hubble is Cilium's observability platform - think of it as X-ray vision for your network:
- See traffic flows between pods visually
- Debug connection issues with packet-level detail
- Monitor API calls without modifying applications
- Track security policy violations in real-time
Enterprise Security Features
- Network Policies: Control which pods can talk to each other
- WireGuard Encryption: Optional encryption using WireGuard protocol
- Identity-Based Security: Policies based on labels, not just IPs
- Layer 7 Filtering: Block specific HTTP paths or methods
Pre-Installation Requirements
Before installing Cilium, let's verify your cluster is ready:
# Check kernel version (Linux kernel 5.10+ recommended for full eBPF functionality)
talosctl --nodes 192.168.0.11 version | grep -i kernel
# Expected output:
# kernel: 6.1.0-talos # Version 5.10+ recommended for all eBPF features
# Older kernels have limited eBPF support
# Verify all nodes are waiting for CNI (NotReady is expected!)
kubectl get nodes
# Expected output - all nodes NotReady:
NAME STATUS ROLES AGE VERSION
talos-cp-01 NotReady control-plane 10m v1.34.0
talos-cp-02 NotReady control-plane 8m v1.34.0
talos-cp-03 NotReady control-plane 6m v1.34.0
talos-wrk-01 NotReady worker 4m v1.34.0
talos-wrk-02 NotReady worker 4m v1.34.0
# Confirm no other CNI is installed
# grep searches for text patterns in output
kubectl get pods -n kube-system | grep -E "flannel|calico|weave|cilium"
# Should return nothing - empty output is good!
# (If you've already installed Cilium, seeing cilium-... pods here is expected)
Troubleshooting Pre-Installation Issues:
# If nodes are not appearing at all:
kubectl get nodes -o wide
# Look for connection or certificate issues
# If you see old CNI pods:
kubectl delete -n kube-system ds/kube-flannel-ds # Remove Flannel
kubectl delete -n kube-system ds/calico-node # Remove Calico
# Verify control plane is actually ready:
# Note: componentstatuses is deprecated since k8s 1.19
# Use the API server healthz endpoints instead:
kubectl get --raw='/readyz?verbose'
# All checks should show "ok"
# Check API server liveness:
kubectl get --raw='/livez?verbose'
# Should show "ok" for all checks
Why These Checks Matter
- Kernel version: eBPF works best with modern kernels
- 5.10+ recommended; newer kernels unlock DSR/Host-Reachable Services/Bandwidth Manager
- Check your Cilium chart's feature matrix and Talos kernel for exact support
- NotReady nodes: Confirms nodes are waiting for CNI (expected state)
- No existing CNI: Installing multiple CNIs causes routing conflicts and packet loss
Installing Cilium CLI
First, install the Cilium CLI tool on your workstation. This tool helps manage and troubleshoot Cilium:
# Get the latest stable version number
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
# Detect your CPU architecture (Intel/AMD or ARM)
CLI_ARCH=amd64 # Default for Intel/AMD
if [ "$(uname -m)" = "aarch64" ]; then
CLI_ARCH=arm64 # ARM processors (like Apple Silicon)
fi
# Detect OS (linux or darwin for macOS)
OS=$(uname -s | tr '[:upper:]' '[:lower:]')
# Download the CLI and its checksum file
# -L: Follow redirects
# --fail: Exit on HTTP errors
# --remote-name-all: Save with original filenames
curl -L --fail --remote-name-all \
https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-${OS}-${CLI_ARCH}.tar.gz{,.sha256sum}
# Verify the download isn't corrupted (works on Linux and macOS)
if command -v sha256sum >/dev/null 2>&1; then
sha256sum --check cilium-${OS}-${CLI_ARCH}.tar.gz.sha256sum
else
shasum -a 256 -c cilium-${OS}-${CLI_ARCH}.tar.gz.sha256sum
fi
# Extract to system binary location
# x: extract, z: gzip compressed, v: verbose, f: file, C: change to directory
sudo tar xzvf cilium-${OS}-${CLI_ARCH}.tar.gz -C /usr/local/bin
# Clean up downloaded files
rm cilium-${OS}-${CLI_ARCH}.tar.gz{,.sha256sum}
# Verify installation worked
cilium version --client
# Expected output:
# cilium-cli: v0.16.x (git-sha)
Installing Helm (Required for Production)
Helm is Kubernetes' package manager. We'll use it to install Cilium with custom configuration:
# Install Helm if you don't have it
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
# Verify Helm installation
helm version
# Expected output:
# version.BuildInfo{Version:"v3.x.x", ...}
Cilium Configuration
Here's my battle-tested Cilium configuration with detailed explanations of every setting:
Understanding the Configuration
Before we dive in, let's understand key networking concepts:
- CIDR (10.244.0.0/16): Defines an IP range. The
/16means the first 16 bits are fixed, giving us 65,536 IP addresses (10.244.0.0 to 10.244.255.255) - Pod CIDR: The IP range assigned to pods
- Service CIDR: The IP range for Kubernetes services
- VIP: Virtual IP that floats between control planes
- MTU: Maximum Transmission Unit - largest packet size (like envelope size for network data)
The Complete Configuration File
Create this configuration file with explanations for each section. Always cross-check loadBalancer.mode/acceleration and bpf.lbAlgorithmAnnotation exist in your chart version to avoid "unknown field" errors.
# cilium-values.yaml
# Eviction Protection - Prevent Cilium from being killed on resource pressure
priorityClassName: system-node-critical # Agent gets highest priority
operator:
priorityClassName: system-cluster-critical # Operator is cluster-critical
cluster:
name: homelab-cluster # Your cluster name (can be anything)
id: 1 # Unique ID if running multiple clusters
# Why: Prevents clusters from interfering
# IP Address Management (IPAM) Configuration
ipam:
mode: kubernetes # Let Kubernetes assign pod IPs
# Why: Integrates with Kubernetes' IP management
# Note: clusterPoolIPv4PodCIDRList is NOT used in kubernetes mode
# Pod CIDRs are managed by Kubernetes itself via node allocations
# With ipam.mode: kubernetes, ensure the node CIDR allocator is enabled
# in kube-controller-manager and --cluster-cidr matches your Pod CIDR
# (Talos sets this when you define podSubnets)
# Non-Talos: kube-controller-manager flags must include:
# --allocate-node-cidrs=true --cluster-cidr=10.244.0.0/16
# Verify each node has a podCIDR:
# kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.podCIDR}{"\n"}{end}'
# Replace kube-proxy with eBPF (major performance boost)
kubeProxyReplacement: strict # Use 'strict' when kube-proxy is disabled (e.g., Talos)
# Use 'probe' to auto-detect if kube-proxy may still run
# Options: strict|probe|partial|false
k8sServiceHost: 192.168.0.200 # Your control plane VIP from Part 2
k8sServicePort: 6443 # Kubernetes API port (always 6443)
# Why: Cilium needs to reach API server
# Note for Talos: You can also use KubePrism (localhost:7445)
# k8sServiceHost: localhost
# k8sServicePort: 7445
# eBPF (Extended Berkeley Packet Filter) Settings
bpf:
masquerade: true # Hide pod IPs behind node IPs for external traffic
# Why: Required for SNAT of pod traffic to external networks
lbAlgorithmAnnotation: true # Allow per-Service LB algorithm override
# Host access to ClusterIP services (optional)
# Enable this ONLY if you curl ClusterIP Services from nodes or hostNetwork pods:
# enableHostReachableServices: true
# hostServices:
# enabled: true
# Routing Configuration
routingMode: native # Direct routing (no overlay/tunnel)
# Why: Better performance than VXLAN
# Note: Native routing requires L2 adjacency or correct L3 routes to each node's PodCIDR
# If nodes span VLANs/routers, use BGP mode or add static routes
autoDirectNodeRoutes: true # Auto-configure routes between nodes
ipv4NativeRoutingCIDR: 10.244.0.0/16 # Pod CIDR(s) for native routing & direct node routes
# Why: Enables native routing for these IPs (no overlay)
tunnel: disabled # Explicitly disable tunneling (optional, native mode implies this)
endpointRoutes:
enabled: true # Create per-endpoint routes
# Why: More efficient packet routing
# Network packet size
mtu: 1500 # Standard Ethernet packet size
# Change to 1450 if you see packet fragmentation
# Why: Must match your network's MTU
# Note: If you later enable WireGuard, reduce MTU (e.g., 1420-1450)
# to account for tunnel overhead or you'll see fragmentation/timeouts
# Hubble - Network observability (like Wireshark for Kubernetes)
hubble:
enabled: true # Turn on network visibility
relay:
enabled: true
replicas: 2 # Run 2 copies for high availability
# Why: Don't lose visibility if one fails
priorityClassName: system-cluster-critical # Protect from eviction
resources: # Set resource requests to avoid starvation
requests: { cpu: 50m, memory: 128Mi }
limits: { cpu: 500m, memory: 512Mi }
tls:
server:
enabled: false # Keep simple until Part 6; enable when you add certs
ui:
enabled: true # Web UI for viewing network flows
replicas: 1
priorityClassName: system-cluster-critical # Protect from eviction
resources: # Minimal resources for UI
requests: { cpu: 20m, memory: 64Mi }
limits: { cpu: 200m, memory: 256Mi }
ingress:
enabled: false # We'll configure ingress in Part 6
metrics:
enabled: # What network events to track
- dns:query # DNS lookups
- drop # Dropped packets (important!)
- tcp # TCP connections
- flow # General network flows
- icmp # Ping traffic
- http # HTTP requests
serviceMonitor:
enabled: false # Enable when we add Prometheus
# L2 (Layer 2) Load Balancer - Replaces MetalLB
l2announcements:
enabled: true # Use L2 (ARP) mode for load balancing
interfaces:
- '^(eth0|ens.*|enp[0-9s]+)$' # Regex matching your interface(s)
# CRITICAL: Must match your actual interface!
# Check with: ip link show
# Single interface: '^eth0$'
# Mixed hosts: '^(eth0|ens.*|enp[0-9s]+)$'
# Load Balancer Advanced Settings
loadBalancer:
mode: dsr # Direct Server Return - bypasses load balancer for responses
# Why: Better performance for large responses
# Note: DSR preserves source IP but requires proper return routing
# DSR requires clients can reach backend node IPs directly (return path bypasses LB)
# If you see asymmetric routing/dropped replies, try mode: snat or
# set Service's externalTrafficPolicy: Local
acceleration: native # Use eBPF for acceleration
# Note: For L4 load-balancing algorithm, use per-Service annotation:
# service.cilium.io/lb-algorithm: maglev|random|round_robin
# (Enable with bpf.lbAlgorithmAnnotation=true)
# Also can override mode per-Service:
# service.cilium.io/lb-mode: dsr|hybrid|snat
# Security Features (start simple, add later)
encryption:
enabled: false # Set true for production (15% performance cost)
type: wireguard # Modern VPN-style encryption
nodeEncryption: false # Encrypt node-to-node traffic
# Network Policy Enforcement
policyEnforcementMode: default # How strictly to enforce policies
# Options: default, always, never
# Identity allocation (fast policy warmup after restarts)
identityAllocationMode: crd # Store identities as CRDs for persistence
# Cilium Operator (manages Cilium)
operator:
replicas: 2 # Run 2 for high availability
# Cilium Agent resources (top-level in Helm chart)
resources: # Resource limits for cilium-agent pods
requests: # Minimum resources needed
cpu: 100m # 0.1 CPU cores
memory: 128Mi # 128 MB RAM
limits: # Maximum resources allowed
cpu: 1000m # 1 CPU core
memory: 1Gi # 1 GB RAM
# Note: For operator/relay/UI resources, use operator.resources,
# hubble.relay.resources, hubble.ui.resources
Installation Process
Time to install Cilium and bring your cluster to life!
Step 1: Add Cilium Helm Repository
# Add the Cilium repository to Helm
helm repo add cilium https://helm.cilium.io/
# Update repository information
helm repo update
# Expected output:
# ...Successfully got an update from the "cilium" chart repository
Step 2: Install Cilium
# Create the configuration file first
cat > cilium-values.yaml <<'EOF'
[paste the configuration from above]
EOF
# Install Cilium using Helm
# --version: Specific version for consistency
# --namespace: Install in kube-system (where CNI belongs)
# --values: Use our custom configuration
# Get latest chart version (or set a known-good version explicitly)
CILIUM_CHART_VERSION=${CILIUM_CHART_VERSION:-$(helm search repo cilium/cilium --versions | awk 'NR==2 {print $2}')}
# Or pin to a specific version for consistency:
# CILIUM_CHART_VERSION=${CILIUM_CHART_VERSION:-1.18.1}
helm show chart cilium/cilium --version "$CILIUM_CHART_VERSION" | grep appVersion
# appVersion: 1.18.1 ← verify before installing so you track the latest GA
# Verify your values match the chart version (prevents "unknown field" errors):
helm show values cilium/cilium --version "$CILIUM_CHART_VERSION" | less
# Verify kube-proxy is truly gone (required for strict mode):
kubectl -n kube-system get ds/kube-proxy || echo "kube-proxy not present (good)"
# If kube-proxy exists, strict KPR requires it to be disabled/removed
# If migrating from kube-proxy, run preflight check first:
cilium preflight check --kube-proxy-replacement=strict
helm upgrade --install cilium cilium/cilium \
--version "$CILIUM_CHART_VERSION" \
--namespace kube-system \
--values cilium-values.yaml
# Expected output:
# NAME: cilium
# LAST DEPLOYED: [timestamp]
# NAMESPACE: kube-system
# STATUS: deployed
Step 3: Monitor the Installation
# Watch Cilium pods starting up
# DaemonSet means one pod per node
kubectl -n kube-system rollout status daemonset/cilium
# Expected output:
# Waiting for daemon set "cilium" rollout to finish: 0 of 7 updated...
# Waiting for daemon set "cilium" rollout to finish: 3 of 7 updated...
# daemon set "cilium" successfully rolled out
# Check all Cilium components
kubectl -n kube-system get pods -l app.kubernetes.io/part-of=cilium
# Expected output:
NAME READY STATUS RESTARTS
cilium-7xkg9 1/1 Running 0
cilium-8xkg2 1/1 Running 0
cilium-operator-6f9cbd4d7c-2nvpt 1/1 Running 0
cilium-operator-6f9cbd4d7c-xrzdr 1/1 Running 0
hubble-relay-7d4d6cb8c5-kzb4n 1/1 Running 0
hubble-ui-64d4995d57-g5zhj 1/1 Running 0
Step 4: Verify Nodes Are Now Ready
This is the moment of truth - your nodes should transition from NotReady to Ready:
# Check node status
kubectl get nodes
# Expected output - all nodes Ready!
NAME STATUS ROLES AGE VERSION
talos-cp-01 Ready control-plane 20m v1.34.0
talos-cp-02 Ready control-plane 18m v1.34.0
talos-cp-03 Ready control-plane 16m v1.34.0
talos-wrk-01 Ready worker 14m v1.34.0
talos-wrk-02 Ready worker 14m v1.34.0
# If still NotReady after 2-3 minutes, check Cilium logs:
kubectl -n kube-system logs -l app.kubernetes.io/name=cilium-agent --tail=50 || \
kubectl -n kube-system logs -l k8s-app=cilium --tail=50
Configuring L2 Load Balancer
Now let's set up load balancing. This replaces MetalLB with Cilium's native functionality.
Important: If you previously ran MetalLB, make sure it's removed/disabled to avoid ARP conflicts.
# Clean up MetalLB if it was installed:
kubectl delete ns metallb-system --ignore-not-found
# If you used CRDs:
kubectl get crd | grep metallb | awk '{print $1}' | xargs -r kubectl delete crd
# If traffic doesn't move after switching from MetalLB, flush ARP on client:
# Linux
ip neigh flush all
# macOS
sudo arp -d -a # flush all if needed
Understanding L2 Load Balancing
What is L2 (Layer 2) Load Balancing?
L2 refers to the Data Link layer of networking. In simple terms:
- Your router uses ARP (Address Resolution Protocol) to find devices
- L2 load balancing makes Cilium respond to ARP requests for service IPs
- This gives your services external IPs that work on your local network
Think of it like this:
- Service needs external IP (e.g., 192.168.0.201)
- Router asks "who has 192.168.0.201?"
- Cilium responds "I do!"
- Traffic flows to your service
IPv6 Note: For dual-stack clusters, L2 announcements use NDP (Neighbor Discovery Protocol) for IPv6. Create an IPv6 pool and setloadBalancerIPs: trueto cover both protocols. Enable IPv6 forwarding on nodes:sysctl -w net.ipv6.conf.all.forwarding=1
Step 1: Create IP Address Pool
First, define which IP addresses Cilium can hand out to services:
# cilium-lb-ipam-pool.yaml
# Note: Using v2alpha1 for Cilium v1.18; check docs for your version
apiVersion: cilium.io/v2alpha1
kind: CiliumLoadBalancerIPPool
metadata:
name: default-pool
spec:
blocks:
- start: 192.168.0.201 # First IP to hand out
stop: 192.168.0.210 # Last IP to hand out
# Why: Gives us 10 IPs for LoadBalancer services
# IMPORTANT: Ensure the pool (192.168.0.201–.210) does NOT
# overlap with DHCP scopes or static assignments, or you'll
# see intermittent ARP flaps.
serviceSelector:
matchLabels:
lb-pool: default # Services can request this pool with a label
# Why: Allows multiple pools for different purposes
Step 2: Configure L2 Announcement Policy
Tell Cilium how to announce these IPs on your network:
# l2-announcement-policy.yaml
apiVersion: cilium.io/v2alpha1
kind: CiliumL2AnnouncementPolicy
metadata:
name: default-l2-policy
spec:
interfaces:
- '^(eth0|ens.*|enp[0-9s]+)$' # Regex matching your interface(s)
# CRITICAL: Must match your actual interface!
# Check with: ip link show
externalIPs: true # Announce manually assigned external IPs
loadBalancerIPs: true # Announce LoadBalancer service IPs
nodeSelector:
matchLabels:
kubernetes.io/os: linux # Which nodes should announce
# Why: All nodes can announce for redundancy
Step 3: Apply the Configuration
# Create both resources
kubectl apply -f cilium-lb-ipam-pool.yaml
kubectl apply -f l2-announcement-policy.yaml
# Verify IP pool was created
kubectl get ciliumloadbalancerippools
# Expected output:
NAME DISABLED CONFLICTING IPS AVAILABLE AGE
default-pool false False 10 30s
# ^^ 10 IPs available
# Check the announcement policy
kubectl get ciliuml2announcementpolicies
# Expected output:
NAME AGE
default-l2-policy 30s
Troubleshooting Interface Names
If load balancing doesn't work, the interface name is usually wrong:
# Find your actual interface name
ip link show | grep -E "^[0-9]+:"
# Example output:
# 1: lo: <LOOPBACK,UP,LOWER_UP>
# 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP>
# 3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP>
# In this case, use "eth0"
# For Proxmox VMs, it might be "ens18"
# For Ubuntu, often "enp0s3" or similar
Verifying Cilium Health
Let's thoroughly verify Cilium is working correctly:
Check Overall Status
# Run Cilium status check
cilium status --wait
# Expected output (the art is Cilium's logo!):
/¯¯\
/¯¯\__/¯¯\ Cilium: OK
\__/¯¯\__/ Operator: OK
/¯¯\__/¯¯\ Envoy DaemonSet: disabled (normal - Envoy runs embedded in Cilium agent)
\__/¯¯\__/ Hubble Relay: OK
\__/ ClusterMesh: disabled (normal - single cluster)
# Multi-cluster? You'll add ClusterMesh in a later part
# What each component does:
# - Cilium: Main networking agent on each node
# - Operator: Manages IP allocation and garbage collection
# - Hubble Relay: Collects network observability data
# Verify each node has a podCIDR assigned (required for kubernetes IPAM mode)
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.podCIDR}{"\n"}{end}'
# All nodes should show a CIDR (e.g., 10.244.0.0/24). If any are blank, fix your
# controller-manager or Talos podSubnets configuration before proceeding.
Run Connectivity Tests
Cilium includes comprehensive network tests:
# This runs comprehensive network tests
# Takes about 5 minutes to complete
cilium connectivity test --collect-sysdump-on-failure
# If the connectivity test complains about missing CRDs/RBAC, ensure they're present:
# Get the Cilium version (not chart version) for CLI commands:
CILIUM_VERSION=$(helm show chart cilium/cilium --version "$CILIUM_CHART_VERSION" | awk '/^appVersion:/ {print $2}')
cilium install --version "$CILIUM_VERSION" --dry-run --print-config > /dev/null
# You'll see tests like:
# ✅ [pod-to-pod] Testing connectivity...
# ✅ [pod-to-service] Testing service connectivity...
# ✅ [pod-to-external] Testing external connectivity...
# ✅ [network-policy] Testing policy enforcement...
# Final output will show all tests passed
# (exact count varies by Cilium version)
# If any test fails, it will show:
# ❌ Test "pod-to-pod" failed: connection timeout
# This helps identify specific network issues
Verify Node Connectivity
# List all nodes and their Cilium status (agent CLI - run inside pod)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg node list || \
POD=$(kubectl -n kube-system get pods -l app.kubernetes.io/name=cilium-agent -o jsonpath='{.items[0].metadata.name}') && \
kubectl -n kube-system exec -it "$POD" -c cilium-agent -- cilium-dbg node list
# Portable fallback if kubectl exec ds/cilium is flaky:
POD=$(kubectl -n kube-system get pods -l app.kubernetes.io/name=cilium-agent -o jsonpath='{.items[0].metadata.name}')
kubectl -n kube-system exec -it "$POD" -c cilium-agent -- cilium-dbg node list
# Expected output:
NAME STATUS AGE ENDPOINT
talos-cp-01 OK 5m 192.168.0.11:4240
talos-cp-02 OK 5m 192.168.0.12:4240
talos-cp-03 OK 5m 192.168.0.13:4240
talos-wrk-01 OK 4m 192.168.0.14:4240
talos-wrk-02 OK 4m 192.168.0.15:4240
# The endpoint port 4240 is Cilium's health check port
# All nodes should show "OK" status
Verify eBPF Programs
Check that eBPF programs are loaded in the kernel:
# List eBPF programs for endpoints (run inside cilium agent pod)
# Note: These commands require cilium-dbg inside the agent pod, not cilium-cli
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg endpoint list
# Example output:
IP ADDRESS IDENTITY LABELS
10.244.0.125 4 k8s:app=nginx
10.244.1.89 5 k8s:app=redis
# Check eBPF map usage (advanced)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg bpf metrics
# Shows eBPF memory usage and map pressure
# High pressure indicates need for tuning
Check DNS Resolution
Verify CoreDNS is now working (it was Pending before):
# CoreDNS should now be Running
kubectl get pods -n kube-system -l k8s-app=kube-dns
# Expected output:
NAME READY STATUS RESTARTS
coredns-565d847f9-abc123 1/1 Running 0
coredns-565d847f9-def456 1/1 Running 0
# Test DNS from a pod
kubectl run test-dns --image=busybox:1.36 --rm -it --restart=Never -- \
sh -c 'nslookup kubernetes.default || wget -qO- http://kubernetes.default'
# Expected output:
# Server: 10.96.0.10 # CoreDNS IP (from Service CIDR, default 10.96.0.0/12)
# Address: 10.96.0.10:53 # If you changed Service CIDR, this IP will differ
# Name: kubernetes.default.svc.cluster.local
# Address: 10.96.0.1 # Kubernetes API Service IP
Testing Load Balancer
Let's verify the load balancer works by deploying a test application:
Create a Test Application
# test-lb-service.yaml
apiVersion: v1
kind: Service
metadata:
name: test-lb
labels:
lb-pool: default # Request IP from our default pool (matches pool label selector)
annotations: # Override LB behavior per-service
service.cilium.io/lb-algorithm: maglev # or: random|round_robin
service.cilium.io/lb-mode: dsr # or: snat|hybrid
# io.cilium/lb-ipam-ips: "192.168.0.205" # Request specific static IP (optional)
# io.cilium/lb-ipam-ips: "192.168.0.205,192.168.0.206" # Multiple IPs supported (comma-separated)
spec:
type: LoadBalancer # This triggers Cilium to assign an external IP
externalTrafficPolicy: Cluster # Spreads traffic across all nodes, client IP NOT preserved
# To preserve client IP: Change to 'Local' (only nodes with backends handle traffic)
# DSR clarification:
# - DSR + Cluster policy: Usually doesn't preserve client IP (but may in some eBPF paths)
# - DSR + Local policy: Preserves client IP BUT drops traffic to nodes without backends
# If you MUST guarantee client IP preservation, use Local and ensure backends on every node
# that might receive traffic, or requests to "empty" nodes will drop (by design).
# If clients can't route directly to node IPs (separate VLANs, L3 hops), prefer lb-mode: snat
# DSR may silently "work sometimes" with routing issues. Use snat for reliability.
ports:
- port: 80 # External port
targetPort: 8080 # Container port
selector:
app: test # Which pods to send traffic to
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-deployment
spec:
replicas: 3 # Run 3 copies for load balancing test
selector:
matchLabels:
app: test
template:
metadata:
labels:
app: test # Label that service uses to find pods
spec:
containers:
- name: echo
image: ealen/echo-server:latest # Automatically shows hostname/pod info
ports:
- containerPort: 8080
env:
- name: PORT
value: "8080"
Deploy and Test
# Create the test application
kubectl apply -f test-lb-service.yaml
# Watch the service get an external IP
kubectl get svc test-lb -w
# Expected output (IP appears after ~10 seconds):
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
test-lb LoadBalancer 10.96.10.100 <pending> 80:31234/TCP
test-lb LoadBalancer 10.96.10.100 192.168.0.201 80:31234/TCP
# ^^^^^^^^^^^^ ^^^^^^^^^^^^^
# Service CIDR Announced LB IP
# Note: CLUSTER-IP from Service CIDR (default 10.96.0.0/12); changes if you customized serviceSubnets
# Press Ctrl+C to stop watching
Test From Your Network
# Test from your workstation (not inside Kubernetes)
curl http://192.168.0.201
# Expected output (echo-server shows pod info):
# {"host":{"hostname":"test-deployment-7b8f5c6d7-abc123", ...}}
# If connection fails, flush ARP cache:
# On your client machine:
# Linux
ip neigh flush to 192.168.0.201
# macOS
sudo arp -d 192.168.0.201 || true
sudo arp -d -a # flush all if still sticky
# Test multiple times to see load balancing
for i in {1..10}; do
curl -s http://192.168.0.201 | grep -o '"hostname":"[^"]*"'
done
# You should see different pod names, proving load balancing works:
# "hostname":"test-deployment-7b8f5c6d7-abc123"
# "hostname":"test-deployment-7b8f5c6d7-def456"
# "hostname":"test-deployment-7b8f5c6d7-ghi789"
# ...
# Verify client IP preservation (from a non-cluster client):
curl -s http://192.168.0.201 | jq -r '.headers."X-Forwarded-For", .headers."X-Real-Ip", .remote_addr'
# With DSR + Cluster: May show node IP or client IP depending on eBPF path
# With DSR + Local: Shows real client IP (if backend exists on receiving node)
# With SNAT mode: Always shows node IP
# Clean up the test
kubectl delete -f test-lb-service.yaml
Troubleshooting Load Balancer Issues
If the external IP stays <pending>:
# Check if Cilium assigned an IP
kubectl get svc test-lb -o jsonpath='{.status.loadBalancer}'
# Check Cilium's IP allocation
kubectl get ciliumloadbalancerippools default-pool -o yaml
# Look for allocation events
kubectl describe svc test-lb | grep -A5 Events
# Check Cilium agent logs for L2 announcements
kubectl -n kube-system logs -l app.kubernetes.io/name=cilium-agent | grep -i "l2\|arp"
# If your chart uses k8s-app=cilium:
kubectl -n kube-system logs -l k8s-app=cilium --tail=100
# Watch service events for debugging
kubectl get events -A --field-selector \
involvedObject.kind=Service,involvedObject.name=test-lb -w
# Make sure your router/DHCP server isn't leasing 192.168.0.201–210
# Check DHCP leases on your router's admin panel
Important: L2 announcements don't cross routers/VLANs. Ensure the LB IP is on the same L2 segment as the announcing nodes, or use BGP mode instead.
Network Policies
Network policies are firewall rules for your pods. Let's implement zero-trust networking where pods can't communicate unless explicitly allowed.
Understanding Network Policies
Think of network policies like security guards:
- Default: All pods can talk to all pods (apartment building with no locks)
- With policies: Only allowed connections work (secured building with access control)
Create a Default Deny Policy
This policy blocks all traffic by default - the foundation of zero-trust:
# default-deny-all.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
# Note: namespace will be set when applying to specific namespaces
spec:
podSelector: {} # Empty selector = all pods in namespace
policyTypes:
- Ingress # Block incoming traffic
- Egress # Block outgoing traffic
# Result: Complete isolation
Allow DNS (Essential for Pods)
Pods need DNS to resolve service names. This policy allows only DNS traffic:
# allow-dns-egress.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns-egress
# Note: namespace will be set when applying to specific namespaces
spec:
podSelector: {} # Apply to all pods
policyTypes:
- Egress # Only controlling outgoing traffic
egress:
- to: # Allow traffic to...
- namespaceSelector: # Pods in namespace with this label
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector: # That also have this label
matchLabels:
k8s-app: kube-dns # CoreDNS pods (Talos default)
# Alternative label (check your cluster):
# app.kubernetes.io/name: coredns
ports:
- protocol: UDP # DNS uses UDP
port: 53
- protocol: TCP # Sometimes TCP for large responses
port: 53
Apply Security Policies
# Create the policy files
cat > default-deny-all.yaml <<'EOF'
[paste the yaml above]
EOF
cat > allow-dns-egress.yaml <<'EOF'
[paste the yaml above]
EOF
# Create namespaces with security policies
for ns in production staging development; do
echo "Setting up namespace: $ns"
# Create namespace
kubectl create namespace $ns
# Apply deny-all policy (zero-trust baseline)
sed "s/# Note: namespace will be set when applying to specific namespaces/namespace: $ns/" default-deny-all.yaml | kubectl apply -f -
# Allow DNS (pods need this to function)
sed "s/# Note: namespace will be set when applying to specific namespaces/namespace: $ns/" allow-dns-egress.yaml | kubectl apply -f -
done
# Verify policies are active
kubectl get networkpolicies -A
# Expected output:
NAMESPACE NAME POD-SELECTOR AGE
production default-deny-all <none> 10s
production allow-dns-egress <none> 5s
staging default-deny-all <none> 10s
staging allow-dns-egress <none> 5s
development default-deny-all <none> 10s
development allow-dns-egress <none> 5s
Test Network Policies
Verify the policies are working:
# Deploy a test pod in production namespace
kubectl run test-pod --image=busybox --namespace=production -- sleep 3600
# Try to ping another pod (should fail due to policy)
kubectl exec -n production test-pod -- ping -c 1 10.244.0.1
# Expected output:
# PING 10.244.0.1: 56 data bytes
# --- 10.244.0.1 ping statistics ---
# 1 packets transmitted, 0 packets received, 100% packet loss
# But DNS should work
kubectl exec -n production test-pod -- nslookup kubernetes.default
# Expected output:
# Server: 10.96.0.10
# Address: 10.96.0.10:53
# Name: kubernetes.default.svc.cluster.local
# Clean up
kubectl delete pod test-pod -n production
Example: Allow Specific Communication
Here's how to allow pods to communicate:
# allow-web-to-db.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-web-to-database
namespace: production
spec:
podSelector:
matchLabels:
app: database # Apply to database pods
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: web # Allow from web pods
ports:
- protocol: TCP
port: 5432 # PostgreSQL port
Hubble Observability
Hubble is Cilium's observability platform - think of it as X-ray vision for your network. You can see every packet, every connection, and every policy decision.
Access Hubble UI
# Forward Hubble UI to your local machine
# & runs it in background so you can continue using terminal
kubectl port-forward -n kube-system svc/hubble-ui 8080:80 &
KPF_UI=$! # Capture PID for cleanup
# Open your browser to:
# http://localhost:8080
# Stop the port-forward when done:
# kill "$KPF_UI"
# You'll see a visual map of your pods and their connections!
Using Hubble CLI
For command-line visibility:
# Install Hubble CLI
export HUBBLE_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/hubble/master/stable.txt)
OS=$(uname -s | tr '[:upper:]' '[:lower:]') # linux or darwin
ARCH=$(uname -m); [ "$ARCH" = "x86_64" ] && ARCH="amd64" || ARCH="arm64"
curl -L --remote-name-all https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-${OS}-${ARCH}.tar.gz{,.sha256sum}
# Verify checksum (works on Linux and macOS)
if command -v sha256sum >/dev/null 2>&1; then
sha256sum --check hubble-${OS}-${ARCH}.tar.gz.sha256sum
else
shasum -a 256 -c hubble-${OS}-${ARCH}.tar.gz.sha256sum
fi
sudo tar xzvf hubble-${OS}-${ARCH}.tar.gz -C /usr/local/bin
rm hubble-${OS}-${ARCH}.tar.gz{,.sha256sum}
# Port-forward Hubble Relay
kubectl port-forward -n kube-system svc/hubble-relay 4245:80 &
KPF1=$! # Capture PID for cleanup
# Check Hubble status
hubble status --server localhost:4245
# Watch live traffic
hubble observe --server localhost:4245 --namespace default
# Example output:
# TIMESTAMP SOURCE DESTINATION TYPE VERDICT
# Dec 28 09:15:23.123 web-pod-123 db-pod-456 L4/TCP FORWARDED
# Dec 28 09:15:23.124 db-pod-456 web-pod-123 L4/TCP FORWARDED
# Filter for dropped packets (great for debugging policies)
hubble observe --server localhost:4245 --verdict DROPPED
# Filter by labels (helpful for debugging specific apps)
hubble observe --server localhost:4245 --from-label app=web --to-label app=database
# See HTTP traffic
hubble observe --server localhost:4245 --protocol http
# Stop the port-forward when done
kill "$KPF1" # Or use fg then Ctrl+C
What to Look for in Hubble
- DROPPED verdicts: Network policies blocking traffic
- Connection patterns: Which pods talk to which
- Latency issues: Slow responses between services
- Policy violations: Attempted connections that were blocked
Migration Story: Flannel to Cilium
Let me share my painful 6-hour CNI migration story so you can learn from my mistakes:
The 6-Hour Outage Timeline
Hour 1: "This should be quick" (Famous last words)
# What I did (DON'T DO THIS!)
kubectl delete -n kube-system daemonset kube-flannel-ds
helm install cilium cilium/cilium
# Immediate result:
kubectl get nodes
# All nodes NotReady - cluster broken!
Hour 2: Authentication Failures
# Cilium pods crashlooping
kubectl -n kube-system logs cilium-xxxxx
# "Failed to initialize Kubernetes client: Unauthorized"
# Problem: Service account tokens were stale
# Fix: Force kubelet to regenerate tokens
for node in $(kubectl get nodes -o name); do
talosctl -n ${node##*/} service kubelet restart
done
Hour 3: IP Range Chaos
# Pods had wrong IPs
kubectl get pods -o wide
# IPs showing 10.245.x.x instead of 10.244.x.x
# Problem: Old Flannel config still cached
# Fix: Complete node reboot to clear IP allocations
talosctl -n 192.168.0.11-17 reboot
Hour 4: The DNS Disaster
# No pod could resolve DNS
kubectl exec test-pod -- nslookup kubernetes
# ;; connection timed out; no servers could be reached
# Problem: forwardKubeDNSToHost was still true!
# This was THE critical issue - took 2 hours to find
# Fix: Applied the patch from Part 2
talosctl patch machineconfig -n 192.168.0.11-17 \
--patch '{"machine":{"features":{"hostDNS":{"forwardKubeDNSToHost":false}}}}'
Hour 5: Storage System Offline
# All persistent volumes broken
kubectl get pv
# All showing "Released" or "Failed"
# CSI pods couldn't reach API server
# Had to delete and recreate entire CSI deployment
kubectl -n longhorn-system delete pods --all
Hour 6: The Cleanup
# 35 pods still had Flannel IPs cached
# Systematic restart of everything
for ns in $(kubectl get ns -o name); do
kubectl -n ${ns##*/} rollout restart deployment
kubectl -n ${ns##*/} rollout restart statefulset
kubectl -n ${ns##*/} rollout restart daemonset
done
# Finally working!
Root Causes Analysis
After post-mortem analysis, here's what actually went wrong:
- CNI State Persistence: Flannel left behind:
- IP allocations in
/var/lib/cni/ - Network namespaces in
/var/run/netns/ - iptables rules that conflicted with Cilium
- IP allocations in
- The DNS Configuration:
forwardKubeDNSToHost: truecauses:- Host to query CoreDNS for cluster services
- CoreDNS returns cluster IPs
- Cilium eBPF doesn't masquerade these properly
- Result: Complete DNS failure
- Service Account Tokens: When CNI changes:
- Existing tokens become invalid
- Pods can't authenticate to API server
- Kubelet restart forces new tokens
Lessons Learned
- CNI migration requires careful planning
- Fresh installation is much safer than migration
- If you must migrate, plan for downtime and have rollback procedures
- Backup etcd before changes
- Document original configuration
- Test recovery procedure
- IP ranges must match with Talos configuration:
- Talos
podSubnets: 10.244.0.0/16 - Cilium IPAM mode 'kubernetes' uses Kubernetes node allocations
- Kubernetes allocates subnets from the Talos-defined pod CIDR
- Talos
Clean state between CNIs:
# If you MUST migrate, clean everything:
rm -rf /var/lib/cni/
rm -rf /var/run/netns/
iptables -F
iptables -t nat -F
iptables -t mangle -F
Have a rollback plan:
Talos Note: On Talos (immutable OS), don't try to modify/var/lib/cnior iptables directly; reboot withtalosctl rebootto clear stale CNI state.
Critical configuration for Cilium:
# In Talos config (MUST have):
machine:
features:
hostDNS:
forwardKubeDNSToHost: false # Critical!
Performance Tuning
After installation, optimize Cilium for your specific needs:
Monitor eBPF Performance
# Check BPF map usage (inside agent pod)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg bpf metrics
# Example output:
# BPF map pressure:
# cilium_ct4_global: 15% (7864/52428) # Connection tracking
# cilium_lb4_services: 2% (10/512) # Load balancer services
# cilium_ipcache: 8% (412/5000) # IP cache
# If any map shows >80%, increase its size (see below)
Tune Connection Tracking
For high-traffic environments, increase map sizes:
# Note: Prefer setting map sizes via Helm values / cilium-config ConfigMap
# for your exact version; env var names may change
#
# Increase connection tracking tables
# CT = Connection Tracking, NAT = Network Address Translation
kubectl -n kube-system set env daemonset/cilium \
CILIUM_CNP_NODE_STATUS_GC_INTERVAL=5m \
CILIUM_CT_MAP_SIZE=524288 \
CILIUM_NAT_MAP_SIZE=524288
# What these do:
# CNP_NODE_STATUS_GC_INTERVAL: How often to clean up old entries
# CT_MAP_SIZE: Max concurrent connections (524K = good for production)
# NAT_MAP_SIZE: Max NAT translations
# Restart Cilium to apply
kubectl -n kube-system rollout restart daemonset/cilium
Enable Bandwidth Manager
For QoS (Quality of Service) and traffic shaping:
(Requires kernel with TC eBPF & sch_ingress/cls_bpf modules; Talos already ships modern kernels with these)
# Add to cilium-values.yaml:
bandwidthManager:
enabled: true
# bbr: false # Include only if 'bbr' key exists in your chart version; enable BBR via sysctl below
# Or upgrade with:
helm upgrade cilium cilium/cilium \
--namespace kube-system \
--reuse-values \
--set bandwidthManager.enabled=true
# Optional: Enable BBR congestion control on each node (persist via your OS/Talos mechanism)
# sysctl -w net.core.default_qdisc=fq
# sysctl -w net.ipv4.tcp_congestion_control=bbr
# Now you can set bandwidth limits on pods using Kubernetes standard annotations:
kubectl annotate pod my-pod \
kubernetes.io/ingress-bandwidth=10M \
kubernetes.io/egress-bandwidth=10M
Performance Monitoring Commands
# Check Cilium agent performance
cilium status --all-health
# Monitor eBPF program execution time (inside agent pod)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- \
cilium-dbg bpf metrics | grep exec_time
# Check packet drops (agent CLI - run inside pod)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg monitor --type drop
# Or from your workstation using Hubble (preferred):
hubble observe --server localhost:4245 --verdict DROPPED
# Measure endpoint creation time
time kubectl run test --image=nginx --rm -it --restart=Never -- echo test
Troubleshooting Guide
Here are common issues and their solutions:
Pods Stuck in ContainerCreating
This usually means Cilium can't assign an IP:
# 1. Check Cilium agent logs for errors
# Note: Use cilium-dbg for debugging, cilium-cli for management
kubectl -n kube-system logs -l app.kubernetes.io/name=cilium-agent --tail=50 || \
kubectl -n kube-system logs -l k8s-app=cilium --tail=50
# For more detailed debugging, use cilium-dbg from within a Cilium pod:
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg status
# Look for errors like:
# "Unable to allocate IP" - IP pool exhausted
# "Failed to create endpoint" - eBPF program issue
# 2. Verify IP allocation for the node
kubectl get ciliumnodes <node-name> -o yaml | grep -A5 ipam
# Should show:
# ipam:
# pools:
# allocated: 5
# available: 251
# 3. Check for IP conflicts (inside agent pod)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg ip list
# 4. Describe the stuck pod for more details
kubectl describe pod <stuck-pod-name>
# Events section will show CNI errors
# 5. Check host firewall rules (if using ufw/firewalld/nftables)
# Some distros block ARP or NodePort traffic by default
# For quick triage, temporarily disable host firewall and retest:
# systemctl stop firewalld # or: ufw disable
Load Balancer IP Not Responding
When services get IPs but don't respond:
# 1. Verify L2 announcement policy exists
kubectl get ciliuml2announcementpolicies
# If missing, reapply from earlier in this guide
# 2. Check if Cilium is announcing the IP
kubectl -n kube-system logs -l app.kubernetes.io/name=cilium-agent | grep -i "l2.*192.168.0.201" || \
kubectl -n kube-system logs -l k8s-app=cilium | grep -i "l2.*192.168.0.201"
# Should see: "Announcing L2 IP 192.168.0.201"
# 3. Verify ARP on your router/workstation
arp -a | grep 192.168.0.201
# Should show MAC address
# If not, interface name might be wrong
# 4. Check the correct interface is being used
kubectl get ciliuml2announcementpolicies -o yaml | grep interface
# Must match actual interface from: ip link show
Network Policies Not Working
Policies exist but traffic still flows:
# 1. Check enforcement mode (inside agent pod)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg config view | grep -i policy
# Should show: PolicyEnforcementMode: default
# 2. Verify policies are loaded (inside agent pod)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg policy get
# 3. Check identity allocation (inside agent pod)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg identity list
# Pods need identities for policies to work
# If missing, restart Cilium
# 4. Monitor dropped packets (agent CLI - run inside pod)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg monitor --type drop
# Or from workstation using Hubble (preferred):
hubble observe --server localhost:4245 --verdict DROPPED
# Shows real-time policy violations
# If nothing shown, policy might not be applying
# 5. Test specific pod-to-pod connectivity manually
kubectl exec -n <namespace> <src-pod> -- curl -sS http://<dst-pod-ip>:<port>
# 6. Filter Hubble traffic by labels (helpful for debugging specific apps)
hubble observe --server localhost:4245 --from-label app=web --to-label app=database
DNS Resolution Failures
Pods can't resolve service names:
# 1. Check CoreDNS is running
kubectl get pods -n kube-system -l k8s-app=kube-dns
# If no pods found, try alternative label:
kubectl -n kube-system logs -l app.kubernetes.io/name=coredns --tail=100 || \
kubectl -n kube-system logs -l k8s-app=kube-dns --tail=100
# 2. Test DNS from a pod
kubectl run test-dns --image=busybox --rm -it --restart=Never -- nslookup kubernetes
# 3. Check if it's a policy issue
kubectl get networkpolicy -n <namespace>
# Ensure DNS egress is allowed (we did this earlier)
# 4. Verify Cilium's DNS proxy (inside agent pod)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg config view | grep -i dns
High Memory/CPU Usage
Cilium using too many resources:
# 1. Check current usage (requires metrics-server; if not installed, skip this)
# To install metrics-server if missing:
# kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
kubectl top pods -n kube-system | grep cilium
# 2. Review eBPF map sizes (inside agent pod)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg bpf metrics
# Large maps = more memory
# 3. Reduce map sizes if needed
# Method A: Environment variables (as above)
kubectl -n kube-system set env daemonset/cilium \
CILIUM_CT_MAP_SIZE=262144 # Reduce from 524288
# Method B: ConfigMap (more stable across upgrades)
kubectl -n kube-system edit configmap cilium-config
# Add or modify:
# bpf-ct-global-tcp-max: "524288"
# bpf-ct-global-any-max: "262144"
# bpf-nat-global-max: "524288"
# Then restart: kubectl -n kube-system rollout restart daemonset/cilium
# 4. Disable unused features
# Edit cilium-values.yaml and reinstall
Security Best Practices
Let's harden your Cilium installation for production:
Enable WireGuard Encryption
WireGuard encryption protects node-to-node pod traffic:
Understanding the Impact
- Performance cost: ~15% throughput reduction
- CPU usage: Increases by 10-20%
- Security benefit: Node-to-node pod traffic encrypted automatically
- No app changes: Completely transparent to applications
- Note: With
nodeEncryption: true, protects traffic between nodes (not same-node pod traffic)
Enable Encryption
# Upgrade Cilium with encryption enabled
helm upgrade cilium cilium/cilium \
--namespace kube-system \
--reuse-values \
--set encryption.enabled=true \
--set encryption.type=wireguard \
--set encryption.nodeEncryption=true
# Monitor the rollout
kubectl -n kube-system rollout status daemonset/cilium
# Verify encryption is active (run inside agent pod)
kubectl -n kube-system exec -it ds/cilium -c cilium-agent -- cilium-dbg encrypt status
# Expected output:
# Encryption: Wireguard
# Keys in use: 1
# Max keys: 2
# Nodes with encryption: 7/7
Test Encryption
# Capture traffic between pods to verify encryption
# Run tcpdump on a node (via Talos)
talosctl -n 192.168.0.14 tcpdump -i eth0 -c 10 host 10.244.1.5
# You should see encrypted WireGuard packets, not plain text
Implement Advanced Cilium Network Policies
Cilium policies are more powerful than standard Kubernetes policies:
Layer 7 (Application) Policy Example
This policy inspects HTTP traffic content. Note: Cilium runs Envoy embedded in the agent for L7; you don't need a separate Envoy DaemonSet.
# cilium-layer7-policy.yaml
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: api-access-control
namespace: production
spec:
endpointSelector:
matchLabels:
app: api-server # Apply to API pods
ingress:
- fromEndpoints:
- matchLabels:
app: web-frontend # Allow from frontend pods
toPorts:
- ports:
- port: "80"
protocol: TCP
rules:
http:
- method: "GET" # Only allow GET requests
path: "/api/public/.*" # Only public API paths
- method: "POST"
path: "/api/auth/login" # Allow login
headerMatches: # Match specific headers
- name: "Content-Type"
value: "application/json" # Require JSON
Apply and test:
# Apply the policy
kubectl apply -f cilium-layer7-policy.yaml
# Test allowed request (should work)
kubectl exec -n production web-pod -- \
curl -X GET http://api-server/api/public/health
# Test blocked request (should fail)
kubectl exec -n production web-pod -- \
curl -X DELETE http://api-server/api/users/1
# Monitor blocked requests
hubble observe --server localhost:4245 --verdict DROPPED --namespace production
DNS Security Policy
Restrict which domains pods can resolve:
# dns-restriction-policy.yaml
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: restrict-external-dns
namespace: production
spec:
endpointSelector:
matchLabels:
restricted: "true" # Pods with this label
egress:
- toFQDNs:
- matchPattern: "*.homelab.example" # Allow internal
- matchName: "github.com" # Allow specific external
- matchPattern: "*.docker.io" # Allow Docker Hub
- toEndpoints:
- matchLabels:
k8s-app: kube-dns # Always allow cluster DNS
toPorts:
- ports:
- port: "53"
protocol: UDP
Performance Benchmarks
Here are the real improvements I measured after switching to Cilium:
Test Methodology
# Latency test (use busybox for ping since iperf3 lacks it)
kubectl run ping-server --image=busybox:1.36 --image-pull-policy=IfNotPresent -- sleep 3600
kubectl run ping-client --image=busybox:1.36 --image-pull-policy=IfNotPresent -- sleep 3600
kubectl exec ping-client -- ping -c 1000 <server-pod-ip>
# Throughput test (use iperf3 for bandwidth)
kubectl run perf-server --image=networkstatic/iperf3 --image-pull-policy=IfNotPresent -- -s
kubectl run perf-client --image=networkstatic/iperf3 --image-pull-policy=IfNotPresent -- sleep 3600
kubectl exec perf-client -- iperf3 -c <server-pod-ip> -t 60
# Load balancing test
kubectl run -i --rm perf-test --image=rakyll/hey -- \
-n 100000 -c 50 http://<service-ip>
Results Comparison
| Metric | Flannel + MetalLB | Cilium Only | Improvement |
|---|---|---|---|
| Pod-to-Pod Latency (same node) | 0.28ms | 0.19ms | 32% lower |
| Pod-to-Pod Latency (cross-node) | 0.52ms | 0.41ms | 21% lower |
| Throughput (10Gb network) | 9.41 Gbps | 9.87 Gbps | 5% higher |
| Service Load Balancing | 45,000 rps | 52,000 rps | 15% higher |
| P99 Latency @ 10k rps | 12ms | 8ms | 33% lower |
| CPU per node (idle) | ~400m | ~250m | 37% less |
| Memory per node | ~320MB | ~380MB | 60MB more |
Why Cilium Performs Better
- eBPF in kernel space: No context switching between kernel and userspace
- No iptables overhead: Direct packet processing without rule traversal
- Efficient load balancing: Maglev algorithm with consistent hashing
- Native integration: Single component instead of CNI + LB combo
Resource Usage Over Time
# After 30 days running production workloads:
# Flannel + MetalLB: 68 iptables rules per node, growing
# Cilium: 0 iptables rules, all eBPF
# Connection tracking entries:
# Flannel: ~15,000 conntrack entries
# Cilium: Handled in eBPF maps, no conntrack pressure
What's Next
Your cluster now has enterprise-grade networking! Nodes are Ready, pods can communicate, services get external IPs, and network policies provide security. In Part 4, we'll add persistent storage with Rook-Ceph, enabling your applications to store data that survives pod restarts and node failures.
Key Takeaways
Through implementing Cilium and surviving a painful migration, here's what I learned:
- eBPF is a game-changer: 32% latency reduction and 37% less CPU usage is significant
- Native L2 load balancing works perfectly: No need for MetalLB complexity
forwardKubeDNSToHost: falseis CRITICAL: This single setting caused 6 hours of downtime- CNI migration is risky: Prefer fresh installations; if you must migrate, follow the official guide
- Hubble is invaluable: Visual network debugging saves hours
- Network policies from day one: Much easier than retrofitting security
- Interface names matter: Wrong interface = no load balancing
Troubleshooting Checklist
Save this for when things go wrong:
- [ ] Nodes NotReady? → Check Cilium pods are running
- [ ] Pods stuck Creating? → Check IP allocation with
kubectl -n kube-system exec ds/cilium -c cilium-agent -- cilium-dbg ip list
For migrations: Regular Linux:sudo rm -rf /var/lib/cni/* /var/run/netns/*
On Talos (immutable): Reboot nodes withtalosctl rebootto clear CNI state - [ ] LoadBalancer pending? → Verify L2 announcement policy and interface
- [ ] DNS not working? → Ensure DNS egress policy exists
- [ ] Policies not blocking? → Check with
hubble observe --server localhost:4245 --verdict DROPPED - [ ] High CPU usage? → Review eBPF map sizes
- [ ] Can't see traffic? → Use Hubble UI or CLI
For complex issues, collect a support bundle:
# Collect a support bundle (logs, events, config, BPF maps)
cilium sysdump --output-filename cilium-sysdump-$(date +%Y%m%d-%H%M%S).zip
Command Quick Reference
# Health checks
cilium status --wait
cilium connectivity test
kubectl get nodes
# Quick verbose status (surfaces datapath issues fast)
kubectl -n kube-system exec ds/cilium -c cilium-agent -- cilium-dbg status --verbose
# Troubleshooting
# If exec ds/cilium fails, use: POD=$(kubectl -n kube-system get pods -l app.kubernetes.io/name=cilium-agent -o jsonpath='{.items[0].metadata.name}')
kubectl -n kube-system exec ds/cilium -c cilium-agent -- cilium-dbg monitor --type drop # Watch drops
hubble observe --server localhost:4245 --verdict DROPPED # See policy violations
kubectl -n kube-system exec ds/cilium -c cilium-agent -- cilium-dbg bpf metrics # Check eBPF performance
# Load balancer
kubectl get ciliumloadbalancerippools
kubectl get ciliuml2announcementpolicies
# Network policies
kubectl get networkpolicies -A
kubectl get ciliumnetworkpolicies -A
# Rollback/Uninstall (if needed)
helm -n kube-system history cilium # Check revision history
helm -n kube-system rollback cilium <REVISION> # Rollback to specific revision
# helm -n kube-system uninstall cilium # Complete uninstall
# Note: After uninstall, nodes return to NotReady (no CNI) until reinstalling
References
Official Documentation
- Cilium Documentation v1.18 - Complete Cilium guide
- eBPF Introduction - Understanding eBPF technology
- Cilium Network Policies - Advanced policy examples
- Hubble Observability - Network visibility guide
Specific Issues and Solutions
- L2 Announcements Guide - Load balancer setup
- Talos + Cilium Integration - Official integration guide
- Cilium Troubleshooting - Common problems and fixes
Performance and Tuning
- eBPF Map Sizing - Tuning for scale
- Maglev Load Balancing - Google's algorithm paper
Continue to Part 4: Distributed Storage with Rook-Ceph →