The "Flannel No Subnet Available" error occurs when Flannel cannot allocate a pod network subnet to a node. This prevents pods from being assigned IP addresses, blocking pod scheduling. The error is usually caused by CIDR pool exhaustion or configuration issues.
Flannel manages pod IP address allocation by assigning a subnet to each node from a larger CIDR pool. When a new node joins: 1. Flannel tries to allocate a /24 subnet from the pod CIDR 2. Stores the allocation in etcd 3. If all subnets are exhausted, allocation fails 4. Pods cannot be assigned IPs on that node Example: - Pod CIDR: 10.244.0.0/16 (65,536 IPs, ~256 /24 subnets) - Node 1 gets: 10.244.0.0/24 - Node 2 gets: 10.244.1.0/24 - Node 256 gets: 10.244.255.0/24 - Node 257: ERROR - no subnet available
View which subnets are allocated:
# Check pod CIDR in cluster config:
kubectl cluster-info dump | grep -i "pod-network-cidr"
# View subnet allocations for each node:
kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}'
# Or more detailed:
kubectl get nodes -o json | jq '.items[] | {name: .metadata.name, podCIDR: .spec.podCIDR}'
# Check Flannel subnet data in etcd:
kubectl exec -it <etcd-pod> -n kube-system -- etcdctl get --prefix /coreos.com/network/subnetsNote: You need etcd access (usually in kube-system namespace).
Check the configured pod CIDR:
# View Flannel config:
kubectl get cm kube-flannel-cfg -n kube-flannel -o yaml
# Extract Network:
kubectl get cm kube-flannel-cfg -n kube-flannel -o jsonpath='{.data.net-conf\.json}' | jq .Network
# Expected output: "10.244.0.0/16" or similar
# Also check SubnetLen (subnet size, default /24):
kubectl get cm kube-flannel-cfg -n kube-flannel -o jsonpath='{.data.net-conf\.json}' | jq .SubnetLenDefault is /24, which gives 256 subnets for a /16 network.
Determine if you've run out of subnets:
# Count nodes:
kUBECTL_NODES=$(kubectl get nodes --no-headers | wc -l)
echo "Total nodes: $KUBECTL_NODES"
# Get pod CIDR:
POD_CIDR=$(kubectl cluster-info dump | grep "pod-network-cidr" | head -1 | cut -d= -f2)
echo "Pod CIDR: $POD_CIDR"
# For /16 with /24 subnets: 256 subnets max
# For /16 with /25 subnets: 512 subnets max
# If KUBECTL_NODES > 256, you need a larger CIDR or smaller subnets
if [ $KUBECTL_NODES -gt 256 ]; then
echo "ERROR: Cluster has more than 256 nodes but using /16 CIDR with /24 subnets"
fiCalculate available subnets: (2^(16-24)) = 256 with default settings.
Find orphaned subnet allocations:
# Get all allocated subnets from etcd:
kubectl exec -it <etcd-pod> -n kube-system -- etcdctl get --prefix /coreos.com/network/subnets | less
# Get existing nodes:
kubectl get nodes -o jsonpath='{.items[*].metadata.name}'
# Compare: if etcd has subnets for nodes that don't exist, they're stale
# Manually delete stale allocations (dangerous—do with caution):
kubectl exec <etcd-pod> -n kube-system -- etcdctl del /coreos.com/network/subnets/<stale-key>Stale allocations prevent new nodes from getting subnets.
If exhausted, you need a larger network (full cluster migration):
Option 1: Use larger CIDR on new cluster
# When creating cluster, use /14 instead of /16:
kubeadm init --pod-network-cidr=10.244.0.0/14 # 4x larger (1024 subnets with /24)Option 2: Migrate to new cluster
1. Create new cluster with larger CIDR
2. Drain old cluster pods: kubectl drain --all-namespaces
3. Migrate data/state
4. Point traffic to new cluster
Option 3: Change subnet size (not recommended)
# Use smaller subnets (/25 = 512 subnets per /16):
kubectl patch cm kube-flannel-cfg -n kube-flannel -p '{"data":{"net-conf.json":"...SubnetLen: 25..."}}'Smaller subnets = more nodes supported, but less IP per node.
Remove nodes that no longer exist:
# Find nodes in NotReady state:
kubectl get nodes
# Drain pod before deletion:
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data
# Delete the node:
kubectl delete node <node-name>
# Verify:
kubectl get nodesDeleting nodes also frees their subnet allocations.
Force Flannel to recalculate subnet assignments:
# Restart the Flannel DaemonSet:
kubectl rollout restart daemonset/kube-flannel -n kube-flannel
# Watch for pod restarts:
kubectl get pods -n kube-flannel -w
# Check logs:
kubectl logs -n kube-flannel -l app=flannel -f
# Monitor node pod CIDR assignment:
watch kubectl get nodes -o wideAfter restart, verify all nodes have podCIDR assignments.
Set up monitoring to prevent future exhaustion:
# Create Prometheus alert:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: flannel-alerts
spec:
groups:
- name: flannel
rules:
- alert: FlannelSubnetExhaustion
expr: |
count(kube_node_labels{label_kubernetes_io_os="linux"}) > 240
annotations:
summary: "Approaching Flannel subnet limit (240+ nodes)"Planning:
- If using /16 CIDR, maximum ~256 nodes
- For larger clusters, use /14 (1024 nodes) or /12 (4096 nodes)
- Monitor: kubectl get nodes | wc -l
Subnet exhaustion is usually discovered during rapid scaling. Default /16 CIDR with /24 subnets supports exactly 256 nodes—this can be tight for large deployments. For production, plan CIDR carefully: /14 is common (1024 nodes), /12 for very large clusters (4096 nodes). Stale allocations in etcd are rare but can happen if nodes are force-deleted without proper cleanup. MultiUS (multi-subnet) mode is not Flannel's strength; for large deployments, consider Calico or other more advanced CNIs. Flannel stores all state in etcd; etcd performance degrades with large cluster sizes (etcd is not designed for 10k+ keys). Monitor etcd size: etcdctl endpoint status. For migration paths: use in-place kubeadm upgrade to /14 CIDR if possible, or planned migration to new cluster.
Failed to connect to server: connection refused (HTTP/2)
How to fix "HTTP/2 connection refused" error in Kubernetes
missing request for cpu in container
How to fix "missing request for cpu in container" in Kubernetes HPA
error: invalid configuration
How to fix "error: invalid configuration" in Kubernetes
etcdserver: cluster ID mismatch
How to fix "etcdserver: cluster ID mismatch" in Kubernetes
running with swap on is not supported
How to fix "running with swap on is not supported" in kubeadm