AKS kubelet identity errors occur when the cluster cannot authenticate or access resources due to missing permissions or misconfigured managed identity assignments. Learn how to diagnose identity issues and restore proper RBAC configuration.
The AKS kubelet identity is the managed identity that the kubelet (the Kubernetes node agent) uses to authenticate and access Azure resources. When this identity is misconfigured, improperly assigned, or lacks required permissions, kubelet operations fail, preventing the cluster from pulling images from Azure Container Registry (ACR), accessing Key Vault secrets, or performing other necessary Azure operations. A kubelet identity error occurs when there's a mismatch between the identity assigned to the node and the permissions granted to that identity. This can happen during cluster creation, upgrades, or when manually modifying identity assignments. Common scenarios include changing the kubelet identity from one managed identity to another, creating an AKS cluster with a user-assigned kubelet identity without proper permissions, or losing permissions after cluster updates.
Check which managed identity is assigned to your AKS cluster:
az aks show --resource-group <resource-group-name> --name <cluster-name> --query "identityProfile.kubeletidentity"This will return the resource ID of the currently assigned kubelet identity. Note this for the next steps.
Verify that the kubelet identity has the AcrPull role on your Azure Container Registry:
az role assignment list --assignee <kubelet-identity-id> --scope <acr-resource-id>Look for an assignment with the role "AcrPull" or "Contributor". If none exists, proceed to the next step.
If the kubelet identity lacks ACR permissions, assign the AcrPull role:
az role assignment create --assignee <kubelet-identity-id> --role AcrPull --scope <acr-resource-id>Replace placeholders:
- <kubelet-identity-id>: The client ID or principal ID from step 1
- <acr-resource-id>: The full resource ID of your ACR (e.g., /subscriptions/.../registries/myregistry)
Wait 60-120 seconds for the role assignment to propagate through Azure RBAC.
After granting permissions, restart the nodes to refresh the managed identity token cache:
az aks nodepool upgrade --resource-group <resource-group-name> --cluster-name <cluster-name> --name <nodepool-name> --kubernetes-version <version>Alternatively, manually restart nodes:
kubectl cordon -l agentpool=<nodepool-name>
kubectl drain -l agentpool=<nodepool-name> --ignore-daemonsets --delete-emptydir-data
# After nodes restart, uncordon them:
kubectl uncordon -l agentpool=<nodepool-name>Or use Azure CLI to reimage nodes:
az aks nodepool reimage --resource-group <resource-group-name> --cluster-name <cluster-name> --nodepool-name <nodepool-name>If you need to assign a custom user-assigned managed identity as the kubelet identity, ensure the identity has the "Microsoft.ManagedIdentity/userAssignedIdentities/assign/action" permission.
Create a custom role with this action:
cat > custom-role.json << EOF
{
"Name": "Kubelet Identity Assigner",
"Description": "Allow assigning user identities as kubelet identity",
"Actions": [
"Microsoft.ManagedIdentity/userAssignedIdentities/assign/action"
],
"AssignableScopes": [
"/subscriptions/<subscription-id>"
]
}
EOF
az role definition create --role-definition @custom-role.jsonThen assign this custom role to your user/service principal at the identity resource level (not just resource group level).
After upgrading your AKS cluster, if you used --attach-acr during cluster creation, re-run the attach command to ensure the new kubelet identity gets ACR permissions:
az aks update --resource-group <resource-group-name> --name <cluster-name> --attach-acr <acr-name>This automatically grants the AcrPull role to the kubelet identity for that ACR.
Create a test pod to verify the kubelet can now pull images from ACR:
kubectl run test-pull --image=<acr-name>.azurecr.io/test-image:latest --image-pull-policy=Always
kubectl describe pod test-pull
kubectl logs test-pullIf the pod runs successfully, the kubelet identity permissions are restored. Delete the test pod:
kubectl delete pod test-pullManaged Identity vs Service Principal: Older AKS clusters use service principals; newer clusters use managed identities by default. Managed identities are more secure because Azure handles credential rotation automatically.
Kubelet Identity Latency: After assigning new RBAC roles to a managed identity, there can be 60-120 second propagation delays due to Azure RBAC caching. If steps still fail after granting permissions, wait and retry.
Bringing Your Own Kubelet Identity: You can pre-create a user-assigned managed identity and assign it as the kubelet identity during cluster creation:
az aks create --name <cluster-name> --resource-group <resource-group> --identity-type UserAssigned --assign-identity <identity-resource-id> --kubelet-identity <kubelet-identity-resource-id>This requires the identity and calling principal to have proper permissions before cluster creation.
Key Vault Integration: If your cluster needs to sync secrets from Azure Key Vault, the kubelet identity must have "Get" and "List" permissions on the vault. Grant these separately if needed:
az keyvault set-policy --name <vault-name> --object-id <kubelet-identity-principal-id> --secret-permissions get listCI/CD Considerations: If running AKS deployments through CI/CD (GitHub Actions, Azure Pipelines), ensure the kubelet identity has permissions before triggering deployments. Use role assignments scoped to the specific ACR to follow least-privilege principles.
Failed to connect to server: connection refused (HTTP/2)
How to fix "HTTP/2 connection refused" error in Kubernetes
missing request for cpu in container
How to fix "missing request for cpu in container" in Kubernetes HPA
error: invalid configuration
How to fix "error: invalid configuration" in Kubernetes
etcdserver: cluster ID mismatch
How to fix "etcdserver: cluster ID mismatch" in Kubernetes
running with swap on is not supported
How to fix "running with swap on is not supported" in kubeadm