The ACME challenge failed error occurs when cert-manager cannot validate domain ownership with Let's Encrypt during TLS certificate provisioning. This typically blocks certificate issuance and renewal, preventing HTTPS traffic. Common causes include DNS propagation issues, HTTP endpoint accessibility problems, and rate limiting violations.
This error indicates that your Kubernetes cluster, using cert-manager with Let's Encrypt, failed to complete the ACME challenge process. ACME (Automated Certificate Management Environment) requires domain ownership validation before issuing TLS certificates. The failure prevents certificate creation or renewal, which means your applications cannot serve HTTPS traffic. The error can occur at different stages: DNS validation (DNS-01), HTTP validation (HTTP-01), or self-verification checks.
First, inspect the challenge and order objects to understand why validation failed:
kubectl describe challenge
kubectl describe orderLook for error messages indicating DNS or HTTP validation failures. The output will show the specific challenge type (dns01 or http01) and any error messages from the validation attempt.
For HTTP-01 challenges, verify that Let's Encrypt servers can reach your challenge URL:
curl -I http://yourdomain.com/.well-known/acme-challenge/test-tokenFor DNS-01 challenges, verify TXT records are visible externally:
dig _acme-challenge.yourdomain.com TXT @8.8.8.8If using HTTP-01, check your ingress LoadBalancer configuration:
apiVersion: v1
kind: Service
metadata:
name: ingress-nginx
spec:
externalTrafficPolicy: ClusterAlso verify no client certificate authentication blocks the ACME validation endpoints.
For DNS-01 challenges, configure cert-manager to use a specific DNS server for self-checks:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
solvers:
- dns01:
cloudflare:
apiTokenSecretRef:
name: cloudflare-api
key: tokenVerify DNS provider credentials are correct and API access is enabled.
If you see 429 errors, you've hit Let's Encrypt rate limits. Use the staging issuer first:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-staging
spec:
acme:
server: https://acme-staging-v02.api.letsencrypt.org/directoryWait 1 hour before retrying with the production issuer.
For TLS handshake failures, enable temporary certificate generation:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: example-com
spec:
dnsNames:
- example.com
issuerRef:
name: letsencrypt-prod
issueTemporaryCertificate: trueThis allows the ingress to start with a temporary cert while the ACME challenge completes.
For multiple resources on the same hostname, use:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
acme.cert-manager.io/http01-edit-in-place: "true"This allows cert-manager to modify your existing ingress for the challenge rather than creating a new one.
Platform-specific considerations: EKS, AKS, and GKE often override cluster DNS servers, causing DNS-01 failures; explicitly configure DNS servers in cert-manager solver specs. Cloudflare and Route53 require valid API credentials and rate limit monitoring on the DNS provider side. For production deployments, test with the Let's Encrypt staging issuer first to avoid hitting rate limits and causing extended outages.
Failed to connect to server: connection refused (HTTP/2)
How to fix "HTTP/2 connection refused" error in Kubernetes
missing request for cpu in container
How to fix "missing request for cpu in container" in Kubernetes HPA
error: invalid configuration
How to fix "error: invalid configuration" in Kubernetes
etcdserver: cluster ID mismatch
How to fix "etcdserver: cluster ID mismatch" in Kubernetes
running with swap on is not supported
How to fix "running with swap on is not supported" in kubeadm