DynamoDB InternalServerError (HTTP 500) indicates AWS infrastructure is temporarily unable to process your request. Most are transient and automatically retried by SDK clients, but persistent errors require monitoring and investigation.
An InternalServerError in DynamoDB is an HTTP 500 response from AWS, indicating the service cannot process your API call. This occurs on the AWS side due to transient network issues, infrastructure problems, or temporary storage node failures. Unlike client errors, you cannot fix this by changing your request. DynamoDB automatically repairs these issues in real-time without intervention. These errors are expected in distributed systems like DynamoDB and occur at a normal rate (less than 0.01% of requests) within AWS SLA terms. The AWS SDK automatically retries failed requests with exponential backoff, so most transient errors resolve transparently to your application.
Check if the error occurs consistently or intermittently:
// SDK automatically retries transient errors
// If request succeeds after retry, it was transient
const result = await dynamodb.getItem({
TableName: "MyTable",
Key: { id: { S: "test" } }
}).promise();
// Most transient errors resolve automaticallyIf your request succeeds on the next attempt or subsequent calls work normally, the error was transient and no action is needed. DynamoDB SDK clients handle retries automatically with exponential backoff.
Set up CloudWatch monitoring to distinguish transient errors from persistent issues:
1. Open CloudWatch dashboard for your DynamoDB table
2. Add the SystemErrors metric to track 5xx errors
3. Calculate error percentage:
- Divide SystemErrors count by SuccessfulRequestLatency sample count
- Multiply by 100 to get percentage
4. Create an alarm for sustained errors:
- Less than 0.01% is normal per AWS SLA
- If error rate exceeds 1% over 15 minutes, escalate to AWS Support
# View SystemErrors metric (AWS CLI)
aws cloudwatch get-metric-statistics \
--namespace AWS/DynamoDB \
--metric-name SystemErrors \
--dimensions Name=TableName,Value=MyTable \
--start-time 2025-12-22T00:00:00Z \
--end-time 2025-12-22T01:00:00Z \
--period 300 \
--statistics SumDetermine if the error is due to AWS infrastructure issues:
1. Go to [AWS Health Dashboard](https://phd.aws.amazon.com/)
2. Check for DynamoDB service alerts in your region
3. Look for:
- Ongoing maintenance notifications
- Service degradation alerts
- Infrastructure issues affecting your table
4. If issues exist, your requests will automatically succeed once service recovers
5. For account-specific issues, check your AWS Account Health Dashboard (requires AWS Business or Enterprise support plan)
If no service issues are reported, the error is likely isolated to your request pattern.
Use eventually consistent reads to reduce transient errors:
// Eventually consistent reads (default for Scan/Query)
const result = await dynamodb.query({
TableName: "MyTable",
KeyConditionExpression: "pk = :pk",
ExpressionAttributeValues: { ":pk": { S: "user#123" } },
ConsistentRead: false // Default - read from any replica
}).promise();
// For GetItem/BatchGetItem, explicitly set ConsistentRead
const item = await dynamodb.getItem({
TableName: "MyTable",
Key: { id: { S: "test" } },
ConsistentRead: false // Reduces transient errors
}).promise();Eventually consistent reads serve from any available storage node and are less susceptible to transient infrastructure issues. Only use strongly consistent reads when your application requires up-to-the-millisecond accuracy.
Enhance retry logic if you encounter persistent transient errors:
// AWS SDK v3 automatically retries, but you can customize:
import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { ADAPTIVE_RETRY_MODE } from "@aws-sdk/util-retry";
const client = new DynamoDBClient({
region: "us-east-1",
retryStrategy: new ADAPTIVE_RETRY_MODE(),
maxAttempts: 5 // Retry up to 5 times (default is 3)
});
// For Node.js v2 SDK:
const aws = require("aws-sdk");
const dynamodb = new aws.DynamoDB({
maxRetries: 5,
retryDelayOptions: { base: 50 } // Start with 50ms backoff
});The AWS SDK handles retries automatically. Only increase maxAttempts or adjust backoff if you have metrics showing that default retry counts are insufficient for your workload.
For write operations returning 500, determine if the write succeeded:
// When you get a 500 on a write, the operation may have succeeded or failed
// For TransactWriteItems, it's safe to retry idempotent operations
try {
await dynamodb.transactWriteItems({
TransactItems: [
{
Put: {
TableName: "MyTable",
Item: {
id: { S: "test123" },
data: { S: "value" },
timestamp: { N: Date.now().toString() }
},
ConditionExpression: "attribute_not_exists(id)" // Prevents duplicates
}
}
]
}).promise();
} catch (error) {
if (error.code === "InternalServerError") {
// Safe to retry TransactWriteItems
// Retry logic here
}
}
// For simple Put/UpdateItem operations:
// Include idempotent markers (timestamps, version numbers) to safely retryTransactWriteItems is safe to retry on 500 errors. For other operations, use version numbers or timestamps to detect if the write already succeeded before retrying.
If errors persist beyond typical transient patterns:
1. Gather diagnostics:
- Collect DynamoDB request IDs from your logs (format: 4KBNVRGD25RG1KEO9UT4V3FQDJVV4KQNSO5AEMVJF66Q9ASUAAJG)
- Extract timestamps of failures
- Calculate failure rate percentage from CloudWatch
- Document reproducible patterns
2. Escalate if:
- Error rate exceeds 1% over 15 minutes
- Errors persist after 30 minutes
- Multiple tables are affected
- Write operations fail consistently
3. Contact AWS Support with:
- Request IDs from application logs
- CloudWatch metrics showing error rates
- Table configuration details
- Timeline of when errors started
Note: Free tier AWS accounts require Premium Support to contact technical support. Standard tier users can open cases through the AWS Management Console.
DynamoDB SLA and Error Rates: AWS guarantees 99.999% uptime for global tables and 99.99% for standard tables. This translates to ~0.01% acceptable error rate. If you see less than this percentage, errors are expected and normal behavior.
Difference between 502 and 500: A 502 Bad Gateway may indicate your request succeeded but the response delivery failed (data may be persisted). For 500 errors, the service could not process the request. Both require retries.
TransactWriteItems Safety: When using transactions, failed 500 responses indicate either the entire transaction succeeded or the entire transaction failedโpartial writes never occur. This makes retrying transactional operations safe.
Regional Considerations: Some regions may experience higher transient error rates during peak traffic. Check CloudWatch metrics by region to identify patterns. Eventually consistent reads help distribute load across replicas, reducing per-node strain.
UnrecognizedClientException: The security token included in the request is invalid
How to fix "UnrecognizedClientException: The security token included in the request is invalid" in DynamoDB
TransactionCanceledException: Transaction cancelled
How to fix "TransactionCanceledException: Transaction cancelled" in DynamoDB
RequestLimitExceeded: Throughput exceeds the current throughput limit for your account
How to fix "RequestLimitExceeded: Throughput exceeds the current throughput limit for your account" in DynamoDB
TableNotFoundException: Table not found
TableNotFoundException: Table not found in DynamoDB
ValidationException: The provided key element does not match the schema
How to fix "ValidationException: The provided key element does not match the schema" in DynamoDB