Batch operations: maximize throughput¶
The practice¶
Use batch operations (BatchGetItem, BatchWriteItem) when working with multiple items instead of individual operations.
Batch operations allow you to read or write up to 25 items in a single request, dramatically improving throughput and reducing latency overhead.
Why it matters¶
Throughput¶
- Process up to 25 items per request instead of 1
- Reduce total number of API calls by up to 25x
- Maximize utilization of provisioned capacity
Latency¶
- Eliminate per-request network overhead
- Single round-trip instead of multiple
- Particularly beneficial for high-latency connections
Cost efficiency¶
- Same RCU/WCU consumption as individual operations
- But with significantly less overhead
- Reduced Lambda execution time in serverless applications
Visual comparison¶
| Aspect | Batch operations | Individual operations |
|---|---|---|
| Items per request | Up to 25 | 1 |
| Network round-trips | 1 | 25 |
| Latency overhead | Minimal | 25x overhead |
| Throughput | High | Low |
| Code complexity | Simple (library handles chunking) | Complex (manual loops) |
Automatic Chunking
The @ddb-lib/client library automatically chunks requests larger than 25 items and handles retries for unprocessed items. You don't need to worry about batch size limits!
Code examples¶
✅ good: using batch get¶
import { TableClient } from '@ddb-lib/client'
const table = new TableClient({
tableName: 'Users',
partitionKey: 'pk',
sortKey: 'sk'
})
// Retrieve multiple items efficiently
const userIds = ['123', '456', '789', '101', '102']
const users = await table.batchGet({
keys: userIds.map(id => ({
pk: `USER#${id}`,
sk: 'PROFILE'
})),
projection: ['name', 'email', 'status']
})
// Single request, 5 items retrieved
// Fast and efficient!
❌ bad: individual get operations¶
// DON'T DO THIS!
const users = []
for (const userId of userIds) {
const user = await table.get({
key: { pk: `USER#${userId}`, sk: 'PROFILE' },
projection: ['name', 'email', 'status']
})
users.push(user)
}
// 5 separate requests with 5x the latency overhead
// Slow and inefficient!
Real-world performance impact¶
Example scenario¶
Retrieving 100 user profiles (1KB each):
| Approach | Requests | Network round-trips | Total latency | RCU consumed |
|---|---|---|---|---|
| Batch get | 4 | 4 | ~80ms | 100 |
| Individual gets | 100 | 100 | ~2000ms | 100 |
Performance Gain
Batch operations provide 25x reduction in latency overhead! In this example, batch operations are 25x faster while consuming the same RCU.
Batch get operations¶
Basic batch get¶
// Retrieve multiple items from the same table
const items = await table.batchGet({
keys: [
{ pk: 'USER#123', sk: 'PROFILE' },
{ pk: 'USER#456', sk: 'PROFILE' },
{ pk: 'ORDER#789', sk: 'DETAILS' }
]
})
// Library automatically handles:
// - Chunking into 25-item batches
// - Retrying unprocessed items
// - Deduplication of keys
Batch get with projection¶
// Retrieve only needed attributes
const users = await table.batchGet({
keys: userKeys,
projection: ['name', 'email', 'avatar']
})
// Combines batch efficiency with projection cost savings
Large batch get (> 25 items)¶
// Library automatically chunks large batches
const allUsers = await table.batchGet({
keys: Array.from({ length: 100 }, (_, i) => ({
pk: `USER#${i}`,
sk: 'PROFILE'
}))
})
// Automatically split into 4 requests of 25 items each
// All retries handled automatically
Batch write operations¶
Basic batch write¶
// Write multiple items efficiently
await table.batchWrite({
puts: [
{ pk: 'USER#123', sk: 'PROFILE', name: 'Alice' },
{ pk: 'USER#456', sk: 'PROFILE', name: 'Bob' }
],
deletes: [
{ pk: 'USER#789', sk: 'PROFILE' }
]
})
// Single request for multiple operations
Batch put¶
// Insert or update multiple items
const newUsers = [
{ pk: 'USER#123', sk: 'PROFILE', name: 'Alice', email: 'alice@example.com' },
{ pk: 'USER#456', sk: 'PROFILE', name: 'Bob', email: 'bob@example.com' },
{ pk: 'USER#789', sk: 'PROFILE', name: 'Charlie', email: 'charlie@example.com' }
]
await table.batchWrite({
puts: newUsers
})
Batch delete¶
// Delete multiple items
const keysToDelete = [
{ pk: 'USER#123', sk: 'SESSION#abc' },
{ pk: 'USER#123', sk: 'SESSION#def' },
{ pk: 'USER#123', sk: 'SESSION#ghi' }
]
await table.batchWrite({
deletes: keysToDelete
})
Mixed operations¶
// Combine puts and deletes in one batch
await table.batchWrite({
puts: [
{ pk: 'USER#123', sk: 'PROFILE', status: 'ACTIVE' }
],
deletes: [
{ pk: 'USER#123', sk: 'OLD_DATA' },
{ pk: 'USER#123', sk: 'TEMP_SESSION' }
]
})
Advanced patterns¶
Pattern 1: batch processing pipeline¶
// Process items in batches
async function processUsers(userIds: string[]) {
// Fetch in batches
const users = await table.batchGet({
keys: userIds.map(id => ({ pk: `USER#${id}`, sk: 'PROFILE' }))
})
// Process
const updates = users.map(user => ({
...user,
lastProcessed: Date.now()
}))
// Write back in batches
await table.batchWrite({
puts: updates
})
}
Pattern 2: parallel batch operations¶
// Process multiple batches in parallel
async function fetchAllOrders(userIds: string[]) {
const batches = chunk(userIds, 100) // 100 items per batch
const results = await Promise.all(
batches.map(batch =>
table.batchGet({
keys: batch.map(id => ({ pk: `USER#${id}`, sk: 'ORDERS' }))
})
)
)
return results.flat()
}
Pattern 3: batch with error handling¶
// Handle partial failures gracefully
try {
const result = await table.batchWrite({
puts: items
})
// Check for unprocessed items
if (result.unprocessedItems?.length > 0) {
console.log(`${result.unprocessedItems.length} items not processed`)
// Library automatically retries, but you can handle manually if needed
}
} catch (error) {
console.error('Batch write failed:', error)
}
Limitations and considerations¶
Batch size limits¶
- Maximum 25 items per batch request
- Maximum 16MB total request size
- Maximum 400KB per item
// Library handles chunking automatically
const largeItems = Array.from({ length: 100 }, (_, i) => ({
pk: `ITEM#${i}`,
sk: 'DATA',
payload: generateLargePayload()
}))
// Automatically split into multiple batches
await table.batchWrite({ puts: largeItems })
No conditional writes¶
Batch operations don't support conditional expressions:
// ❌ Not supported in batch operations
await table.batchWrite({
puts: [{
pk: 'USER#123',
sk: 'PROFILE',
condition: { attribute_not_exists: 'pk' } // Not supported!
}]
})
// ✅ Use individual operations for conditional writes
await table.put({
item: { pk: 'USER#123', sk: 'PROFILE' },
condition: { attribute_not_exists: 'pk' }
})
Partial failures¶
Batch operations can partially succeed:
// Some items may fail while others succeed
const result = await table.batchWrite({
puts: items
})
// Library automatically retries unprocessed items
// But you should monitor for persistent failures
if (result.unprocessedItems?.length > 0) {
// Log or handle persistent failures
console.warn('Some items could not be processed')
}
When NOT to use batch operations¶
❌ don't use batches for:¶
-
Single item operations
-
Operations requiring conditions
-
Transactions
Monitoring batch operations¶
Track batch operation efficiency:
import { StatsCollector } from '@ddb-lib/stats'
const stats = new StatsCollector()
const table = new TableClient({
tableName: 'Users',
statsCollector: stats
})
// After operations
const summary = stats.getSummary()
console.log(`Batch operations: ${summary.operations.batchGet + summary.operations.batchWrite}`)
console.log(`Individual operations: ${summary.operations.get + summary.operations.put}`)
console.log(`Batch efficiency: ${summary.batchEfficiency}%`)
// Aim for high batch usage in multi-item scenarios
Performance optimization tips¶
Tip 1: combine with projections¶
// Maximum efficiency: batch + projection
const users = await table.batchGet({
keys: userKeys,
projection: ['name', 'email'] // Only get what you need
})
Tip 2: use parallel batches for large datasets¶
// Process very large datasets efficiently
async function processLargeDataset(keys: Key[]) {
const batches = chunk(keys, 100)
// Process batches in parallel (with concurrency limit)
const results = await pMap(
batches,
batch => table.batchGet({ keys: batch }),
{ concurrency: 5 }
)
return results.flat()
}
Tip 3: deduplicate keys¶
// Remove duplicate keys before batching
const uniqueKeys = Array.from(
new Set(keys.map(k => JSON.stringify(k)))
).map(k => JSON.parse(k))
await table.batchGet({ keys: uniqueKeys })
Key takeaways¶
- Use batches for multiple items - 25x throughput improvement
- Library handles complexity - automatic chunking and retries
- Combine with projections - maximum cost efficiency
- Not for conditional writes - use individual operations or transactions
- Monitor efficiency - track batch vs individual operation ratio
Related best practices¶
- Projection expressions - Combine for maximum efficiency
- Query vs scan - Efficient data retrieval
- Capacity planning - Plan for batch throughput
Related guides¶
- Batch operations - Detailed usage guide
- Core operations - Individual operations
- Transactions - When you need ACID guarantees