Batch operations: maximize throughput¶

The practice¶

Use batch operations (BatchGetItem, BatchWriteItem) when working with multiple items instead of individual operations.

Batch operations allow you to read or write up to 25 items in a single request, dramatically improving throughput and reducing latency overhead.

Why it matters¶

Throughput¶

Process up to 25 items per request instead of 1
Reduce total number of API calls by up to 25x
Maximize utilization of provisioned capacity

Latency¶

Eliminate per-request network overhead
Single round-trip instead of multiple
Particularly beneficial for high-latency connections

Cost efficiency¶

Same RCU/WCU consumption as individual operations
But with significantly less overhead
Reduced Lambda execution time in serverless applications

Visual comparison¶

Aspect	Batch operations	Individual operations
Items per request	Up to 25	1
Network round-trips	1	25
Latency overhead	Minimal	25x overhead
Throughput	High	Low
Code complexity	Simple (library handles chunking)	Complex (manual loops)

Automatic Chunking

The @ddb-lib/client library automatically chunks requests larger than 25 items and handles retries for unprocessed items. You don't need to worry about batch size limits!

Code examples¶

✅ good: using batch get¶

import { TableClient } from '@ddb-lib/client'

const table = new TableClient({
  tableName: 'Users',
  partitionKey: 'pk',
  sortKey: 'sk'
})

// Retrieve multiple items efficiently
const userIds = ['123', '456', '789', '101', '102']

const users = await table.batchGet({
  keys: userIds.map(id => ({
    pk: `USER#${id}`,
    sk: 'PROFILE'
  })),
  projection: ['name', 'email', 'status']
})

// Single request, 5 items retrieved
// Fast and efficient!

❌ bad: individual get operations¶

// DON'T DO THIS!
const users = []

for (const userId of userIds) {
  const user = await table.get({
    key: { pk: `USER#${userId}`, sk: 'PROFILE' },
    projection: ['name', 'email', 'status']
  })
  users.push(user)
}

// 5 separate requests with 5x the latency overhead
// Slow and inefficient!

Real-world performance impact¶

Example scenario¶

Retrieving 100 user profiles (1KB each):

Approach	Requests	Network round-trips	Total latency	RCU consumed
Batch get	4	4	~80ms	100
Individual gets	100	100	~2000ms	100

Performance Gain

Batch operations provide 25x reduction in latency overhead! In this example, batch operations are 25x faster while consuming the same RCU.

Batch get operations¶

Basic batch get¶

// Retrieve multiple items from the same table
const items = await table.batchGet({
  keys: [
    { pk: 'USER#123', sk: 'PROFILE' },
    { pk: 'USER#456', sk: 'PROFILE' },
    { pk: 'ORDER#789', sk: 'DETAILS' }
  ]
})

// Library automatically handles:
// - Chunking into 25-item batches
// - Retrying unprocessed items
// - Deduplication of keys

Batch get with projection¶

// Retrieve only needed attributes
const users = await table.batchGet({
  keys: userKeys,
  projection: ['name', 'email', 'avatar']
})

// Combines batch efficiency with projection cost savings

Large batch get (> 25 items)¶

// Library automatically chunks large batches
const allUsers = await table.batchGet({
  keys: Array.from({ length: 100 }, (_, i) => ({
    pk: `USER#${i}`,
    sk: 'PROFILE'
  }))
})

// Automatically split into 4 requests of 25 items each
// All retries handled automatically

Batch write operations¶

Basic batch write¶

// Write multiple items efficiently
await table.batchWrite({
  puts: [
    { pk: 'USER#123', sk: 'PROFILE', name: 'Alice' },
    { pk: 'USER#456', sk: 'PROFILE', name: 'Bob' }
  ],
  deletes: [
    { pk: 'USER#789', sk: 'PROFILE' }
  ]
})

// Single request for multiple operations

Batch put¶

// Insert or update multiple items
const newUsers = [
  { pk: 'USER#123', sk: 'PROFILE', name: 'Alice', email: 'alice@example.com' },
  { pk: 'USER#456', sk: 'PROFILE', name: 'Bob', email: 'bob@example.com' },
  { pk: 'USER#789', sk: 'PROFILE', name: 'Charlie', email: 'charlie@example.com' }
]

await table.batchWrite({
  puts: newUsers
})

Batch delete¶

// Delete multiple items
const keysToDelete = [
  { pk: 'USER#123', sk: 'SESSION#abc' },
  { pk: 'USER#123', sk: 'SESSION#def' },
  { pk: 'USER#123', sk: 'SESSION#ghi' }
]

await table.batchWrite({
  deletes: keysToDelete
})

Mixed operations¶

// Combine puts and deletes in one batch
await table.batchWrite({
  puts: [
    { pk: 'USER#123', sk: 'PROFILE', status: 'ACTIVE' }
  ],
  deletes: [
    { pk: 'USER#123', sk: 'OLD_DATA' },
    { pk: 'USER#123', sk: 'TEMP_SESSION' }
  ]
})

Advanced patterns¶

Pattern 1: batch processing pipeline¶

// Process items in batches
async function processUsers(userIds: string[]) {
  // Fetch in batches
  const users = await table.batchGet({
    keys: userIds.map(id => ({ pk: `USER#${id}`, sk: 'PROFILE' }))
  })

  // Process
  const updates = users.map(user => ({
    ...user,
    lastProcessed: Date.now()
  }))

  // Write back in batches
  await table.batchWrite({
    puts: updates
  })
}

Pattern 2: parallel batch operations¶

// Process multiple batches in parallel
async function fetchAllOrders(userIds: string[]) {
  const batches = chunk(userIds, 100) // 100 items per batch

  const results = await Promise.all(
    batches.map(batch =>
      table.batchGet({
        keys: batch.map(id => ({ pk: `USER#${id}`, sk: 'ORDERS' }))
      })
    )
  )

  return results.flat()
}

Pattern 3: batch with error handling¶

// Handle partial failures gracefully
try {
  const result = await table.batchWrite({
    puts: items
  })

  // Check for unprocessed items
  if (result.unprocessedItems?.length > 0) {
    console.log(`${result.unprocessedItems.length} items not processed`)
    // Library automatically retries, but you can handle manually if needed
  }
} catch (error) {
  console.error('Batch write failed:', error)
}

Limitations and considerations¶

Batch size limits¶

Maximum 25 items per batch request
Maximum 16MB total request size
Maximum 400KB per item

// Library handles chunking automatically
const largeItems = Array.from({ length: 100 }, (_, i) => ({
  pk: `ITEM#${i}`,
  sk: 'DATA',
  payload: generateLargePayload()
}))

// Automatically split into multiple batches
await table.batchWrite({ puts: largeItems })

No conditional writes¶

Batch operations don't support conditional expressions:

// ❌ Not supported in batch operations
await table.batchWrite({
  puts: [{ 
    pk: 'USER#123', 
    sk: 'PROFILE',
    condition: { attribute_not_exists: 'pk' } // Not supported!
  }]
})

// ✅ Use individual operations for conditional writes
await table.put({
  item: { pk: 'USER#123', sk: 'PROFILE' },
  condition: { attribute_not_exists: 'pk' }
})

Partial failures¶

Batch operations can partially succeed:

// Some items may fail while others succeed
const result = await table.batchWrite({
  puts: items
})

// Library automatically retries unprocessed items
// But you should monitor for persistent failures
if (result.unprocessedItems?.length > 0) {
  // Log or handle persistent failures
  console.warn('Some items could not be processed')
}

When NOT to use batch operations¶

❌ don't use batches for:¶

Single item operations

// Overkill for one item
await table.batchGet({
  keys: [{ pk: 'USER#123', sk: 'PROFILE' }]
})

// Just use get
await table.get({
  key: { pk: 'USER#123', sk: 'PROFILE' }
})

Operations requiring conditions

// Need conditional writes - use individual operations
await table.put({
  item: { pk: 'USER#123', sk: 'PROFILE' },
  condition: { attribute_not_exists: 'pk' }
})

Transactions

// Need ACID guarantees - use transactions
await table.transactWrite({
  items: [
    { type: 'put', item: { ... } },
    { type: 'update', key: { ... }, updates: { ... } }
  ]
})

Monitoring batch operations¶

Track batch operation efficiency:

import { StatsCollector } from '@ddb-lib/stats'

const stats = new StatsCollector()
const table = new TableClient({
  tableName: 'Users',
  statsCollector: stats
})

// After operations
const summary = stats.getSummary()

console.log(`Batch operations: ${summary.operations.batchGet + summary.operations.batchWrite}`)
console.log(`Individual operations: ${summary.operations.get + summary.operations.put}`)
console.log(`Batch efficiency: ${summary.batchEfficiency}%`)

// Aim for high batch usage in multi-item scenarios

Performance optimization tips¶

Tip 1: combine with projections¶

// Maximum efficiency: batch + projection
const users = await table.batchGet({
  keys: userKeys,
  projection: ['name', 'email']  // Only get what you need
})

Tip 2: use parallel batches for large datasets¶

// Process very large datasets efficiently
async function processLargeDataset(keys: Key[]) {
  const batches = chunk(keys, 100)

  // Process batches in parallel (with concurrency limit)
  const results = await pMap(
    batches,
    batch => table.batchGet({ keys: batch }),
    { concurrency: 5 }
  )

  return results.flat()
}

Tip 3: deduplicate keys¶

// Remove duplicate keys before batching
const uniqueKeys = Array.from(
  new Set(keys.map(k => JSON.stringify(k)))
).map(k => JSON.parse(k))

await table.batchGet({ keys: uniqueKeys })

Key takeaways¶

Use batches for multiple items - 25x throughput improvement
Library handles complexity - automatic chunking and retries
Combine with projections - maximum cost efficiency
Not for conditional writes - use individual operations or transactions
Monitor efficiency - track batch vs individual operation ratio

Projection expressions - Combine for maximum efficiency
Query vs scan - Efficient data retrieval
Capacity planning - Plan for batch throughput

Batch operations - Detailed usage guide
Core operations - Individual operations
Transactions - When you need ACID guarantees