Missing projections anti-pattern¶
What is it?¶
The missing projections anti-pattern occurs when developers retrieve entire items from DynamoDB when they only need a few specific attributes. This wastes read capacity, bandwidth, and money by transferring unnecessary data.
Why is it a problem?¶
Retrieving full items when you only need specific attributes creates waste:
- Wasted Capacity: Pay for reading data you don't use
- Higher Costs: RCU consumption based on item size, not attributes used
- Slower Performance: More data to transfer over the network
- Bandwidth Waste: Unnecessary data transfer
- Memory Pressure: Larger objects in application memory
- Parsing Overhead: Deserializing unused attributes
The hidden cost¶
If your items are 10KB but you only need 1KB of attributes:
- Without projection: 10KB read = 3 RCU per item
- With projection: 1KB read = 1 RCU per item
- Waste: 2 RCU per item (67% wasted capacity)
Visual comparison¶
Example of the problem¶
❌ anti-pattern: no projection expression¶
import { TableClient } from '@ddb-lib/client'
const table = new TableClient({
tableName: 'Users',
// ... config
})
// BAD: Retrieving entire user object
const result = await table.get({
pk: `USER#${userId}`,
sk: 'PROFILE'
})
// Item contains:
// - name (100 bytes)
// - email (50 bytes)
// - profileImage (5 KB)
// - preferences (2 KB)
// - activityHistory (10 KB)
// - metadata (1 KB)
// Total: ~18 KB = 5 RCU
// But we only need the name!
const userName = result.item.name // Used only 100 bytes out of 18 KB
❌ common scenarios¶
// BAD: Query without projection
const orders = await table.query({
keyCondition: {
pk: `USER#${userId}`,
sk: { beginsWith: 'ORDER#' }
}
})
// Each order is 20 KB (includes full product details, shipping info, etc.)
// But we only need orderId and status for the list view
// Wasting 95% of read capacity!
for (const order of orders.items) {
console.log(order.orderId, order.status) // Only using 2 attributes
}
// BAD: Scan without projection
const activeUsers = await table.scan({
filter: {
status: { eq: 'ACTIVE' }
}
})
// Reading entire user profiles (10 KB each)
// But only need userId and email for notification
// Massive waste!
// BAD: Batch get without projection
const users = await table.batchGet({
keys: userIds.map(id => ({
pk: `USER#${id}`,
sk: 'PROFILE'
}))
})
// Getting full profiles for all users
// But only displaying names in UI
The solution¶
✅ use projection expressions¶
// GOOD: Get only needed attributes
const result = await table.get({
pk: `USER#${userId}`,
sk: 'PROFILE',
projection: ['name']
})
// Only transfers 100 bytes = 1 RCU
// 80% cost reduction!
const userName = result.item.name
✅ query with projection¶
// GOOD: Query with projection
const orders = await table.query({
keyCondition: {
pk: `USER#${userId}`,
sk: { beginsWith: 'ORDER#' }
},
projection: ['orderId', 'status', 'createdAt', 'total']
})
// Each order now ~500 bytes instead of 20 KB
// 95% cost reduction!
for (const order of orders.items) {
console.log(order.orderId, order.status)
}
✅ scan with projection¶
// GOOD: Scan with projection (when scan is necessary)
const activeUsers = await table.scan({
filter: {
status: { eq: 'ACTIVE' }
},
projection: ['userId', 'email', 'name']
})
// Each user now ~200 bytes instead of 10 KB
// 98% cost reduction!
✅ batch get with projection¶
// GOOD: Batch get with projection
const users = await table.batchGet({
keys: userIds.map(id => ({
pk: `USER#${id}`,
sk: 'PROFILE'
})),
projection: ['name', 'email']
})
// Only get what you need
// Significant cost savings
✅ nested attribute projection¶
// GOOD: Project nested attributes
const result = await table.get({
pk: `USER#${userId}`,
sk: 'PROFILE',
projection: [
'name',
'email',
'address.city', // Nested attribute
'address.country', // Nested attribute
'preferences.theme' // Nested attribute
]
})
// Get only specific nested fields
// Don't retrieve entire nested objects
✅ different projections for different use cases¶
// List view: minimal data
async function getUserList() {
return await table.query({
keyCondition: { pk: 'USERS' },
projection: ['userId', 'name', 'email']
})
}
// Detail view: more data
async function getUserDetail(userId: string) {
return await table.get({
pk: `USER#${userId}`,
sk: 'PROFILE',
projection: [
'userId',
'name',
'email',
'phone',
'address',
'preferences',
'createdAt'
]
})
}
// Edit view: all data
async function getUserForEdit(userId: string) {
return await table.get({
pk: `USER#${userId}`,
sk: 'PROFILE'
// No projection - need everything for editing
})
}
Cost impact¶
Real-world example¶
Scenario: E-commerce order list (1 million views per month)
Without Projection:
// Full order item: 20 KB
// RCU per item: 5
// Total RCU: 5 million
// Cost: $0.25 per million = $1.25
With Projection:
// Projected attributes: 500 bytes
// RCU per item: 1
// Total RCU: 1 million
// Cost: $0.25 per million = $0.25
// Savings: $1.00/month (80%)
Cost comparison table¶
| Item Size | Attributes Needed | Without Projection | With Projection | Savings |
|---|---|---|---|---|
| 1 KB | 100 bytes | 1 RCU | 1 RCU | 0% |
| 4 KB | 400 bytes | 1 RCU | 1 RCU | 0% |
| 5 KB | 500 bytes | 2 RCU | 1 RCU | 50% |
| 10 KB | 1 KB | 3 RCU | 1 RCU | 67% |
| 20 KB | 2 KB | 5 RCU | 1 RCU | 80% |
| 50 KB | 5 KB | 13 RCU | 2 RCU | 85% |
| 100 KB | 10 KB | 25 RCU | 3 RCU | 88% |
Monthly cost impact¶
For 10 million reads per month at $0.25 per million RCU:
| Scenario | RCU per Read | Total RCU | Monthly Cost | Annual Cost |
|---|---|---|---|---|
| No Projection (20 KB items) | 5 | 50 million | $12.50 | $150 |
| With Projection (2 KB) | 1 | 10 million | $2.50 | $30 |
| Savings | - | 40 million | $10/month | $120/year |
Detection¶
The anti-pattern detector can identify missing projections:
import { StatsCollector, AntiPatternDetector } from '@ddb-lib/stats'
const stats = new StatsCollector()
const detector = new AntiPatternDetector(stats)
// After running operations
const issues = detector.detectMissingProjections()
for (const issue of issues) {
console.log(issue.message)
// "Query on 'Orders' table retrieving full items (avg 18 KB)"
// "Consider using projection expressions to reduce capacity consumption"
// "Potential savings: 4 RCU per item (80%)"
}
Warning signs¶
You might have this anti-pattern if:
- Your read capacity costs are higher than expected
- You're transferring large amounts of data
- Your application only uses a few attributes from items
- You see high RCU consumption on queries
- Network transfer is slow
Performance impact¶
Latency improvement¶
| Item Size | Without Projection | With Projection | Improvement |
|---|---|---|---|
| 10 KB | 25ms | 15ms | 40% faster |
| 50 KB | 80ms | 20ms | 75% faster |
| 100 KB | 150ms | 25ms | 83% faster |
Throughput improvement¶
With the same provisioned capacity:
Provisioned: 1,000 RCU
Without Projection (5 RCU per item):
- Throughput: 200 items/second
With Projection (1 RCU per item):
- Throughput: 1,000 items/second
- 5x improvement!
Best practices¶
Analyze your access patterns¶
// Identify what attributes you actually use
function analyzeAttributeUsage(items: any[]) {
const usedAttributes = new Set<string>()
for (const item of items) {
// Track which attributes your code accesses
if (item.name) usedAttributes.add('name')
if (item.email) usedAttributes.add('email')
// ... etc
}
console.log('Used attributes:', Array.from(usedAttributes))
console.log('Total attributes in items:', Object.keys(items[0]).length)
// Use this to create optimal projection expressions
}
Create projection helpers¶
// Define projections for common use cases
const PROJECTIONS = {
userList: ['userId', 'name', 'email', 'status'],
userCard: ['userId', 'name', 'profileImage', 'memberSince'],
userDetail: ['userId', 'name', 'email', 'phone', 'address', 'preferences'],
orderList: ['orderId', 'status', 'total', 'createdAt'],
orderDetail: ['orderId', 'status', 'total', 'items', 'shipping', 'createdAt']
}
// Use in queries
const users = await table.query({
keyCondition: { pk: 'USERS' },
projection: PROJECTIONS.userList
})
Document projections¶
/**
* Get user profile for list view
* Projection: name, email, status (reduces RCU by 80%)
*/
async function getUserListItem(userId: string) {
return await table.get({
pk: `USER#${userId}`,
sk: 'PROFILE',
projection: ['name', 'email', 'status']
})
}
When to skip projections¶
Projections aren't always beneficial:
Skip projections when:¶
- You need all attributes (edit forms, full details)
- Items are small (<4 KB and you need most attributes)
- Projection would include most attributes anyway (>80% of item)
- Caching full items (better to cache complete data)
// OK: Skip projection when you need everything
async function getUserForEdit(userId: string) {
return await table.get({
pk: `USER#${userId}`,
sk: 'PROFILE'
// No projection - need all attributes for editing
})
}
// OK: Small items where projection doesn't help much
async function getConfig(key: string) {
return await table.get({
pk: 'CONFIG',
sk: key
// Config items are tiny (< 1 KB), projection overhead not worth it
})
}
Related resources¶
Summary¶
The Problem: Retrieving entire items when you only need specific attributes wastes 50-90% of read capacity and increases latency.
The Solution: Use projection expressions to retrieve only the attributes you need for each use case.
The Impact: Projection expressions can reduce read capacity consumption by 50-90% and improve latency by 40-80%.
Remember: Every byte you read from DynamoDB costs money and time. Use projection expressions to read only what you need. Your wallet and your users will thank you.