Missing projections anti-pattern¶

What is it?¶

The missing projections anti-pattern occurs when developers retrieve entire items from DynamoDB when they only need a few specific attributes. This wastes read capacity, bandwidth, and money by transferring unnecessary data.

Why is it a problem?¶

Retrieving full items when you only need specific attributes creates waste:

Wasted Capacity: Pay for reading data you don't use
Higher Costs: RCU consumption based on item size, not attributes used
Slower Performance: More data to transfer over the network
Bandwidth Waste: Unnecessary data transfer
Memory Pressure: Larger objects in application memory
Parsing Overhead: Deserializing unused attributes

The hidden cost¶

If your items are 10KB but you only need 1KB of attributes:

Without projection: 10KB read = 3 RCU per item
With projection: 1KB read = 1 RCU per item
Waste: 2 RCU per item (67% wasted capacity)

Visual comparison¶

Example of the problem¶

❌ anti-pattern: no projection expression¶

import { TableClient } from '@ddb-lib/client'

const table = new TableClient({
  tableName: 'Users',
  // ... config
})

// BAD: Retrieving entire user object
const result = await table.get({
  pk: `USER#${userId}`,
  sk: 'PROFILE'
})

// Item contains:
// - name (100 bytes)
// - email (50 bytes)
// - profileImage (5 KB)
// - preferences (2 KB)
// - activityHistory (10 KB)
// - metadata (1 KB)
// Total: ~18 KB = 5 RCU

// But we only need the name!
const userName = result.item.name  // Used only 100 bytes out of 18 KB

❌ common scenarios¶

// BAD: Query without projection
const orders = await table.query({
  keyCondition: {
    pk: `USER#${userId}`,
    sk: { beginsWith: 'ORDER#' }
  }
})

// Each order is 20 KB (includes full product details, shipping info, etc.)
// But we only need orderId and status for the list view
// Wasting 95% of read capacity!

for (const order of orders.items) {
  console.log(order.orderId, order.status)  // Only using 2 attributes
}

// BAD: Scan without projection
const activeUsers = await table.scan({
  filter: {
    status: { eq: 'ACTIVE' }
  }
})

// Reading entire user profiles (10 KB each)
// But only need userId and email for notification
// Massive waste!

// BAD: Batch get without projection
const users = await table.batchGet({
  keys: userIds.map(id => ({
    pk: `USER#${id}`,
    sk: 'PROFILE'
  }))
})

// Getting full profiles for all users
// But only displaying names in UI

The solution¶

✅ use projection expressions¶

// GOOD: Get only needed attributes
const result = await table.get({
  pk: `USER#${userId}`,
  sk: 'PROFILE',
  projection: ['name']
})

// Only transfers 100 bytes = 1 RCU
// 80% cost reduction!

const userName = result.item.name

✅ query with projection¶

// GOOD: Query with projection
const orders = await table.query({
  keyCondition: {
    pk: `USER#${userId}`,
    sk: { beginsWith: 'ORDER#' }
  },
  projection: ['orderId', 'status', 'createdAt', 'total']
})

// Each order now ~500 bytes instead of 20 KB
// 95% cost reduction!

for (const order of orders.items) {
  console.log(order.orderId, order.status)
}

✅ scan with projection¶

// GOOD: Scan with projection (when scan is necessary)
const activeUsers = await table.scan({
  filter: {
    status: { eq: 'ACTIVE' }
  },
  projection: ['userId', 'email', 'name']
})

// Each user now ~200 bytes instead of 10 KB
// 98% cost reduction!

✅ batch get with projection¶

// GOOD: Batch get with projection
const users = await table.batchGet({
  keys: userIds.map(id => ({
    pk: `USER#${id}`,
    sk: 'PROFILE'
  })),
  projection: ['name', 'email']
})

// Only get what you need
// Significant cost savings

✅ nested attribute projection¶

// GOOD: Project nested attributes
const result = await table.get({
  pk: `USER#${userId}`,
  sk: 'PROFILE',
  projection: [
    'name',
    'email',
    'address.city',      // Nested attribute
    'address.country',   // Nested attribute
    'preferences.theme'  // Nested attribute
  ]
})

// Get only specific nested fields
// Don't retrieve entire nested objects

✅ different projections for different use cases¶

// List view: minimal data
async function getUserList() {
  return await table.query({
    keyCondition: { pk: 'USERS' },
    projection: ['userId', 'name', 'email']
  })
}

// Detail view: more data
async function getUserDetail(userId: string) {
  return await table.get({
    pk: `USER#${userId}`,
    sk: 'PROFILE',
    projection: [
      'userId',
      'name', 
      'email',
      'phone',
      'address',
      'preferences',
      'createdAt'
    ]
  })
}

// Edit view: all data
async function getUserForEdit(userId: string) {
  return await table.get({
    pk: `USER#${userId}`,
    sk: 'PROFILE'
    // No projection - need everything for editing
  })
}

Cost impact¶

Real-world example¶

Scenario: E-commerce order list (1 million views per month)

Without Projection:

// Full order item: 20 KB
// RCU per item: 5
// Total RCU: 5 million
// Cost: $0.25 per million = $1.25

With Projection:

// Projected attributes: 500 bytes
// RCU per item: 1
// Total RCU: 1 million
// Cost: $0.25 per million = $0.25
// Savings: $1.00/month (80%)

Cost comparison table¶

Item Size	Attributes Needed	Without Projection	With Projection	Savings
1 KB	100 bytes	1 RCU	1 RCU	0%
4 KB	400 bytes	1 RCU	1 RCU	0%
5 KB	500 bytes	2 RCU	1 RCU	50%
10 KB	1 KB	3 RCU	1 RCU	67%
20 KB	2 KB	5 RCU	1 RCU	80%
50 KB	5 KB	13 RCU	2 RCU	85%
100 KB	10 KB	25 RCU	3 RCU	88%

Monthly cost impact¶

For 10 million reads per month at $0.25 per million RCU:

Scenario	RCU per Read	Total RCU	Monthly Cost	Annual Cost
No Projection (20 KB items)	5	50 million	$12.50	$150
With Projection (2 KB)	1	10 million	$2.50	$30
Savings	-	40 million	$10/month	$120/year

Detection¶

The anti-pattern detector can identify missing projections:

import { StatsCollector, AntiPatternDetector } from '@ddb-lib/stats'

const stats = new StatsCollector()
const detector = new AntiPatternDetector(stats)

// After running operations
const issues = detector.detectMissingProjections()

for (const issue of issues) {
  console.log(issue.message)
  // "Query on 'Orders' table retrieving full items (avg 18 KB)"
  // "Consider using projection expressions to reduce capacity consumption"
  // "Potential savings: 4 RCU per item (80%)"
}

Warning signs¶

You might have this anti-pattern if:

Your read capacity costs are higher than expected
You're transferring large amounts of data
Your application only uses a few attributes from items
You see high RCU consumption on queries
Network transfer is slow

Performance impact¶

Latency improvement¶

Item Size	Without Projection	With Projection	Improvement
10 KB	25ms	15ms	40% faster
50 KB	80ms	20ms	75% faster
100 KB	150ms	25ms	83% faster

Throughput improvement¶

With the same provisioned capacity:

Provisioned: 1,000 RCU

Without Projection (5 RCU per item):
- Throughput: 200 items/second

With Projection (1 RCU per item):
- Throughput: 1,000 items/second
- 5x improvement!

Best practices¶

Analyze your access patterns¶

// Identify what attributes you actually use
function analyzeAttributeUsage(items: any[]) {
  const usedAttributes = new Set<string>()

  for (const item of items) {
    // Track which attributes your code accesses
    if (item.name) usedAttributes.add('name')
    if (item.email) usedAttributes.add('email')
    // ... etc
  }

  console.log('Used attributes:', Array.from(usedAttributes))
  console.log('Total attributes in items:', Object.keys(items[0]).length)

  // Use this to create optimal projection expressions
}

Create projection helpers¶

// Define projections for common use cases
const PROJECTIONS = {
  userList: ['userId', 'name', 'email', 'status'],
  userCard: ['userId', 'name', 'profileImage', 'memberSince'],
  userDetail: ['userId', 'name', 'email', 'phone', 'address', 'preferences'],
  orderList: ['orderId', 'status', 'total', 'createdAt'],
  orderDetail: ['orderId', 'status', 'total', 'items', 'shipping', 'createdAt']
}

// Use in queries
const users = await table.query({
  keyCondition: { pk: 'USERS' },
  projection: PROJECTIONS.userList
})

Document projections¶

/**
 * Get user profile for list view
 * Projection: name, email, status (reduces RCU by 80%)
 */
async function getUserListItem(userId: string) {
  return await table.get({
    pk: `USER#${userId}`,
    sk: 'PROFILE',
    projection: ['name', 'email', 'status']
  })
}

When to skip projections¶

Projections aren't always beneficial:

Skip projections when:¶

You need all attributes (edit forms, full details)
Items are small (<4 KB and you need most attributes)
Projection would include most attributes anyway (>80% of item)
Caching full items (better to cache complete data)

// OK: Skip projection when you need everything
async function getUserForEdit(userId: string) {
  return await table.get({
    pk: `USER#${userId}`,
    sk: 'PROFILE'
    // No projection - need all attributes for editing
  })
}

// OK: Small items where projection doesn't help much
async function getConfig(key: string) {
  return await table.get({
    pk: 'CONFIG',
    sk: key
    // Config items are tiny (< 1 KB), projection overhead not worth it
  })
}

Summary¶

The Problem: Retrieving entire items when you only need specific attributes wastes 50-90% of read capacity and increases latency.

The Solution: Use projection expressions to retrieve only the attributes you need for each use case.

The Impact: Projection expressions can reduce read capacity consumption by 50-90% and improve latency by 40-80%.

Remember: Every byte you read from DynamoDB costs money and time. Use projection expressions to read only what you need. Your wallet and your users will thank you.