2025-09-05
AWS CDK Link Shortener Part 4: Production Deployment & Optimization
Multi-environment deployment strategies, performance optimization at scale, and cost management. Production insights and lessons learned with proper monitoring and incident response patterns.
Production Deployment & Optimization
Production optimization requires more than making things fast - it demands predictable performance under any load condition. When traffic spikes unexpectedly, infrastructure that works perfectly in staging can reveal scaling bottlenecks in production.
The most common oversight? Database provisioning for steady-state traffic rather than peak loads. A DynamoDB table optimized for normal operations can become a bottleneck when traffic increases 10x during campaigns or product launches.
In Parts 1-3, we built the foundation, core functionality, and security. Now this post covers making it production-ready: deployment, monitoring, and optimization.
Multi-Environment Deployment: Beyond Dev and Prod
Most tutorials show you dev and prod environments. In practice, you need at least four: dev, staging, pre-prod, and production. Here’s why and how to build them:
// bin/link-shortener.ts - The app entry point that got us through launch day
#!/usr/bin/env node
import 'source-map-support/register';
import * as cdk from 'aws-cdk-lib';
import { LinkShortenerStack } from '../lib/link-shortener-stack';
import { DatabaseStack } from '../lib/database-stack';
import { MonitoringStack } from '../lib/monitoring-stack';
const app = new cdk.App();
// Environment configuration that scales with your team
const environments = {
dev: {
account: process.env.CDK_DEFAULT_ACCOUNT,
region: 'us-west-2', // Cheaper region for dev
stage: 'dev',
domain: 'dev-links.yourcompany.com',
customDomain: false,
monitoring: {
detailedMetrics: false,
logRetention: 7, // Days
alerting: false,
},
database: {
billingMode: 'PAY_PER_REQUEST',
pointInTimeRecovery: false,
backupRetention: 7,
},
lambda: {
reservedConcurrency: 10, // Limit dev costs
memorySize: 512,
timeout: 30,
}
},
staging: {
account: process.env.CDK_DEFAULT_ACCOUNT,
region: 'us-east-1',
stage: 'staging',
domain: 'staging-links.yourcompany.com',
customDomain: true,
monitoring: {
detailedMetrics: true,
logRetention: 14,
alerting: true,
},
database: {
billingMode: 'PAY_PER_REQUEST',
pointInTimeRecovery: true,
backupRetention: 14,
},
lambda: {
reservedConcurrency: 50,
memorySize: 1024,
timeout: 30,
}
},
'pre-prod': {
account: process.env.CDK_PREPROD_ACCOUNT,
region: 'us-east-1',
stage: 'pre-prod',
domain: 'pp-links.yourcompany.com',
customDomain: true,
monitoring: {
detailedMetrics: true,
logRetention: 30,
alerting: true,
},
database: {
billingMode: 'PROVISIONED', // Match production patterns
readCapacity: 100,
writeCapacity: 50,
pointInTimeRecovery: true,
backupRetention: 30,
},
lambda: {
reservedConcurrency: 200,
memorySize: 1024,
timeout: 30,
}
},
production: {
account: process.env.CDK_PROD_ACCOUNT,
region: 'us-east-1',
stage: 'prod',
domain: 'go.yourcompany.com',
customDomain: true,
monitoring: {
detailedMetrics: true,
logRetention: 90,
alerting: true,
dashboard: true,
},
database: {
billingMode: 'PROVISIONED',
readCapacity: 500, // Start conservative, auto-scale up
writeCapacity: 200,
pointInTimeRecovery: true,
backupRetention: 90,
globalTables: true, // Multi-region disaster recovery
},
lambda: {
reservedConcurrency: 1000,
memorySize: 1024,
timeout: 30,
provisionedConcurrency: 10, // Keep some functions warm
}
}
};
const stage = app.node.tryGetContext('stage') || 'dev';
const config = environments[stage as keyof typeof environments];
if (!config) {
throw new Error(`Invalid stage: ${stage}. Available stages: ${Object.keys(environments).join(', ')}`);
}
// Deploy in logical order with dependencies
const databaseStack = new DatabaseStack(app, `LinkShortener-Database-${stage}`, {
env: { account: config.account, region: config.region },
stage,
config: config.database,
});
const appStack = new LinkShortenerStack(app, `LinkShortener-App-${stage}`, {
env: { account: config.account, region: config.region },
stage,
config,
database: databaseStack.database,
});
// Only deploy monitoring in staging+ environments
if (stage !== 'dev') {
new MonitoringStack(app, `LinkShortener-Monitoring-${stage}`, {
env: { account: config.account, region: config.region },
stage,
config: config.monitoring,
appStack,
});
}
Why four environments? Each serves a specific purpose:
- Dev: Development isolation with cost controls for experimentation
- Staging: Integration testing with production-like data and configurations
- Pre-prod: Production replica for load testing and final validation
- Production: Live environment with full monitoring and redundancy
Performance Optimization: Lambda Cold Starts and Beyond
Here are the optimizations that actually made a difference:
1. Lambda Configuration That Matters
// lib/constructs/optimized-lambda.ts
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as nodejs from 'aws-cdk-lib/aws-lambda-nodejs';
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
export interface OptimizedLambdaProps {
entry: string;
stage: string;
reservedConcurrency?: number;
provisionedConcurrency?: number;
memorySize?: number;
}
export class OptimizedLambda extends Construct {
public readonly function: nodejs.NodejsFunction;
constructor(scope: Construct, id: string, props: OptimizedLambdaProps) {
super(scope, id);
this.function = new nodejs.NodejsFunction(this, 'Function', {
entry: props.entry,
handler: 'handler',
runtime: lambda.Runtime.NODEJS_20_X,
// Memory configuration affects CPU - sweet spot for most workloads
memorySize: props.memorySize || 1024,
// Timeout aggressive enough to fail fast
timeout: cdk.Duration.seconds(30),
// Environment variables for optimization
environment: {
NODE_OPTIONS: '--enable-source-maps',
AWS_NODEJS_CONNECTION_REUSE_ENABLED: '1', // Reuse TCP connections
POWERTOOLS_SERVICE_NAME: 'link-shortener',
POWERTOOLS_METRICS_NAMESPACE: 'LinkShortener',
},
// Bundle optimization
bundling: {
minify: true,
sourceMap: true,
target: 'es2022',
format: nodejs.OutputFormat.ESM,
banner: 'import { createRequire } from "module"; const require = createRequire(import.meta.url);',
externalModules: [
'@aws-sdk/*', // Don't bundle AWS SDK
],
esbuildArgs: {
'--tree-shaking': 'true',
'--platform': 'node',
'--target': 'node20',
},
},
// Reserved concurrency to prevent one function from eating all capacity
reservedConcurrency: props.reservedConcurrency,
// VPC configuration only if you need it (adds 1-2s to cold starts)
// vpc: props.stage === 'prod' ? vpc : undefined,
});
// Provisioned concurrency for production critical paths
if (props.provisionedConcurrency && props.stage === 'prod') {
const version = this.function.currentVersion;
new lambda.Alias(this, 'ProductionAlias', {
aliasName: 'prod',
version,
provisionedConcurrencyConfig: {
provisionedConcurrentExecutions: props.provisionedConcurrency,
},
});
}
// X-Ray tracing for performance insights
this.function.addEnvironment('_X_AMZN_TRACE_ID', '${_X_AMZN_TRACE_ID}');
}
}
2. Connection Pooling That Actually Works
Creating new DynamoDB connections on every invocation was a major performance bottleneck. Here’s a connection manager that helps:
// src/utils/dynamodb-connection.ts
import { DynamoDBClient } from '@aws-sdk/client-dynamodb';
import { DynamoDBDocumentClient } from '@aws-sdk/lib-dynamodb';
// Global connection pool - survives between Lambda invocations
let dynamoClient: DynamoDBDocumentClient | null = null;
export function getDynamoClient(): DynamoDBDocumentClient {
if (!dynamoClient) {
const client = new DynamoDBClient({
region: process.env.AWS_REGION,
// Connection pooling configuration
maxAttempts: 3,
requestHandler: {
// Optimize for Lambda runtime
connectionTimeout: 1000, // 1s timeout
requestTimeout: 5000, // 5s total request timeout
// Connection pooling
httpsAgent: {
maxSockets: 10, // Reduced from default 50
keepAlive: true,
keepAliveMsecs: 30000,
},
},
// Client-side caching of credentials
credentials: {
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
sessionToken: process.env.AWS_SESSION_TOKEN,
},
});
dynamoClient = DynamoDBDocumentClient.from(client, {
marshallOptions: {
convertEmptyValues: false,
removeUndefinedValues: true,
convertClassInstanceToMap: false,
},
unmarshallOptions: {
wrapNumbers: false,
},
});
// Log connection creation for debugging
console.log('DynamoDB connection pool initialized');
}
return dynamoClient;
}
// Performance monitoring wrapper
export async function withPerformanceLogging<T>(
operation: string,
fn: () => Promise<T>
): Promise<T> {
const start = Date.now();
try {
const result = await fn();
const duration = Date.now() - start;
console.log(JSON.stringify({
operation,
duration,
success: true,
timestamp: new Date().toISOString(),
}));
return result;
} catch (error) {
const duration = Date.now() - start;
console.error(JSON.stringify({
operation,
duration,
success: false,
error: error instanceof Error ? error.message : String(error),
timestamp: new Date().toISOString(),
}));
throw error;
}
}
3. Production-Optimized Redirect Handler
// src/handlers/redirect-optimized.ts
import { APIGatewayProxyEvent, APIGatewayProxyResult } from 'aws-lambda';
import { getDynamoClient, withPerformanceLogging } from '../utils/dynamodb-connection';
import { GetCommand, UpdateCommand } from '@aws-sdk/lib-dynamodb';
// Declare cold start tracking outside handler
let isColdStart = true;
export const handler = async (
event: APIGatewayProxyEvent
): Promise<APIGatewayProxyResult> => {
const startTime = Date.now();
const coldStart = isColdStart;
isColdStart = false;
// Extract short code from path
const shortCode = event.pathParameters?.proxy || event.pathParameters?.shortCode;
if (!shortCode) {
return createErrorResponse(404, 'Short code not found');
}
try {
const dynamodb = getDynamoClient();
// Optimized DynamoDB query with projection
const result = await withPerformanceLogging(
'GetShortUrl',
() => dynamodb.send(new GetCommand({
TableName: process.env.URLS_TABLE_NAME!,
Key: { shortCode },
ProjectionExpression: 'originalUrl, expiresAt, clickCount',
ConsistentRead: false, // Eventually consistent is fine for redirects
}))
);
if (!result.Item) {
// Log 404 for analytics but don't block
logAnalyticsAsync('404', shortCode, event).catch(console.error);
return createErrorResponse(404, 'Link not found');
}
const { originalUrl, expiresAt } = result.Item;
// Check expiration
if (expiresAt && Date.now() > expiresAt) {
logAnalyticsAsync('EXPIRED', shortCode, event).catch(console.error);
return createErrorResponse(410, 'Link has expired');
}
// Update click count asynchronously (fire-and-forget)
updateClickCountAsync(shortCode).catch(console.error);
// Log successful redirect
logAnalyticsAsync('SUCCESS', shortCode, event).catch(console.error);
const responseTime = Date.now() - startTime;
// Structured logging for monitoring
console.log(JSON.stringify({
event: 'redirect_success',
shortCode,
responseTime,
coldStart,
userAgent: event.headers['User-Agent']?.substring(0, 100),
referer: event.headers['Referer']?.substring(0, 100),
timestamp: new Date().toISOString(),
}));
return {
statusCode: 301, // Permanent redirect for caching
headers: {
Location: originalUrl,
'Cache-Control': 'public, max-age=300, s-maxage=3600', // 5min browser, 1hr CDN
'X-Response-Time': responseTime.toString(),
'X-Cold-Start': coldStart.toString(),
},
body: '',
};
} catch (error) {
const responseTime = Date.now() - startTime;
console.error(JSON.stringify({
event: 'redirect_error',
shortCode,
error: error instanceof Error ? error.message : String(error),
responseTime,
coldStart,
timestamp: new Date().toISOString(),
}));
return createErrorResponse(500, 'Internal server error');
}
};
async function updateClickCountAsync(shortCode: string): Promise<void> {
try {
const dynamodb = getDynamoClient();
await dynamodb.send(new UpdateCommand({
TableName: process.env.URLS_TABLE_NAME!,
Key: { shortCode },
UpdateExpression: 'ADD clickCount :inc SET lastClickAt = :timestamp',
ExpressionAttributeValues: {
':inc': 1,
':timestamp': Date.now(),
},
}));
} catch (error) {
// Don't fail redirect if analytics update fails
console.error('Failed to update click count:', error);
}
}
async function logAnalyticsAsync(
eventType: string,
shortCode: string,
event: APIGatewayProxyEvent
): Promise<void> {
// Implementation for async analytics logging
// This would typically write to a separate analytics table or queue
}
function createErrorResponse(statusCode: number, message: string): APIGatewayProxyResult {
return {
statusCode,
headers: {
'Content-Type': 'text/html',
'Cache-Control': 'no-cache',
},
body: `
<!DOCTYPE html>
<html>
<head><title>Error</title></head>
<body style="font-family: Arial, sans-serif; text-align: center; margin-top: 100px;">
<h1>${statusCode}</h1>
<p>${message}</p>
</body>
</html>
`,
};
}
Cost Optimization: Learning from Expensive Mistakes
Cost optimization becomes critical when traffic patterns change unexpectedly. Understanding how different AWS services scale and bill helps prevent budget surprises during high-traffic periods:
1. DynamoDB Optimization Strategy
// lib/database-stack-optimized.ts
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
import * as applicationautoscaling from 'aws-cdk-lib/aws-applicationautoscaling';
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
export class OptimizedDatabaseStack extends Construct {
public readonly linksTable: dynamodb.Table;
constructor(scope: Construct, id: string, props: {
stage: string;
expectedReadsPerSecond: number;
expectedWritesPerSecond: number;
}) {
super(scope, id);
this.linksTable = new dynamodb.Table(this, 'LinksTable', {
partitionKey: {
name: 'shortCode',
type: dynamodb.AttributeType.STRING,
},
// Start with on-demand, switch to provisioned when you understand patterns
billingMode: props.stage === 'prod'
? dynamodb.BillingMode.PROVISIONED
: dynamodb.BillingMode.PAY_PER_REQUEST,
// Provisioned capacity for production
...(props.stage === 'prod' && {
readCapacity: Math.max(5, Math.ceil(props.expectedReadsPerSecond * 1.2)),
writeCapacity: Math.max(5, Math.ceil(props.expectedWritesPerSecond * 1.2)),
}),
pointInTimeRecovery: props.stage === 'prod',
deletionProtection: props.stage === 'prod',
// Encryption for compliance
encryption: dynamodb.TableEncryption.AWS_MANAGED,
stream: dynamodb.StreamViewType.NEW_AND_OLD_IMAGES, // For analytics
});
// Auto-scaling for production
if (props.stage === 'prod') {
this.setupAutoScaling();
}
// Global Secondary Index for analytics queries
this.linksTable.addGlobalSecondaryIndex({
indexName: 'UserIndex',
partitionKey: {
name: 'userId',
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: 'createdAt',
type: dynamodb.AttributeType.NUMBER,
},
projectionType: dynamodb.ProjectionType.KEYS_ONLY, // Minimize costs
// Same billing mode as main table
...(props.stage === 'prod' && {
readCapacity: Math.max(5, Math.ceil(props.expectedReadsPerSecond * 0.1)),
writeCapacity: Math.max(5, Math.ceil(props.expectedWritesPerSecond * 1.0)),
}),
});
}
private setupAutoScaling(): void {
// Read capacity auto-scaling
const readScaling = this.linksTable.autoScaleReadCapacity({
minCapacity: 5,
maxCapacity: 1000, // Reasonable ceiling
});
readScaling.scaleOnUtilization({
targetUtilizationPercent: 70, // Conservative target
scaleInCooldown: cdk.Duration.minutes(5),
scaleOutCooldown: cdk.Duration.minutes(1),
});
// Write capacity auto-scaling
const writeScaling = this.linksTable.autoScaleWriteCapacity({
minCapacity: 5,
maxCapacity: 500,
});
writeScaling.scaleOnUtilization({
targetUtilizationPercent: 70,
scaleInCooldown: cdk.Duration.minutes(5),
scaleOutCooldown: cdk.Duration.minutes(1),
});
}
}
2. CloudFront Configuration for Maximum Cost Efficiency
// lib/cdn-stack-optimized.ts
import * as cloudfront from 'aws-cdk-lib/aws-cloudfront';
import * as origins from 'aws-cdk-lib/aws-cloudfront-origins';
import * as apigateway from 'aws-cdk-lib/aws-apigateway';
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
export class OptimizedCDNStack extends Construct {
public readonly distribution: cloudfront.Distribution;
constructor(scope: Construct, id: string, props: {
apiGateway: apigateway.RestApi;
stage: string;
}) {
super(scope, id);
this.distribution = new cloudfront.Distribution(this, 'Distribution', {
defaultBehavior: {
origin: new origins.RestApiOrigin(props.apiGateway),
// Caching policy optimized for redirects
cachePolicy: new cloudfront.CachePolicy(this, 'RedirectCachePolicy', {
cachePolicyName: `link-shortener-${props.stage}`,
defaultTtl: cdk.Duration.minutes(5),
maxTtl: cdk.Duration.hours(24),
minTtl: cdk.Duration.minutes(1),
// Cache based on path only (ignore query strings and headers)
queryStringBehavior: cloudfront.CacheQueryStringBehavior.none(),
headerBehavior: cloudfront.CacheHeaderBehavior.none(),
cookieBehavior: cloudfront.CacheCookieBehavior.none(),
}),
// Compression saves bandwidth costs
compress: true,
// Only allow GET requests for redirects
allowedMethods: cloudfront.AllowedMethods.ALLOW_GET_HEAD,
cachedMethods: cloudfront.CachedMethods.CACHE_GET_HEAD,
viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
},
// Additional behavior for API endpoints (no caching)
additionalBehaviors: {
'/api/*': {
origin: new origins.RestApiOrigin(props.apiGateway),
cachePolicy: cloudfront.CachePolicy.CACHING_DISABLED,
viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
allowedMethods: cloudfront.AllowedMethods.ALLOW_ALL,
},
},
// Use cheapest price class for non-critical applications
priceClass: props.stage === 'prod'
? cloudfront.PriceClass.PRICE_CLASS_100 // US, Canada, Europe
: cloudfront.PriceClass.PRICE_CLASS_100,
// Error handling
errorResponses: [
{
httpStatus: 404,
responseHttpStatus: 404,
responsePagePath: '/404.html',
ttl: cdk.Duration.minutes(5), // Cache 404s to prevent hammering origin
},
{
httpStatus: 500,
responseHttpStatus: 500,
responsePagePath: '/500.html',
ttl: cdk.Duration.minutes(1), // Short cache for server errors
},
],
// Enable logging for analytics (additional cost but necessary for insights)
...(props.stage === 'prod' && {
enableLogging: true,
logBucket: s3.Bucket.fromBucketName(this, 'LogsBucket', `cloudfront-logs-${props.stage}`),
logFilePrefix: 'link-shortener/',
}),
});
}
}
3. Cost Monitoring and Alerts
// lib/cost-monitoring-stack.ts
import * as cloudwatch from 'aws-cdk-lib/aws-cloudwatch';
import * as sns from 'aws-cdk-lib/aws-sns';
import * as subscriptions from 'aws-cdk-lib/aws-sns-subscriptions';
import * as actions from 'aws-cdk-lib/aws-cloudwatch-actions';
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
export class CostMonitoringStack extends Construct {
constructor(scope: Construct, id: string, props: {
stage: string;
alertEmail: string;
monthlyBudget: number;
}) {
super(scope, id);
// SNS topic for cost alerts
const alertTopic = new sns.Topic(this, 'CostAlerts', {
displayName: `Link Shortener Cost Alerts - ${props.stage}`,
});
alertTopic.addSubscription(
new subscriptions.EmailSubscription(props.alertEmail)
);
// DynamoDB cost monitoring
const dynamoReadAlarm = new cloudwatch.Alarm(this, 'DynamoReadUnitsHigh', {
metric: new cloudwatch.Metric({
namespace: 'AWS/DynamoDB',
metricName: 'ConsumedReadCapacityUnits',
dimensionsMap: {
TableName: 'LinksTable', // Replace with actual table name
},
statistic: 'Sum',
period: cdk.Duration.minutes(5),
}),
threshold: 1000, // Adjust based on your budget
evaluationPeriods: 2,
alarmDescription: 'DynamoDB read capacity usage is high',
});
dynamoReadAlarm.addAlarmAction(
new actions.SnsAction(alertTopic)
);
// Lambda invocation cost monitoring
const lambdaInvocationsAlarm = new cloudwatch.Alarm(this, 'LambdaInvocationsHigh', {
metric: new cloudwatch.Metric({
namespace: 'AWS/Lambda',
metricName: 'Invocations',
dimensionsMap: {
FunctionName: 'redirect-handler', // Replace with actual function name
},
statistic: 'Sum',
period: cdk.Duration.hours(1),
}),
threshold: 100000, // 100k invocations per hour
evaluationPeriods: 1,
alarmDescription: 'Lambda invocations are unusually high',
});
lambdaInvocationsAlarm.addAlarmAction(
new actions.SnsAction(alertTopic)
);
// Create cost dashboard
new cloudwatch.Dashboard(this, 'CostDashboard', {
dashboardName: `LinkShortener-Costs-${props.stage}`,
widgets: [
[
new cloudwatch.GraphWidget({
title: 'DynamoDB Read Capacity Units',
left: [dynamoReadAlarm.metric],
width: 12,
}),
],
[
new cloudwatch.GraphWidget({
title: 'Lambda Invocations',
left: [lambdaInvocationsAlarm.metric],
width: 12,
}),
],
[
new cloudwatch.GraphWidget({
title: 'CloudFront Requests',
left: [
new cloudwatch.Metric({
namespace: 'AWS/CloudFront',
metricName: 'Requests',
statistic: 'Sum',
period: cdk.Duration.hours(1),
}),
],
width: 12,
}),
],
],
});
}
}
Production Monitoring: Beyond “It Works”
Here’s the monitoring approach that helped during production incidents:
1. Custom Metrics That Matter
// src/utils/metrics.ts
import { CloudWatchClient, PutMetricDataCommand } from '@aws-sdk/client-cloudwatch';
const cloudwatch = new CloudWatchClient({ region: process.env.AWS_REGION });
export class MetricsCollector {
private namespace = 'LinkShortener/Production';
private metrics: Array<{
MetricName: string;
Value: number;
Unit: string;
Timestamp: Date;
Dimensions?: Array<{ Name: string; Value: string }>;
}> = [];
async recordRedirectSuccess(shortCode: string, responseTime: number, coldStart: boolean): Promise<void> {
this.metrics.push(
{
MetricName: 'RedirectResponseTime',
Value: responseTime,
Unit: 'Milliseconds',
Timestamp: new Date(),
Dimensions: [
{ Name: 'ColdStart', Value: coldStart.toString() },
],
},
{
MetricName: 'RedirectCount',
Value: 1,
Unit: 'Count',
Timestamp: new Date(),
Dimensions: [
{ Name: 'Status', Value: 'Success' },
],
}
);
await this.flush();
}
async recordDatabaseLatency(operation: string, latency: number): Promise<void> {
this.metrics.push({
MetricName: 'DatabaseLatency',
Value: latency,
Unit: 'Milliseconds',
Timestamp: new Date(),
Dimensions: [
{ Name: 'Operation', Value: operation },
],
});
await this.flush();
}
async recordError(errorType: string, shortCode?: string): Promise<void> {
this.metrics.push({
MetricName: 'ErrorCount',
Value: 1,
Unit: 'Count',
Timestamp: new Date(),
Dimensions: [
{ Name: 'ErrorType', Value: errorType },
...(shortCode ? [{ Name: 'ShortCode', Value: shortCode }] : []),
],
});
await this.flush();
}
private async flush(): Promise<void> {
if (this.metrics.length === 0) return;
try {
await cloudwatch.send(new PutMetricDataCommand({
Namespace: this.namespace,
MetricData: this.metrics,
}));
this.metrics = []; // Clear after successful send
} catch (error) {
console.error('Failed to send metrics:', error);
// Don't throw - metrics failures shouldn't break the main functionality
}
}
}
// Singleton instance
export const metrics = new MetricsCollector();
2. Load Testing That Simulates Reality
// tests/load-test.ts - Load testing that helps catch scaling issues
import { performance } from 'perf_hooks';
interface LoadTestConfig {
baseUrl: string;
concurrentUsers: number;
testDurationMs: number;
rampUpMs: number;
shortCodes: string[];
}
interface LoadTestResult {
totalRequests: number;
successfulRequests: number;
failedRequests: number;
averageResponseTime: number;
p50ResponseTime: number;
p95ResponseTime: number;
p99ResponseTime: number;
errorsPerSecond: number;
requestsPerSecond: number;
}
export async function runLoadTest(config: LoadTestConfig): Promise<LoadTestResult> {
const results: Array<{
success: boolean;
responseTime: number;
timestamp: number;
error?: string;
}> = [];
const startTime = performance.now();
const endTime = startTime + config.testDurationMs;
// Create promise for each concurrent user
const userPromises = Array.from({ length: config.concurrentUsers }, async (_, userIndex) => {
// Stagger user start times during ramp-up
const userStartDelay = (config.rampUpMs * userIndex) / config.concurrentUsers;
await sleep(userStartDelay);
while (performance.now() < endTime) {
const requestStart = performance.now();
try {
// Random short code selection
const shortCode = config.shortCodes[Math.floor(Math.random() * config.shortCodes.length)];
const url = `${config.baseUrl}/${shortCode}`;
const response = await fetch(url, {
method: 'GET',
redirect: 'manual', // Don't follow redirects - we just want timing
});
const responseTime = performance.now() - requestStart;
results.push({
success: response.status >= 200 && response.status < 400,
responseTime,
timestamp: performance.now(),
});
} catch (error) {
const responseTime = performance.now() - requestStart;
results.push({
success: false,
responseTime,
timestamp: performance.now(),
error: error instanceof Error ? error.message : String(error),
});
}
// Wait before next request (adjust for desired load)
await sleep(100 + Math.random() * 200); // 100-300ms between requests per user
}
});
// Wait for all users to complete
await Promise.all(userPromises);
// Calculate statistics
const successfulResults = results.filter(r => r.success);
const responseTimes = successfulResults.map(r => r.responseTime);
responseTimes.sort((a, b) => a - b);
const totalDurationSec = (performance.now() - startTime) / 1000;
return {
totalRequests: results.length,
successfulRequests: successfulResults.length,
failedRequests: results.length - successfulResults.length,
averageResponseTime: responseTimes.reduce((a, b) => a + b, 0) / responseTimes.length,
p50ResponseTime: responseTimes[Math.floor(responseTimes.length * 0.5)],
p95ResponseTime: responseTimes[Math.floor(responseTimes.length * 0.95)],
p99ResponseTime: responseTimes[Math.floor(responseTimes.length * 0.99)],
errorsPerSecond: (results.length - successfulResults.length) / totalDurationSec,
requestsPerSecond: results.length / totalDurationSec,
};
}
async function sleep(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}
// Example usage - run this before every deployment
async function validatePerformance() {
console.log('Running pre-deployment load test...');
const testConfig: LoadTestConfig = {
baseUrl: 'https://staging-links.yourcompany.com',
concurrentUsers: 50,
testDurationMs: 60 * 1000, // 1 minute
rampUpMs: 10 * 1000, // 10 second ramp-up
shortCodes: ['test1', 'test2', 'test3', 'popular-link', 'campaign-2024'],
};
const results = await runLoadTest(testConfig);
// Performance assertions
const maxAcceptableP95 = 500; // 500ms P95 response time
const maxAcceptableErrorRate = 0.01; // 1% error rate
if (results.p95ResponseTime > maxAcceptableP95) {
throw new Error(`P95 response time too high: ${results.p95ResponseTime}ms > ${maxAcceptableP95}ms`);
}
const errorRate = results.failedRequests / results.totalRequests;
if (errorRate > maxAcceptableErrorRate) {
throw new Error(`Error rate too high: ${(errorRate * 100).toFixed(2)}% > ${(maxAcceptableErrorRate * 100)}%`);
}
console.log('Load test passed:', results);
}
Blue-Green Deployments: Deploy Without Fear
A deployment strategy that reduces deployment anxiety:
// deployment/blue-green-deploy.ts
import * as aws from '@aws-sdk/client-route53';
import * as lambda from '@aws-sdk/client-lambda';
interface DeploymentConfig {
stage: 'blue' | 'green';
domainName: string;
hostedZoneId: string;
healthCheckUrl: string;
}
export class BlueGreenDeployment {
private route53 = new aws.Route53Client({});
private lambdaClient = new lambda.LambdaClient({});
async deployNewVersion(config: DeploymentConfig): Promise<void> {
console.log(`Starting ${config.stage} deployment...`);
// Step 1: Deploy new infrastructure
await this.deployCDKStack(config.stage);
// Step 2: Warm up the new environment
await this.warmUpEnvironment(config);
// Step 3: Run health checks
await this.runHealthChecks(config.healthCheckUrl);
// Step 4: Gradually shift traffic
await this.shiftTraffic(config, [10, 25, 50, 100]);
console.log(`${config.stage} deployment completed successfully`);
}
private async deployCDKStack(stage: string): Promise<void> {
// This would typically use CDK CLI or AWS SDK to deploy
console.log(`Deploying CDK stack for ${stage}...`);
// Example: exec CDK deploy command
const { spawn } = await import('child_process');
return new Promise((resolve, reject) => {
const deploy = spawn('npx', ['cdk', 'deploy', '--all', '--context', `stage=${stage}`], {
stdio: 'inherit',
});
deploy.on('close', (code) => {
if (code === 0) {
resolve();
} else {
reject(new Error(`CDK deploy failed with code ${code}`));
}
});
});
}
private async warmUpEnvironment(config: DeploymentConfig): Promise<void> {
console.log('Warming up Lambda functions...');
// Get all Lambda functions for this stage
const functions = await this.lambdaClient.send(new lambda.ListFunctionsCommand({
Marker: undefined,
MaxItems: 100,
}));
const stageFunctions = functions.Functions?.filter(fn =>
fn.FunctionName?.includes(config.stage)
) || [];
// Warm up each function
const warmUpPromises = stageFunctions.map(async (fn) => {
if (!fn.FunctionName) return;
try {
await this.lambdaClient.send(new lambda.InvokeCommand({
FunctionName: fn.FunctionName,
Payload: JSON.stringify({
source: 'warm-up',
warmUp: true,
}),
}));
console.log(`Warmed up ${fn.FunctionName}`);
} catch (error) {
console.warn(`[WARN] Failed to warm up ${fn.FunctionName}:`, error);
}
});
await Promise.all(warmUpPromises);
}
private async runHealthChecks(healthCheckUrl: string): Promise<void> {
console.log('Running health checks...');
const checks = [
{ name: 'Basic redirect', path: '/test-redirect' },
{ name: 'API health', path: '/api/health' },
{ name: '404 handling', path: '/non-existent-link' },
];
for (const check of checks) {
const url = `${healthCheckUrl}${check.path}`;
const response = await fetch(url);
// Different expectations for different endpoints
const expectedStatus = check.path === '/non-existent-link' ? 404 : 200;
if (response.status !== expectedStatus) {
throw new Error(`Health check failed for ${check.name}: ${response.status}`);
}
console.log(`${check.name} health check passed`);
}
}
private async shiftTraffic(
config: DeploymentConfig,
trafficPercentages: number[]
): Promise<void> {
for (const percentage of trafficPercentages) {
console.log(`Shifting ${percentage}% traffic to ${config.stage}...`);
// Update Route53 weighted routing
await this.updateRoute53WeightedRecord(config, percentage);
// Wait for DNS propagation and monitoring
await this.sleep(120000); // 2 minutes
// Check error rates during traffic shift
await this.monitorErrorRates(config);
console.log(`${percentage}% traffic shifted successfully`);
}
}
private async updateRoute53WeightedRecord(
config: DeploymentConfig,
weight: number
): Promise<void> {
const oppositeWeight = 100 - weight;
const oppositeStage = config.stage === 'blue' ? 'green' : 'blue';
// Update current stage weight
await this.route53.send(new aws.ChangeResourceRecordSetsCommand({
HostedZoneId: config.hostedZoneId,
ChangeBatch: {
Changes: [{
Action: 'UPSERT',
ResourceRecordSet: {
Name: config.domainName,
Type: 'CNAME',
SetIdentifier: config.stage,
Weight: weight,
TTL: 60, // Short TTL for quick changes
ResourceRecords: [{
Value: `${config.stage}-api.example.com`
}],
},
}],
},
}));
// Update opposite stage weight
await this.route53.send(new aws.ChangeResourceRecordSetsCommand({
HostedZoneId: config.hostedZoneId,
ChangeBatch: {
Changes: [{
Action: 'UPSERT',
ResourceRecordSet: {
Name: config.domainName,
Type: 'CNAME',
SetIdentifier: oppositeStage,
Weight: oppositeWeight,
TTL: 60,
ResourceRecords: [{
Value: `${oppositeStage}-api.example.com`
}],
},
}],
},
}));
}
private async monitorErrorRates(config: DeploymentConfig): Promise<void> {
// This would integrate with CloudWatch to check error rates
// and automatically roll back if error rates exceed threshold
console.log('Monitoring error rates...');
// Example: Check CloudWatch metrics
// If error rate > 1%, rollback
// If response time P95 > 500ms, rollback
await this.sleep(30000); // Monitor for 30 seconds
}
async rollback(config: DeploymentConfig): Promise<void> {
console.log(`Rolling back ${config.stage} deployment...`);
// Shift all traffic back to stable version
const stableStage = config.stage === 'blue' ? 'green' : 'blue';
await this.updateRoute53WeightedRecord({
...config,
stage: stableStage,
}, 100);
console.log('Rollback completed');
}
private async sleep(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
Production Optimization Considerations
Running production infrastructure reveals important patterns about scaling and cost management:
1. Conservative provisioning with aggressive monitoring Start with minimal capacity and rely on auto-scaling. Over-provisioning increases costs without improving reliability for most workloads.
2. Cold start impact on user experience Even 2-3 seconds of cold start latency significantly degrades redirect performance. Provisioned concurrency for critical paths often justifies the additional cost.
3. DynamoDB auto-scaling timing Auto-scaling takes 5-10 minutes to increase capacity but scales down quickly. Setting target utilization at 70% instead of 90% provides buffer for traffic spikes.
4. Business metrics over technical metrics Tracking “redirects per campaign” and “conversion-generating links” provides more actionable insights than raw “Lambda invocations.” Business context helps prioritize optimization efforts.
5. Staging load testing effectiveness Comprehensive load testing catches most production issues, but real user patterns often differ from synthetic tests. Focus on simulating actual traffic patterns rather than theoretical peak loads.
Production Metrics That Matter
Here are the dashboards that provide useful daily insights:
// lib/production-dashboard.ts
import * as cloudwatch from 'aws-cdk-lib/aws-cloudwatch';
export class ProductionDashboard extends Construct {
constructor(scope: Construct, id: string) {
super(scope, id);
new cloudwatch.Dashboard(this, 'LinkShortenerProduction', {
dashboardName: 'LinkShortener-Production-Health',
widgets: [
// Row 1: Business metrics
[
new cloudwatch.SingleValueWidget({
title: 'Redirects (24h)',
metrics: [
new cloudwatch.Metric({
namespace: 'LinkShortener/Production',
metricName: 'RedirectCount',
statistic: 'Sum',
period: cdk.Duration.hours(24),
}),
],
width: 6,
}),
new cloudwatch.SingleValueWidget({
title: 'Success Rate (24h)',
metrics: [
new cloudwatch.MathExpression({
expression: '(successful / total) * 100',
usingMetrics: {
successful: new cloudwatch.Metric({
namespace: 'LinkShortener/Production',
metricName: 'RedirectCount',
dimensionsMap: { Status: 'Success' },
statistic: 'Sum',
}),
total: new cloudwatch.Metric({
namespace: 'LinkShortener/Production',
metricName: 'RedirectCount',
statistic: 'Sum',
}),
},
}),
],
width: 6,
}),
],
// Row 2: Performance metrics
[
new cloudwatch.GraphWidget({
title: 'Response Time Percentiles',
left: [
new cloudwatch.Metric({
namespace: 'LinkShortener/Production',
metricName: 'RedirectResponseTime',
statistic: 'p50',
period: cdk.Duration.minutes(5),
label: 'P50',
}),
new cloudwatch.Metric({
namespace: 'LinkShortener/Production',
metricName: 'RedirectResponseTime',
statistic: 'p95',
period: cdk.Duration.minutes(5),
label: 'P95',
}),
new cloudwatch.Metric({
namespace: 'LinkShortener/Production',
metricName: 'RedirectResponseTime',
statistic: 'p99',
period: cdk.Duration.minutes(5),
label: 'P99',
}),
],
width: 12,
}),
],
// Row 3: Infrastructure health
[
new cloudwatch.GraphWidget({
title: 'DynamoDB Throttling',
left: [
new cloudwatch.Metric({
namespace: 'AWS/DynamoDB',
metricName: 'ReadThrottledRequests',
dimensionsMap: { TableName: 'LinksTable' },
statistic: 'Sum',
}),
new cloudwatch.Metric({
namespace: 'AWS/DynamoDB',
metricName: 'WriteThrottledRequests',
dimensionsMap: { TableName: 'LinksTable' },
statistic: 'Sum',
}),
],
width: 6,
}),
new cloudwatch.GraphWidget({
title: 'Lambda Cold Starts',
left: [
new cloudwatch.Metric({
namespace: 'LinkShortener/Production',
metricName: 'RedirectCount',
dimensionsMap: { ColdStart: 'true' },
statistic: 'Sum',
period: cdk.Duration.minutes(5),
}),
],
width: 6,
}),
],
],
});
}
}
What’s Next?
In Part 5, we’ll tackle the final frontier: scaling to handle millions of redirects per day, cost optimization at scale, and the operational practices that let a small team manage a high-traffic service.
We’ll cover advanced topics like multi-region deployments, database sharding strategies, and monitoring that alerts you before users notice problems.
The infrastructure we’ve built scales well, but there are specific patterns that help services handle increasing load efficiently.
AWS CDK Link Shortener: From Zero to Production
A comprehensive 5-part series on building a production-grade link shortener service with AWS CDK, Node.js Lambda, and DynamoDB. Real war stories, performance optimization, and cost management included.
All posts in this series
Related posts
A comprehensive technical guide to choosing and implementing AWS edge computing solutions for global applications with practical examples and cost optimization strategies.
Practical strategies to prevent and handle DynamoDB throttling in Single Table Design applications. Covers partition key design, write sharding, capacity modes, DAX caching, retry patterns, and CloudWatch monitoring for high-throughput systems.
A comprehensive guide to reducing AWS costs by 40-70% through systematic optimization using native AWS services, automation, and proven implementation patterns.
Real lessons from deploying LangChain applications to production. Learn about the anti-patterns that cause failures and the patterns that enable success, with working code examples and cost optimization strategies.
How systematic database profiling and optimization reduced infrastructure costs significantly. PostgreSQL and MongoDB performance insights and practical patterns.