Skip to content

2025-09-04

Migrating from Serverless Framework to AWS CDK: Part 6 - Migration Strategies and Best Practices

Execute a smooth migration from Serverless Framework to AWS CDK with proven strategies, testing approaches, rollback procedures, and performance optimization techniques.

A CDK migration’s final phase is not about correctness of the generated infrastructure but about operational readiness during cut-over: a plan for taking traffic, a plan for reverting if the migration degrades production, and a plan for reconciling any drift that appears under live load. For a migration covering dozens of Lambda functions and multiple DynamoDB tables, the cut-over plan is larger than the migration code itself; rollback must be fast, granular, and tested before any production weight moves over.

This post is the final part of the Serverless Framework-to-CDK migration series. It covers the production cut-over strategy, the per-service rollback plan, the traffic-shaping patterns (canary, blue-green) that keep error budgets safe during migration, the data migration reconciliation step, and the post-migration verification that confirms the old and new systems emit identical behaviour.

Series Navigation:

Three Migration Approaches and Their Outcomes

Three different migration strategies were tested in production environments, each providing valuable insights:

Approach #1: Big Bang Migration

Implementation: Deploy all CDK infrastructure during a 4-hour maintenance window.

Issues encountered: CloudFormation stack deployment exceeded time estimates (6 hours). API Gateway stage deployment failed. DynamoDB import had data integrity issues affecting 3,000 records. Rollback required additional 4 hours.

Operational impact: 10 hours total downtime, significant service disruption, increased support volume.

Lesson: “Big bang” works for demo apps, not production systems with interdependencies.

Approach #2: Strangler Pattern Implementation

Implementation: Gradual function migration using traffic splitting.

Issues encountered: Complex function dependencies created cross-service call patterns. Authentication synchronization between systems failed. Performance degradation from increased latency.

Operational impact: Extended migration timeline from 3 weeks to 2 months. API performance issues reported.

Lesson: Strangler pattern requires careful dependency mapping and shared authentication.

Approach #3: Blue-Green Deployment Success

Implementation: Full parallel deployment with instant traffic switching.

Results: Complete environment parity achieved. 30-second rollback capability. No data loss. Zero downtime.

Operational impact: Successful zero-downtime migration. Performance improved 40%. No service interruptions.

Effective approach: Blue-green deployment with comprehensive monitoring and automated rollback.

Production-Ready Migration Strategies

Blue-Green Deployment Strategy

Blue-green deployment proved most effective for production migrations:

// lib/stacks/production-blue-green-stack.ts
import { Stack, StackProps, Tags, CfnOutput, Duration, TreatMissingData } from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { RestApi, Deployment, Stage, MethodLoggingLevel, LambdaIntegration } from 'aws-cdk-lib/aws-apigateway';
import { Alarm, Metric, ComparisonOperator } from 'aws-cdk-lib/aws-cloudwatch';
import { LambdaAction } from 'aws-cdk-lib/aws-cloudwatch-actions';
import { PolicyStatement } from 'aws-cdk-lib/aws-iam';
import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';

export interface BlueGreenStackProps extends StackProps {
  stage: string;
  environment: 'blue' | 'green';
  monitoringConfig: {
    errorThreshold: number;
    latencyThreshold: number;
    rollbackFunction: NodejsFunction;
  };
}

export class ProductionBlueGreenStack extends Stack {
  public readonly api: RestApi;
  public readonly healthCheckEndpoint: string;
  public readonly switchOverFunction: NodejsFunction;

  constructor(scope: Construct, id: string, props: BlueGreenStackProps) {
    super(scope, id, props);

    // Create the complete CDK infrastructure
    this.api = new RestApi(this, 'Api', {
      restApiName: `my-service-${props.stage}-${props.environment}`,
      description: `Production API - ${props.environment.toUpperCase()} environment`,
      deployOptions: {
        stageName: props.environment,
        // Aggressive throttling during migration for safety
        throttlingRateLimit: props.environment === 'green' ? 500 : 1000,
        throttlingBurstLimit: props.environment === 'green' ? 1000 : 2000,
        // Enhanced monitoring during migration
        metricsEnabled: true,
        loggingLevel: MethodLoggingLevel.INFO,
        dataTraceEnabled: true,
        tracingEnabled: true,
      },
    });

    // Deploy all Lambda functions
    const functions = this.createLambdaFunctions(props);

    // Set up API routes
    this.setupApiRoutes(functions);

    // Create health check endpoint for monitoring
    const healthCheckFn = new NodejsFunction(this, 'HealthCheckFunction', {
      entry: 'src/health/health-check.ts',
      handler: 'handler',
      environment: {
        ENVIRONMENT: props.environment,
        API_VERSION: process.env.API_VERSION || 'v1',
        DEPLOYMENT_TIME: new Date().toISOString(),
      },
    });

    const healthResource = this.api.root.addResource('health');
    healthResource.addMethod('GET', new LambdaIntegration(healthCheckFn));

    this.healthCheckEndpoint = `${this.api.url}health`;

    // Create production monitoring alarms
    this.createProductionAlarms(props);

    // Traffic switching function
    this.switchOverFunction = this.createSwitchOverFunction(props);

    // Tag all resources for identification
    Tags.of(this).add('Environment', props.environment);
    Tags.of(this).add('MigrationPhase', 'cdk-migration');
    Tags.of(this).add('DeploymentTime', new Date().toISOString());
    Tags.of(this).add('Version', process.env.COMMIT_SHA || 'latest');

    // Export critical information
    new CfnOutput(this, 'ApiEndpoint', {
      value: this.api.url,
      exportName: `${this.stackName}-api-endpoint`,
      description: `API endpoint for ${props.environment} environment`,
    });

    new CfnOutput(this, 'HealthCheckUrl', {
      value: this.healthCheckEndpoint,
      exportName: `${this.stackName}-health-check`,
      description: 'Health check endpoint for monitoring',
    });
  }

  private createProductionAlarms(props: BlueGreenStackProps) {
    // Error rate alarm - triggers rollback
    const errorAlarm = new Alarm(this, 'HighErrorRateAlarm', {
      metric: this.api.metricServerError({
        period: Duration.minutes(2),
        statistic: 'Sum',
      }),
      threshold: props.monitoringConfig.errorThreshold,
      evaluationPeriods: 2,
      comparisonOperator: ComparisonOperator.GREATER_THAN_THRESHOLD,
      alarmDescription: `High error rate detected in ${props.environment} environment`,
      treatMissingData: TreatMissingData.NOT_BREACHING,
    });

    // Latency alarm - triggers investigation
    const latencyAlarm = new Alarm(this, 'HighLatencyAlarm', {
      metric: this.api.metricLatency({
        period: Duration.minutes(5),
        statistic: 'Average',
      }),
      threshold: props.monitoringConfig.latencyThreshold,
      evaluationPeriods: 3,
      alarmDescription: `High latency detected in ${props.environment} environment`,
    });

    // Connect alarms to automated rollback
    errorAlarm.addAlarmAction(
      new LambdaAction(props.monitoringConfig.rollbackFunction)
    );

    // Export alarm ARNs for external monitoring
    new CfnOutput(this, 'ErrorAlarmArn', {
      value: errorAlarm.alarmArn,
      exportName: `${this.stackName}-error-alarm`,
    });
  }

  private createSwitchOverFunction(props: BlueGreenStackProps) {
    return new NodejsFunction(this, 'TrafficSwitchFunction', {
      entry: 'src/deployment/traffic-switch.ts',
      handler: 'handler',
      timeout: Duration.minutes(5),
      environment: {
        CURRENT_ENVIRONMENT: props.environment,
        TARGET_ENVIRONMENT: props.environment === 'blue' ? 'green' : 'blue',
        HOSTED_ZONE_ID: process.env.HOSTED_ZONE_ID!,
        DOMAIN_NAME: process.env.API_DOMAIN!,
        SLACK_WEBHOOK_URL: process.env.SLACK_WEBHOOK_URL!,
      },
      initialPolicy: [
        new PolicyStatement({
          actions: ['route53:ChangeResourceRecordSets', 'route53:GetChange'],
          resources: ['*'],
        }),
      ],
    });
  }
}

// src/health/health-check.ts - Comprehensive health validation
import { APIGatewayProxyHandler } from 'aws-lambda';
import { DynamoDBClient, DescribeTableCommand } from '@aws-sdk/client-dynamodb';

const dynamoDB = new DynamoDBClient({});

export const handler: APIGatewayProxyHandler = async () => {
  const startTime = Date.now();
  const checks = [];

  try {
    // Database connectivity check
    const tableCheck = await dynamoDB.send(new DescribeTableCommand({
      TableName: process.env.USERS_TABLE!,
    }));
    checks.push({
      name: 'database',
      status: tableCheck.Table?.TableStatus === 'ACTIVE' ? 'healthy' : 'unhealthy',
      responseTime: Date.now() - startTime,
    });

    // Memory usage check
    const memoryUsed = process.memoryUsage();
    checks.push({
      name: 'memory',
      status: memoryUsed.heapUsed < 100 * 1024 * 1024 ? 'healthy' : 'warning', // 100MB threshold
      details: {
        heapUsed: Math.round(memoryUsed.heapUsed / 1024 / 1024) + 'MB',
        heapTotal: Math.round(memoryUsed.heapTotal / 1024 / 1024) + 'MB',
      },
    });

    const overallStatus = checks.every(check => check.status === 'healthy') ? 'healthy' : 'degraded';

    return {
      statusCode: overallStatus === 'healthy' ? 200 : 503,
      headers: {
        'Content-Type': 'application/json',
        'Cache-Control': 'no-cache',
      },
      body: JSON.stringify({
        status: overallStatus,
        environment: process.env.ENVIRONMENT,
        version: process.env.API_VERSION,
        deploymentTime: process.env.DEPLOYMENT_TIME,
        timestamp: new Date().toISOString(),
        responseTime: Date.now() - startTime,
        checks,
      }),
    };
  } catch (error) {
    return {
      statusCode: 503,
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        status: 'unhealthy',
        error: error.message,
        timestamp: new Date().toISOString(),
      }),
    };
  }
};

2. Strangler Fig Pattern

When to use: Large applications requiring zero-downtime migration.

// lib/constructs/migration/traffic-splitter.ts
import {
  LambdaRestApi,
  RestApi,
  Deployment,
  Stage
} from 'aws-cdk-lib/aws-apigateway';

export class TrafficSplitter extends Construct {
  constructor(scope: Construct, id: string, props: {
    legacyApiId: string;
    newApi: RestApi;
    trafficPercentageToNew: number;
  }) {
    super(scope, id);

    // Create canary deployment
    const deployment = new Deployment(this, 'CanaryDeployment', {
      api: props.newApi,
      description: `Canary deployment ${new Date().toISOString()}`,
    });

    const stage = new Stage(this, 'CanaryStage', {
      deployment,
      stageName: 'canary',
      canarySettings: {
        percentTraffic: props.trafficPercentageToNew,
        useStageCache: false,
      },
    });

    // CloudWatch alarms for monitoring
    new Alarm(this, 'CanaryErrorAlarm', {
      metric: props.newApi.metricServerError({
        stage,
      }),
      threshold: 5,
      evaluationPeriods: 2,
    });
  }
}

3. Blue-Green Deployment

When to use: When you need instant rollback capabilities.

// lib/stacks/blue-green-stack.ts
export class BlueGreenStack extends Stack {
  constructor(scope: Construct, id: string, props: {
    stage: string;
    version: 'blue' | 'green';
  }) {
    super(scope, id);

    const api = new RestApi(this, 'Api', {
      restApiName: `my-service-${props.stage}-${props.version}`,
      deployOptions: {
        stageName: props.version,
      },
    });

    // Tag resources for easy identification
    Tags.of(this).add('Deployment', props.version);
    Tags.of(this).add('Version', process.env.COMMIT_SHA || 'latest');

    // Export API endpoint
    new CfnOutput(this, 'ApiEndpoint', {
      value: api.url,
      exportName: `${this.stackName}-endpoint`,
    });
  }
}

// deployment-scripts/blue-green-switch.ts
import { Route53Client, ChangeResourceRecordSetsCommand } from '@aws-sdk/client-route53';

export async function switchTraffic(targetVersion: 'blue' | 'green') {
  const route53 = new Route53Client({});

  await route53.send(new ChangeResourceRecordSetsCommand({
    HostedZoneId: process.env.HOSTED_ZONE_ID,
    ChangeBatch: {
      Changes: [{
        Action: 'UPSERT',
        ResourceRecordSet: {
          Name: 'api.example.com',
          Type: 'CNAME',
          TTL: 60,
          ResourceRecords: [{
            Value: `api-${targetVersion}.execute-api.region.amazonaws.com`,
          }],
        },
      }],
    },
  }));
}

Effective Testing Strategy for Production

Initial testing approaches were comprehensive but missed critical production scenarios. The revised strategy focused on actual production failure modes.

Testing Reality

Standard testing approach: Unit tests, integration tests, load tests - all passing.

Production failure modes discovered:

  • CloudFormation template size exceeding 400KB limit
  • API Gateway timeout conflicts with Lambda timeout settings
  • DynamoDB throttling under peak traffic
  • JWT validation performance degradation at scale

Production-Oriented Testing Strategy

This testing approach identifies critical issues before production deployment:

// test/infrastructure/api-stack.test.ts
import { Template, Match } from 'aws-cdk-lib/assertions';
import { App } from 'aws-cdk-lib';
import { ApiStack } from '../../lib/stacks/api-stack';

describe('ApiStack', () => {
  let template: Template;

  beforeAll(() => {
    const app = new App();
    const stack = new ApiStack(app, 'TestStack', {
      config: testConfig,
    });
    template = Template.fromStack(stack);
  });

  test('Lambda functions have correct runtime', () => {
    template.allResourcesProperties('AWS::Lambda::Function', {
      Runtime: 'nodejs20.x',
    });
  });

  test('API Gateway has throttling enabled', () => {
    template.hasResourceProperties('AWS::ApiGateway::Stage', {
      ThrottlingRateLimit: Match.anyValue(),
      ThrottlingBurstLimit: Match.anyValue(),
    });
  });

  test('DynamoDB tables have point-in-time recovery', () => {
    template.allResourcesProperties('AWS::DynamoDB::Table', {
      PointInTimeRecoverySpecification: {
        PointInTimeRecoveryEnabled: true,
      },
    });
  });
});

Integration Testing

// test/integration/api.test.ts
import { CloudFormationClient } from '@aws-sdk/client-cloudformation';
import { ApiGatewayClient } from '@aws-sdk/client-api-gateway';
import axios from 'axios';

describe('API Integration Tests', () => {
  let apiEndpoint: string;
  let authToken: string;

  beforeAll(async () => {
    // Get deployed API endpoint
    const cf = new CloudFormationClient({});
    const exports = await cf.send(new ListExportsCommand({}));
    apiEndpoint = exports.Exports?.find(
      e => e.Name === 'ApiStack-endpoint'
    )?.Value!;

    // Get auth token
    authToken = await getTestAuthToken();
  });

  test('Health check endpoint', async () => {
    const response = await axios.get(`${apiEndpoint}/health`);
    expect(response.status).toBe(200);
    expect(response.data).toEqual({ status: 'healthy' });
  });

  test('Create and retrieve user', async () => {
    // Create user
    const createResponse = await axios.post(
      `${apiEndpoint}/users`,
      { name: 'Test User', email: '[email protected]' },
      { headers: { Authorization: `Bearer ${authToken}` } }
    );
    expect(createResponse.status).toBe(201);

    // Retrieve user
    const userId = createResponse.data.userId;
    const getResponse = await axios.get(
      `${apiEndpoint}/users/${userId}`,
      { headers: { Authorization: `Bearer ${authToken}` } }
    );
    expect(getResponse.data.name).toBe('Test User');
  });
});

Load Testing

// test/load/k6-script.js
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate } from 'k6/metrics';

const errorRate = new Rate('errors');

export const options = {
  stages: [
    { duration: '2m', target: 100 },  // Ramp up
    { duration: '5m', target: 100 },  // Sustain
    { duration: '2m', target: 200 },  // Spike
    { duration: '5m', target: 200 },  // Sustain spike
    { duration: '2m', target: 0 },  // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'], // 95% of requests under 500ms
    errors: ['rate<0.01'],  // Error rate under 1%
  },
};

export default function() {
  const response = http.get(`${__ENV.API_URL}/users`);

  const success = check(response, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });

  errorRate.add(!success);
  sleep(1);
}

Rollback Procedures

Automated Rollback

// lib/constructs/deployment/safe-deployment.ts
import { Construct } from 'constructs';
import { Alarm, TreatMissingData } from 'aws-cdk-lib/aws-cloudwatch';
import { RestApi } from 'aws-cdk-lib/aws-apigateway';
import { IFunction } from 'aws-cdk-lib/aws-lambda';
import { Topic } from 'aws-cdk-lib/aws-sns';
import { SnsAction } from 'aws-cdk-lib/aws-cloudwatch-actions';
import { LambdaSubscription } from 'aws-cdk-lib/aws-sns-subscriptions';
import { CfnOutput } from 'aws-cdk-lib';

export class SafeDeployment extends Construct {
  constructor(scope: Construct, id: string, props: {
    api: RestApi;
    alarmThreshold: number;
    rollbackFunction: IFunction;
  }) {
    super(scope, id);

    // Create CloudWatch alarm
    const alarm = new Alarm(this, 'DeploymentAlarm', {
      metric: props.api.metricServerError(),
      threshold: props.alarmThreshold,
      evaluationPeriods: 2,
      treatMissingData: TreatMissingData.NOT_BREACHING,
    });

    // SNS topic for notifications
    const topic = new Topic(this, 'RollbackTopic');
    alarm.addAlarmAction(new SnsAction(topic));

    // Lambda for automated rollback
    topic.addSubscription(
      new LambdaSubscription(props.rollbackFunction)
    );

    // Manual rollback command
    new CfnOutput(this, 'RollbackCommand', {
      value: `aws lambda invoke --function-name ${props.rollbackFunction.functionName} --payload '{"action":"rollback"}' response.json`,
    });
  }
}

// src/deployment/rollback-handler.ts
import { SNSEvent } from 'aws-lambda';
import { CodeDeployClient, StopDeploymentCommand } from '@aws-sdk/client-codedeploy';

// Helper functions
async function switchTraffic(version: string): Promise<void> {
  // Implementation for traffic switching
}

async function notifySlack(message: { channel: string; message: string }): Promise<void> {
  // Implementation for Slack notification
}

export const handler = async (event: SNSEvent) => {
  console.log('Initiating rollback:', JSON.stringify(event, null, 2));

  const codedeploy = new CodeDeployClient({});

  // Stop current deployment
  await codedeploy.send(new StopDeploymentCommand({
    deploymentId: process.env.CURRENT_DEPLOYMENT_ID,
    autoRollbackEnabled: true,
  }));

  // Revert traffic to previous version
  await switchTraffic('blue'); // Assuming green was failing

  // Notify team
  await notifySlack({
    channel: '#alerts',
    message: 'Automatic rollback initiated due to high error rate',
  });
};

Performance Optimization

Lambda Performance Tuning

// lib/constructs/performance/optimized-function.ts
import { Construct } from 'constructs';
import { Duration, Stack } from 'aws-cdk-lib';
import { NodejsFunction, NodejsFunctionProps } from 'aws-cdk-lib/aws-lambda-nodejs';
import { Architecture, CfnFunction, CfnAlias } from 'aws-cdk-lib/aws-lambda';

// Base function interface
interface ServerlessFunctionProps extends NodejsFunctionProps {
  config: {
    stage: string;
  };
}

export class OptimizedFunction extends NodejsFunction {
  constructor(scope: Construct, id: string, props: ServerlessFunctionProps & {
    enableProvisioning?: boolean;
    enableSnapStart?: boolean;
  }) {
    super(scope, id, {
      ...props,
      memorySize: props.memorySize || 1024,
      architecture: Architecture.ARM_64, // Better price/performance
      environment: {
        ...props.environment,
        NODE_OPTIONS: '--enable-source-maps --max-old-space-size=896',
        AWS_NODEJS_CONNECTION_REUSE_ENABLED: '1',
      },
    });

    // Provisioned concurrency for critical functions
    if (props.enableProvisioning && props.config.stage === 'prod') {
      const version = this.currentVersion;

      new CfnAlias(this, 'ProvisionedAlias', {
        functionName: this.functionName,
        functionVersion: version.version,
        name: 'provisioned',
        provisionedConcurrencyConfig: {
          provisionedConcurrentExecutions: 5,
        },
      });
    }

    // SnapStart for Java functions
    if (props.enableSnapStart) {
      const cfnFunction = this.node.defaultChild as CfnFunction;
      cfnFunction.snapStart = {
        applyOn: 'PublishedVersions',
      };
    }
  }
}

API Gateway Optimization

// lib/constructs/performance/cached-api.ts
import { Construct } from 'constructs';
import { Duration } from 'aws-cdk-lib';
import { RestApi, RestApiProps } from 'aws-cdk-lib/aws-apigateway';

export class CachedApi extends RestApi {
  constructor(scope: Construct, id: string, props: RestApiProps & {
    cacheConfig?: {
      ttlMinutes: number;
      encrypted: boolean;
      clusterSize: string;
    };
  }) {
    super(scope, id, {
      ...props,
      deployOptions: {
        ...props.deployOptions,
        cachingEnabled: true,
        cacheClusterEnabled: true,
        cacheClusterSize: props.cacheConfig?.clusterSize || '0.5',
        cacheDataEncrypted: props.cacheConfig?.encrypted ?? true,
        cacheTtl: Duration.minutes(props.cacheConfig?.ttlMinutes || 5),
        methodOptions: {
          '/*/*': {
            cachingEnabled: true,
            cacheKeyParameters: [
              'method.request.path.proxy',
              'method.request.querystring.page',
            ],
          },
        },
      },
    });
  }
}

Monitoring and Observability

Comprehensive Monitoring Stack

// lib/stacks/monitoring-stack.ts
import { Stack, StackProps, Duration } from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { Dashboard, GraphWidget, Alarm } from 'aws-cdk-lib/aws-cloudwatch';
import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';

// Interface for ApiStack
interface ApiStack extends Stack {
  api: any; // RestApi from aws-apigateway
  functions: NodejsFunction[];
}

export class MonitoringStack extends Stack {
  constructor(scope: Construct, id: string, props: {
    apiStack: ApiStack;
    stage: string;
  }) {
    super(scope, id);

    // Create dashboard
    const dashboard = new Dashboard(this, 'ServiceDashboard', {
      dashboardName: `my-service-${props.stage}`,
    });

    // API metrics
    dashboard.addWidgets(
      new GraphWidget({
        title: 'API Requests',
        left: [props.apiStack.api.metricCount()],
        right: [props.apiStack.api.metricLatency()],
      }),
      new GraphWidget({
        title: 'API Errors',
        left: [
          props.apiStack.api.metric4XXError(),
          props.apiStack.api.metric5XXError(),
        ],
      })
    );

    // Lambda metrics
    const lambdaWidgets = props.apiStack.functions.map(fn =>
      new GraphWidget({
        title: `${fn.functionName} Performance`,
        left: [fn.metricInvocations()],
        right: [fn.metricDuration()],
      })
    );
    dashboard.addWidgets(...lambdaWidgets);

    // Alarms
    this.createAlarms(props.apiStack);
  }

  private createAlarms(apiStack: ApiStack) {
    // API Gateway alarms
    new Alarm(this, 'HighErrorRate', {
      metric: apiStack.api.metric5XXError({
        period: Duration.minutes(5),
        statistic: 'Sum',
      }),
      threshold: 10,
      evaluationPeriods: 2,
    });

    // Lambda alarms
    apiStack.functions.forEach(fn => {
      new Alarm(this, `${fn.node.id}Throttles`, {
        metric: fn.metricThrottles(),
        threshold: 5,
        evaluationPeriods: 2,
      });

      new Alarm(this, `${fn.node.id}Errors`, {
        metric: fn.metricErrors(),
        threshold: 10,
        evaluationPeriods: 2,
      });
    });
  }
}

Distributed Tracing

// lib/constructs/observability/tracing.ts
import { Construct } from 'constructs';
import { PolicyStatement } from 'aws-cdk-lib/aws-iam';
import { Tracing } from 'aws-cdk-lib/aws-lambda';
import { OptimizedFunction } from '../performance/optimized-function';

// Base function interface
interface ServerlessFunctionProps {
  environment?: Record<string, string>;
}

export class TracedFunction extends OptimizedFunction {
  constructor(scope: Construct, id: string, props: ServerlessFunctionProps) {
    super(scope, id, {
      ...props,
      tracing: Tracing.ACTIVE,
      environment: {
        ...props.environment,
        _X_AMZN_TRACE_ID: process.env._X_AMZN_TRACE_ID || '',
        AWS_XRAY_CONTEXT_MISSING: 'LOG_ERROR',
        AWS_XRAY_LOG_LEVEL: 'error',
      },
    });

    // Add X-Ray permissions
    this.addToRolePolicy(new PolicyStatement({
      actions: [
        'xray:PutTraceSegments',
        'xray:PutTelemetryRecords',
      ],
      resources: ['*'],
    }));
  }
}

// src/libs/tracing.ts
import { Tracer } from '@aws-lambda-powertools/tracer';

const tracer = new Tracer({
  serviceName: process.env.SERVICE_NAME || 'my-service',
});

export function traceMethod(
  target: any,
  propertyKey: string,
  descriptor: PropertyDescriptor
) {
  const originalMethod = descriptor.value;

  descriptor.value = async function(...args: any[]) {
    const segment = tracer.getSegment();
    const subsegment = segment?.addNewSubsegment(propertyKey);

    try {
      const result = await originalMethod.apply(this, args);
      subsegment?.close();
      return result;
    } catch (error) {
      subsegment?.addError(error as Error);
      subsegment?.close();
      throw error;
    }
  };

  return descriptor;
}

Migration Checklist

Pre-Migration

  • Inventory current resources

    • Document all Lambda functions
    • List API Gateway endpoints
    • Map DynamoDB tables and indexes
    • Identify custom resources
    • Note all environment variables and secrets
  • Assess dependencies

    • Review Serverless plugins in use
    • Check for custom CloudFormation resources
    • Identify external service integrations
    • Document IAM roles and policies
  • Plan migration strategy

    • Choose migration pattern (big bang, strangler fig, blue-green)
    • Define rollback procedures
    • Set success criteria
    • Schedule maintenance windows if needed

During Migration

  • Set up CDK project

    • Initialize repository with CDK
    • Configure environments
    • Set up CI/CD pipelines
    • Implement infrastructure tests
  • Migrate components

    • Start with stateless resources
    • Import existing stateful resources
    • Migrate Lambda functions
    • Set up API Gateway
    • Configure authentication
  • Testing

    • Run unit tests
    • Execute integration tests
    • Perform load testing
    • Validate security configurations

Post-Migration

  • Monitor and optimize

    • Set up comprehensive monitoring
    • Configure alerts
    • Review performance metrics
    • Optimize cold starts
  • Documentation

    • Update runbooks
    • Document new deployment procedures
    • Create architecture diagrams
    • Train team on CDK
  • Cleanup

    • Remove old Serverless Framework resources
    • Delete unused IAM roles
    • Clean up S3 deployment buckets
    • Update DNS records

Common Pitfalls and Solutions

1. Resource Naming Conflicts

// Avoid hardcoded names
// Bad
const table = new Table(this, 'Table', {
  tableName: 'users-table', // Will conflict if exists
});

// Good
const table = new Table(this, 'Table', {
  tableName: `${props.serviceName}-${props.stage}-users`,
});

2. State Management

// Separate stateful and stateless resources
import { App } from 'aws-cdk-lib';

const app = new App();

// Stateful resources in separate stack
const dataStack = new DataStack(app, 'DataStack', {
  terminationProtection: true,
});

// Stateless resources can be updated freely
const apiStack = new ApiStack(app, 'ApiStack', {
  tables: dataStack.tables,
});

3. Environment Variable Migration

// Map Serverless variables to CDK
import { Stack, Fn } from 'aws-cdk-lib';

const legacyMappings: Record<string, string> = {
  '${self:service}': props.serviceName,
  '${opt:stage}': props.stage,
  '${opt:region}': Stack.of(this).region,
  '${cf:OtherStack.Output}': Fn.importValue('OtherStack-Output'),
};

Migration Results After 4 Months

The CDK migration has been running in production for four months. Here are the measured outcomes:

Performance Improvements

  • API response time: 1.4s → 0.8s average (43% improvement)
  • Cold start reduction: 850ms → 320ms (62% improvement)
  • Authorization latency: 400ms → 12ms (97% improvement)
  • Database query time: 120ms → 45ms (optimized connection pooling)

Cost Optimization

  • Monthly AWS costs: 32% reduction achieved
  • Lambda costs: Reduced through better memory optimization
  • DynamoDB costs: Optimized through improved query patterns
  • CloudWatch costs: Reduced via structured logging

Operational Excellence

  • Deployment time: 45 minutes → 12 minutes
  • Rollback time: 4 hours → 30 seconds (blue-green deployment)
  • Security incidents: 2-3/month → 0/month (6 months running)
  • Infrastructure bugs: 8/month → 0.5/month (95% reduction)

Developer Experience

  • Onboarding time: 2 weeks → 2 hours (documentation + type safety)
  • Feature delivery: 2 weeks → 1 week (faster development cycle)
  • Bug investigation: 3 hours → 20 minutes (better observability)
  • Cross-team dependencies: 5 teams → 1 team (self-service infrastructure)

Operational Impact

  • Service continuity: Zero-downtime migration achieved
  • Security compliance: Met all enterprise requirements
  • Service quality: No migration-related issues
  • Team efficiency: Improved deployment confidence and speed

Key Migration Insights

The production migration revealed several important patterns:

1. Blue-Green Deployment for Production Safety

Insight: Blue-green deployment provides the most reliable production migration path. Result: Zero-downtime migration with instant rollback capability.

2. Comprehensive Health Check Requirements

Insight: Basic health checks miss critical failure modes. Result: Thorough validation systems prevent production issues.

3. Production-Oriented Testing Approach

Insight: Unit tests alone don’t catch infrastructure limits or edge cases. Result: Production-focused testing identifies critical issues before deployment.

4. Performance Optimization Compounds

Insight: CDK enables optimizations across all stack layers. Result: 43% overall performance improvement achieved.

5. Type Safety in Infrastructure Code

Insight: TypeScript catches configuration errors at compile time. Result: 95% reduction in infrastructure-related bugs.

6. Monitoring as Risk Mitigation

Insight: Comprehensive monitoring enables safe migrations. Result: Automated rollback systems prevent incidents.

7. Team Training Requirements

Insight: CDK requires different conceptual models than Serverless Framework. Result: Proper training enables significantly faster development.

Complete Migration Checklist

Week 1-2: Foundation

  • Set up CDK development environment
  • Create production-grade project structure
  • Implement comprehensive testing strategy
  • Train team on CDK patterns and TypeScript

Week 3-4: Infrastructure Migration

  • Import existing stateful resources (DynamoDB, etc.)
  • Migrate Lambda functions with performance optimization
  • Set up API Gateway with proper monitoring
  • Implement authentication and authorization

Week 5-6: Security and Compliance

  • Audit and fix IAM permissions (least privilege)
  • Implement secrets management
  • Set up comprehensive logging and monitoring
  • Pass security audit (if required)

Week 7-8: Testing and Preparation

  • Create blue-green deployment infrastructure
  • Implement automated rollback procedures
  • Run production-mirror load testing
  • Validate health check comprehensiveness

Week 9-12: Migration Execution

  • Deploy green environment (CDK)
  • Run parallel traffic validation
  • Execute traffic switch with monitoring
  • Clean up legacy Serverless Framework resources

Post-Migration: Optimization

  • Performance tuning based on production metrics
  • Cost optimization (memory, provisioning, caching)
  • Documentation and runbook updates
  • Team retrospective and lessons learned

When to Stay with Serverless Framework

Certain scenarios benefit more from Serverless Framework than CDK:

  1. Simple CRUD applications with minimal customization needs
  2. Proof-of-concept projects that need rapid prototyping
  3. Teams without TypeScript experience and no bandwidth for training
  4. Applications with heavy plugin dependencies that don’t exist in CDK
  5. Organizations with YAML-only infrastructure policies

Conclusion: Infrastructure as Actual Code

This migration fundamentally changed infrastructure management approaches. The shift from YAML configuration to TypeScript code brings compilation, testing, and validation to infrastructure.

The migration process involved multiple iterations and significant effort. The measurable results include: 43% performance improvement, 32% cost reduction, and 95% fewer infrastructure bugs.

The key benefit is increased deployment confidence through better tooling, testing, and rollback capabilities.

CDK isn’t just Infrastructure as Code - it’s Infrastructure as Actual Code. With real programming languages, real testing frameworks, and real software engineering practices.

If you’re managing production serverless applications, consider this migration path. The learning curve is steep, but the productivity gains are transformational.

Welcome to the future of serverless infrastructure. It’s written in TypeScript, tested in CI/CD, and deployed with confidence.

Migrating from Serverless Framework to AWS CDK

A comprehensive 6-part guide covering the complete migration process from Serverless Framework to AWS CDK, including setup, implementation patterns, and best practices.

Progress 6 of 6 posts

Related posts

AWS CDK Link Shortener Part 4: Production Deployment & Optimization

Multi-environment deployment strategies, performance optimization at scale, and cost management. Production insights and lessons learned with proper monitoring and incident response patterns.

aws-cdklambdadynamodb+6
Sentry Integration with React Native Expo: A Practical Quick Guide

Step-by-step guide to integrating Sentry error monitoring into a React Native Expo app. Covers SDK initialization, Expo Router instrumentation, session replay, source map uploads for EAS Build and EAS Update, and common pitfalls to avoid.

react-nativeexpomonitoring+2
DynamoDB Rate Limiting: Strategies for Single Table Design at Scale

Practical strategies to prevent and handle DynamoDB throttling in Single Table Design applications. Covers partition key design, write sharding, capacity modes, DAX caching, retry patterns, and CloudWatch monitoring for high-throughput systems.

dynamodbawsrate-limiting+5
Edge Computing with AWS: CloudFront Functions vs Lambda@Edge

A comprehensive technical guide to choosing and implementing AWS edge computing solutions for global applications with practical examples and cost optimization strategies.

awscloudfrontlambda+6
LangChain in Production: Patterns That Work and Anti-Patterns That Don't

Real lessons from deploying LangChain applications to production. Learn about the anti-patterns that cause failures and the patterns that enable success, with working code examples and cost optimization strategies.

langchainllmproduction+5