Skip to content

2026-01-22

MCP Advanced Patterns: Skills, Workflows, Integration, and RBAC

Enterprise-grade patterns for Model Context Protocol implementations including tool composition, multi-agent orchestration, role-based access control, and production observability.

Abstract

While MCP adoption has grown rapidly since its launch, most content covers basic server implementation. This post targets the next level: how to design sophisticated MCP-based systems with proper multi-agent workflows, tool composition patterns, enterprise-grade security with RBAC, and production observability. The focus is on patterns that work at scale, with concrete examples of what succeeds and what creates problems down the line.

The Scaling Challenge

Organizations successfully deploying basic MCP integrations encounter predictable challenges as they scale:

Tool Explosion: Starting with 5 tools and growing to 50 across 10 servers creates discovery and selection problems for agents.

Permission Complexity: Different users need different tool access. A junior developer should not trigger production deployments, but how do you implement fine-grained access control?

Workflow Orchestration: Complex tasks require multiple tools in sequence or parallel. Coordinating tool chains reliably becomes its own engineering problem.

Multi-Agent Coordination: Multiple AI agents working together need different tool subsets. Preventing conflicts and ensuring proper isolation requires careful design.

Audit and Compliance: Regulated industries need complete audit trails. Tracking who invoked what, when, and with what results is non-negotiable.

Pattern Solutions

Scaling Challenges

Tool Discovery

50+ tools across servers

Permission Control

Role-based access

Workflow Coordination

Multi-step operations

Agent Isolation

Conflict prevention

Audit Compliance

Complete trails

Tool Registry

+ Annotations

RBAC Layer

+ Progressive Auth

Orchestrator

+ Error Recovery

Gateway

+ Session Management

Observability Stack

+ Structured Logging

Pattern 1: Tool Design and Annotation

Well-designed MCP tools share common characteristics that make them composable and maintainable. The annotation system introduced in MCP provides metadata that helps both agents and users understand tool behavior.

Note: The patterns in this post are based on MCP specification version 2025-11-05. API details may vary with newer specification versions.

Comprehensive Tool Annotations

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { z } from "zod";

const server = new McpServer({
  name: "deployment-server",
  version: "1.0.0",
});

// Tool with comprehensive annotations
// Note: server.tool() is the high-level SDK API for tool registration
server.tool(
  "deploy_service",
  {
    service: z.string().describe("Service name to deploy"),
    environment: z.enum(["staging", "production"]).describe("Target environment"),
    version: z.string().regex(/^\d+\.\d+\.\d+$/).describe("Semantic version"),
  },
  {
    title: "Deploy Service",
    description: "Triggers deployment of a service to the specified environment",
    annotations: {
      // Behavioral hints for clients (MCP specification 2025-11-05)
      readOnlyHint: false,  // This tool modifies state
      destructiveHint: false,  // Not destructive (can be reversed)
      idempotentHint: false,  // Multiple calls create multiple deployments
      openWorldHint: true,  // Interacts with external systems
    },
  },
  async ({ service, environment, version }) => {
    // Implementation with proper error handling
    const result = await deploymentService.deploy(service, environment, version);

    return {
      content: [{ type: "text", text: JSON.stringify(result, null, 2) }],
    };
  }
);

Output Schemas for Structured Results

Tools that return structured data benefit from output schemas, enabling programmatic result handling.

Note: The outputSchema feature requires SDK version 1.0.0 or later. Verify your SDK version supports this feature before implementation.

server.tool(
  "get_service_metrics",
  {
    service: z.string(),
    timeRange: z.enum(["1h", "24h", "7d", "30d"]),
  },
  {
    title: "Get Service Metrics",
    description: "Retrieves performance metrics for a service",
    annotations: {
      readOnlyHint: true,
      idempotentHint: true,
    },
    // Output schema enables structured content validation
    outputSchema: z.object({
      service: z.string(),
      period: z.string(),
      metrics: z.object({
        requestCount: z.number(),
        errorRate: z.number(),
        p50Latency: z.number(),
        p99Latency: z.number(),
      }),
    }),
  },
  async ({ service, timeRange }) => {
    const metrics = await metricsService.getMetrics(service, timeRange);

    return {
      content: [{ type: "text", text: JSON.stringify(metrics) }],
      // Structured content for programmatic access
      structuredContent: metrics,
    };
  }
);

Tool Composition Patterns

Complex operations often require chaining multiple tools. Here are patterns that work:

// Pattern 1: Sequential Tool Chain Configuration
const deploymentChain = {
  chain: [
    {
      tool: "validate_config",
      inputPath: "$.config",
      outputPath: "$.validation",
    },
    {
      tool: "run_tests",
      inputPath: "$.validation.service",
      outputPath: "$.testResults",
    },
    {
      tool: "deploy_service",
      inputPath: "$.validation",
      outputPath: "$.deployment",
    },
  ],
};

// Pattern 2: Parallel Tool Execution
const healthCheckConfig = {
  parallel: [
    { tool: "check_service_health", params: { service: "api" } },
    { tool: "check_service_health", params: { service: "database" } },
    { tool: "check_service_health", params: { service: "cache" } },
  ],
  merge: "$.healthStatus", // Combine results
};

// Pattern 3: Conditional Tool Execution
const conditionalDeployConfig = {
  condition: "$.environment === 'production'",
  ifTrue: [
    { tool: "get_deployment_approval" },
    { tool: "notify_stakeholders" },
    { tool: "deploy_service" },
  ],
  ifFalse: [
    { tool: "deploy_service" },
  ],
};

Pattern 2: Multi-Agent Workflow Orchestration

When multiple AI agents need to collaborate with different tool access, an orchestrator pattern provides coordination and isolation.

MCP Servers

Specialized Agents

Workflow Orchestrator

Task Queue

Agent Router

State Manager

Validator Agent

Config tools only

Health Agent

Monitoring tools

Deployer Agent

Deploy tools

Config Server

Monitor Server

Deploy Server

Orchestrator Implementation

Note: The HTTP+SSE transport is deprecated. For new implementations, use the Streamable HTTP transport (StreamableHTTPClientTransport). The transport class names may vary by SDK version - verify against your installed SDK.

import { Client } from "@modelcontextprotocol/sdk/client/index.js";
// Transport import - verify class name against your SDK version
// import { StreamableHTTPClientTransport } from "@modelcontextprotocol/sdk/client/streamableHttp.js";

interface AgentConfig {
  name: string;
  role: string;
  servers: string[];
  permissions: string[];
}

interface WorkflowStep {
  name: string;
  agent: string;
  tool: string;
  params: Record<string, any>;
  outputKey: string;
  timeout?: number;
  onError: "abort" | "continue" | "rollback";
  parallel?: WorkflowStep[];
}

class MultiAgentOrchestrator {
  private agents: Map<string, AgentConfig> = new Map();
  private mcpClients: Map<string, Client> = new Map();

  async registerAgent(config: AgentConfig): Promise<void> {
    const client = new Client({
      name: `agent-${config.name}`,
      version: "1.0.0",
    });

    // Connect to assigned MCP servers
    // Note: Use StreamableHTTPClientTransport for new implementations
    for (const serverUrl of config.servers) {
      await client.connect(new StreamableHTTPClientTransport(serverUrl));
    }

    // Filter available tools based on agent permissions
    const tools = await client.listTools();
    const allowedTools = tools.filter(tool =>
      this.checkToolPermission(tool, config.permissions)
    );

    console.log(`Agent ${config.name} registered with ${allowedTools.length} tools`);

    this.agents.set(config.name, config);
    this.mcpClients.set(config.name, client);
  }

  async executeWorkflow(workflow: WorkflowDefinition): Promise<WorkflowResult> {
    const executionId = crypto.randomUUID();
    const results: Map<string, any> = new Map();

    console.log(`Starting workflow ${workflow.name} (${executionId})`);

    for (const step of workflow.steps) {
      try {
        if (step.parallel) {
          // Execute parallel steps concurrently
          const parallelResults = await Promise.all(
            step.parallel.map(subStep =>
              this.executeStep(subStep, results, executionId)
            )
          );

          // Merge parallel results
          parallelResults.forEach((result, index) => {
            results.set(step.parallel[index].outputKey, result);
          });
        } else {
          // Execute sequential step
          const result = await this.executeStep(step, results, executionId);
          results.set(step.outputKey, result);
        }
      } catch (error) {
        if (step.onError === "continue") {
          console.log(`Step ${step.name} failed, continuing: ${error.message}`);
          results.set(step.outputKey, { error: error.message });
        } else if (step.onError === "rollback") {
          await this.rollbackWorkflow(executionId, results);
          throw error;
        } else {
          throw error;
        }
      }
    }

    return {
      executionId,
      status: "completed",
      results: Object.fromEntries(results),
    };
  }

  private async executeStep(
    step: WorkflowStep,
    context: Map<string, any>,
    executionId: string
  ): Promise<any> {
    const agent = this.agents.get(step.agent);
    const client = this.mcpClients.get(step.agent);

    if (!agent || !client) {
      throw new Error(`Agent ${step.agent} not found`);
    }

    // Resolve parameters from workflow context
    const params = this.resolveParams(step.params, context);

    // Verify permission before execution
    if (!this.checkToolPermission({ name: step.tool }, agent.permissions)) {
      throw new Error(`Agent ${step.agent} lacks permission for tool ${step.tool}`);
    }

    // Execute with timeout
    const result = await Promise.race([
      client.callTool({ name: step.tool, arguments: params }),
      this.timeout(step.timeout || 30000),
    ]);

    // Audit logging
    await this.auditLog({
      executionId,
      agent: step.agent,
      tool: step.tool,
      params,
      result: result.isError ? "failure" : "success",
      timestamp: new Date().toISOString(),
    });

    if (result.isError) {
      throw new Error(`Tool ${step.tool} failed: ${result.content[0]?.text}`);
    }

    return result.content;
  }

  private checkToolPermission(
    tool: { name: string },
    permissions: string[]
  ): boolean {
    // Permission format: "tool:*", "tool:read", "tool:deploy:staging"
    for (const perm of permissions) {
      if (perm === "tool:*") return true;
      if (perm === `tool:${tool.name}`) return true;
      if (perm.startsWith(`tool:${tool.name}:`)) return true;
    }
    return false;
  }
}

Workflow Definition Example

const productionDeploymentWorkflow: WorkflowDefinition = {
  name: "production-deployment",
  steps: [
    {
      name: "validate",
      agent: "validator",
      tool: "validate_deployment_config",
      params: { service: "$.input.service", version: "$.input.version" },
      outputKey: "validation",
      onError: "abort",
    },
    {
      name: "parallel-checks",
      parallel: [
        {
          name: "health-check",
          agent: "health-checker",
          tool: "check_service_health",
          params: { service: "$.input.service" },
          outputKey: "health",
          onError: "abort",
        },
        {
          name: "security-scan",
          agent: "security-scanner",
          tool: "scan_vulnerabilities",
          params: { version: "$.input.version" },
          outputKey: "security",
          onError: "abort",
        },
      ],
      onError: "abort",
    },
    {
      name: "get-approval",
      agent: "approval-bot",
      tool: "request_deployment_approval",
      params: {
        service: "$.input.service",
        validation: "$.validation",
        security: "$.security",
      },
      outputKey: "approval",
      timeout: 300000, // 5 minutes for human approval
      onError: "abort",
    },
    {
      name: "deploy",
      agent: "deployer",
      tool: "deploy_service",
      params: {
        service: "$.input.service",
        version: "$.input.version",
        approvalId: "$.approval.id",
      },
      outputKey: "deployment",
      onError: "rollback",
    },
  ],
};

Pattern 3: Role-Based Access Control

Enterprise MCP deployments need fine-grained access control. Here is a pattern that combines RBAC with attribute-based conditions.

Permission Model

import { z } from "zod";

interface Permission {
  resource: string;  // "tool:deploy_service", "resource:config:*"
  action: string;  // "invoke", "read", "write"
  conditions?: {  // Optional ABAC conditions
    environment?: string[];
    timeWindow?: { start: string; end: string };
    maxCost?: number;
  };
}

interface Role {
  name: string;
  permissions: Permission[];
  inherits?: string[];  // Role inheritance
}

interface User {
  id: string;
  email: string;
  roles: string[];
  attributes: Record<string, any>;
}

RBAC Service Implementation

class MCPAuthorizationService {
  private roles: Map<string, Role> = new Map();

  constructor() {
    this.initializeRoles();
  }

  private initializeRoles(): void {
    // Define role hierarchy
    const roles: Role[] = [
      {
        name: "viewer",
        permissions: [
          { resource: "tool:list_*", action: "invoke" },
          { resource: "tool:get_*", action: "invoke" },
          { resource: "tool:check_*", action: "invoke" },
          { resource: "resource:*", action: "read" },
        ],
      },
      {
        name: "developer",
        inherits: ["viewer"],
        permissions: [
          {
            resource: "tool:deploy_service",
            action: "invoke",
            conditions: { environment: ["development", "staging"] },
          },
          { resource: "tool:run_tests", action: "invoke" },
          { resource: "tool:build_artifact", action: "invoke" },
        ],
      },
      {
        name: "senior_developer",
        inherits: ["developer"],
        permissions: [
          {
            resource: "tool:deploy_service",
            action: "invoke",
            conditions: { environment: ["development", "staging", "production"] },
          },
          { resource: "tool:rollback_deployment", action: "invoke" },
        ],
      },
      {
        name: "admin",
        permissions: [
          { resource: "*", action: "*" },
        ],
      },
    ];

    for (const role of roles) {
      this.roles.set(role.name, role);
    }
  }

  async authorize(
    user: User,
    tool: string,
    params: Record<string, any>
  ): Promise<AuthorizationResult> {
    const permissions = this.getEffectivePermissions(user);

    for (const perm of permissions) {
      if (this.matchesResource(perm.resource, `tool:${tool}`)) {
        if (perm.action === "*" || perm.action === "invoke") {
          // Check ABAC conditions
          if (perm.conditions) {
            const conditionResult = this.evaluateConditions(
              perm.conditions,
              params,
              user
            );
            if (!conditionResult.allowed) {
              return {
                allowed: false,
                reason: conditionResult.reason,
                requiredElevation: this.suggestElevation(tool, params),
              };
            }
          }
          return { allowed: true };
        }
      }
    }

    return {
      allowed: false,
      reason: `User ${user.email} lacks permission for tool ${tool}`,
      requiredElevation: this.suggestElevation(tool, params),
    };
  }

  private getEffectivePermissions(user: User): Permission[] {
    const permissions: Permission[] = [];
    const processedRoles = new Set<string>();

    const processRole = (roleName: string) => {
      if (processedRoles.has(roleName)) return;
      processedRoles.add(roleName);

      const role = this.roles.get(roleName);
      if (!role) return;

      // Process inherited roles first (depth-first)
      if (role.inherits) {
        for (const inherited of role.inherits) {
          processRole(inherited);
        }
      }

      permissions.push(...role.permissions);
    };

    for (const roleName of user.roles) {
      processRole(roleName);
    }

    return permissions;
  }

  private evaluateConditions(
    conditions: Permission["conditions"],
    params: Record<string, any>,
    user: User
  ): { allowed: boolean; reason?: string } {
    // Environment restriction
    if (conditions?.environment) {
      const requestedEnv = params.environment;
      if (!conditions.environment.includes(requestedEnv)) {
        return {
          allowed: false,
          reason: `Environment ${requestedEnv} not allowed. Permitted: ${conditions.environment.join(", ")}`,
        };
      }
    }

    // Time window restriction
    if (conditions?.timeWindow) {
      const now = new Date();
      const hour = now.getHours();
      const [startHour] = conditions.timeWindow.start.split(":").map(Number);
      const [endHour] = conditions.timeWindow.end.split(":").map(Number);

      if (hour < startHour || hour >= endHour) {
        return {
          allowed: false,
          reason: `Operation only allowed between ${conditions.timeWindow.start} and ${conditions.timeWindow.end}`,
        };
      }
    }

    return { allowed: true };
  }

  private matchesResource(pattern: string, resource: string): boolean {
    if (pattern === "*") return true;

    const regexPattern = pattern.replace(/\*/g, ".*").replace(/\?/g, ".");
    return new RegExp(`^${regexPattern}$`).test(resource);
  }
}

Secure MCP Server Integration

class SecureMCPServer {
  private server: McpServer;
  private authService: MCPAuthorizationService;

  constructor(config: SecureServerConfig) {
    this.server = new McpServer(config);
    this.authService = new MCPAuthorizationService();
  }

  secureTool(
    name: string,
    schema: z.ZodType,
    options: ToolOptions,
    handler: ToolHandler
  ) {
    this.server.tool(name, schema, options, async (params, context) => {
      const user = context.meta?.user as User;

      if (!user) {
        return {
          content: [{ type: "text", text: "Authentication required" }],
          isError: true,
        };
      }

      // Check authorization
      const authResult = await this.authService.authorize(user, name, params);

      if (!authResult.allowed) {
        // Audit denied access
        await this.auditLog({
          action: "tool_denied",
          user: user.email,
          tool: name,
          reason: authResult.reason,
        });

        return {
          content: [{
            type: "text",
            text: `Access denied: ${authResult.reason}${
              authResult.requiredElevation
                ? `\n${authResult.requiredElevation}`
                : ""
            }`,
          }],
          isError: true,
        };
      }

      // Audit successful access
      await this.auditLog({
        action: "tool_invoked",
        user: user.email,
        tool: name,
        params,
      });

      return handler(params, context);
    });
  }
}

Pattern 4: Progressive Authorization

Instead of granting all permissions upfront, progressive authorization elevates scopes as needed for sensitive operations.

Scope Progression

Needs more

Justification

Approval

Baseline Scopes

Read-only tools

Standard Scopes

Dev environment

Elevated Scopes

Production ops

Admin Scopes

Destructive ops

User Request

class ProgressiveAuthorizationService {
  private baseScopes = ["mcp:tools:read", "mcp:resources:read"];
  private scopeHistory: Map<string, Set<string>> = new Map();

  async getInitialToken(userId: string): Promise<TokenResponse> {
    const token = await this.oauthClient.getToken({
      grant_type: "client_credentials",
      scope: this.baseScopes.join(" "),
      user_id: userId,
    });

    this.scopeHistory.set(userId, new Set(this.baseScopes));
    return token;
  }

  async elevateScope(
    userId: string,
    requiredScope: string,
    justification: string
  ): Promise<ElevationResult> {
    const currentScopes = this.scopeHistory.get(userId) || new Set();

    if (currentScopes.has(requiredScope)) {
      return { elevated: true, token: await this.getCurrentToken(userId) };
    }

    // Check eligibility for requested scope
    const canElevate = await this.checkElevationEligibility(userId, requiredScope);

    if (!canElevate.eligible) {
      return {
        elevated: false,
        reason: canElevate.reason,
        approvalRequired: true,
        approvalWorkflow: this.getApprovalWorkflow(requiredScope),
      };
    }

    // Request elevated token
    const elevatedToken = await this.oauthClient.getToken({
      grant_type: "client_credentials",
      scope: [...currentScopes, requiredScope].join(" "),
      user_id: userId,
      justification,
    });

    currentScopes.add(requiredScope);
    this.scopeHistory.set(userId, currentScopes);

    await this.auditLog({
      action: "scope_elevated",
      user: userId,
      scope: requiredScope,
      justification,
    });

    return { elevated: true, token: elevatedToken };
  }

  private getScopeRequirements(): Record<string, ScopeRequirement> {
    return {
      "mcp:deploy:staging": {
        minRole: "developer",
        requiresApproval: false,
        expiresIn: 3600, // 1 hour
      },
      "mcp:deploy:production": {
        minRole: "senior_developer",
        requiresApproval: true,
        approvers: ["tech-lead", "sre-oncall"],
        expiresIn: 1800, // 30 minutes
      },
      "mcp:admin": {
        minRole: "admin",
        requiresApproval: true,
        approvers: ["security-team"],
        expiresIn: 900, // 15 minutes
      },
    };
  }
}

Pattern 5: Error Handling and Recovery

Production MCP deployments need robust error handling with retry strategies and circuit breakers.

Error Classification

enum MCPErrorCode {
  // Protocol errors
  PARSE_ERROR = -32700,
  INVALID_REQUEST = -32600,
  METHOD_NOT_FOUND = -32601,
  INVALID_PARAMS = -32602,
  INTERNAL_ERROR = -32603,

  // Tool-specific errors
  TOOL_EXECUTION_FAILED = -32000,
  TOOL_TIMEOUT = -32001,
  TOOL_UNAUTHORIZED = -32002,
  TOOL_RATE_LIMITED = -32003,
  TOOL_DEPENDENCY_FAILED = -32004,
}

interface RecoverableError {
  code: MCPErrorCode;
  message: string;
  retryable: boolean;
  retryAfterMs?: number;
  suggestedAction?: string;
}

Error Handler with Retry Logic

class MCPErrorHandler {
  private retryPolicies: Map<MCPErrorCode, RetryPolicy> = new Map([
    [MCPErrorCode.TOOL_TIMEOUT, { maxRetries: 3, backoffMs: 1000, multiplier: 2 }],
    [MCPErrorCode.TOOL_DEPENDENCY_FAILED, { maxRetries: 5, backoffMs: 500, multiplier: 1.5 }],
    [MCPErrorCode.TOOL_RATE_LIMITED, { maxRetries: 3, backoffMs: 5000, multiplier: 2 }],
  ]);

  async executeWithRecovery<T>(
    toolName: string,
    executor: () => Promise<T>,
    timeoutMs: number = 30000
  ): Promise<T> {
    let lastError: RecoverableError | null = null;
    let attempt = 0;

    while (true) {
      try {
        return await Promise.race([
          executor(),
          this.createTimeout(timeoutMs),
        ]);
      } catch (error) {
        lastError = this.classifyError(error, toolName);
        attempt++;

        console.log(`Tool ${toolName} failed (attempt ${attempt}):`, {
          code: lastError.code,
          message: lastError.message,
          retryable: lastError.retryable,
        });

        if (!lastError.retryable) break;

        const policy = this.retryPolicies.get(lastError.code);
        if (!policy || attempt >= policy.maxRetries) break;

        const backoff = policy.backoffMs * Math.pow(policy.multiplier, attempt - 1);
        const jitter = Math.random() * 0.1 * backoff;

        await this.sleep(backoff + jitter);
      }
    }

    throw this.createToolError(lastError!, toolName);
  }

  private classifyError(error: any, toolName: string): RecoverableError {
    if (error.code === "ETIMEDOUT" || error.message?.includes("timeout")) {
      return {
        code: MCPErrorCode.TOOL_TIMEOUT,
        message: `Tool ${toolName} timed out`,
        retryable: true,
        suggestedAction: "Check service health or increase timeout",
      };
    }

    if (error.response?.status === 429) {
      return {
        code: MCPErrorCode.TOOL_RATE_LIMITED,
        message: `Tool ${toolName} rate limited`,
        retryable: true,
        retryAfterMs: parseInt(error.response.headers["retry-after"]) * 1000 || 5000,
      };
    }

    if (error.response?.status >= 500) {
      return {
        code: MCPErrorCode.TOOL_DEPENDENCY_FAILED,
        message: `Tool ${toolName} backend error`,
        retryable: true,
      };
    }

    return {
      code: MCPErrorCode.TOOL_EXECUTION_FAILED,
      message: `Tool ${toolName} failed: ${error.message}`,
      retryable: false,
    };
  }
}

Circuit Breaker Pattern

class CircuitBreaker {
  private state: "closed" | "open" | "half-open" = "closed";
  private failures = 0;
  private lastFailureTime = 0;

  constructor(
    private readonly threshold: number = 5,
    private readonly resetTimeout: number = 30000
  ) {}

  async execute<T>(operation: () => Promise<T>): Promise<T> {
    if (this.state === "open") {
      if (Date.now() - this.lastFailureTime > this.resetTimeout) {
        this.state = "half-open";
      } else {
        throw new Error("Circuit breaker is open");
      }
    }

    try {
      const result = await operation();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }

  private onSuccess(): void {
    this.failures = 0;
    this.state = "closed";
  }

  private onFailure(): void {
    this.failures++;
    this.lastFailureTime = Date.now();

    if (this.failures >= this.threshold) {
      this.state = "open";
      console.log(`Circuit breaker opened after ${this.failures} failures`);
    }
  }
}

Pattern 6: Observability Stack

Production MCP deployments need comprehensive observability across metrics, tracing, and audit logging.

import { trace, SpanStatusCode } from "@opentelemetry/api";
import { Counter, Histogram, Gauge } from "prom-client";

class MCPObservability {
  private tracer = trace.getTracer("mcp-server");

  // Prometheus metrics
  private requestCounter = new Counter({
    name: "mcp_tool_requests_total",
    help: "Total MCP tool invocations",
    labelNames: ["tool", "status", "user_role"],
  });

  private requestDuration = new Histogram({
    name: "mcp_tool_duration_seconds",
    help: "MCP tool execution duration",
    labelNames: ["tool"],
    buckets: [0.1, 0.5, 1, 2, 5, 10, 30],
  });

  private activeRequests = new Gauge({
    name: "mcp_active_requests",
    help: "Currently executing MCP requests",
  });

  async observedToolExecution<T>(
    toolName: string,
    user: User,
    params: Record<string, any>,
    executor: () => Promise<T>
  ): Promise<T> {
    return this.tracer.startActiveSpan(`tool:${toolName}`, async (span) => {
      const startTime = Date.now();
      this.activeRequests.inc();

      try {
        span.setAttributes({
          "mcp.tool.name": toolName,
          "mcp.user.id": user.id,
          "mcp.user.role": user.roles.join(","),
          "mcp.params": JSON.stringify(this.sanitizeParams(params)),
        });

        const result = await executor();

        span.setStatus({ code: SpanStatusCode.OK });
        this.requestCounter.inc({
          tool: toolName,
          status: "success",
          user_role: user.roles[0],
        });

        return result;
      } catch (error) {
        span.setStatus({ code: SpanStatusCode.ERROR, message: error.message });
        span.recordException(error);

        this.requestCounter.inc({
          tool: toolName,
          status: "error",
          user_role: user.roles[0],
        });

        throw error;
      } finally {
        const duration = (Date.now() - startTime) / 1000;
        this.requestDuration.observe({ tool: toolName }, duration);
        this.activeRequests.dec();
        span.end();
      }
    });
  }

  private sanitizeParams(params: Record<string, any>): Record<string, any> {
    const sensitiveKeys = ["password", "token", "secret", "apiKey"];
    const sanitized = { ...params };

    for (const key of Object.keys(sanitized)) {
      if (sensitiveKeys.some(s => key.toLowerCase().includes(s))) {
        sanitized[key] = "[REDACTED]";
      }
    }

    return sanitized;
  }
}

Audit Logging for Compliance

class AuditLogger {
  private buffer: AuditEvent[] = [];

  constructor(private config: AuditConfig) {
    // Batch flush every 10 seconds
    setInterval(() => this.flush(), 10000);
  }

  log(event: {
    action: string;
    user: User;
    resource: string;
    result: "success" | "failure" | "denied";
    params?: Record<string, any>;
  }): void {
    const auditEvent: AuditEvent = {
      timestamp: new Date().toISOString(),
      eventId: crypto.randomUUID(),
      userId: event.user.id,
      userEmail: event.user.email,
      userRoles: event.user.roles,
      action: event.action,
      resource: event.resource,
      params: this.sanitizeForAudit(event.params),
      result: event.result,
      serverVersion: this.config.serverVersion,
      environment: this.config.environment,
    };

    this.buffer.push(auditEvent);

    // Immediate flush for security-critical events
    if (event.result === "denied" || this.isCriticalAction(event.action)) {
      this.flush();
    }
  }

  private async flush(): Promise<void> {
    if (this.buffer.length === 0) return;

    const batch = [...this.buffer];
    this.buffer = [];

    try {
      await Promise.all([
        this.sendToCloudWatch(batch),
        this.writeToLocalLog(batch),
      ]);
    } catch (error) {
      console.error("Audit flush failed:", error);
      this.buffer.unshift(...batch); // Re-queue failed events
    }
  }

  private isCriticalAction(action: string): boolean {
    return ["deploy:production", "delete", "permission_change", "scope_elevation"]
      .some(c => action.includes(c));
  }
}

Where to Start: Building Enterprise MCP

The patterns in this post can feel overwhelming. Here’s how to approach implementation without getting stuck.

Start with authentication, not authorization. Before worrying about fine-grained RBAC, get basic authentication working. A simple middleware that validates tokens and attaches user context to requests is enough. You can layer permissions on top later.

Get one workflow working end-to-end before abstracting. The temptation is to build a general-purpose orchestrator. Resist it. Pick your most critical multi-step operation and hardcode the workflow. Only extract patterns once you have two or three working workflows to compare.

Add observability early, not after problems appear. Instrument your first MCP server with metrics and tracing from day one. When authorization issues arise (and they will), you’ll thank yourself for having visibility into what’s happening.

Progressive authorization is worth the complexity. Starting users with minimal scopes and elevating on demand feels like more work than granting broad permissions upfront. But the security posture and audit clarity it provides justifies the investment. Implement it before you have too many tools to retrofit.

Circuit breakers prevent cascading failures. If your MCP servers call external services (databases, APIs, other services), wrap those calls in circuit breakers immediately. One slow dependency can bring down your entire agent system without them.

The key principle: each pattern addresses a specific scaling pain point. Implement patterns when you feel the pain, not before. A working system with basic auth beats an unfinished system with perfect RBAC.

Common Pitfalls

Over-Permissioning

Granting broad permissions to avoid authorization errors during development creates security debt. Start minimal, add incrementally based on actual needs.

Monolithic Tools

Creating large tools that do everything makes them harder to secure, test, and compose. Design small, focused tools that chain together.

Ignoring Partial Failures

Assuming workflows either fully succeed or fully fail leads to inconsistent states. Track workflow state, implement compensation actions, and support resumption from failure points.

Context Window Blindness

Returning excessive data from tools wastes context window capacity. Return only relevant data, use structured output schemas, and implement progressive loading for large results.

Security as Afterthought

Adding security after initial implementation leads to architectural rework. Design RBAC from day one, implement security middleware before any tools.

Key Takeaways

RBAC is foundational: Not optional for enterprise deployments. Design your permission model early.

Progressive authorization works: Start with minimal scopes, elevate as needed. Don’t pre-authorize everything.

Composable tools scale better: Small tools that chain together are more maintainable and flexible than monolithic alternatives.

Plan for partial failures: Workflows should be resumable. Track state and implement compensation patterns.

Observability enables everything else: You cannot secure or optimize what you cannot measure. Invest in metrics, tracing, and audit logging from the start.

Gateway pattern simplifies operations: For multi-server deployments, a gateway centralizes authentication, routing, and monitoring.

Sources

Related posts