25 GPT-5.5 Prompts for Software Architecture and System Design

25 GPT-5.5 Prompts for Software Architecture and System Design

Article header image

In the rapidly evolving landscape of software engineering, mastering system design and architecture is paramount for developing scalable, maintainable, and high-performance applications. Architects today face multifaceted challenges, ranging from microservices orchestration and data consistency to security compliance and cost optimization. With the advent of GPT-5.5, engineers and architects can leverage advanced AI-driven reasoning to streamline complex design challenges, enabling more insightful decision-making and reducing iterative design cycles.

This comprehensive playbook introduces 25 meticulously crafted GPT-5.5 prompts tailored specifically for software architecture and system design, categorized into five critical domains: scalability, reliability, security, maintainability, and performance optimization. Each prompt is designed with explicit system instructions, contextual variables, and output schemas that align with real-world engineering workflows. For example, prompts guide the design of a real-time collaborative editor with conflict resolution mechanisms or architect a high-throughput, fault-tolerant payment gateway with multi-region deployment.

Why GPT-5.5 for System Design?

GPT-5.5 incorporates cutting-edge advancements in natural language understanding and contextual reasoning, enabling it to assist in generating detailed architectural blueprints, technology stack evaluations, and trade-off analyses. Unlike earlier models, GPT-5.5 can interpret nuanced system requirements, propose modular designs, and provide justifications grounded in established software engineering principles.

Key Capabilities Include:

  • Contextual Awareness: Understands complex system requirements and constraints.
  • Multi-turn Reasoning: Engages in iterative dialogue to refine design proposals.
  • Schema-Driven Outputs: Produces structured JSON/YAML schemas for integration with CI/CD pipelines and architecture documentation tools.
  • Code and Configuration Generation: Generates production-grade code snippets, deployment manifests, and infrastructure-as-code templates.

How to Use This Playbook

Each prompt in this collection includes:

  1. System Instructions: High-level guidance on what the prompt aims to achieve.
  2. Contextual Variables: Dynamic variables representing system parameters such as throughput, latency, user base size, or compliance requirements.
  3. Output Schema: Defines the expected format of the AI’s response, facilitating easy parsing and integration.
  4. Example Use Case: Demonstrations of prompt application with detailed walkthroughs, including code snippets, architectural diagrams, and deployment scripts.

Example: Designing a Real-Time Collaborative Editor

Consider a prompt designed to help architect a real-time collaborative text editor supporting thousands of concurrent users. GPT-5.5 can assist in generating:

  • Event-driven architecture diagrams featuring Operational Transformation (OT) or Conflict-free Replicated Data Types (CRDTs).
  • WebSocket-based communication protocols with fallback strategies.
  • Scalable backend components, including distributed state synchronization and persistence layers.
  • Sample code snippets in TypeScript demonstrating client-server synchronization logic.
interface DocumentChange {
  userId: string;
  timestamp: number;
  operations: Operation[];
}

type Operation = Insert | Delete;

interface Insert {
  type: 'insert';
  position: number;
  text: string;
}

interface Delete {
  type: 'delete';
  position: number;
  length: number;
}

// Example function to apply an operation
function applyOperation(doc: string, op: Operation): string {
  switch(op.type) {
    case 'insert':
      return doc.slice(0, op.position) + op.text + doc.slice(op.position);
    case 'delete':
      return doc.slice(0, op.position) + doc.slice(op.position + op.length);
  }
}

Example: Architecting a High-Throughput Payment Gateway

For a payment gateway handling millions of transactions per day, GPT-5.5 can generate detailed designs encompassing:

  • Microservices for transaction processing, fraud detection, and settlement.
  • Event sourcing and CQRS patterns for auditability and scalability.
  • Infrastructure-as-Code (IaC) scripts for multi-region Kubernetes deployments on AWS EKS or Google GKE.
  • Security best practices, including PCI-DSS compliance checklists and tokenization strategies.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-processor
  labels:
    app: payment-gateway
spec:
  replicas: 10
  selector:
    matchLabels:
      app: payment-gateway
      component: processor
  template:
    metadata:
      labels:
        app: payment-gateway
        component: processor
    spec:
      containers:
      - name: processor
        image: payment-processor:v1.2.0
        ports:
        - containerPort: 8080
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: payment-db-credentials
              key: url
        resources:
          requests:
            cpu: "500m"
            memory: "1Gi"
          limits:
            cpu: "2"
            memory: "4Gi"
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /readyz
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 5

By integrating these prompts into your workflow, you can harness GPT-5.5’s capabilities to elevate your architectural decisions and accelerate system design processes significantly. This playbook not only serves as a starting point but also as a continuously extensible framework for evolving your design methodologies alongside emerging technological trends.

1. Microservices & Distributed Systems

Designing microservices and distributed systems demands a meticulous approach to separation of concerns, resilience engineering, and efficient inter-service communication patterns. These architectures must handle complex operational challenges such as fault tolerance, scalability, eventual consistency, and observability. Leveraging GPT-5.5’s advanced contextual understanding, architects can generate nuanced system designs that incorporate best practices, industry standards, and emerging patterns to maximize system robustness and responsiveness under diverse workloads.

Microservices architecture breaks down monolithic applications into independently deployable services, each encapsulating a specific business capability. Distributed systems extend this concept across networked nodes, emphasizing scalability, availability, and partition tolerance. Key design principles include defining clear service boundaries, selecting appropriate communication protocols, implementing fault isolation, and enabling seamless scalability.

Core Design Considerations

  • Service Boundaries: Define services around business capabilities or bounded contexts to minimize coupling.
  • Communication Patterns: Choose synchronous (REST, gRPC) or asynchronous (messaging, event-driven) communication based on latency and consistency needs.
  • Fault Tolerance: Implement circuit breakers, retries with exponential backoff, bulkheads, and failover mechanisms to ensure resilience.
  • Data Management: Use eventual consistency, distributed transactions, or sagas to maintain data integrity across services.
  • Observability: Integrate centralized logging, metrics, tracing (e.g., OpenTelemetry) for monitoring and diagnostics.
  • Security: Enforce authentication, authorization, and secure communication channels (TLS, mTLS).

Production-Grade Infrastructure Components

  • Service Mesh: Tools like Istio or Linkerd provide advanced traffic management, observability, and security features.
  • API Gateway: Centralized entry point for routing, authentication, rate-limiting, and request aggregation.
  • Message Brokers: Kafka, RabbitMQ, or AWS SNS/SQS enable reliable asynchronous communication.
  • Container Orchestration: Kubernetes for automated deployment, scaling, and management.
  • Configuration Management: Use tools like Consul or Vault for dynamic configuration and secrets management.
  1. Prompt: Designing a Scalable Microservices Architecture for an E-commerce Platform

    System Instructions: “Generate a detailed microservices architecture for a global e-commerce platform supporting user accounts, product catalogs, order processing, and payment handling. Emphasize service boundaries, inter-service communication, and fault tolerance.”

    Context Variables:

    • userBaseSize: 10 million monthly active users
    • peakRequestRate: 50,000 requests per second
    • paymentProviders: Stripe, PayPal
    • regions: US, EU, APAC

    Output Schema:

    {
      "services": [
        {"name": "UserService", "responsibilities": "Manage user profiles and authentication", "communication": "REST API"},
        {"name": "CatalogService", "responsibilities": "Product listings and inventory", "communication": "gRPC"},
        {"name": "OrderService", "responsibilities": "Order lifecycle management", "communication": "Event-driven via message queue"},
        {"name": "PaymentService", "responsibilities": "Process payments with external providers", "communication": "REST API with retries"}
      ],
      "communicationPatterns": ["Synchronous REST between UserService and CatalogService", "Asynchronous events between OrderService and PaymentService"],
      "scalabilityStrategies": ["Auto-scaling groups per service", "Regional failover", "Circuit breaker for payment calls"]
    }

    Detailed Architecture Analysis

    The platform consists of four core microservices, each optimized for its domain:

    • UserService: Handles user registration, authentication (OAuth2/JWT), profile management. Exposes RESTful endpoints secured with TLS and rate-limiting.
    • CatalogService: Maintains product data, inventory counts, and search indexes. Uses gRPC for high-performance, low-latency RPC calls, enabling synchronous queries from UserService and frontend APIs.
    • OrderService: Manages order creation, updates, and status tracking through event-driven mechanisms (e.g., Kafka or AWS Kinesis). This decouples order processing from synchronous requests, improving throughput and fault isolation.
    • PaymentService: Integrates with external payment gateways (Stripe, PayPal) via REST APIs. Implements retry policies, circuit breakers (e.g., Hystrix or Resilience4j), and idempotency keys to prevent double charges.

    Production-Grade Configuration Example (Kubernetes Deployment Snippet)

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: payment-service
      labels:
        app: payment-service
    spec:
      replicas: 5
      selector:
        matchLabels:
          app: payment-service
      template:
        metadata:
          labels:
            app: payment-service
        spec:
          containers:
          - name: payment-service
            image: company/payment-service:1.2.0
            ports:
            - containerPort: 8080
            env:
            - name: STRIPE_API_KEY
              valueFrom:
                secretKeyRef:
                  name: payment-secrets
                  key: stripeApiKey
            - name: PAYPAL_CLIENT_ID
              valueFrom:
                secretKeyRef:
                  name: payment-secrets
                  key: paypalClientId
            readinessProbe:
              httpGet:
                path: /health
                port: 8080
              initialDelaySeconds: 10
              periodSeconds: 15
            livenessProbe:
              httpGet:
                path: /health
                port: 8080
              initialDelaySeconds: 30
              periodSeconds: 30
            resources:
              requests:
                cpu: "500m"
                memory: "512Mi"
              limits:
                cpu: "2"
                memory: "2Gi"
            # Circuit breaker and retry implemented in service code using resilience4j or similar
    

    Inter-Service Communication Patterns

    Communication Type Source Service Destination Service Protocol Purpose Fault-Tolerance Mechanism
    Synchronous UserService CatalogService gRPC Retrieve product details for user browsing Timeouts, retries with exponential backoff
    Asynchronous OrderService PaymentService Message Queue (e.g., Kafka) Trigger payment processing after order placement Message retries, dead-letter queues, circuit breakers
  2. Prompt: Fault-Tolerant Distributed System for Real-Time Collaborative Editor

    System Instructions: “Design a distributed system architecture enabling real-time collaboration on documents with low latency and conflict resolution.”

    Context Variables:

    • concurrentUsersPerDocument: 1000
    • latencyRequirement: <100ms
    • conflictResolutionStrategy: Operational Transformation (OT) or CRDTs
    • persistence: Event sourcing with CQRS

    Output Schema:

    {
      "components": [
        {"name": "CollaborationService", "protocol": "WebSockets", "function": "Broadcast real-time changes"},
        {"name": "ConflictResolutionEngine", "algorithm": "CRDT", "function": "Merge concurrent edits"},
        {"name": "PersistenceLayer", "pattern": "Event Sourcing", "function": "Store all edit events"},
        {"name": "QueryService", "pattern": "CQRS", "function": "Serve read requests efficiently"}
      ],
      "dataFlow": ["Client <-> CollaborationService <-> ConflictResolutionEngine <-> PersistenceLayer"],
      "faultTolerance": ["Automatic reconnection", "Event replay for recovery", "Conflict-free merges"]
    }

    Architecture Breakdown

    • CollaborationService: Uses WebSocket connections for bidirectional, low-latency communication. Implements presence and session management to track active users.
    • ConflictResolutionEngine: Applies CRDT algorithms (e.g., RGA, LWW-Element-Set) which enable commutative and idempotent merges ensuring eventual consistency without locking.
    • PersistenceLayer: Event sourcing stores immutable edit events in an append-only log, facilitating auditability and temporal queries.
    • QueryService: Implements CQRS pattern separating write operations (events) from read models optimized for fast UI queries, updated asynchronously.

    Technology Stack Recommendations

    • WebSocket Framework: Use frameworks like Socket.IO (Node.js), SignalR (.NET), or Phoenix Channels (Elixir) for real-time communication.
    • CRDT Libraries: Automerge, Yjs for JavaScript clients; delta-crdts for backend implementations.
    • Event Store: Apache Kafka, EventStoreDB, or cloud services like AWS Kinesis for reliable event persistence.
    • Database for Read Models: NoSQL databases such as MongoDB or Elasticsearch optimized for read-heavy loads.

    Code Snippet: Implementing CRDT-based Merge

    import Automerge from 'automerge';
    
    // Initialize document
    let doc1 = Automerge.init();
    doc1 = Automerge.change(doc1, doc => {
      doc.text = new Automerge.Text();
      doc.text.insertAt(0, ...'Hello');
    });
    
    // Simulate concurrent edit on doc2
    let doc2 = Automerge.clone(doc1);
    doc2 = Automerge.change(doc2, doc => {
      doc.text.insertAt(0, 'A');
    });
    
    // Merge changes
    let merged = Automerge.merge(doc1, doc2);
    console.log(merged.text.toString()); // Output: "AHello"
    

    Fault Tolerance and Recovery Strategies

    1. Automatic Reconnection: Clients automatically reconnect to CollaborationService with exponential backoff on connection loss.
    2. Event Replay: Upon reconnection or service restart, missed events are replayed from the event store to synchronize client state.
    3. Conflict-Free Merges: CRDT algorithms ensure that concurrent edits merge deterministically without user intervention.
  3. Prompt: Partitioning Strategy for High-Volume Distributed Messaging System

    System Instructions: “Recommend an optimal partitioning and sharding strategy for a distributed messaging system processing billions of messages daily.”

    Context Variables:

    • messageSizeAvg: 2KB
    • retentionPeriod: 30 days
    • consumersCount: 5000
    • throughputRequirement: 100,000 messages per second

    Output Schema:

    {
      "partitionStrategy": "Topic-based partitioning with key hashing",
      "shardCount": 256,
      "replicationFactor": 3,
      "loadBalancing": "Round-robin among consumer groups",
      "dataPlacement": "Geo-distributed clusters with latency-aware routing"
    }

    Comprehensive Partitioning and Sharding Strategy

    For a system ingesting billions of messages daily at an average size of 2KB, the partitioning strategy must optimize throughput, minimize latency, and support fault tolerance:

    • Topic-Based Partitioning: Messages are categorized into topics representing logical streams (e.g., user activity, transactions). Each topic is divided into multiple partitions to parallelize consumption.
    • Key Hashing: Within each topic, messages are assigned to partitions based on a hash of their key (e.g., userID, sessionID). This maintains message order per key while distributing load evenly.
    • Shard Count: A shard count of 256 partitions balances parallelism and management overhead. This number should be chosen based on expected throughput and cluster capacity.
    • Replication Factor: A replication factor of 3 ensures data durability and availability during node failures.
    • Load Balancing: Consumers are grouped logically, with round-robin assignment to partitions, ensuring balanced processing and scalability.
    • Geo-Distributed Clusters: Deploy clusters across multiple geographic regions with latency-aware routing to serve consumers closer to their location, reducing end-to-end latency.

    Example Kafka Topic Configuration

    apiVersion: kafka.strimzi.io/v1beta2
    kind: KafkaTopic
    metadata:
      name: user-activity
      labels:
        strimzi.io/cluster: my-cluster
    spec:
      partitions: 256
      replicas: 3
      config:
        retention.ms: 2592000000  # 30 days in milliseconds
        segment.bytes: 1073741824 # 1 GB segment size for optimal log compaction
        cleanup.policy: delete
    

    Consumer Group Load Balancing Example (Java)

    Properties props = new Properties();
    props.put("bootstrap.servers", "broker1:9092,broker2:9092");
    props.put("group.id", "consumer-group-1");
    props.put("enable.auto.commit", "false");
    props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
    props.put("value.deserializer", "org.apache.kafka.common.serialization.ByteArrayDeserializer");
    
    KafkaConsumer consumer = new KafkaConsumer<>(props);
    consumer.subscribe(Arrays.asList("user-activity"));
    
    // Consumer loop
    while (true) {
        ConsumerRecords records = consumer.poll(Duration.ofMillis(100));
        for (ConsumerRecord record : records) {
            // Process message
        }
        consumer.commitSync();
    }
    

    Latency-Aware Geo-Distribution Strategy

    Implementing geo-distributed clusters requires:

    • Multi-region Kafka clusters with MirrorMaker 2.0 for cross-region replication.
    • Routing clients to nearest cluster endpoints using DNS-based load balancing or Anycast IP.
    • Latency monitoring to dynamically adjust routing policies.
    • Consistent hashing to ensure messages with the same key route to the same partition across clusters, preserving ordering.

    Summary Table: Partitioning Strategy Parameters

    Parameter Value Description
    Partition Strategy Topic-based with Key Hashing Ensures ordered processing per key and balanced load
    Shard Count 256 Number of partitions per topic for parallelism
    Replication Factor 3 Number of replicas to ensure durability and availability
    Load Balancing Round-robin among consumer groups Distributes message processing evenly
    Data Placement Geo-distributed clusters Reduces latency via proximity routing and replication

2. API Design & Integration

Section illustration

API design is fundamental to creating modular, scalable, and maintainable system architectures. A well-designed API not only enables seamless communication between components and services but also enforces clear contracts that reduce integration errors and simplify future enhancements. Leveraging GPT-5.5 can dramatically accelerate the process of defining precise, consistent, and extensible API contracts, while also supporting robust integration strategies that align with best practices in modern software engineering.

In this section, we explore advanced prompts tailored to generate comprehensive API designs, covering RESTful and GraphQL specifications, as well as API Gateway architectures for hybrid cloud environments. Each prompt is enriched with contextual variables and detailed output schemas, catering to real-world complexities such as multi-tenancy, role-based access control, audit logging, real-time data streaming, and security observability.

Key Considerations for API Design & Integration

  • Consistency: Uniform naming conventions, data formats, and error handling mechanisms across all endpoints.
  • Security: Authentication, authorization, and encryption strategies that protect data in transit and at rest.
  • Scalability: Design for horizontal scaling, rate limiting, and efficient pagination to handle growing user bases.
  • Extensibility: Versioning schemes and extensible schemas to accommodate evolving business requirements.
  • Observability: Integration of logging, metrics, and distributed tracing to monitor API health and performance.
  1. Prompt: Defining RESTful API Specification for a Multi-Tenant SaaS Application

    System Instructions: “Generate a RESTful API specification for a SaaS application that supports multi-tenancy with role-based access control and audit logging.”

    Context Variables:

    • tenancyModel: Isolated databases per tenant
    • roles: Admin, User, Auditor
    • loggingRequired: True

    Output Schema:

    {
      "endpoints": [
        {
          "path": "/tenants/{tenantId}/users",
          "methods": ["GET", "POST"],
          "rolesAllowed": ["Admin"],
          "description": "Retrieve or create users within the specified tenant."
        },
        {
          "path": "/tenants/{tenantId}/users/{userId}",
          "methods": ["GET", "PUT", "DELETE"],
          "rolesAllowed": ["Admin", "User"],
          "description": "Retrieve, update, or delete a specific user. Users can access their own profile."
        },
        {
          "path": "/tenants/{tenantId}/audit-logs",
          "methods": ["GET"],
          "rolesAllowed": ["Auditor"],
          "description": "Access audit logs for the specified tenant for compliance and monitoring."
        }
      ],
      "authentication": "OAuth 2.0 with JWT tokens issued per tenant",
      "rateLimiting": "1000 requests per minute per tenant",
      "responseFormat": "JSON API standard",
      "errorHandling": {
        "standardErrors": {
          "400": "Bad Request - Invalid input data",
          "401": "Unauthorized - Invalid or missing token",
          "403": "Forbidden - Insufficient permissions",
          "404": "Not Found - Resource does not exist",
          "429": "Too Many Requests - Rate limit exceeded",
          "500": "Internal Server Error - Unexpected error"
        }
      },
      "auditLogging": {
        "enabled": true,
        "logFields": ["timestamp", "userId", "tenantId", "action", "resource", "statusCode", "ipAddress"],
        "storage": "Encrypted, append-only logs stored in a secure data lake"
      }
    }

    Detailed Walkthrough

    This API design supports a multi-tenant SaaS system where each tenant operates in an isolated database environment, enhancing data security and reducing cross-tenant interference.

    Role-based access control (RBAC) enforces strict permissions:

    • Admin: Full user management within the tenant scope.
    • User: Access only to their own profile for reading and updating.
    • Auditor: Read-only access to audit logs for compliance monitoring.

    Authentication employs OAuth 2.0 with JWT tokens that include tenant claims to ensure tenant isolation at the token validation stage.

    Example: OpenAPI 3.0 Snippet for User Endpoints

    openapi: 3.0.3
    info:
      title: Multi-Tenant SaaS API
      version: 1.0.0
    servers:
      - url: https://api.saasapp.com
    paths:
      /tenants/{tenantId}/users:
        get:
          summary: List users in a tenant
          security:
            - oauth2: [admin]
          parameters:
            - name: tenantId
              in: path
              required: true
              schema:
                type: string
          responses:
            '200':
              description: A list of users
              content:
                application/vnd.api+json:
                  schema:
                    type: array
                    items:
                      $ref: '#/components/schemas/User'
            '401':
              $ref: '#/components/responses/Unauthorized'
        post:
          summary: Create a new user in a tenant
          security:
            - oauth2: [admin]
          requestBody:
            required: true
            content:
              application/vnd.api+json:
                schema:
                  $ref: '#/components/schemas/NewUser'
          responses:
            '201':
              description: User created
            '400':
              $ref: '#/components/responses/BadRequest'
      /tenants/{tenantId}/users/{userId}:
        get:
          summary: Get user by ID
          security:
            - oauth2: [admin, user]
          parameters:
            - name: tenantId
              in: path
              required: true
              schema:
                type: string
            - name: userId
              in: path
              required: true
              schema:
                type: string
          responses:
            '200':
              description: User details
              content:
                application/vnd.api+json:
                  schema:
                    $ref: '#/components/schemas/User'
            '403':
              $ref: '#/components/responses/Forbidden'
    components:
      schemas:
        User:
          type: object
          properties:
            id:
              type: string
            username:
              type: string
            email:
              type: string
            role:
              type: string
              enum: [Admin, User, Auditor]
        NewUser:
          type: object
          required:
            - username
            - email
          properties:
            username:
              type: string
            email:
              type: string
            role:
              type: string
              enum: [Admin, User, Auditor]
      securitySchemes:
        oauth2:
          type: oauth2
          flows:
            clientCredentials:
              tokenUrl: https://auth.saasapp.com/token
              scopes:
                admin: Full admin access
                user: Basic user access
                auditor: Read-only audit access
      responses:
        Unauthorized:
          description: Unauthorized access
        BadRequest:
          description: Invalid request parameters or payload
        Forbidden:
          description: Insufficient permissions

    This OpenAPI definition enforces role-based scopes and includes detailed request and response schemas following the JSON API standard.

  2. Prompt: Designing GraphQL API for a Social Media Platform

    System Instructions: “Create a GraphQL schema that supports user profiles, posts, comments, and real-time notifications.”

    Context Variables:

    • subscriptionsRequired: True
    • paginationStyle: Cursor-based
    • authorizationModel: Role-based with field-level permissions

    Output Schema:

    {
      "types": [
        {
          "name": "User",
          "fields": [
            {"name": "id", "type": "ID!"},
            {"name": "name", "type": "String!"},
            {"name": "email", "type": "String!"},
            {"name": "profilePicture", "type": "String"},
            {"name": "friends", "type": "[User!]!"}
          ],
          "description": "Represents a user profile with personal information and social connections."
        },
        {
          "name": "Post",
          "fields": [
            {"name": "id", "type": "ID!"},
            {"name": "author", "type": "User!"},
            {"name": "content", "type": "String!"},
            {"name": "comments", "type": "[Comment!]!"},
            {"name": "createdAt", "type": "DateTime!"}
          ],
          "description": "A social media post made by a user."
        },
        {
          "name": "Comment",
          "fields": [
            {"name": "id", "type": "ID!"},
            {"name": "author", "type": "User!"},
            {"name": "content", "type": "String!"},
            {"name": "createdAt", "type": "DateTime!"}
          ],
          "description": "A comment made on a post."
        },
        {
          "name": "Notification",
          "fields": [
            {"name": "id", "type": "ID!"},
            {"name": "type", "type": "String!"},
            {"name": "message", "type": "String!"},
            {"name": "read", "type": "Boolean!"}
          ],
          "description": "Real-time notifications sent to users."
        }
      ],
      "queries": [
        {"name": "getUser", "args": {"id": "ID!"}, "returnType": "User"},
        {"name": "getPosts", "args": {"cursor": "String", "limit": "Int"}, "returnType": "[Post!]!"}
      ],
      "mutations": [
        {"name": "createPost", "args": {"content": "String!"}, "returnType": "Post"},
        {"name": "addComment", "args": {"postId": "ID!", "content": "String!"}, "returnType": "Comment"}
      ],
      "subscriptions": [
        {"name": "onNewNotification", "args": {"userId": "ID!"}, "returnType": "Notification"}
      ],
      "authorization": {
        "model": "role-based",
        "fieldLevelPermissions": true,
        "roles": ["Admin", "User", "Guest"]
      },
      "pagination": {
        "style": "cursor-based",
        "defaultLimit": 20,
        "maxLimit": 100
      }
    }

    Schema Definition Example (SDL)

    scalar DateTime
    
    type User {
      id: ID!
      name: String!
      email: String!
      profilePicture: String
      friends: [User!]!
    }
    
    type Post {
      id: ID!
      author: User!
      content: String!
      comments: [Comment!]!
      createdAt: DateTime!
    }
    
    type Comment {
      id: ID!
      author: User!
      content: String!
      createdAt: DateTime!
    }
    
    type Notification {
      id: ID!
      type: String!
      message: String!
      read: Boolean!
    }
    
    type Query {
      getUser(id: ID!): User
      getPosts(cursor: String, limit: Int = 20): [Post!]!
    }
    
    type Mutation {
      createPost(content: String!): Post
      addComment(postId: ID!, content: String!): Comment
    }
    
    type Subscription {
      onNewNotification(userId: ID!): Notification
    }

    Authorization Strategy

    Field-level permissions are enforced dynamically at runtime via middleware or resolver wrappers. For example, the email field on User type might only be visible to Admin and the user themselves, while friends list is public to all authenticated users.

    Cursor-Based Pagination Implementation

    This approach uses opaque cursors to paginate through posts efficiently, preventing issues with offset-based pagination such as inconsistent results on data changes. Example query:

    query GetPosts($cursor: String, $limit: Int) {
      getPosts(cursor: $cursor, limit: $limit) {
        id
        content
        createdAt
      }
    }

    Real-Time Notifications with Subscriptions

    Subscriptions utilize WebSocket connections to push real-time updates to clients. The onNewNotification subscription filters notifications by userId, ensuring users receive only their own notifications.

    Example Resolver Snippet (Node.js with Apollo Server)

    const { PubSub, withFilter } = require('apollo-server');
    const pubsub = new PubSub();
    const NOTIFICATION_TOPIC = 'NEW_NOTIFICATION';
    
    const resolvers = {
      Subscription: {
        onNewNotification: {
          subscribe: withFilter(
            () => pubsub.asyncIterator([NOTIFICATION_TOPIC]),
            (payload, variables, context) => {
              return payload.userId === context.user.id;
            }
          )
        }
      },
      Mutation: {
        createPost: async (_, { content }, { user }) => {
          // Save post logic here
          return createdPost;
        },
        addComment: async (_, { postId, content }, { user }) => {
          // Save comment logic here
          return createdComment;
        }
      }
    };
  3. Prompt: API Gateway Design for Hybrid Cloud Integration

    System Instructions: “Develop an API Gateway architecture that integrates on-premise legacy systems with cloud-native microservices, ensuring security and observability.”

    Context Variables:

    • legacyProtocols: SOAP, JMS
    • cloudServices: RESTful microservices with OAuth 2.0
    • securityRequirements: TLS termination, JWT validation
    • observabilityTools: Distributed tracing, metrics aggregation

    Output Schema:

    {
      "gatewayComponents": [
        {
          "name": "ProtocolTranslator",
          "function": "Convert SOAP/JMS to REST",
          "details": "Provides adapters that transform legacy SOAP requests into RESTful JSON APIs and vice versa, leveraging XML parsers and JMS clients."
        },
        {
          "name": "SecurityModule",
          "function": "TLS termination, JWT validation",
          "details": "Handles SSL/TLS termination, verifies JWT tokens against the identity provider, and enforces scopes for API access."
        },
        {
          "name": "RoutingModule",
          "function": "Dynamic routing based on service registry",
          "details": "Uses a service discovery mechanism (e.g., Consul, Eureka) to route requests to appropriate microservices or legacy endpoints."
        },
        {
          "name": "ObservabilityModule",
          "function": "Integrate tracing and metrics",
          "details": "Implements distributed tracing (e.g., OpenTelemetry), aggregates Prometheus metrics, and exports logs to centralized systems."
        }
      ],
      "deploymentModel": "High-availability cluster with autoscaling",
      "failoverStrategy": "Circuit breaker and fallback endpoints",
      "additionalFeatures": {
        "rateLimiting": "Configured per client and tenant",
        "caching": "Response caching with Redis",
        "requestValidation": "Schema validation for inbound requests"
      }
    }

    Architecture Diagram Description

    The API Gateway acts as a centralized ingress point, bridging legacy on-premise systems and modern cloud-native microservices. The ProtocolTranslator enables seamless interoperability by converting SOAP/XML and JMS messages into RESTful calls and JSON payloads.

    The SecurityModule manages secure communications by terminating TLS sessions and validating JWTs issued by a centralized Identity and Access Management (IAM) system, ensuring only authenticated and authorized requests proceed.

    The RoutingModule dynamically routes requests based on the service registry, providing resilience and scalability. It supports canary deployments and blue-green releases by tagging services accordingly.

    The ObservabilityModule integrates distributed tracing (using OpenTelemetry), aggregates metrics via Prometheus exporters, and forwards logs to ELK or similar platforms, enabling comprehensive monitoring and troubleshooting.

    Example: Kubernetes Deployment YAML Snippet with API Gateway

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: hybrid-api-gateway
      labels:
        app: api-gateway
    spec:
      replicas: 3
      selector:
        matchLabels:
          app: api-gateway
      template:
        metadata:
          labels:
            app: api-gateway
        spec:
          containers:
            - name: api-gateway
              image: company/api-gateway:latest
              ports:
                - containerPort: 8080
              env:
                - name: PROTOCOL_TRANSLATOR_ENABLED
    
    

    3. Database Schema & Query Optimization

    Effective database design and query optimization are foundational to achieving high system performance, scalability, and maintainability in any software architecture. In modern systems, databases handle vast volumes of data with diverse workload patterns, requiring a deep understanding of relational and NoSQL schema design principles, normalization forms, indexing strategies, and query execution plans. GPT-5.5 excels in generating highly normalized schemas aligned with business domains, suggesting optimal indexes tailored to workload characteristics, and producing performant query rewrites that reduce latency and resource consumption.

    Stay Ahead of the AI Curve

    Get weekly insights on ChatGPT, OpenAI, and AI tools delivered to your inbox.

    Subscribe to Our Newsletter

    Beyond basic schema generation, GPT-5.5 can produce detailed explanations of normalization stages, identify potential bottlenecks in query plans, and recommend partitioning or sharding strategies for distributed databases. It can also tailor recommendations to specific database engines (e.g., PostgreSQL, MySQL, MongoDB), hardware configurations, and operational constraints. This section explores advanced prompt engineering for database schema design and query optimization with real-world examples and best practices.

    1. Prompt: Designing a Normalized Database Schema for a Subscription Billing System

      System Instructions: "Create a normalized relational schema to support customer subscriptions, plan tiers, usage tracking, and invoicing."

      Detailed Context Variables

      • subscriptionPlans: Basic, Pro, Enterprise — representing multiple pricing tiers with varying feature sets and pricing.
      • billingCycle: Monthly and yearly billing options, affecting invoice generation frequency and proration logic.
      • usageMetrics: Tracking API calls and storage usage to enable metered billing and overage charges.

      Normalized Schema Design Explanation

      The schema is designed to satisfy Third Normal Form (3NF), ensuring elimination of transitive dependencies and redundant data storage. Each entity represents a single concept, and foreign key constraints enforce referential integrity. This design facilitates flexibility in subscription management, accurate usage tracking, and detailed invoicing.

      Output Schema

      {
        "tables": [
          {"name": "Customers", "columns": [
            "CustomerID SERIAL PRIMARY KEY",
            "Name VARCHAR(255) NOT NULL",
            "Email VARCHAR(255) UNIQUE NOT NULL",
            "CreatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP"
          ]},
          {"name": "SubscriptionPlans", "columns": [
            "PlanID SERIAL PRIMARY KEY",
            "Name VARCHAR(50) NOT NULL",
            "Price DECIMAL(10, 2) NOT NULL",
            "BillingCycle VARCHAR(10) CHECK (BillingCycle IN ('Monthly', 'Yearly')) NOT NULL"
          ]},
          {"name": "Subscriptions", "columns": [
            "SubscriptionID SERIAL PRIMARY KEY",
            "CustomerID INT NOT NULL REFERENCES Customers(CustomerID)",
            "PlanID INT NOT NULL REFERENCES SubscriptionPlans(PlanID)",
            "StartDate DATE NOT NULL",
            "EndDate DATE"
          ]},
          {"name": "UsageRecords", "columns": [
            "UsageID SERIAL PRIMARY KEY",
            "SubscriptionID INT NOT NULL REFERENCES Subscriptions(SubscriptionID)",
            "Metric VARCHAR(50) NOT NULL CHECK (Metric IN ('API Calls', 'Storage Used'))",
            "Amount NUMERIC(15, 4) NOT NULL",
            "Date DATE NOT NULL"
          ]},
          {"name": "Invoices", "columns": [
            "InvoiceID SERIAL PRIMARY KEY",
            "SubscriptionID INT NOT NULL REFERENCES Subscriptions(SubscriptionID)",
            "AmountDue DECIMAL(12, 2) NOT NULL",
            "DueDate DATE NOT NULL",
            "Status VARCHAR(20) CHECK (Status IN ('Pending', 'Paid', 'Overdue')) NOT NULL"
          ]}
        ],
        "relationships": [
          {"from": "Subscriptions.CustomerID", "to": "Customers.CustomerID"},
          {"from": "Subscriptions.PlanID", "to": "SubscriptionPlans.PlanID"},
          {"from": "UsageRecords.SubscriptionID", "to": "Subscriptions.SubscriptionID"},
          {"from": "Invoices.SubscriptionID", "to": "Subscriptions.SubscriptionID"}
        ],
        "normalization": "3NF"
      }

      Production-Grade SQL Table Creation Example (PostgreSQL)

      CREATE TABLE Customers (
        CustomerID SERIAL PRIMARY KEY,
        Name VARCHAR(255) NOT NULL,
        Email VARCHAR(255) UNIQUE NOT NULL,
        CreatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP
      );
      
      CREATE TABLE SubscriptionPlans (
        PlanID SERIAL PRIMARY KEY,
        Name VARCHAR(50) NOT NULL,
        Price DECIMAL(10, 2) NOT NULL,
        BillingCycle VARCHAR(10) CHECK (BillingCycle IN ('Monthly', 'Yearly')) NOT NULL
      );
      
      CREATE TABLE Subscriptions (
        SubscriptionID SERIAL PRIMARY KEY,
        CustomerID INT NOT NULL REFERENCES Customers(CustomerID) ON DELETE CASCADE,
        PlanID INT NOT NULL REFERENCES SubscriptionPlans(PlanID),
        StartDate DATE NOT NULL,
        EndDate DATE
      );
      
      CREATE TABLE UsageRecords (
        UsageID SERIAL PRIMARY KEY,
        SubscriptionID INT NOT NULL REFERENCES Subscriptions(SubscriptionID) ON DELETE CASCADE,
        Metric VARCHAR(50) NOT NULL CHECK (Metric IN ('API Calls', 'Storage Used')),
        Amount NUMERIC(15,4) NOT NULL,
        Date DATE NOT NULL
      );
      
      CREATE TABLE Invoices (
        InvoiceID SERIAL PRIMARY KEY,
        SubscriptionID INT NOT NULL REFERENCES Subscriptions(SubscriptionID),
        AmountDue DECIMAL(12, 2) NOT NULL,
        DueDate DATE NOT NULL,
        Status VARCHAR(20) CHECK (Status IN ('Pending', 'Paid', 'Overdue')) NOT NULL
      );
      

      Additional Design Considerations

      • Indexing: Add indexes on foreign key columns (e.g., Subscriptions.CustomerID) to speed up joins.
      • Partitioning: For large UsageRecords, consider partitioning by date to improve query performance.
      • Proration Logic: Implement stored procedures or application logic to handle billing proration on subscription plan changes.
      • Audit Trails: Optionally add audit tables or use triggers to track subscription changes.
    2. Prompt: Query Optimization for High-Throughput Financial Transactions

      System Instructions: "Analyze and optimize SQL queries for a payment processing system handling 10,000 transactions per second."

      Context Variables

      • databaseEngine: PostgreSQL 13+
      • queries: SELECT * FROM transactions WHERE status='pending' AND created_at > now() - interval '5 minutes'
      • indexesAvailable: Primary key on transaction_id, non-clustered index on status
      • hardwareSpecs: 32-core CPU, 256GB RAM, NVMe SSD storage for low latency I/O

      Analysis of Original Query

      The original query uses a WHERE clause filtering on status and a recent created_at timestamp. The existing index on status alone is insufficient to efficiently filter by both conditions, causing sequential scans or inefficient index scans. Fetching all columns with SELECT * further increases I/O and CPU usage, particularly under high concurrency.

      Optimized Query and Index Recommendations

      {
        "optimizedQuery": "SELECT transaction_id, amount, created_at FROM transactions WHERE status = 'pending' AND created_at > now() - interval '5 minutes'",
        "indexesSuggested": ["CREATE INDEX idx_status_created_at ON transactions (status, created_at DESC)"],
        "explainPlan": {
          "seqScan": false,
          "indexScan": true,
          "cost": "Low",
          "rowsEstimated": 5000
        },
        "additionalRecommendations": [
          "Partition transactions table by time (e.g., monthly partitions) to limit data scanned",
          "Run VACUUM ANALYZE regularly to maintain index statistics",
          "Use prepared statements and connection pooling to reduce overhead",
          "Employ query parallelism if supported by the PostgreSQL version"
        ]
      }

      Example: Creating Composite Index in PostgreSQL

      CREATE INDEX idx_status_created_at ON transactions (status, created_at DESC);
      

      Step-by-Step Query Execution Improvement Walkthrough

      1. Identify bottleneck: Use EXPLAIN ANALYZE on the original query to confirm sequential scans and high cost.
      2. Refine SELECT clause: Replace SELECT * with only necessary columns to reduce I/O.
      3. Create composite index: Index on (status, created_at DESC) allows the database to efficiently filter and order data.
      4. Partition data: Implement table partitioning by date ranges to reduce the dataset queried.
      5. Maintain statistics: Schedule regular VACUUM and ANALYZE commands to keep planner statistics current.
      6. Monitor performance: Continuously monitor query latency and adjust indexing or partitioning as data grows.

      EXPLAIN Output Example for Optimized Query

      Index Scan using idx_status_created_at on transactions  (cost=0.42..15.67 rows=5000 width=48)
        Index Cond: ((status = 'pending'::text) AND (created_at > (now() - '00:05:00'::interval)))
      

      Additional Best Practices

      • Leverage connection pooling (e.g., PgBouncer) to manage high concurrency.
      • Use asynchronous processing for non-critical transaction steps.
      • Consider using materialized views or caching layers for aggregated transaction metrics.
      • Perform load testing with realistic transaction volumes to validate optimizations.
    3. Prompt: NoSQL Data Modeling for a Location-Based Social Network

      System Instructions: "Design a NoSQL document schema to support geo-queries, friend connections, and event feeds."

      Context Variables

      • databaseType: MongoDB 5.0+
      • geoIndexRequired: True — support efficient geospatial queries near user locations.
      • userConnections: Bidirectional friendships requiring fast lookup and updates.
      • eventFeedUpdateFrequency: Real-time updates with high write throughput.

      NoSQL Schema Design Principles

      Unlike relational models, document databases favor denormalization for performance and scalability. The schema design should optimize for common query patterns such as fetching nearby users, retrieving friends lists, and streaming event feeds. MongoDB's support for GeoJSON and specialized geospatial indexes enables efficient location-based queries.

      Output Schema

      {
        "collections": [
          {
            "name": "Users",
            "fields": {
              "_id": "ObjectId",
              "name": "string",
              "location": {"type": "GeoJSON Point", "index": "2dsphere"},
              "friends": "array of ObjectId (bidirectional references)",
              "lastActive": "timestamp"
            }
          },
          {
            "name": "Events",
            "fields": {
              "_id": "ObjectId",
              "userId": "ObjectId",
              "content": "string",
              "createdAt": "timestamp",
              "location": {"type": "GeoJSON Point", "index": "2dsphere"}
            }
          }
        ],
        "indexes": ["2dsphere on location fields", "Index on userId in Events"]
      }

      MongoDB Schema Creation and Indexing Example

      db.users.createIndex({ location: "2dsphere" });
      db.users.createIndex({ lastActive: 1 });
      
      db.events.createIndex({ location: "2dsphere" });
      db.events.createIndex({ userId: 1 });
      

      Data Model Explanation and Trade-offs

      • Users Collection: Stores user profile and geolocation data with a 2dsphere index to enable queries like "find users within radius X". The friends field holds an array of ObjectIds for bidirectional friendship representation, allowing fast lookup but potentially growing large arrays for popular users.
      • Events Collection: Stores user-generated content with location metadata. The 2dsphere index enables geo-filtering of events, while the index on userId supports efficient retrieval of a user's event feed.

      Example Geo-Query: Find Users Within 5km Radius

      db.users.find({
        location: {
          $nearSphere: {
            $geometry: {
              type: "Point",
              coordinates: [ -73.9667, 40.78 ]
            },
            $maxDistance: 5000
          }
        }
      });
      

      Handling Real-Time Event Feed Updates

      • Use capped collections or change streams in MongoDB to push real-time updates.
      • Consider denormalizing event data into user documents if feed read performance outweighs update complexity.
      • Implement TTL indexes on events to automatically expire outdated data.

      Scaling Considerations

      • Sharding users and events collections by geographic region or user ID hash to distribute load.
      • Use Redis or similar in-memory stores for caching hot friend lists and event feeds.
      • Monitor index sizes and performance using MongoDB Atlas or equivalent monitoring tools.

      Summary Table: Relational vs NoSQL Design Trade-offs

      Aspect Relational (Normalized) NoSQL (Document)
      Schema Flexibility Rigid, predefined tables and relationships Flexible, schema-less or schema-on-read
      Query Complexity Supports complex joins and transactions Limited joins, favors denormalization
      Performance Optimal for ACID compliance and complex queries Optimized for high write/read throughput and horizontal scaling
      Scalability Vertical scaling, sharding complex Horizontal scaling and sharding native

    4. Scalability & Performance Planning

    Section illustration

    Ensuring that software systems scale gracefully under increasing load and maintain optimal performance is a cornerstone of robust system design. This involves a deep understanding of workload characteristics, infrastructure limitations, and failure scenarios. GPT-5.5 can assist architects and engineers by generating detailed, actionable scalability and performance strategies tailored to specific system requirements and operational contexts.

    In this section, we explore advanced prompt templates designed to elicit comprehensive plans addressing scalability, capacity planning, fault tolerance, and performance tuning for complex, high-demand systems. These prompts include precise context variables, expected output schemas, and are augmented with real-world code snippets, configuration examples, and architectural patterns.

    1. Prompt: Scalability Plan for a High-Throughput Payment Gateway

      System Instructions: "Develop a scalability and capacity planning strategy for a payment gateway processing 100,000 transactions per second globally."

      Context Variables:

      • peakTPS: 100,000 transactions per second
      • latencySLA: <50ms end-to-end latency
      • failureTolerance: 99.999% uptime (5 nines)
      • infrastructure: Kubernetes clusters deployed across 5 global regions

      Detailed Scalability Approach:

      The system must support horizontal scaling at both the application and infrastructure layers. Key techniques include:

      • Horizontal Pod Autoscaling: Dynamically adjust the number of pods based on CPU/memory utilization and custom metrics like request rate.
      • Regional Load Balancing: Use geo-aware load balancers (e.g., AWS Global Accelerator, GCP Traffic Director) to route traffic to the nearest healthy region.
      • Message Queue Buffering: Introduce durable queues (e.g., Apache Kafka, AWS Kinesis) to smooth traffic spikes and decouple front-end from downstream processing.

      Capacity Planning Calculations:

      Estimate resource requirements based on peak TPS and latency constraints:

      Metric Assumptions Calculated Need
      CPU Cores Assuming 50 µs per transaction per core 2000 cores (100,000 TPS × 50 µs = 5,000 ms of CPU time per second → 5 seconds CPU per second, so ~2000 cores)
      Memory 20 MB per core for JVM heap and buffers 4000 GB total (2000 cores × 2 GB)
      Network Bandwidth Assuming 1 KB per transaction payload 100 Gbps (100,000 TPS × 1 KB × 8 bits)

      Fault Tolerance Strategies:

      • Multi-Region Failover: Active-active cluster deployments with cross-region replication to ensure zero downtime during regional outages.
      • Circuit Breakers: Implement using libraries like Netflix Hystrix or Resilience4j to prevent cascading failures.
      • Retry Policies with Exponential Backoff: Mitigate transient failures while avoiding overwhelming the system.

      Monitoring and Observability Setup:

      • Prometheus Metrics: Track custom application metrics such as transaction latency, error rates, and throughput.
      • Alertmanager Alerts: Configure alerts for SLA breaches, resource exhaustion, and anomalous patterns.
      • Distributed Tracing: Use Jaeger to trace transaction flow across microservices, identify bottlenecks, and latency hotspots.

      Example Kubernetes Horizontal Pod Autoscaler Configuration:

      apiVersion: autoscaling/v2beta2
      kind: HorizontalPodAutoscaler
      metadata:
        name: payment-gateway-hpa
        namespace: payments
      spec:
        scaleTargetRef:
          apiVersion: apps/v1
          kind: Deployment
          name: payment-gateway
        minReplicas: 50
        maxReplicas: 1000
        metrics:
        - type: Resource
          resource:
            name: cpu
            target:
              type: Utilization
              averageUtilization: 70
        - type: Pods
          pods:
            metric:
              name: transactions_per_second
            target:
              type: AverageValue
              averageValue: 5000

      Sample Kafka Configuration for Message Queuing:

      kafka-topics.sh --create --topic payment-transactions --partitions 500 --replication-factor 3 --config retention.ms=604800000

      This command creates a Kafka topic with 500 partitions to ensure high throughput and replication factor 3 for durability.

      Summary Output Schema:

      {
        "scalingApproach": [
          "Horizontal pod autoscaling",
          "Regional load balancing",
          "Message queue buffering"
        ],
        "capacityPlanning": {
          "cpuCoresRequired": 2000,
          "memoryGBRequired": 4000,
          "networkBandwidthGbps": 100
        },
        "faultTolerance": [
          "Multi-region failover",
          "Circuit breakers",
          "Retry policies with exponential backoff"
        ],
        "monitoringSetup": [
          "Prometheus metrics",
          "Alertmanager alerts",
          "Distributed tracing with Jaeger"
        ]
      }
    2. Prompt: Performance Tuning Strategy for Real-Time Analytics Platform

      System Instructions: "Produce a performance tuning plan for a real-time analytics platform ingesting streaming data and providing sub-second query responses."

      Context Variables:

      • dataIngestionRate: 1 million events per second
      • queryLatencyTarget: <500ms
      • storageType: Columnar OLAP database (e.g., Apache Druid, ClickHouse)
      • computeModel: Serverless functions with autoscaling (e.g., AWS Lambda, Azure Functions)

      Optimization Techniques:

      • Data Pre-Aggregation: Compute rollups and summaries at ingestion time to reduce query complexity.
      • Materialized Views: Maintain pre-computed views to serve frequent queries rapidly.
      • Efficient Partition Pruning: Design partition keys to enable the query engine to skip irrelevant data segments, reducing I/O.

      Resource Allocation Strategies:

      • Dynamic Scaling: Serverless functions auto-scale based on incoming event rate and query concurrency.
      • In-Memory Caching: Use Redis or Memcached to cache hot query results and metadata.

      Bottleneck Mitigation:

      • Backpressure Handling: Implement mechanisms to slow data ingestion when downstream systems are overwhelmed, such as Kafka’s flow control or reactive streams.
      • Query Concurrency Limits: Throttle queries during peak load to maintain SLA.

      Monitoring Setup:

      • Latency Histograms: Track distribution of query response times to detect tail latency issues.
      • Error Rate Tracking: Monitor ingestion failures, query errors, and system exceptions.
      • Resource Utilization Dashboards: Visualize CPU, memory, and network usage in real-time.

      Example Apache Druid Tuning Snippet for Ingestion:

      {
        "type" : "index_parallel",
        "spec": {
          "dataSchema": {
            "granularitySpec": {
              "type": "uniform",
              "segmentGranularity": "HOUR",
              "queryGranularity": "MINUTE"
            },
            "metricsSpec": [
              {"type":"count","name":"count"},
              {"type":"doubleSum","name":"total_revenue","fieldName":"revenue"}
            ],
            "dimensionsSpec": {
              "dimensions": ["user_id", "product_id", "region"]
            }
          },
          "ioConfig": {
            "type": "index_parallel",
            "inputSource": {"type": "kafka", "topic": "events-topic"},
            "inputFormat": {"type": "json"}
          },
          "tuningConfig": {
            "type": "index_parallel",
            "maxRowsPerSegment": 5000000,
            "maxTotalRows": 100000000,
            "partitionsSpec": {
              "type": "hashed",
              "targetPartitionSize": 500000
            }
          }
        }
      }

      Serverless Autoscaling Example with AWS Lambda (Node.js):

      exports.handler = async (event) => {
        // Process streaming event
        // Ensure idempotency and minimal cold start latency by keeping dependencies lightweight
        return {
          statusCode: 200,
          body: JSON.stringify('Processed event successfully'),
        };
      };

      Summary Output Schema:

      {
        "optimizationTechniques": [
          "Data pre-aggregation",
          "Materialized views",
          "Efficient partition pruning"
        ],
        "resourceAllocation": [
          "Dynamic scaling of compute nodes",
          "In-memory caching for hot data"
        ],
        "bottleneckMitigation": [
          "Backpressure handling on ingestion",
          "Query concurrency limits"
        ],
        "monitoring": [
          "Latency histograms",
          "Error rate tracking",
          "Resource utilization dashboards"
        ]
      }
    3. Prompt: Load Balancing Strategy for Global Video Streaming Service

      System Instructions: "Design a load balancing strategy for a video streaming service with millions of concurrent viewers worldwide."

      Context Variables:

      • concurrentViewers: 5 million
      • contentDeliveryNetwork: Integrated CDN usage (e.g., Cloudflare, Akamai)
      • latencyTarget: <200ms startup time
      • failoverMechanism: Automatic rerouting on node failure

      Comprehensive Load Balancing Strategy:

      To efficiently serve a massive global audience with low startup latency, the system must leverage multi-layered load balancing and edge caching:

      • Global DNS-Based Routing: Use a DNS provider with latency-based routing (e.g., AWS Route 53 latency routing) to direct users to the closest regional cluster.
      • Regional Application Load Balancers: Within each region, application load balancers distribute incoming streaming requests across multiple backend streaming servers.
      • Edge Caching via CDN: Leverage CDN edge nodes to cache video segments near users, reducing origin server load and decreasing startup latency.

      Advanced Routing Policies:

      • Latency-Aware Routing: Continuously measure user-to-region latency and route traffic dynamically to minimize startup delay.
      • Health Checks with Automatic Failover: Employ active health monitoring to detect failing nodes and automatically reroute traffic to healthy endpoints.

      Scaling Model:

      The system should implement auto-scaling based on concurrent viewer metrics using cloud-native autoscaling groups or Kubernetes Horizontal Pod Autoscalers, combined with pre-warming strategies during anticipated peak hours.

      Example: Route 53 Latency-Based Routing Configuration:

      resource "aws_route53_record" "video_streaming_latency_routing" {
        zone_id = aws_route53_zone.primary.zone_id
        name    = "streaming.example.com"
        type    = "A"
      
        alias {
          name                   = aws_lb.region1.dns_name
          zone_id                = aws_lb.region1.zone_id
          evaluate_target_health = true
        }
      
        set_identifier = "region1"
        latency_routing_policy {
          region = "us-east-1"
        }
      }
      
      resource "aws_route53_record" "video_streaming_latency_routing_region2" {
        zone_id = aws_route53_zone.primary.zone_id
        name    = "streaming.example.com"
        type    = "A"
      
        alias {
          name                   = aws_lb.region2.dns_name
          zone_id                = aws_lb.region2.zone_id
          evaluate_target_health = true
        }
      
        set_identifier = "region2"
        latency_routing_policy {
          region = "eu-west-1"
        }
      }

      Sample NGINX Configuration Snippet for Regional Load Balancer:

      http {
        upstream backend {
          server backend1.example.com max_fails=3 fail_timeout=30s;
          server backend2.example.com max_fails=3 fail_timeout=30s;
          server backend3.example.com max_fails=3 fail_timeout=30s;
        }
      
        server {
          listen 80;
      
          location / {
            proxy_pass http://backend;
            proxy_next_upstream error timeout invalid_header http_500 http_503;
            proxy_connect_timeout 5s;
            proxy_read_timeout 30s;
          }
        }
      }

      CDN Edge Caching Best Practices:

      • Cache video segments with appropriate Cache-Control headers to maximize hit ratios.
      • Use HTTP/2 or HTTP/3 protocols for faster multiplexing.
      • Implement token-based authentication on CDN requests to secure content.

      Summary Output Schema:

      {
        "loadBalancers": [
          "Global DNS-based routing",
          "Regional application load balancers",
          "Edge caching via CDN"
        ],
        "routingPolicies": [
          "Latency-aware routing",
          "Health checks with automatic failover"
        ],
        "scalingModel": [
          "Auto-scaling based on concurrent viewers",
          "Pre-warming during peak hours"
        ],
        "monitoring": [
          "Real-time viewer metrics",
          "Load balancer health dashboards",
          "CDN cache hit ratio tracking"
        ]
      }

Get Free Access to 40,000+ AI Prompts for ChatGPT, Claude & Codex

Subscribe for instant access to the largest curated Notion Prompt Library for AI workflows.

More on this

Codex Hooks and Programmatic Access Tokens: Enterprise Automation Guide

Reading Time: 14 minutes
Codex Hooks and Programmatic Access Tokens: Enterprise Automation Guide In modern enterprise software development, automation and secure programmatic access to AI services are paramount. Codex Hooks and Programmatic Access Tokens represent a sophisticated approach to integrating AI-powered tools into continuous...