25 GPT-5.5 Prompts for Software Architecture and System Design

May 31, 2026

25 GPT-5.5 Prompts for Software Architecture and System Design

In the rapidly evolving landscape of software engineering, mastering system design and architecture is paramount for developing scalable, maintainable, and high-performance applications. Architects today face multifaceted challenges, ranging from microservices orchestration and data consistency to security compliance and cost optimization. With the advent of GPT-5.5, engineers and architects can leverage advanced AI-driven reasoning to streamline complex design challenges, enabling more insightful decision-making and reducing iterative design cycles.

This comprehensive playbook introduces 25 meticulously crafted GPT-5.5 prompts tailored specifically for software architecture and system design, categorized into five critical domains: scalability, reliability, security, maintainability, and performance optimization. Each prompt is designed with explicit system instructions, contextual variables, and output schemas that align with real-world engineering workflows. For example, prompts guide the design of a real-time collaborative editor with conflict resolution mechanisms or architect a high-throughput, fault-tolerant payment gateway with multi-region deployment.

Why GPT-5.5 for System Design?

GPT-5.5 incorporates cutting-edge advancements in natural language understanding and contextual reasoning, enabling it to assist in generating detailed architectural blueprints, technology stack evaluations, and trade-off analyses. Unlike earlier models, GPT-5.5 can interpret nuanced system requirements, propose modular designs, and provide justifications grounded in established software engineering principles.

Key Capabilities Include:

Contextual Awareness: Understands complex system requirements and constraints.
Multi-turn Reasoning: Engages in iterative dialogue to refine design proposals.
Schema-Driven Outputs: Produces structured JSON/YAML schemas for integration with CI/CD pipelines and architecture documentation tools.
Code and Configuration Generation: Generates production-grade code snippets, deployment manifests, and infrastructure-as-code templates.

How to Use This Playbook

Each prompt in this collection includes:

System Instructions: High-level guidance on what the prompt aims to achieve.
Contextual Variables: Dynamic variables representing system parameters such as throughput, latency, user base size, or compliance requirements.
Output Schema: Defines the expected format of the AI’s response, facilitating easy parsing and integration.
Example Use Case: Demonstrations of prompt application with detailed walkthroughs, including code snippets, architectural diagrams, and deployment scripts.

Example: Designing a Real-Time Collaborative Editor

Consider a prompt designed to help architect a real-time collaborative text editor supporting thousands of concurrent users. GPT-5.5 can assist in generating:

Event-driven architecture diagrams featuring Operational Transformation (OT) or Conflict-free Replicated Data Types (CRDTs).
WebSocket-based communication protocols with fallback strategies.
Scalable backend components, including distributed state synchronization and persistence layers.
Sample code snippets in TypeScript demonstrating client-server synchronization logic.

interface DocumentChange {
  userId: string;
  timestamp: number;
  operations: Operation[];
}

type Operation = Insert | Delete;

interface Insert {
  type: 'insert';
  position: number;
  text: string;
}

interface Delete {
  type: 'delete';
  position: number;
  length: number;
}

// Example function to apply an operation
function applyOperation(doc: string, op: Operation): string {
  switch(op.type) {
    case 'insert':
      return doc.slice(0, op.position) + op.text + doc.slice(op.position);
    case 'delete':
      return doc.slice(0, op.position) + doc.slice(op.position + op.length);
  }
}

Example: Architecting a High-Throughput Payment Gateway

For a payment gateway handling millions of transactions per day, GPT-5.5 can generate detailed designs encompassing:

Microservices for transaction processing, fraud detection, and settlement.
Event sourcing and CQRS patterns for auditability and scalability.
Infrastructure-as-Code (IaC) scripts for multi-region Kubernetes deployments on AWS EKS or Google GKE.
Security best practices, including PCI-DSS compliance checklists and tokenization strategies.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-processor
  labels:
    app: payment-gateway
spec:
  replicas: 10
  selector:
    matchLabels:
      app: payment-gateway
      component: processor
  template:
    metadata:
      labels:
        app: payment-gateway
        component: processor
    spec:
      containers:
      - name: processor
        image: payment-processor:v1.2.0
        ports:
        - containerPort: 8080
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: payment-db-credentials
              key: url
        resources:
          requests:
            cpu: "500m"
            memory: "1Gi"
          limits:
            cpu: "2"
            memory: "4Gi"
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /readyz
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 5

By integrating these prompts into your workflow, you can harness GPT-5.5’s capabilities to elevate your architectural decisions and accelerate system design processes significantly. This playbook not only serves as a starting point but also as a continuously extensible framework for evolving your design methodologies alongside emerging technological trends.

1. Microservices & Distributed Systems

Designing microservices and distributed systems demands a meticulous approach to separation of concerns, resilience engineering, and efficient inter-service communication patterns. These architectures must handle complex operational challenges such as fault tolerance, scalability, eventual consistency, and observability. Leveraging GPT-5.5’s advanced contextual understanding, architects can generate nuanced system designs that incorporate best practices, industry standards, and emerging patterns to maximize system robustness and responsiveness under diverse workloads.

Microservices architecture breaks down monolithic applications into independently deployable services, each encapsulating a specific business capability. Distributed systems extend this concept across networked nodes, emphasizing scalability, availability, and partition tolerance. Key design principles include defining clear service boundaries, selecting appropriate communication protocols, implementing fault isolation, and enabling seamless scalability.

Core Design Considerations

Service Boundaries: Define services around business capabilities or bounded contexts to minimize coupling.
Communication Patterns: Choose synchronous (REST, gRPC) or asynchronous (messaging, event-driven) communication based on latency and consistency needs.
Fault Tolerance: Implement circuit breakers, retries with exponential backoff, bulkheads, and failover mechanisms to ensure resilience.
Data Management: Use eventual consistency, distributed transactions, or sagas to maintain data integrity across services.
Observability: Integrate centralized logging, metrics, tracing (e.g., OpenTelemetry) for monitoring and diagnostics.
Security: Enforce authentication, authorization, and secure communication channels (TLS, mTLS).

Production-Grade Infrastructure Components

Service Mesh: Tools like Istio or Linkerd provide advanced traffic management, observability, and security features.
API Gateway: Centralized entry point for routing, authentication, rate-limiting, and request aggregation.
Message Brokers: Kafka, RabbitMQ, or AWS SNS/SQS enable reliable asynchronous communication.
Container Orchestration: Kubernetes for automated deployment, scaling, and management.
Configuration Management: Use tools like Consul or Vault for dynamic configuration and secrets management.

Prompt: Designing a Scalable Microservices Architecture for an E-commerce Platform

System Instructions: “Generate a detailed microservices architecture for a global e-commerce platform supporting user accounts, product catalogs, order processing, and payment handling. Emphasize service boundaries, inter-service communication, and fault tolerance.”

Context Variables:

userBaseSize: 10 million monthly active users
peakRequestRate: 50,000 requests per second
paymentProviders: Stripe, PayPal
regions: US, EU, APAC

Output Schema:

{
  "services": [
    {"name": "UserService", "responsibilities": "Manage user profiles and authentication", "communication": "REST API"},
    {"name": "CatalogService", "responsibilities": "Product listings and inventory", "communication": "gRPC"},
    {"name": "OrderService", "responsibilities": "Order lifecycle management", "communication": "Event-driven via message queue"},
    {"name": "PaymentService", "responsibilities": "Process payments with external providers", "communication": "REST API with retries"}
  ],
  "communicationPatterns": ["Synchronous REST between UserService and CatalogService", "Asynchronous events between OrderService and PaymentService"],
  "scalabilityStrategies": ["Auto-scaling groups per service", "Regional failover", "Circuit breaker for payment calls"]
}

Detailed Architecture Analysis

The platform consists of four core microservices, each optimized for its domain:

UserService: Handles user registration, authentication (OAuth2/JWT), profile management. Exposes RESTful endpoints secured with TLS and rate-limiting.
CatalogService: Maintains product data, inventory counts, and search indexes. Uses gRPC for high-performance, low-latency RPC calls, enabling synchronous queries from UserService and frontend APIs.
OrderService: Manages order creation, updates, and status tracking through event-driven mechanisms (e.g., Kafka or AWS Kinesis). This decouples order processing from synchronous requests, improving throughput and fault isolation.
PaymentService: Integrates with external payment gateways (Stripe, PayPal) via REST APIs. Implements retry policies, circuit breakers (e.g., Hystrix or Resilience4j), and idempotency keys to prevent double charges.

Production-Grade Configuration Example (Kubernetes Deployment Snippet)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
  labels:
    app: payment-service
spec:
  replicas: 5
  selector:
    matchLabels:
      app: payment-service
  template:
    metadata:
      labels:
        app: payment-service
    spec:
      containers:
      - name: payment-service
        image: company/payment-service:1.2.0
        ports:
        - containerPort: 8080
        env:
        - name: STRIPE_API_KEY
          valueFrom:
            secretKeyRef:
              name: payment-secrets
              key: stripeApiKey
        - name: PAYPAL_CLIENT_ID
          valueFrom:
            secretKeyRef:
              name: payment-secrets
              key: paypalClientId
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 15
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 30
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits:
            cpu: "2"
            memory: "2Gi"
        # Circuit breaker and retry implemented in service code using resilience4j or similar

Inter-Service Communication Patterns

Communication Type	Source Service	Destination Service	Protocol	Purpose	Fault-Tolerance Mechanism
Synchronous	UserService	CatalogService	gRPC	Retrieve product details for user browsing	Timeouts, retries with exponential backoff
Asynchronous	OrderService	PaymentService	Message Queue (e.g., Kafka)	Trigger payment processing after order placement	Message retries, dead-letter queues, circuit breakers

Prompt: Fault-Tolerant Distributed System for Real-Time Collaborative Editor

System Instructions: “Design a distributed system architecture enabling real-time collaboration on documents with low latency and conflict resolution.”

Context Variables:
- concurrentUsersPerDocument: 1000
- latencyRequirement: <100ms
- conflictResolutionStrategy: Operational Transformation (OT) or CRDTs
- persistence: Event sourcing with CQRS
Output Schema:
```
{
  "components": [
    {"name": "CollaborationService", "protocol": "WebSockets", "function": "Broadcast real-time changes"},
    {"name": "ConflictResolutionEngine", "algorithm": "CRDT", "function": "Merge concurrent edits"},
    {"name": "PersistenceLayer", "pattern": "Event Sourcing", "function": "Store all edit events"},
    {"name": "QueryService", "pattern": "CQRS", "function": "Serve read requests efficiently"}
  ],
  "dataFlow": ["Client <-> CollaborationService <-> ConflictResolutionEngine <-> PersistenceLayer"],
  "faultTolerance": ["Automatic reconnection", "Event replay for recovery", "Conflict-free merges"]
}
```
Architecture Breakdown
- CollaborationService: Uses WebSocket connections for bidirectional, low-latency communication. Implements presence and session management to track active users.
- ConflictResolutionEngine: Applies CRDT algorithms (e.g., RGA, LWW-Element-Set) which enable commutative and idempotent merges ensuring eventual consistency without locking.
- PersistenceLayer: Event sourcing stores immutable edit events in an append-only log, facilitating auditability and temporal queries.
- QueryService: Implements CQRS pattern separating write operations (events) from read models optimized for fast UI queries, updated asynchronously.
Technology Stack Recommendations
- WebSocket Framework: Use frameworks like Socket.IO (Node.js), SignalR (.NET), or Phoenix Channels (Elixir) for real-time communication.
- CRDT Libraries: Automerge, Yjs for JavaScript clients; delta-crdts for backend implementations.
- Event Store: Apache Kafka, EventStoreDB, or cloud services like AWS Kinesis for reliable event persistence.
- Database for Read Models: NoSQL databases such as MongoDB or Elasticsearch optimized for read-heavy loads.
Code Snippet: Implementing CRDT-based Merge
```
import Automerge from 'automerge';

// Initialize document
let doc1 = Automerge.init();
doc1 = Automerge.change(doc1, doc => {
  doc.text = new Automerge.Text();
  doc.text.insertAt(0, ...'Hello');
});

// Simulate concurrent edit on doc2
let doc2 = Automerge.clone(doc1);
doc2 = Automerge.change(doc2, doc => {
  doc.text.insertAt(0, 'A');
});

// Merge changes
let merged = Automerge.merge(doc1, doc2);
console.log(merged.text.toString()); // Output: "AHello"
```
Fault Tolerance and Recovery Strategies
1. Automatic Reconnection: Clients automatically reconnect to CollaborationService with exponential backoff on connection loss.
2. Event Replay: Upon reconnection or service restart, missed events are replayed from the event store to synchronize client state.
3. Conflict-Free Merges: CRDT algorithms ensure that concurrent edits merge deterministically without user intervention.

Prompt: Partitioning Strategy for High-Volume Distributed Messaging System

System Instructions: “Recommend an optimal partitioning and sharding strategy for a distributed messaging system processing billions of messages daily.”

Context Variables:

messageSizeAvg: 2KB
retentionPeriod: 30 days
consumersCount: 5000
throughputRequirement: 100,000 messages per second

Output Schema:

{
  "partitionStrategy": "Topic-based partitioning with key hashing",
  "shardCount": 256,
  "replicationFactor": 3,
  "loadBalancing": "Round-robin among consumer groups",
  "dataPlacement": "Geo-distributed clusters with latency-aware routing"
}

Comprehensive Partitioning and Sharding Strategy

For a system ingesting billions of messages daily at an average size of 2KB, the partitioning strategy must optimize throughput, minimize latency, and support fault tolerance:

Topic-Based Partitioning: Messages are categorized into topics representing logical streams (e.g., user activity, transactions). Each topic is divided into multiple partitions to parallelize consumption.
Key Hashing: Within each topic, messages are assigned to partitions based on a hash of their key (e.g., userID, sessionID). This maintains message order per key while distributing load evenly.
Shard Count: A shard count of 256 partitions balances parallelism and management overhead. This number should be chosen based on expected throughput and cluster capacity.
Replication Factor: A replication factor of 3 ensures data durability and availability during node failures.
Load Balancing: Consumers are grouped logically, with round-robin assignment to partitions, ensuring balanced processing and scalability.
Geo-Distributed Clusters: Deploy clusters across multiple geographic regions with latency-aware routing to serve consumers closer to their location, reducing end-to-end latency.

Example Kafka Topic Configuration

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
  name: user-activity
  labels:
    strimzi.io/cluster: my-cluster
spec:
  partitions: 256
  replicas: 3
  config:
    retention.ms: 2592000000  # 30 days in milliseconds
    segment.bytes: 1073741824 # 1 GB segment size for optimal log compaction
    cleanup.policy: delete

Consumer Group Load Balancing Example (Java)

Properties props = new Properties();
props.put("bootstrap.servers", "broker1:9092,broker2:9092");
props.put("group.id", "consumer-group-1");
props.put("enable.auto.commit", "false");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.ByteArrayDeserializer");

KafkaConsumer consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("user-activity"));

// Consumer loop
while (true) {
    ConsumerRecords records = consumer.poll(Duration.ofMillis(100));
    for (ConsumerRecord record : records) {
        // Process message
    }
    consumer.commitSync();
}

Latency-Aware Geo-Distribution Strategy

Implementing geo-distributed clusters requires:

Multi-region Kafka clusters with MirrorMaker 2.0 for cross-region replication.
Routing clients to nearest cluster endpoints using DNS-based load balancing or Anycast IP.
Latency monitoring to dynamically adjust routing policies.
Consistent hashing to ensure messages with the same key route to the same partition across clusters, preserving ordering.

Summary Table: Partitioning Strategy Parameters

Parameter	Value	Description
Partition Strategy	Topic-based with Key Hashing	Ensures ordered processing per key and balanced load
Shard Count	256	Number of partitions per topic for parallelism
Replication Factor	3	Number of replicas to ensure durability and availability
Load Balancing	Round-robin among consumer groups	Distributes message processing evenly
Data Placement	Geo-distributed clusters	Reduces latency via proximity routing and replication

2. API Design & Integration

API design is fundamental to creating modular, scalable, and maintainable system architectures. A well-designed API not only enables seamless communication between components and services but also enforces clear contracts that reduce integration errors and simplify future enhancements. Leveraging GPT-5.5 can dramatically accelerate the process of defining precise, consistent, and extensible API contracts, while also supporting robust integration strategies that align with best practices in modern software engineering.

In this section, we explore advanced prompts tailored to generate comprehensive API designs, covering RESTful and GraphQL specifications, as well as API Gateway architectures for hybrid cloud environments. Each prompt is enriched with contextual variables and detailed output schemas, catering to real-world complexities such as multi-tenancy, role-based access control, audit logging, real-time data streaming, and security observability.

Key Considerations for API Design & Integration

Consistency: Uniform naming conventions, data formats, and error handling mechanisms across all endpoints.
Security: Authentication, authorization, and encryption strategies that protect data in transit and at rest.
Scalability: Design for horizontal scaling, rate limiting, and efficient pagination to handle growing user bases.
Extensibility: Versioning schemes and extensible schemas to accommodate evolving business requirements.
Observability: Integration of logging, metrics, and distributed tracing to monitor API health and performance.

Prompt: Defining RESTful API Specification for a Multi-Tenant SaaS Application

System Instructions: “Generate a RESTful API specification for a SaaS application that supports multi-tenancy with role-based access control and audit logging.”

Context Variables:

tenancyModel: Isolated databases per tenant
roles: Admin, User, Auditor
loggingRequired: True

Output Schema:

{
  "endpoints": [
    {
      "path": "/tenants/{tenantId}/users",
      "methods": ["GET", "POST"],
      "rolesAllowed": ["Admin"],
      "description": "Retrieve or create users within the specified tenant."
    },
    {
      "path": "/tenants/{tenantId}/users/{userId}",
      "methods": ["GET", "PUT", "DELETE"],
      "rolesAllowed": ["Admin", "User"],
      "description": "Retrieve, update, or delete a specific user. Users can access their own profile."
    },
    {
      "path": "/tenants/{tenantId}/audit-logs",
      "methods": ["GET"],
      "rolesAllowed": ["Auditor"],
      "description": "Access audit logs for the specified tenant for compliance and monitoring."
    }
  ],
  "authentication": "OAuth 2.0 with JWT tokens issued per tenant",
  "rateLimiting": "1000 requests per minute per tenant",
  "responseFormat": "JSON API standard",
  "errorHandling": {
    "standardErrors": {
      "400": "Bad Request - Invalid input data",
      "401": "Unauthorized - Invalid or missing token",
      "403": "Forbidden - Insufficient permissions",
      "404": "Not Found - Resource does not exist",
      "429": "Too Many Requests - Rate limit exceeded",
      "500": "Internal Server Error - Unexpected error"
    }
  },
  "auditLogging": {
    "enabled": true,
    "logFields": ["timestamp", "userId", "tenantId", "action", "resource", "statusCode", "ipAddress"],
    "storage": "Encrypted, append-only logs stored in a secure data lake"
  }
}

Detailed Walkthrough

This API design supports a multi-tenant SaaS system where each tenant operates in an isolated database environment, enhancing data security and reducing cross-tenant interference.

Role-based access control (RBAC) enforces strict permissions:

Admin: Full user management within the tenant scope.
User: Access only to their own profile for reading and updating.
Auditor: Read-only access to audit logs for compliance monitoring.

Authentication employs OAuth 2.0 with JWT tokens that include tenant claims to ensure tenant isolation at the token validation stage.

Example: OpenAPI 3.0 Snippet for User Endpoints

openapi: 3.0.3
info:
  title: Multi-Tenant SaaS API
  version: 1.0.0
servers:
  - url: https://api.saasapp.com
paths:
  /tenants/{tenantId}/users:
    get:
      summary: List users in a tenant
      security:
        - oauth2: [admin]
      parameters:
        - name: tenantId
          in: path
          required: true
          schema:
            type: string
      responses:
        '200':
          description: A list of users
          content:
            application/vnd.api+json:
              schema:
                type: array
                items:
                  $ref: '#/components/schemas/User'
        '401':
          $ref: '#/components/responses/Unauthorized'
    post:
      summary: Create a new user in a tenant
      security:
        - oauth2: [admin]
      requestBody:
        required: true
        content:
          application/vnd.api+json:
            schema:
              $ref: '#/components/schemas/NewUser'
      responses:
        '201':
          description: User created
        '400':
          $ref: '#/components/responses/BadRequest'
  /tenants/{tenantId}/users/{userId}:
    get:
      summary: Get user by ID
      security:
        - oauth2: [admin, user]
      parameters:
        - name: tenantId
          in: path
          required: true
          schema:
            type: string
        - name: userId
          in: path
          required: true
          schema:
            type: string
      responses:
        '200':
          description: User details
          content:
            application/vnd.api+json:
              schema:
                $ref: '#/components/schemas/User'
        '403':
          $ref: '#/components/responses/Forbidden'
components:
  schemas:
    User:
      type: object
      properties:
        id:
          type: string
        username:
          type: string
        email:
          type: string
        role:
          type: string
          enum: [Admin, User, Auditor]
    NewUser:
      type: object
      required:
        - username
        - email
      properties:
        username:
          type: string
        email:
          type: string
        role:
          type: string
          enum: [Admin, User, Auditor]
  securitySchemes:
    oauth2:
      type: oauth2
      flows:
        clientCredentials:
          tokenUrl: https://auth.saasapp.com/token
          scopes:
            admin: Full admin access
            user: Basic user access
            auditor: Read-only audit access
  responses:
    Unauthorized:
      description: Unauthorized access
    BadRequest:
      description: Invalid request parameters or payload
    Forbidden:
      description: Insufficient permissions

This OpenAPI definition enforces role-based scopes and includes detailed request and response schemas following the JSON API standard.

Prompt: Designing GraphQL API for a Social Media Platform

System Instructions: “Create a GraphQL schema that supports user profiles, posts, comments, and real-time notifications.”

Context Variables:

subscriptionsRequired: True
paginationStyle: Cursor-based
authorizationModel: Role-based with field-level permissions

Output Schema:

{
  "types": [
    {
      "name": "User",
      "fields": [
        {"name": "id", "type": "ID!"},
        {"name": "name", "type": "String!"},
        {"name": "email", "type": "String!"},
        {"name": "profilePicture", "type": "String"},
        {"name": "friends", "type": "[User!]!"}
      ],
      "description": "Represents a user profile with personal information and social connections."
    },
    {
      "name": "Post",
      "fields": [
        {"name": "id", "type": "ID!"},
        {"name": "author", "type": "User!"},
        {"name": "content", "type": "String!"},
        {"name": "comments", "type": "[Comment!]!"},
        {"name": "createdAt", "type": "DateTime!"}
      ],
      "description": "A social media post made by a user."
    },
    {
      "name": "Comment",
      "fields": [
        {"name": "id", "type": "ID!"},
        {"name": "author", "type": "User!"},
        {"name": "content", "type": "String!"},
        {"name": "createdAt", "type": "DateTime!"}
      ],
      "description": "A comment made on a post."
    },
    {
      "name": "Notification",
      "fields": [
        {"name": "id", "type": "ID!"},
        {"name": "type", "type": "String!"},
        {"name": "message", "type": "String!"},
        {"name": "read", "type": "Boolean!"}
      ],
      "description": "Real-time notifications sent to users."
    }
  ],
  "queries": [
    {"name": "getUser", "args": {"id": "ID!"}, "returnType": "User"},
    {"name": "getPosts", "args": {"cursor": "String", "limit": "Int"}, "returnType": "[Post!]!"}
  ],
  "mutations": [
    {"name": "createPost", "args": {"content": "String!"}, "returnType": "Post"},
    {"name": "addComment", "args": {"postId": "ID!", "content": "String!"}, "returnType": "Comment"}
  ],
  "subscriptions": [
    {"name": "onNewNotification", "args": {"userId": "ID!"}, "returnType": "Notification"}
  ],
  "authorization": {
    "model": "role-based",
    "fieldLevelPermissions": true,
    "roles": ["Admin", "User", "Guest"]
  },
  "pagination": {
    "style": "cursor-based",
    "defaultLimit": 20,
    "maxLimit": 100
  }
}

Schema Definition Example (SDL)

scalar DateTime

type User {
  id: ID!
  name: String!
  email: String!
  profilePicture: String
  friends: [User!]!
}

type Post {
  id: ID!
  author: User!
  content: String!
  comments: [Comment!]!
  createdAt: DateTime!
}

type Comment {
  id: ID!
  author: User!
  content: String!
  createdAt: DateTime!
}

type Notification {
  id: ID!
  type: String!
  message: String!
  read: Boolean!
}

type Query {
  getUser(id: ID!): User
  getPosts(cursor: String, limit: Int = 20): [Post!]!
}

type Mutation {
  createPost(content: String!): Post
  addComment(postId: ID!, content: String!): Comment
}

type Subscription {
  onNewNotification(userId: ID!): Notification
}

Authorization Strategy

Field-level permissions are enforced dynamically at runtime via middleware or resolver wrappers. For example, the email field on User type might only be visible to Admin and the user themselves, while friends list is public to all authenticated users.

Cursor-Based Pagination Implementation

This approach uses opaque cursors to paginate through posts efficiently, preventing issues with offset-based pagination such as inconsistent results on data changes. Example query:

query GetPosts($cursor: String, $limit: Int) {
  getPosts(cursor: $cursor, limit: $limit) {
    id
    content
    createdAt
  }
}

Real-Time Notifications with Subscriptions

Subscriptions utilize WebSocket connections to push real-time updates to clients. The onNewNotification subscription filters notifications by userId, ensuring users receive only their own notifications.

Example Resolver Snippet (Node.js with Apollo Server)

const { PubSub, withFilter } = require('apollo-server');
const pubsub = new PubSub();
const NOTIFICATION_TOPIC = 'NEW_NOTIFICATION';

const resolvers = {
  Subscription: {
    onNewNotification: {
      subscribe: withFilter(
        () => pubsub.asyncIterator([NOTIFICATION_TOPIC]),
        (payload, variables, context) => {
          return payload.userId === context.user.id;
        }
      )
    }
  },
  Mutation: {
    createPost: async (_, { content }, { user }) => {
      // Save post logic here
      return createdPost;
    },
    addComment: async (_, { postId, content }, { user }) => {
      // Save comment logic here
      return createdComment;
    }
  }
};

Prompt: API Gateway Design for Hybrid Cloud Integration

System Instructions: “Develop an API Gateway architecture that integrates on-premise legacy systems with cloud-native microservices, ensuring security and observability.”

Context Variables:

legacyProtocols: SOAP, JMS
cloudServices: RESTful microservices with OAuth 2.0
securityRequirements: TLS termination, JWT validation
observabilityTools: Distributed tracing, metrics aggregation

Output Schema:

{
  "gatewayComponents": [
    {
      "name": "ProtocolTranslator",
      "function": "Convert SOAP/JMS to REST",
      "details": "Provides adapters that transform legacy SOAP requests into RESTful JSON APIs and vice versa, leveraging XML parsers and JMS clients."
    },
    {
      "name": "SecurityModule",
      "function": "TLS termination, JWT validation",
      "details": "Handles SSL/TLS termination, verifies JWT tokens against the identity provider, and enforces scopes for API access."
    },
    {
      "name": "RoutingModule",
      "function": "Dynamic routing based on service registry",
      "details": "Uses a service discovery mechanism (e.g., Consul, Eureka) to route requests to appropriate microservices or legacy endpoints."
    },
    {
      "name": "ObservabilityModule",
      "function": "Integrate tracing and metrics",
      "details": "Implements distributed tracing (e.g., OpenTelemetry), aggregates Prometheus metrics, and exports logs to centralized systems."
    }
  ],
  "deploymentModel": "High-availability cluster with autoscaling",
  "failoverStrategy": "Circuit breaker and fallback endpoints",
  "additionalFeatures": {
    "rateLimiting": "Configured per client and tenant",
    "caching": "Response caching with Redis",
    "requestValidation": "Schema validation for inbound requests"
  }
}

Architecture Diagram Description

The API Gateway acts as a centralized ingress point, bridging legacy on-premise systems and modern cloud-native microservices. The ProtocolTranslator enables seamless interoperability by converting SOAP/XML and JMS messages into RESTful calls and JSON payloads.

The SecurityModule manages secure communications by terminating TLS sessions and validating JWTs issued by a centralized Identity and Access Management (IAM) system, ensuring only authenticated and authorized requests proceed.

The RoutingModule dynamically routes requests based on the service registry, providing resilience and scalability. It supports canary deployments and blue-green releases by tagging services accordingly.

The ObservabilityModule integrates distributed tracing (using OpenTelemetry), aggregates metrics via Prometheus exporters, and forwards logs to ELK or similar platforms, enabling comprehensive monitoring and troubleshooting.

Example: Kubernetes Deployment YAML Snippet with API Gateway

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hybrid-api-gateway
  labels:
    app: api-gateway
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-gateway
  template:
    metadata:
      labels:
        app: api-gateway
    spec:
      containers:
        - name: api-gateway
          image: company/api-gateway:latest
          ports:
            - containerPort: 8080
          env:
            - name: PROTOCOL_TRANSLATOR_ENABLED

3. Database Schema & Query Optimization
Effective database design and query optimization are foundational to achieving high system performance, scalability, and maintainability in any software architecture. In modern systems, databases handle vast volumes of data with diverse workload patterns, requiring a deep understanding of relational and NoSQL schema design principles, normalization forms, indexing strategies, and query execution plans. GPT-5.5 excels in generating highly normalized schemas aligned with business domains, suggesting optimal indexes tailored to workload characteristics, and producing performant query rewrites that reduce latency and resource consumption.


  Stay Ahead of the AI Curve
  Get weekly insights on ChatGPT, OpenAI, and AI tools delivered to your inbox.
  Subscribe to Our Newsletter


Beyond basic schema generation, GPT-5.5 can produce detailed explanations of normalization stages, identify potential bottlenecks in query plans, and recommend partitioning or sharding strategies for distributed databases. It can also tailor recommendations to specific database engines (e.g., PostgreSQL, MySQL, MongoDB), hardware configurations, and operational constraints. This section explores advanced prompt engineering for database schema design and query optimization with real-world examples and best practices.


  
    Prompt: Designing a Normalized Database Schema for a Subscription Billing System
    System Instructions: "Create a normalized relational schema to support customer subscriptions, plan tiers, usage tracking, and invoicing."
    
    Detailed Context Variables
    
      subscriptionPlans: Basic, Pro, Enterprise — representing multiple pricing tiers with varying feature sets and pricing.
      billingCycle: Monthly and yearly billing options, affecting invoice generation frequency and proration logic.
      usageMetrics: Tracking API calls and storage usage to enable metered billing and overage charges.
    
    
    Normalized Schema Design Explanation
    The schema is designed to satisfy Third Normal Form (3NF), ensuring elimination of transitive dependencies and redundant data storage. Each entity represents a single concept, and foreign key constraints enforce referential integrity. This design facilitates flexibility in subscription management, accurate usage tracking, and detailed invoicing.
    
    Output Schema
    {
  "tables": [
    {"name": "Customers", "columns": [
      "CustomerID SERIAL PRIMARY KEY",
      "Name VARCHAR(255) NOT NULL",
      "Email VARCHAR(255) UNIQUE NOT NULL",
      "CreatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP"
    ]},
    {"name": "SubscriptionPlans", "columns": [
      "PlanID SERIAL PRIMARY KEY",
      "Name VARCHAR(50) NOT NULL",
      "Price DECIMAL(10, 2) NOT NULL",
      "BillingCycle VARCHAR(10) CHECK (BillingCycle IN ('Monthly', 'Yearly')) NOT NULL"
    ]},
    {"name": "Subscriptions", "columns": [
      "SubscriptionID SERIAL PRIMARY KEY",
      "CustomerID INT NOT NULL REFERENCES Customers(CustomerID)",
      "PlanID INT NOT NULL REFERENCES SubscriptionPlans(PlanID)",
      "StartDate DATE NOT NULL",
      "EndDate DATE"
    ]},
    {"name": "UsageRecords", "columns": [
      "UsageID SERIAL PRIMARY KEY",
      "SubscriptionID INT NOT NULL REFERENCES Subscriptions(SubscriptionID)",
      "Metric VARCHAR(50) NOT NULL CHECK (Metric IN ('API Calls', 'Storage Used'))",
      "Amount NUMERIC(15, 4) NOT NULL",
      "Date DATE NOT NULL"
    ]},
    {"name": "Invoices", "columns": [
      "InvoiceID SERIAL PRIMARY KEY",
      "SubscriptionID INT NOT NULL REFERENCES Subscriptions(SubscriptionID)",
      "AmountDue DECIMAL(12, 2) NOT NULL",
      "DueDate DATE NOT NULL",
      "Status VARCHAR(20) CHECK (Status IN ('Pending', 'Paid', 'Overdue')) NOT NULL"
    ]}
  ],
  "relationships": [
    {"from": "Subscriptions.CustomerID", "to": "Customers.CustomerID"},
    {"from": "Subscriptions.PlanID", "to": "SubscriptionPlans.PlanID"},
    {"from": "UsageRecords.SubscriptionID", "to": "Subscriptions.SubscriptionID"},
    {"from": "Invoices.SubscriptionID", "to": "Subscriptions.SubscriptionID"}
  ],
  "normalization": "3NF"
}
Production-Grade SQL Table Creation Example (PostgreSQL)
CREATE TABLE Customers (
  CustomerID SERIAL PRIMARY KEY,
  Name VARCHAR(255) NOT NULL,
  Email VARCHAR(255) UNIQUE NOT NULL,
  CreatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE SubscriptionPlans (
  PlanID SERIAL PRIMARY KEY,
  Name VARCHAR(50) NOT NULL,
  Price DECIMAL(10, 2) NOT NULL,
  BillingCycle VARCHAR(10) CHECK (BillingCycle IN ('Monthly', 'Yearly')) NOT NULL
);

CREATE TABLE Subscriptions (
  SubscriptionID SERIAL PRIMARY KEY,
  CustomerID INT NOT NULL REFERENCES Customers(CustomerID) ON DELETE CASCADE,
  PlanID INT NOT NULL REFERENCES SubscriptionPlans(PlanID),
  StartDate DATE NOT NULL,
  EndDate DATE
);

CREATE TABLE UsageRecords (
  UsageID SERIAL PRIMARY KEY,
  SubscriptionID INT NOT NULL REFERENCES Subscriptions(SubscriptionID) ON DELETE CASCADE,
  Metric VARCHAR(50) NOT NULL CHECK (Metric IN ('API Calls', 'Storage Used')),
  Amount NUMERIC(15,4) NOT NULL,
  Date DATE NOT NULL
);

CREATE TABLE Invoices (
  InvoiceID SERIAL PRIMARY KEY,
  SubscriptionID INT NOT NULL REFERENCES Subscriptions(SubscriptionID),
  AmountDue DECIMAL(12, 2) NOT NULL,
  DueDate DATE NOT NULL,
  Status VARCHAR(20) CHECK (Status IN ('Pending', 'Paid', 'Overdue')) NOT NULL
);

Additional Design Considerations

Indexing: Add indexes on foreign key columns (e.g., Subscriptions.CustomerID) to speed up joins.
Partitioning: For large UsageRecords, consider partitioning by date to improve query performance.
Proration Logic: Implement stored procedures or application logic to handle billing proration on subscription plan changes.
Audit Trails: Optionally add audit tables or use triggers to track subscription changes.



Prompt: Query Optimization for High-Throughput Financial Transactions
System Instructions: "Analyze and optimize SQL queries for a payment processing system handling 10,000 transactions per second."
Context Variables

databaseEngine: PostgreSQL 13+
queries: SELECT * FROM transactions WHERE status='pending' AND created_at > now() - interval '5 minutes'
indexesAvailable: Primary key on transaction_id, non-clustered index on status
hardwareSpecs: 32-core CPU, 256GB RAM, NVMe SSD storage for low latency I/O

Analysis of Original Query
The original query uses a WHERE clause filtering on status and a recent created_at timestamp. The existing index on status alone is insufficient to efficiently filter by both conditions, causing sequential scans or inefficient index scans. Fetching all columns with SELECT * further increases I/O and CPU usage, particularly under high concurrency.
Optimized Query and Index Recommendations
{
  "optimizedQuery": "SELECT transaction_id, amount, created_at FROM transactions WHERE status = 'pending' AND created_at > now() - interval '5 minutes'",
  "indexesSuggested": ["CREATE INDEX idx_status_created_at ON transactions (status, created_at DESC)"],
  "explainPlan": {
    "seqScan": false,
    "indexScan": true,
    "cost": "Low",
    "rowsEstimated": 5000
  },
  "additionalRecommendations": [
    "Partition transactions table by time (e.g., monthly partitions) to limit data scanned",
    "Run VACUUM ANALYZE regularly to maintain index statistics",
    "Use prepared statements and connection pooling to reduce overhead",
    "Employ query parallelism if supported by the PostgreSQL version"
  ]
}
Example: Creating Composite Index in PostgreSQL
CREATE INDEX idx_status_created_at ON transactions (status, created_at DESC);

Step-by-Step Query Execution Improvement Walkthrough

Identify bottleneck: Use EXPLAIN ANALYZE on the original query to confirm sequential scans and high cost.
Refine SELECT clause: Replace SELECT * with only necessary columns to reduce I/O.
Create composite index: Index on (status, created_at DESC) allows the database to efficiently filter and order data.
Partition data: Implement table partitioning by date ranges to reduce the dataset queried.
Maintain statistics: Schedule regular VACUUM and ANALYZE commands to keep planner statistics current.
Monitor performance: Continuously monitor query latency and adjust indexing or partitioning as data grows.

EXPLAIN Output Example for Optimized Query
Index Scan using idx_status_created_at on transactions  (cost=0.42..15.67 rows=5000 width=48)
  Index Cond: ((status = 'pending'::text) AND (created_at > (now() - '00:05:00'::interval)))

Additional Best Practices

Leverage connection pooling (e.g., PgBouncer) to manage high concurrency.
Use asynchronous processing for non-critical transaction steps.
Consider using materialized views or caching layers for aggregated transaction metrics.
Perform load testing with realistic transaction volumes to validate optimizations.



Prompt: NoSQL Data Modeling for a Location-Based Social Network
System Instructions: "Design a NoSQL document schema to support geo-queries, friend connections, and event feeds."
Context Variables

databaseType: MongoDB 5.0+
geoIndexRequired: True — support efficient geospatial queries near user locations.
userConnections: Bidirectional friendships requiring fast lookup and updates.
eventFeedUpdateFrequency: Real-time updates with high write throughput.

NoSQL Schema Design Principles
Unlike relational models, document databases favor denormalization for performance and scalability. The schema design should optimize for common query patterns such as fetching nearby users, retrieving friends lists, and streaming event feeds. MongoDB's support for GeoJSON and specialized geospatial indexes enables efficient location-based queries.
Output Schema
{
  "collections": [
    {
      "name": "Users",
      "fields": {
        "_id": "ObjectId",
        "name": "string",
        "location": {"type": "GeoJSON Point", "index": "2dsphere"},
        "friends": "array of ObjectId (bidirectional references)",
        "lastActive": "timestamp"
      }
    },
    {
      "name": "Events",
      "fields": {
        "_id": "ObjectId",
        "userId": "ObjectId",
        "content": "string",
        "createdAt": "timestamp",
        "location": {"type": "GeoJSON Point", "index": "2dsphere"}
      }
    }
  ],
  "indexes": ["2dsphere on location fields", "Index on userId in Events"]
}
MongoDB Schema Creation and Indexing Example
db.users.createIndex({ location: "2dsphere" });
db.users.createIndex({ lastActive: 1 });

db.events.createIndex({ location: "2dsphere" });
db.events.createIndex({ userId: 1 });

Data Model Explanation and Trade-offs

Users Collection: Stores user profile and geolocation data with a 2dsphere index to enable queries like "find users within radius X". The friends field holds an array of ObjectIds for bidirectional friendship representation, allowing fast lookup but potentially growing large arrays for popular users.
Events Collection: Stores user-generated content with location metadata. The 2dsphere index enables geo-filtering of events, while the index on userId supports efficient retrieval of a user's event feed.

Example Geo-Query: Find Users Within 5km Radius
db.users.find({
  location: {
    $nearSphere: {
      $geometry: {
        type: "Point",
        coordinates: [ -73.9667, 40.78 ]
      },
      $maxDistance: 5000
    }
  }
});

Handling Real-Time Event Feed Updates

Use capped collections or change streams in MongoDB to push real-time updates.
Consider denormalizing event data into user documents if feed read performance outweighs update complexity.
Implement TTL indexes on events to automatically expire outdated data.

Scaling Considerations

Sharding users and events collections by geographic region or user ID hash to distribute load.
Use Redis or similar in-memory stores for caching hot friend lists and event feeds.
Monitor index sizes and performance using MongoDB Atlas or equivalent monitoring tools.

Summary Table: Relational vs NoSQL Design Trade-offs



Aspect
Relational (Normalized)
NoSQL (Document)




Schema Flexibility
Rigid, predefined tables and relationships
Flexible, schema-less or schema-on-read


Query Complexity
Supports complex joins and transactions
Limited joins, favors denormalization


Performance
Optimal for ACID compliance and complex queries
Optimized for high write/read throughput and horizontal scaling


Scalability
Vertical scaling, sharding complex
Horizontal scaling and sharding native





4. Scalability & Performance Planning

Ensuring that software systems scale gracefully under increasing load and maintain optimal performance is a cornerstone of robust system design. This involves a deep understanding of workload characteristics, infrastructure limitations, and failure scenarios. GPT-5.5 can assist architects and engineers by generating detailed, actionable scalability and performance strategies tailored to specific system requirements and operational contexts.
In this section, we explore advanced prompt templates designed to elicit comprehensive plans addressing scalability, capacity planning, fault tolerance, and performance tuning for complex, high-demand systems. These prompts include precise context variables, expected output schemas, and are augmented with real-world code snippets, configuration examples, and architectural patterns.


Prompt: Scalability Plan for a High-Throughput Payment Gateway
System Instructions: "Develop a scalability and capacity planning strategy for a payment gateway processing 100,000 transactions per second globally."
Context Variables:

peakTPS: 100,000 transactions per second
latencySLA: <50ms end-to-end latency
failureTolerance: 99.999% uptime (5 nines)
infrastructure: Kubernetes clusters deployed across 5 global regions

Detailed Scalability Approach:
The system must support horizontal scaling at both the application and infrastructure layers. Key techniques include:

Horizontal Pod Autoscaling: Dynamically adjust the number of pods based on CPU/memory utilization and custom metrics like request rate.
Regional Load Balancing: Use geo-aware load balancers (e.g., AWS Global Accelerator, GCP Traffic Director) to route traffic to the nearest healthy region.
Message Queue Buffering: Introduce durable queues (e.g., Apache Kafka, AWS Kinesis) to smooth traffic spikes and decouple front-end from downstream processing.

Capacity Planning Calculations:
Estimate resource requirements based on peak TPS and latency constraints:



Metric
Assumptions
Calculated Need




CPU Cores
Assuming 50 µs per transaction per core
2000 cores (100,000 TPS × 50 µs = 5,000 ms of CPU time per second → 5 seconds CPU per second, so ~2000 cores)


Memory
20 MB per core for JVM heap and buffers
4000 GB total (2000 cores × 2 GB)


Network Bandwidth
Assuming 1 KB per transaction payload
100 Gbps (100,000 TPS × 1 KB × 8 bits)



Fault Tolerance Strategies:

Multi-Region Failover: Active-active cluster deployments with cross-region replication to ensure zero downtime during regional outages.
Circuit Breakers: Implement using libraries like Netflix Hystrix or Resilience4j to prevent cascading failures.
Retry Policies with Exponential Backoff: Mitigate transient failures while avoiding overwhelming the system.

Monitoring and Observability Setup:

Prometheus Metrics: Track custom application metrics such as transaction latency, error rates, and throughput.
Alertmanager Alerts: Configure alerts for SLA breaches, resource exhaustion, and anomalous patterns.
Distributed Tracing: Use Jaeger to trace transaction flow across microservices, identify bottlenecks, and latency hotspots.

Example Kubernetes Horizontal Pod Autoscaler Configuration:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: payment-gateway-hpa
  namespace: payments
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: payment-gateway
  minReplicas: 50
  maxReplicas: 1000
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Pods
    pods:
      metric:
        name: transactions_per_second
      target:
        type: AverageValue
        averageValue: 5000
Sample Kafka Configuration for Message Queuing:
kafka-topics.sh --create --topic payment-transactions --partitions 500 --replication-factor 3 --config retention.ms=604800000
This command creates a Kafka topic with 500 partitions to ensure high throughput and replication factor 3 for durability.
Summary Output Schema:
{
  "scalingApproach": [
    "Horizontal pod autoscaling",
    "Regional load balancing",
    "Message queue buffering"
  ],
  "capacityPlanning": {
    "cpuCoresRequired": 2000,
    "memoryGBRequired": 4000,
    "networkBandwidthGbps": 100
  },
  "faultTolerance": [
    "Multi-region failover",
    "Circuit breakers",
    "Retry policies with exponential backoff"
  ],
  "monitoringSetup": [
    "Prometheus metrics",
    "Alertmanager alerts",
    "Distributed tracing with Jaeger"
  ]
}


Prompt: Performance Tuning Strategy for Real-Time Analytics Platform
System Instructions: "Produce a performance tuning plan for a real-time analytics platform ingesting streaming data and providing sub-second query responses."
Context Variables:

dataIngestionRate: 1 million events per second
queryLatencyTarget: <500ms
storageType: Columnar OLAP database (e.g., Apache Druid, ClickHouse)
computeModel: Serverless functions with autoscaling (e.g., AWS Lambda, Azure Functions)

Optimization Techniques:

Data Pre-Aggregation: Compute rollups and summaries at ingestion time to reduce query complexity.
Materialized Views: Maintain pre-computed views to serve frequent queries rapidly.
Efficient Partition Pruning: Design partition keys to enable the query engine to skip irrelevant data segments, reducing I/O.

Resource Allocation Strategies:

Dynamic Scaling: Serverless functions auto-scale based on incoming event rate and query concurrency.
In-Memory Caching: Use Redis or Memcached to cache hot query results and metadata.

Bottleneck Mitigation:

Backpressure Handling: Implement mechanisms to slow data ingestion when downstream systems are overwhelmed, such as Kafka’s flow control or reactive streams.
Query Concurrency Limits: Throttle queries during peak load to maintain SLA.

Monitoring Setup:

Latency Histograms: Track distribution of query response times to detect tail latency issues.
Error Rate Tracking: Monitor ingestion failures, query errors, and system exceptions.
Resource Utilization Dashboards: Visualize CPU, memory, and network usage in real-time.

Example Apache Druid Tuning Snippet for Ingestion:
{
  "type" : "index_parallel",
  "spec": {
    "dataSchema": {
      "granularitySpec": {
        "type": "uniform",
        "segmentGranularity": "HOUR",
        "queryGranularity": "MINUTE"
      },
      "metricsSpec": [
        {"type":"count","name":"count"},
        {"type":"doubleSum","name":"total_revenue","fieldName":"revenue"}
      ],
      "dimensionsSpec": {
        "dimensions": ["user_id", "product_id", "region"]
      }
    },
    "ioConfig": {
      "type": "index_parallel",
      "inputSource": {"type": "kafka", "topic": "events-topic"},
      "inputFormat": {"type": "json"}
    },
    "tuningConfig": {
      "type": "index_parallel",
      "maxRowsPerSegment": 5000000,
      "maxTotalRows": 100000000,
      "partitionsSpec": {
        "type": "hashed",
        "targetPartitionSize": 500000
      }
    }
  }
}
Serverless Autoscaling Example with AWS Lambda (Node.js):
exports.handler = async (event) => {
  // Process streaming event
  // Ensure idempotency and minimal cold start latency by keeping dependencies lightweight
  return {
    statusCode: 200,
    body: JSON.stringify('Processed event successfully'),
  };
};
Summary Output Schema:
{
  "optimizationTechniques": [
    "Data pre-aggregation",
    "Materialized views",
    "Efficient partition pruning"
  ],
  "resourceAllocation": [
    "Dynamic scaling of compute nodes",
    "In-memory caching for hot data"
  ],
  "bottleneckMitigation": [
    "Backpressure handling on ingestion",
    "Query concurrency limits"
  ],
  "monitoring": [
    "Latency histograms",
    "Error rate tracking",
    "Resource utilization dashboards"
  ]
}


Prompt: Load Balancing Strategy for Global Video Streaming Service
System Instructions: "Design a load balancing strategy for a video streaming service with millions of concurrent viewers worldwide."
Context Variables:

concurrentViewers: 5 million
contentDeliveryNetwork: Integrated CDN usage (e.g., Cloudflare, Akamai)
latencyTarget: <200ms startup time
failoverMechanism: Automatic rerouting on node failure

Comprehensive Load Balancing Strategy:
To efficiently serve a massive global audience with low startup latency, the system must leverage multi-layered load balancing and edge caching:

Global DNS-Based Routing: Use a DNS provider with latency-based routing (e.g., AWS Route 53 latency routing) to direct users to the closest regional cluster.
Regional Application Load Balancers: Within each region, application load balancers distribute incoming streaming requests across multiple backend streaming servers.
Edge Caching via CDN: Leverage CDN edge nodes to cache video segments near users, reducing origin server load and decreasing startup latency.

Advanced Routing Policies:

Latency-Aware Routing: Continuously measure user-to-region latency and route traffic dynamically to minimize startup delay.
Health Checks with Automatic Failover: Employ active health monitoring to detect failing nodes and automatically reroute traffic to healthy endpoints.

Scaling Model:
The system should implement auto-scaling based on concurrent viewer metrics using cloud-native autoscaling groups or Kubernetes Horizontal Pod Autoscalers, combined with pre-warming strategies during anticipated peak hours.
Example: Route 53 Latency-Based Routing Configuration:
resource "aws_route53_record" "video_streaming_latency_routing" {
  zone_id = aws_route53_zone.primary.zone_id
  name    = "streaming.example.com"
  type    = "A"

  alias {
    name                   = aws_lb.region1.dns_name
    zone_id                = aws_lb.region1.zone_id
    evaluate_target_health = true
  }

  set_identifier = "region1"
  latency_routing_policy {
    region = "us-east-1"
  }
}

resource "aws_route53_record" "video_streaming_latency_routing_region2" {
  zone_id = aws_route53_zone.primary.zone_id
  name    = "streaming.example.com"
  type    = "A"

  alias {
    name                   = aws_lb.region2.dns_name
    zone_id                = aws_lb.region2.zone_id
    evaluate_target_health = true
  }

  set_identifier = "region2"
  latency_routing_policy {
    region = "eu-west-1"
  }
}
Sample NGINX Configuration Snippet for Regional Load Balancer:
http {
  upstream backend {
    server backend1.example.com max_fails=3 fail_timeout=30s;
    server backend2.example.com max_fails=3 fail_timeout=30s;
    server backend3.example.com max_fails=3 fail_timeout=30s;
  }

  server {
    listen 80;

    location / {
      proxy_pass http://backend;
      proxy_next_upstream error timeout invalid_header http_500 http_503;
      proxy_connect_timeout 5s;
      proxy_read_timeout 30s;
    }
  }
}
CDN Edge Caching Best Practices:

Cache video segments with appropriate Cache-Control headers to maximize hit ratios.
Use HTTP/2 or HTTP/3 protocols for faster multiplexing.
Implement token-based authentication on CDN requests to secure content.

Summary Output Schema:
{
  "loadBalancers": [
    "Global DNS-based routing",
    "Regional application load balancers",
    "Edge caching via CDN"
  ],
  "routingPolicies": [
    "Latency-aware routing",
    "Health checks with automatic failover"
  ],
  "scalingModel": [
    "Auto-scaling based on concurrent viewers",
    "Pre-warming during peak hours"
  ],
  "monitoring": [
    "Real-time viewer metrics",
    "Load balancer health dashboards",
    "CDN cache hit ratio tracking"
  ]
}


  
  
  
  

    

    
      
      
      
      
      

      Please leave this field empty





      
        Thank you! Please check your inbox (and spam folder) for a confirmation email. Click the link to get instant access to our 40,000+ ChatGPT Prompt Library.Check your inbox or spam folder to confirm your subscription.
        
                
      
    

      

    
  
  
  

    

    
      
      
      
      
      

      Please leave this field empty





      
        Thank you! Please check your inbox (and spam folder) for a confirmation email. Click the link to get instant access to our 40,000+ ChatGPT Prompt Library.Check your inbox or spam folder to confirm your subscription.
        
                
      
    

          
      

    
  
  
  

    

    
      
      
      
      
      

      Please leave this field empty





      
        Thank you! Please check your inbox (and spam folder) for a confirmation email. Click the link to get instant access to our 40,000+ ChatGPT Prompt Library.Check your inbox or spam folder to confirm your subscription.

Aspect	Relational (Normalized)	NoSQL (Document)
Schema Flexibility	Rigid, predefined tables and relationships	Flexible, schema-less or schema-on-read
Query Complexity	Supports complex joins and transactions	Limited joins, favors denormalization
Performance	Optimal for ACID compliance and complex queries	Optimized for high write/read throughput and horizontal scaling
Scalability	Vertical scaling, sharding complex	Horizontal scaling and sharding native

Metric	Assumptions	Calculated Need
CPU Cores	Assuming 50 µs per transaction per core	2000 cores (100,000 TPS × 50 µs = 5,000 ms of CPU time per second → 5 seconds CPU per second, so ~2000 cores)
Memory	20 MB per core for JVM heap and buffers	4000 GB total (2000 cores × 2 GB)
Network Bandwidth	Assuming 1 KB per transaction payload	100 Gbps (100,000 TPS × 1 KB × 8 bits)





  
  
  
  

    

    
      
      
      
      
      

      Please leave this field empty
Get Free Access to 40,000+ AI Prompts for ChatGPT, Claude & Codex
Subscribe for instant access to the largest curated Notion Prompt Library for AI workflows.



      
        Check your inbox or spam folder to confirm your subscription & get your free prompts link.
        
                
      
    

      

  


Facebook

Twitter

LinkedIn

Instagram






«Previous: Codex Hooks and Programmatic Access Tokens: Enterprise Automation Guide



Next: GPT-5.5 Mini Prompting Guide: Optimizing for Speed, Cost, and Edge Deployment»







Markos Symeonides



LinkedIn

Twitter

Facebook







More on this


How ChatGPT Search Actually Works in 2026: Understanding AI-Powered Web Results, Source Attribution, and When to Use It Over Google
Posted in How to
 Reading Time: 23 minutes 
ChatGPT’s search capability has evolved from a bolted-on experiment into one of the most sophisticated AI-native search experiences available in 2026. Since OpenAI launched the feature broadly in late 2023 and iterated aggressively through 2024 and 2025, the system now...
Mastering ChatGPT Canvas: The Complete Workflow Guide for Document Editing, Code Review, and Real-Time Collaboration in 2026
Posted in How to
 Reading Time: 24 minutes 
ChatGPT Canvas: The Complete Playbook for Collaborative AI Document and Code Editing ChatGPT Canvas represents OpenAI’s most significant interface evolution since the launch of ChatGPT itself. Rather than burying your work in an endless scroll of chat messages, Canvas gives...
30 ChatGPT Prompt Chains for Complex Multi-Step Workflows: Research, Content, Coding, and Business Analysis
Posted in How to
 Reading Time: 29 minutes 
30 Prompt Chains for ChatGPT-5.5: Master Workflows for Complex, Multi-Step Tasks Prompt chaining is the single most underutilized technique in professional ChatGPT workflows. While most users treat each conversation as an isolated transaction — one question, one answer — power...
GPT-5.6 Price Cuts Explained: How Luna and Terra’s New July 2026 Pricing Changes Your AI Development Budget
Posted in How to
 Reading Time: 20 minutes 
GPT-5.6 Price Reductions: Complete Guide to the July 30, 2026 Announcement On July 30, 2026, OpenAI dropped one of the most consequential pricing announcements in the company’s commercial history. The GPT-5.6 model family — comprising three distinct tiers named Sol,...

25 GPT-5.5 Prompts for Software Architecture and System Design

25 GPT-5.5 Prompts for Software Architecture and System Design

Why GPT-5.5 for System Design?

Key Capabilities Include:

How to Use This Playbook

Example: Designing a Real-Time Collaborative Editor

Example: Architecting a High-Throughput Payment Gateway

1. Microservices & Distributed Systems

Core Design Considerations

Production-Grade Infrastructure Components

Prompt: Designing a Scalable Microservices Architecture for an E-commerce Platform

Detailed Architecture Analysis

Production-Grade Configuration Example (Kubernetes Deployment Snippet)

Inter-Service Communication Patterns

Prompt: Fault-Tolerant Distributed System for Real-Time Collaborative Editor

Architecture Breakdown

Technology Stack Recommendations

Code Snippet: Implementing CRDT-based Merge

Fault Tolerance and Recovery Strategies

Prompt: Partitioning Strategy for High-Volume Distributed Messaging System

Comprehensive Partitioning and Sharding Strategy

Example Kafka Topic Configuration

Consumer Group Load Balancing Example (Java)

Latency-Aware Geo-Distribution Strategy

Summary Table: Partitioning Strategy Parameters

2. API Design & Integration

Key Considerations for API Design & Integration

Prompt: Defining RESTful API Specification for a Multi-Tenant SaaS Application

Detailed Walkthrough

Example: OpenAPI 3.0 Snippet for User Endpoints

Prompt: Designing GraphQL API for a Social Media Platform

Schema Definition Example (SDL)

Authorization Strategy

Cursor-Based Pagination Implementation

Real-Time Notifications with Subscriptions

Example Resolver Snippet (Node.js with Apollo Server)

Prompt: API Gateway Design for Hybrid Cloud Integration

Architecture Diagram Description

Example: Kubernetes Deployment YAML Snippet with API Gateway

3. Database Schema & Query Optimization

Stay Ahead of the AI Curve

Prompt: Designing a Normalized Database Schema for a Subscription Billing System

Detailed Context Variables

Normalized Schema Design Explanation

Output Schema

Production-Grade SQL Table Creation Example (PostgreSQL)

Additional Design Considerations

Prompt: Query Optimization for High-Throughput Financial Transactions

Context Variables

Analysis of Original Query

Optimized Query and Index Recommendations

Example: Creating Composite Index in PostgreSQL

Step-by-Step Query Execution Improvement Walkthrough

EXPLAIN Output Example for Optimized Query

Additional Best Practices

Prompt: NoSQL Data Modeling for a Location-Based Social Network

Context Variables

NoSQL Schema Design Principles

Output Schema

MongoDB Schema Creation and Indexing Example

Data Model Explanation and Trade-offs

Example Geo-Query: Find Users Within 5km Radius

Handling Real-Time Event Feed Updates

Scaling Considerations

Summary Table: Relational vs NoSQL Design Trade-offs

4. Scalability & Performance Planning

Prompt: Scalability Plan for a High-Throughput Payment Gateway

Context Variables:

Detailed Scalability Approach:

Capacity Planning Calculations:

Fault Tolerance Strategies:

Monitoring and Observability Setup:

Example Kubernetes Horizontal Pod Autoscaler Configuration:

Sample Kafka Configuration for Message Queuing:

Summary Output Schema:

Prompt: Performance Tuning Strategy for Real-Time Analytics Platform

Context Variables:

Optimization Techniques:

Resource Allocation Strategies:

Bottleneck Mitigation: