How to Build Agentic Commerce Workflows with OpenAI and Visa’s New Partnership: A Complete Developer Guide

How to Build Agentic Commerce Workflows with OpenAI and Visa’s New Partnership: A Complete Developer Guide

Article header illustration

By the ChatGPT AI Hub Editorial Team | Published June 2025 | Category: Tutorial

The announcement of a strategic collaboration between OpenAI and Visa marks one of the most consequential infrastructure shifts in commerce technology since the introduction of contactless payments. For the first time, AI agents built on OpenAI’s platform can be granted verified, tokenized payment credentials—enabling fully autonomous purchase workflows that operate without per-transaction human confirmation. This guide walks developers through the complete architecture, from provisioning agent payment tokens to building secure, auditable purchase flows using the OpenAI Agents SDK and Visa’s emerging agentic commerce infrastructure.

What follows is a technical deep-dive written for developers who already understand REST APIs, async Python, and basic payment processing concepts. You’ll leave with working code patterns, architectural diagrams described in prose, security checklists, and a clear mental model for how machine-initiated commerce differs from traditional e-commerce authorization flows.


Understanding the OpenAI-Visa Agentic Commerce Framework

Before writing a single line of code, it’s worth understanding what the OpenAI-Visa partnership actually provides at the infrastructure level. Traditional payment flows assume a human cardholder: someone who presents credentials, receives a one-time password, reviews a purchase, and confirms. Agentic commerce inverts this model. The AI agent is the authorized principal, acting within pre-defined spending parameters set by the human account holder.

Visa’s contribution to this framework is what they call an “agent token”—a variant of their existing network tokenization standard (EMV Payment Tokenization Specification) scoped specifically for software principals rather than physical devices or card-on-file credentials. These tokens are cryptographically bound to a specific agent identity, carry embedded spending controls (merchant category codes, maximum transaction amounts, time-based expiry), and generate cryptograms that prove authorization at transaction time without exposing the underlying PAN (Primary Account Number).

OpenAI’s contribution is the agent identity and orchestration layer. Through the OpenAI Agents SDK—particularly in conjunction with the tool-calling and structured output capabilities introduced in late 2024 and refined through 2025—developers can define agents that receive scoped payment tokens as secure environment variables, call payment tools within their reasoning loops, and emit structured purchase intents that Visa’s token requestor infrastructure can process.

Together, these two systems create a closed loop: the human sets parameters once, the agent operates autonomously within those parameters, and every transaction is cryptographically provable, reversible, and auditable. The key insight is that authorization is separated from authentication—the human authenticates at token provisioning time; the agent is authorized at transaction time via cryptographic proof.

Key Terminology You Need to Know

Term Definition Who Manages It
Agent Token A network token scoped to an AI agent principal, carrying embedded spending controls Visa Token Service (VTS)
Token Requestor ID (TRID) Unique identifier assigned to the developer/platform requesting tokens on behalf of agents Developer (assigned by Visa)
Cryptogram One-time transaction-specific cryptographic proof generated at authorization time VTS / Agent runtime
Spending Control Profile JSON configuration defining per-transaction limits, MCC restrictions, and velocity controls Developer / Cardholder
Agent Identity Assertion OpenAI-signed JWT proving the requesting agent’s identity and current task context OpenAI Platform
Purchase Intent Structured JSON object emitted by an agent describing a proposed transaction Agent (developer-defined schema)
Durable Token Vault Encrypted server-side store for agent tokens, isolated from agent memory Developer infrastructure

Prerequisites and Environment Setup

This tutorial assumes you have an OpenAI Platform account with Agents SDK access (currently available to developers in the priority API access tier), a Visa Developer Program account with agentic commerce sandbox credentials, and a server-side environment running Python 3.11 or later. You’ll also need a key management service—AWS KMS, Google Cloud KMS, or HashiCorp Vault all work for the token storage patterns described here.

Installing Required Dependencies

# Core dependencies
pip install openai>=1.30.0
pip install httpx>=0.27.0
pip install cryptography>=42.0.0
pip install pydantic>=2.7.0
pip install python-jose[cryptography]>=3.3.0

# For token vault integration (AWS example)
pip install boto3>=1.34.0
pip install aws-encryption-sdk>=3.2.0

# Development and testing
pip install pytest-asyncio>=0.23.0
pip install respx>=0.21.0  # HTTP mocking for Visa sandbox

Environment Variables

# .env file structure
OPENAI_API_KEY=sk-proj-...
OPENAI_ORG_ID=org-...

# Visa Developer Sandbox
VISA_API_KEY=your_visa_api_key
VISA_API_SECRET=your_visa_api_secret
VISA_TRID=your_token_requestor_id
VISA_BASE_URL=https://sandbox.api.visa.com

# Token Vault (AWS KMS example)
KMS_KEY_ARN=arn:aws:kms:us-east-1:...
TOKEN_VAULT_TABLE=agent_payment_tokens

# Agent Configuration
MAX_TRANSACTION_AMOUNT=500.00
ALLOWED_MCC_CODES=5411,5912,5999,7372
AGENT_TOKEN_TTL_HOURS=24

Step 1: Provisioning Agent Payment Tokens via Visa Token Service

Token provisioning is the foundational step that makes agentic commerce possible. Unlike provisioning a token for a mobile wallet where the device itself is the principal, here you’re provisioning a token for a software identity. Visa’s agentic commerce sandbox uses the same underlying EMV token infrastructure but with an extended request payload that includes agent metadata.

The provisioning flow works as follows: your backend service authenticates to Visa’s Token Service using mutual TLS and your TRID credentials, submits an enrollment request that includes the underlying PAN (obtained from the cardholder through a secure UI at setup time), specifies an agent identity descriptor, and attaches a spending control profile. Visa responds with a token reference ID and the token itself, which you must immediately store in your durable token vault—never in agent memory or logs.

Spending Control Profile Schema

from pydantic import BaseModel, Field
from typing import List, Optional
from decimal import Decimal

class SpendingControlProfile(BaseModel):
    """
    Defines the authorization boundaries for an agent payment token.
    Maps to Visa's VTS spending controls extension.
    """
    max_transaction_amount: Decimal = Field(
        description="Maximum single transaction in USD",
        ge=Decimal("0.01"),
        le=Decimal("10000.00")
    )
    daily_spend_limit: Optional[Decimal] = Field(
        default=None,
        description="Rolling 24-hour spend cap"
    )
    allowed_mcc_codes: List[str] = Field(
        description="Permitted Merchant Category Codes",
        min_length=1
    )
    allowed_merchant_ids: Optional[List[str]] = Field(
        default=None,
        description="Allowlist of specific merchant IDs (optional)"
    )
    require_purchase_intent_hash: bool = Field(
        default=True,
        description="Require cryptographic purchase intent commitment"
    )
    token_valid_hours: int = Field(
        default=24,
        ge=1,
        le=720
    )

class AgentTokenRequest(BaseModel):
    """
    Full token provisioning request payload for Visa VTS agentic endpoint.
    """
    pan: str  # Only present during provisioning, never logged
    cardholder_name: str
    expiry_month: str
    expiry_year: str
    agent_identity_descriptor: str  # OpenAI agent ID or your internal agent UUID
    agent_display_name: str
    spending_controls: SpendingControlProfile
    cardholder_consent_reference: str  # Reference to stored consent record

Token Provisioning Client

import httpx
import hashlib
import hmac
import time
import json
from typing import Optional
import os

class VisaTokenClient:
    """
    Client for Visa Token Service agentic commerce endpoints.
    Uses HMAC-SHA256 request signing as required by Visa Developer API.
    """
    
    def __init__(self):
        self.api_key = os.environ["VISA_API_KEY"]
        self.api_secret = os.environ["VISA_API_SECRET"]
        self.trid = os.environ["VISA_TRID"]
        self.base_url = os.environ["VISA_BASE_URL"]
        
    def _sign_request(self, method: str, path: str, body: str, timestamp: str) -> str:
        """Generate HMAC-SHA256 signature for Visa API authentication."""
        message = f"{timestamp}\n{method.upper()}\n{path}\n{body}"
        signature = hmac.new(
            self.api_secret.encode("utf-8"),
            message.encode("utf-8"),
            hashlib.sha256
        ).hexdigest()
        return signature
    
    async def provision_agent_token(
        self, 
        request: AgentTokenRequest
    ) -> dict:
        """
        Submit token provisioning request to Visa VTS.
        Returns token reference and encrypted token payload.
        """
        path = "/vts/v1/agentic/tokens"
        timestamp = str(int(time.time()))
        
        payload = {
            "tokenRequestorId": self.trid,
            "cardholderName": request.cardholder_name,
            "primaryAccountNumber": request.pan,  # Transmitted over mTLS only
            "panExpirationDate": f"{request.expiry_month}/{request.expiry_year}",
            "agentIdentityDescriptor": request.agent_identity_descriptor,
            "agentDisplayName": request.agent_display_name,
            "cardholderConsentReference": request.cardholder_consent_reference,
            "spendingControls": {
                "maxTransactionAmount": str(request.spending_controls.max_transaction_amount),
                "dailySpendLimit": str(request.spending_controls.daily_spend_limit) 
                    if request.spending_controls.daily_spend_limit else None,
                "allowedMccCodes": request.spending_controls.allowed_mcc_codes,
                "allowedMerchantIds": request.spending_controls.allowed_merchant_ids,
                "requirePurchaseIntentHash": request.spending_controls.require_purchase_intent_hash,
                "tokenValidHours": request.spending_controls.token_valid_hours
            }
        }
        
        body_str = json.dumps(payload)
        signature = self._sign_request("POST", path, body_str, timestamp)
        
        async with httpx.AsyncClient() as client:
            response = await client.post(
                f"{self.base_url}{path}",
                content=body_str,
                headers={
                    "Content-Type": "application/json",
                    "X-API-KEY": self.api_key,
                    "X-Request-Timestamp": timestamp,
                    "X-Signature": signature,
                    "X-TRID": self.trid
                },
                timeout=30.0
            )
            response.raise_for_status()
            
        result = response.json()
        
        # CRITICAL: Remove PAN from result before returning
        # Token value must go directly to vault, never to agent context
        return {
            "token_reference_id": result["tokenReferenceId"],
            "token_value": result["tokenValue"],  # Route to vault immediately
            "token_status": result["tokenStatus"],
            "spending_controls_applied": result["spendingControlsApplied"],
            "expires_at": result["tokenExpiresAt"]
        }
    
    async def generate_cryptogram(
        self,
        token_reference_id: str,
        transaction_amount: Decimal,
        merchant_id: str,
        purchase_intent_hash: Optional[str] = None
    ) -> dict:
        """
        Generate a single-use transaction cryptogram for authorization.
        Called at transaction time, NOT during provisioning.
        """
        path = "/vts/v1/agentic/cryptograms"
        timestamp = str(int(time.time()))
        
        payload = {
            "tokenRequestorId": self.trid,
            "tokenReferenceId": token_reference_id,
            "transactionAmount": str(transaction_amount),
            "transactionCurrencyCode": "840",  # USD
            "merchantId": merchant_id,
            "purchaseIntentHash": purchase_intent_hash,
            "cryptogramType": "TAVV"  # Token Authentication Verification Value
        }
        
        body_str = json.dumps(payload)
        signature = self._sign_request("POST", path, body_str, timestamp)
        
        async with httpx.AsyncClient() as client:
            response = await client.post(
                f"{self.base_url}{path}",
                content=body_str,
                headers={
                    "Content-Type": "application/json",
                    "X-API-KEY": self.api_key,
                    "X-Request-Timestamp": timestamp,
                    "X-Signature": signature,
                    "X-TRID": self.trid
                },
                timeout=15.0
            )
            response.raise_for_status()
            
        return response.json()

Section illustration


Step 2: Building the Durable Token Vault

Agent tokens must never reside in agent memory, conversation context, or application logs. The vault is a server-side encrypted store that agents access only through a controlled interface that enforces spending limit checks before every token retrieval. The architecture presented here uses AWS DynamoDB with KMS envelope encryption, but the pattern maps directly to any secret management system.

Token Vault Implementation

import boto3
import json
import base64
from datetime import datetime, timezone
from decimal import Decimal
from typing import Optional
import os

class AgentTokenVault:
    """
    Encrypted durable store for agent payment tokens.
    Enforces spending controls before every token access.
    Never returns raw token values to callers who haven't 
    passed the control validation gate.
    """
    
    def __init__(self):
        self.dynamodb = boto3.resource("dynamodb", region_name="us-east-1")
        self.kms = boto3.client("kms", region_name="us-east-1")
        self.table = self.dynamodb.Table(os.environ["TOKEN_VAULT_TABLE"])
        self.kms_key_arn = os.environ["KMS_KEY_ARN"]
    
    def _encrypt_token(self, token_value: str) -> dict:
        """Envelope encrypt token using AWS KMS."""
        response = self.kms.generate_data_key(
            KeyId=self.kms_key_arn,
            KeySpec="AES_256"
        )
        
        plaintext_key = response["Plaintext"]
        encrypted_key = base64.b64encode(response["CiphertextBlob"]).decode()
        
        # Use plaintext_key to encrypt token_value with AES-256-GCM
        from cryptography.hazmat.primitives.ciphers.aead import AESGCM
        import secrets
        
        nonce = secrets.token_bytes(12)
        aesgcm = AESGCM(plaintext_key)
        encrypted_token = aesgcm.encrypt(nonce, token_value.encode(), None)
        
        return {
            "encrypted_key": encrypted_key,
            "nonce": base64.b64encode(nonce).decode(),
            "ciphertext": base64.b64encode(encrypted_token).decode()
        }
    
    def _decrypt_token(self, encrypted_payload: dict) -> str:
        """Decrypt envelope-encrypted token."""
        from cryptography.hazmat.primitives.ciphers.aead import AESGCM
        
        encrypted_key = base64.b64decode(encrypted_payload["encrypted_key"])
        response = self.kms.decrypt(CiphertextBlob=encrypted_key)
        plaintext_key = response["Plaintext"]
        
        nonce = base64.b64decode(encrypted_payload["nonce"])
        ciphertext = base64.b64decode(encrypted_payload["ciphertext"])
        
        aesgcm = AESGCM(plaintext_key)
        return aesgcm.decrypt(nonce, ciphertext, None).decode()
    
    async def store_token(
        self,
        agent_id: str,
        token_reference_id: str,
        token_value: str,
        spending_controls: dict,
        expires_at: str
    ) -> None:
        """Store encrypted token in vault with associated controls."""
        encrypted_payload = self._encrypt_token(token_value)
        
        self.table.put_item(
            Item={
                "agentId": agent_id,
                "tokenReferenceId": token_reference_id,
                "encryptedToken": json.dumps(encrypted_payload),
                "spendingControls": json.dumps(spending_controls),
                "dailySpentAmount": Decimal("0.00"),
                "dailySpentResetAt": datetime.now(timezone.utc).isoformat(),
                "expiresAt": expires_at,
                "createdAt": datetime.now(timezone.utc).isoformat(),
                "accessLog": []
            }
        )
    
    async def retrieve_token_for_transaction(
        self,
        agent_id: str,
        transaction_amount: Decimal,
        merchant_id: str,
        mcc_code: str
    ) -> Optional[str]:
        """
        Retrieve token only after passing all spending control gates.
        Returns token_reference_id (not raw token) for cryptogram generation.
        """
        response = self.table.get_item(
            Key={"agentId": agent_id}
        )
        
        if "Item" not in response:
            raise ValueError(f"No token found for agent {agent_id}")
        
        item = response["Item"]
        controls = json.loads(item["spendingControls"])
        
        # Gate 1: Token expiry check
        if datetime.fromisoformat(item["expiresAt"]) <= datetime.now(timezone.utc):
            raise PermissionError("Agent payment token has expired")
        
        # Gate 2: Transaction amount check
        max_amount = Decimal(controls["maxTransactionAmount"])
        if transaction_amount > max_amount:
            raise PermissionError(
                f"Transaction amount {transaction_amount} exceeds token limit {max_amount}"
            )
        
        # Gate 3: MCC code allowlist check
        if mcc_code not in controls["allowedMccCodes"]:
            raise PermissionError(
                f"Merchant category code {mcc_code} not permitted for this agent token"
            )
        
        # Gate 4: Daily spend limit check
        if controls.get("dailySpendLimit"):
            daily_limit = Decimal(controls["dailySpendLimit"])
            daily_spent = item["dailySpentAmount"]
            if daily_spent + transaction_amount > daily_limit:
                raise PermissionError(
                    f"Transaction would exceed daily spend limit of {daily_limit}"
                )
        
        # Gate 5: Merchant allowlist (if configured)
        if controls.get("allowedMerchantIds"):
            if merchant_id not in controls["allowedMerchantIds"]:
                raise PermissionError(f"Merchant {merchant_id} not in agent's allowlist")
        
        # All gates passed - return token reference (not raw token value)
        # Cryptogram generation uses the reference, not the token itself
        return item["tokenReferenceId"]

Step 3: Defining the Purchase Intent Schema

Purchase intent is the structured JSON object that an AI agent emits when it determines a purchase is warranted. This object serves two purposes: it’s the agent’s declaration of what it intends to buy and why, and it becomes the input to the cryptogram generation step where its hash is embedded in the transaction proof. This creates an unbreakable link between the agent’s stated reasoning and the actual transaction—a critical audit trail property.

The schema design here is intentional. By requiring the agent to explicitly state the business justification, the item description, and the price it expects to pay, you create a natural forcing function that reduces hallucinated or erroneous purchases. An agent that can’t articulate a clear business justification in structured form simply won’t be able to proceed. For deeper context on how structured outputs improve agent reliability across tool-calling scenarios,

Enterprise teams deploying AI agents at scale should also review our in-depth coverage of GPT-5.5 vs Claude Opus 4.8: The Complete Enterprise Developer’s Comparison Guide for 2026, which addresses related architectural decisions and operational considerations that directly impact the implementations described above.

.

from pydantic import BaseModel, Field, validator
from typing import Optional, Literal
from decimal import Decimal
from datetime import datetime
import hashlib
import json

class LineItem(BaseModel):
    description: str = Field(max_length=200)
    quantity: int = Field(ge=1, le=1000)
    unit_price: Decimal = Field(ge=Decimal("0.01"))
    
    @property
    def total_price(self) -> Decimal:
        return self.quantity * self.unit_price

class PurchaseIntent(BaseModel):
    """
    Structured purchase declaration emitted by AI agents.
    This object is hashed and included in the Visa transaction cryptogram,
    creating a cryptographic link between agent reasoning and transaction.
    """
    agent_id: str
    task_id: str  # The task context that triggered this purchase
    merchant_name: str = Field(max_length=100)
    merchant_id: str  # Visa merchant ID
    mcc_code: str = Field(pattern=r"^\d{4}$")
    line_items: list[LineItem] = Field(min_length=1, max_length=50)
    business_justification: str = Field(
        min_length=20,
        max_length=500,
        description="Agent's stated reason for this purchase"
    )
    urgency: Literal["routine", "time_sensitive", "critical"]
    requires_human_confirmation: bool = Field(
        default=False,
        description="Set True if agent believes human review is warranted"
    )
    created_at: datetime = Field(default_factory=datetime.utcnow)
    
    @property
    def total_amount(self) -> Decimal:
        return sum(item.total_price for item in self.line_items)
    
    def compute_intent_hash(self) -> str:
        """
        Deterministic SHA-256 hash of purchase intent for cryptogram binding.
        Excludes created_at to allow re-verification within same second.
        """
        canonical = {
            "agent_id": self.agent_id,
            "task_id": self.task_id,
            "merchant_id": self.merchant_id,
            "mcc_code": self.mcc_code,
            "total_amount": str(self.total_amount),
            "line_items": [
                {
                    "description": item.description,
                    "quantity": item.quantity,
                    "unit_price": str(item.unit_price)
                }
                for item in self.line_items
            ],
            "business_justification": self.business_justification
        }
        canonical_str = json.dumps(canonical, sort_keys=True)
        return hashlib.sha256(canonical_str.encode()).hexdigest()

Step 4: Building the Purchase-Capable OpenAI Agent

With token provisioning and vault infrastructure in place, you can now build the agent itself. The OpenAI Agents SDK supports tool registration with typed schemas—the payment execution capability is defined as a tool, and the agent accesses it through the standard tool-calling mechanism. Crucially, the agent never sees the token value; it only emits a structured purchase intent, and the execution layer handles all payment infrastructure calls.

Defining the Payment Execution Tool

import asyncio
from openai import AsyncOpenAI
from openai.types.beta.threads import Run
import json
from decimal import Decimal
from typing import Any

client = AsyncOpenAI()

# Tool schema registered with the agent
PAYMENT_TOOL_SCHEMA = {
    "type": "function",
    "function": {
        "name": "execute_purchase",
        "description": (
            "Execute a purchase on behalf of the user. Use this tool when you have "
            "identified a specific product or service to purchase within your authorized "
            "spending parameters. You must provide a complete business justification and "
            "all line item details. This tool will validate spending controls before "
            "proceeding. Do NOT call this tool speculatively."
        ),
        "parameters": {
            "type": "object",
            "properties": {
                "merchant_name": {
                    "type": "string",
                    "description": "Full legal name of the merchant"
                },
                "merchant_id": {
                    "type": "string", 
                    "description": "Visa merchant ID obtained from merchant's payment portal"
                },
                "mcc_code": {
                    "type": "string",
                    "description": "4-digit Merchant Category Code"
                },
                "line_items": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "description": {"type": "string"},
                            "quantity": {"type": "integer"},
                            "unit_price": {"type": "string", "description": "Price in USD"}
                        },
                        "required": ["description", "quantity", "unit_price"]
                    }
                },
                "business_justification": {
                    "type": "string",
                    "description": "Clear explanation of why this purchase is necessary for the current task"
                },
                "urgency": {
                    "type": "string",
                    "enum": ["routine", "time_sensitive", "critical"]
                }
            },
            "required": [
                "merchant_name", "merchant_id", "mcc_code", 
                "line_items", "business_justification", "urgency"
            ]
        }
    }
}

The Payment Execution Handler

from typing import Optional
import logging

logger = logging.getLogger(__name__)

class AgentPaymentExecutor:
    """
    Handles the execution of purchase intents emitted by OpenAI agents.
    Orchestrates: intent validation → vault gate → cryptogram → authorization.
    """
    
    def __init__(self, agent_id: str):
        self.agent_id = agent_id
        self.vault = AgentTokenVault()
        self.visa_client = VisaTokenClient()
        
    async def execute_purchase(self, tool_args: dict, task_id: str) -> dict:
        """
        Main execution handler called when agent invokes execute_purchase tool.
        Returns structured result back to agent context.
        """
        try:
            # Step 1: Construct and validate purchase intent
            line_items = [
                LineItem(
                    description=item["description"],
                    quantity=item["quantity"],
                    unit_price=Decimal(item["unit_price"])
                )
                for item in tool_args["line_items"]
            ]
            
            intent = PurchaseIntent(
                agent_id=self.agent_id,
                task_id=task_id,
                merchant_name=tool_args["merchant_name"],
                merchant_id=tool_args["merchant_id"],
                mcc_code=tool_args["mcc_code"],
                line_items=line_items,
                business_justification=tool_args["business_justification"],
                urgency=tool_args["urgency"]
            )
            
            logger.info(
                "Purchase intent created",
                extra={
                    "agent_id": self.agent_id,
                    "task_id": task_id,
                    "amount": str(intent.total_amount),
                    "merchant_id": intent.merchant_id,
                    "intent_hash": intent.compute_intent_hash()
                }
            )
            
            # Step 2: Check if human confirmation is needed
            if intent.total_amount > Decimal("200.00") or intent.urgency == "critical":
                return await self._request_human_confirmation(intent)
            
            # Step 3: Vault gate - retrieve token reference if controls pass
            token_reference_id = await self.vault.retrieve_token_for_transaction(
                agent_id=self.agent_id,
                transaction_amount=intent.total_amount,
                merchant_id=intent.merchant_id,
                mcc_code=intent.mcc_code
            )
            
            # Step 4: Generate transaction cryptogram with intent hash binding
            intent_hash = intent.compute_intent_hash()
            cryptogram_response = await self.visa_client.generate_cryptogram(
                token_reference_id=token_reference_id,
                transaction_amount=intent.total_amount,
                merchant_id=intent.merchant_id,
                purchase_intent_hash=intent_hash
            )
            
            # Step 5: Submit authorization to merchant
            auth_result = await self._submit_authorization(
                intent=intent,
                cryptogram=cryptogram_response["cryptogram"],
                token_reference_id=token_reference_id
            )
            
            # Step 6: Update daily spend tracking in vault
            await self._update_spend_tracking(intent.total_amount)
            
            return {
                "success": True,
                "transaction_id": auth_result["transactionId"],
                "amount_charged": str(intent.total_amount),
                "merchant_name": intent.merchant_name,
                "authorization_code": auth_result["authorizationCode"],
                "intent_hash": intent_hash,
                "message": (
                    f"Purchase of ${intent.total_amount} from {intent.merchant_name} "
                    f"authorized. Transaction ID: {auth_result['transactionId']}"
                )
            }
            
        except PermissionError as e:
            logger.warning(f"Spending control violation: {e}", extra={"agent_id": self.agent_id})
            return {
                "success": False,
                "error_type": "SPENDING_CONTROL_VIOLATION",
                "message": str(e),
                "agent_guidance": (
                    "This purchase exceeds your authorized spending parameters. "
                    "Do not attempt to circumvent these controls. If this purchase "
                    "is genuinely required, inform the user and request they update "
                    "your spending authorization."
                )
            }
        except Exception as e:
            logger.error(f"Payment execution failed: {e}", extra={"agent_id": self.agent_id})
            return {
                "success": False,
                "error_type": "PAYMENT_SYSTEM_ERROR",
                "message": "Payment processing encountered an error. No charge was made.",
                "should_retry": False
            }
    
    async def _request_human_confirmation(self, intent: PurchaseIntent) -> dict:
        """Pause execution and request human approval for high-value transactions."""
        # Implementation depends on your notification infrastructure
        # (email, Slack, push notification, etc.)
        return {
            "success": False,
            "error_type": "HUMAN_CONFIRMATION_REQUIRED",
            "intent_hash": intent.compute_intent_hash(),
            "message": (
                f"This purchase of ${intent.total_amount} requires your confirmation. "
                f"A confirmation request has been sent to the account holder."
            ),
            "pending_intent_id": f"pending_{intent.task_id}_{int(datetime.utcnow().timestamp())}"
        }
    
    async def _submit_authorization(
        self, 
        intent: PurchaseIntent,
        cryptogram: str,
        token_reference_id: str
    ) -> dict:
        """Submit payment authorization to merchant's payment gateway."""
        # This connects to the merchant's payment processing API
        # Implementation varies by merchant/gateway
        # The cryptogram serves as proof of authorized token usage
        async with httpx.AsyncClient() as client:
            response = await client.post(
                f"https://api.merchant.example.com/v1/authorize",
                json={
                    "tokenReferenceId": token_reference_id,
                    "cryptogram": cryptogram,
                    "amount": str(intent.total_amount),
                    "currency": "USD",
                    "agentInitiated": True,
                    "purchaseIntentHash": intent.compute_intent_hash()
                },
                timeout=15.0
            )
            response.raise_for_status()
            return response.json()
    
    async def _update_spend_tracking(self, amount: Decimal) -> None:
        """Update rolling daily spend in vault after successful transaction."""
        self.vault.table.update_item(
            Key={"agentId": self.agent_id},
            UpdateExpression="ADD dailySpentAmount :amount",
            ExpressionAttributeValues={":amount": amount}
        )

Step 5: Assembling the Full Agent Runtime

With all components built, the agent runtime ties everything together. The pattern uses OpenAI’s Assistants API with tool execution handled server-side. The agent operates in a structured reasoning loop: it receives a task, reasons about what purchases (if any) are needed, emits purchase intents via the tool, receives execution results, and continues until the task is complete.

import asyncio
from openai import AsyncOpenAI
import json
import os

client = AsyncOpenAI()

class AgenticCommerceRuntime:
    """
    Full runtime for a purchase-capable OpenAI agent.
    Manages the assistant lifecycle, tool execution loop,
    and payment orchestration.
    """
    
    def __init__(self, agent_id: str, assistant_id: str):
        self.agent_id = agent_id
        self.assistant_id = assistant_id
        self.payment_executor = AgentPaymentExecutor(agent_id)
    
    async def run_task(self, task_description: str) -> dict:
        """
        Execute a task that may involve autonomous purchases.
        Returns final task result with transaction audit trail.
        """
        # Create a thread for this task
        thread = await client.beta.threads.create()
        task_id = thread.id
        
        # Add the task as a user message
        await client.beta.threads.messages.create(
            thread_id=thread.id,
            role="user",
            content=task_description
        )
        
        # Start the run
        run = await client.beta.threads.runs.create(
            thread_id=thread.id,
            assistant_id=self.assistant_id,
            tools=[PAYMENT_TOOL_SCHEMA],
            instructions=(
                "You are an autonomous commerce agent with the ability to make purchases "
                "on behalf of the user within your authorized spending parameters. "
                "Always provide detailed business justification for any purchase. "
                "Never attempt purchases that exceed your authorization scope. "
                "If a required purchase exceeds your limits, inform the user clearly."
            )
        )
        
        transaction_log = []
        
        # Tool execution loop
        while run.status in ["queued", "in_progress", "requires_action"]:
            if run.status == "requires_action":
                tool_outputs = []
                
                for tool_call in run.required_action.submit_tool_outputs.tool_calls:
                    if tool_call.function.name == "execute_purchase":
                        tool_args = json.loads(tool_call.function.arguments)
                        
                        # Execute the payment
                        result = await self.payment_executor.execute_purchase(
                            tool_args=tool_args,
                            task_id=task_id
                        )
                        
                        # Log transaction attempt
                        transaction_log.append({
                            "tool_call_id": tool_call.id,
                            "result": result
                        })
                        
                        tool_outputs.append({
                            "tool_call_id": tool_call.id,
                            "output": json.dumps(result)
                        })
                
                # Submit all tool results
                run = await client.beta.threads.runs.submit_tool_outputs(
                    thread_id=thread.id,
                    run_id=run.id,
                    tool_outputs=tool_outputs
                )
            else:
                await asyncio.sleep(1.0)
                run = await client.beta.threads.runs.retrieve(
                    thread_id=thread.id,
                    run_id=run.id
                )
        
        # Retrieve final message
        messages = await client.beta.threads.messages.list(thread_id=thread.id)
        final_message = messages.data[0].content[0].text.value
        
        return {
            "task_id": task_id,
            "status": run.status,
            "final_message": final_message,
            "transaction_log": transaction_log,
            "total_spent": sum(
                Decimal(t["result"].get("amount_charged", "0"))
                for t in transaction_log
                if t["result"].get("success")
            )
        }

Section illustration


Security Architecture and Threat Model

Agentic commerce introduces a novel attack surface that requires careful threat modeling. The following section outlines the primary threat vectors and the mitigations built into the architecture described in this guide.

Access 40,000+ AI Prompts for ChatGPT, Claude & Codex — Free!

Subscribe to get instant access to our complete Notion Prompt Library — the largest curated collection of prompts for ChatGPT, Claude, OpenAI Codex, and other leading AI models. Optimized for real-world workflows across coding, research, content creation, and business.

Get Free Access Now →

Threat Matrix for Machine-Initiated Commerce

Threat Vector Severity Mitigation in This Architecture
Prompt Injection Purchases Malicious content in web pages or data the agent processes convinces it to make unauthorized purchases High Spending controls enforced at vault layer, not agent layer; agent cannot override KMS-encrypted controls
Token Exfiltration Attacker extracts raw token from agent context or logs Critical Raw token never in agent context; envelope encryption in vault; tokens accessed only by reference
Amount Manipulation Attacker manipulates product prices between intent and execution High Purchase intent hash bound to cryptogram; price mismatch invalidates cryptogram
Replay Attacks Captured transaction data replayed to re-execute a purchase High TAVV cryptogram is single-use; Visa rejects replayed cryptograms
Agent Identity Spoofing Attacker impersonates an authorized agent to access its token Critical OpenAI-signed JWT required for agent identity assertion; vault validates JWT before token access
Velocity Fraud Agent makes many small purchases to stay under per-transaction limit Medium Daily spend limit enforced in vault; velocity alerts configurable in Visa spending controls
MCC Code Bypassing Agent routes purchase through uncontrolled merchant category Medium MCC allowlist enforced at vault gate before cryptogram generation; merchant IDs optionally allowlisted
Hallucinated Merchant IDs Agent invents merchant credentials and attempts authorization Medium Merchant ID validation against registered Visa merchant database at authorization step

Additional Security Hardening Recommendations

Beyond the architectural mitigations in the table above, production deployments should implement several additional controls. First, implement a “purchase intent confidence threshold” in your agent system prompt—require the agent to express its confidence in the purchase decision and refuse to proceed if confidence is below a configurable threshold. Second, integrate anomaly detection on your transaction log: sudden spikes in purchase frequency or unusual merchant patterns should trigger automatic token suspension and human review.

Third, implement segregated agent identities: an agent authorized for office supply procurement should have a completely separate token, vault entry, and identity from an agent authorized for software license purchases. Cross-domain contamination—where an agent with one purpose somehow acquires the authorization of another—is a real risk in complex multi-agent systems. Fourth, all purchase intents should be written to an append-only audit log with a retention period matching your jurisdiction’s financial records requirements (typically 7 years in most US contexts). For comprehensive coverage of multi-agent security patterns and prompt injection defenses,

For teams looking to expand their AI capabilities with proven prompt patterns, our comprehensive guide on 99+ ChatGPT Prompts for technical writers provides battle-tested templates that complement the strategies discussed in this article and can be immediately applied to production workflows.

.


Testing Your Agentic Commerce Implementation

Visa provides a comprehensive sandbox environment that mirrors production token service behavior. The sandbox supports simulated spending control violations, cryptogram generation, and even simulated network declines—all essential for building robust error handling in your agent.

Unit Tests for Critical Paths

import pytest
import pytest_asyncio
from decimal import Decimal
from unittest.mock import AsyncMock, MagicMock, patch

class TestSpendingControlGates:
    """
    Test that vault spending control gates cannot be bypassed.
    These tests verify the security properties of the architecture.
    """
    
    @pytest.mark.asyncio
    async def test_transaction_exceeding_limit_is_rejected(self):
        """Gate 2: Transaction amount must not exceed token limit."""
        vault = AgentTokenVault()
        
        # Mock DynamoDB response with $100 limit
        vault.table.get_item = MagicMock(return_value={
            "Item": {
                "agentId": "test-agent",
                "tokenReferenceId": "ref-123",
                "spendingControls": '{"maxTransactionAmount": "100.00", "allowedMccCodes": ["5411"], "dailySpendLimit": "500.00"}',
                "dailySpentAmount": Decimal("0.00"),
                "expiresAt": "2099-01-01T00:00:00+00:00",
                "encryptedToken": "{}"
            }
        })
        
        with pytest.raises(PermissionError, match="exceeds token limit"):
            await vault.retrieve_token_for_transaction(
                agent_id="test-agent",
                transaction_amount=Decimal("150.00"),  # Exceeds $100 limit
                merchant_id="merchant-123",
                mcc_code="5411"
            )
    
    @pytest.mark.asyncio
    async def test_disallowed_mcc_is_rejected(self):
        """Gate 3: MCC code must be in allowlist."""
        vault = AgentTokenVault()
        
        vault.table.get_item = MagicMock(return_value={
            "Item": {
                "agentId": "test-agent",
                "tokenReferenceId": "ref-123",
                "spendingControls": '{"maxTransactionAmount": "500.00", "allowedMccCodes": ["5411"]}',
                "dailySpentAmount": Decimal("0.00"),
                "expiresAt": "2099-01-01T00:00:00+00:00",
                "encryptedToken": "{}"
            }
        })
        
        with pytest.raises(PermissionError, match="not permitted"):
            await vault.retrieve_token_for_transaction(
                agent_id="test-agent",
                transaction_amount=Decimal("50.00"),
                merchant_id="merchant-456",
                mcc_code="7995"  # Gambling - not in allowlist
            )
    
    @pytest.mark.asyncio
    async def test_purchase_intent_hash_is_deterministic(self):
        """Purchase intent hash must be deterministic for cryptogram binding."""
        intent = PurchaseIntent(
            agent_id="agent-1",
            task_id="task-abc",
            merchant_name="Test Merchant",
            merchant_id="merchant-001",
            mcc_code="5411",
            line_items=[LineItem(description="Test Item", quantity=2, unit_price=Decimal("25.00"))],
            business_justification="Required for weekly supply replenishment task",
            urgency="routine"
        )
        
        hash1 = intent.compute_intent_hash()
        hash2 = intent.compute_intent_hash()
        
        assert hash1 == hash2, "Intent hash must be deterministic"
        assert len(hash1) == 64, "SHA-256 hash must be 64 hex characters"
    
    @pytest.mark.asyncio 
    async def test_modified_intent_produces_different_hash(self):
        """Changing any field must change the hash - prevents tampering."""
        base_intent_data = {
            "agent_id": "agent-1",
            "task_id": "task-abc",
            "merchant_name": "Test Merchant",
            "merchant_id": "merchant-001",
            "mcc_code": "5411",
            "line_items": [LineItem(description="Test Item", quantity=2, unit_price=Decimal("25.00"))],
            "business_justification": "Required for weekly supply replenishment task",
            "urgency": "routine"
        }
        
        original = PurchaseIntent(**base_intent_data)
        
        tampered_data = base_intent_data.copy()
        tampered_data["line_items"] = [
            LineItem(description="Test Item", quantity=2, unit_price=Decimal("30.00"))  # Price tampered
        ]
        tampered = PurchaseIntent(**tampered_data)
        
        assert original.compute_intent_hash() != tampered.compute_intent_hash()

Production Deployment Checklist

Before moving any agentic commerce implementation to production, work through each of these categories systematically. This checklist reflects the security and compliance requirements for live payment processing with Visa network tokens.

Infrastructure Security

  • Token vault access is restricted via IAM policies—only the payment executor service role has vault read access
  • KMS key rotation is enabled and scheduled (90-day maximum rotation interval)
  • All vault access events are logged to CloudTrail with tamper-evident log integrity enabled
  • Network policies prevent agent runtime pods from making direct calls to Visa APIs—all payment calls route through the executor service
  • mTLS certificates for Visa API communication are stored in Secrets Manager with automatic rotation
  • Token vault DynamoDB table has point-in-time recovery enabled and is encrypted at rest with a dedicated CMK

Agent Safety Controls

  • System prompt explicitly forbids the agent from discussing its payment capabilities with external content sources
  • Purchase intents above configurable threshold require dual confirmation (agent + human)
  • Maximum daily transaction count limit set in addition to dollar limits
  • Agent token is automatically suspended if three consecutive spending control violations occur
  • Agent cannot initiate purchases for merchants outside the pre-approved working set without human approval
  • Purchase intent review window of 30 seconds implemented for non-urgent purchases (human can cancel)

Compliance and Legal

  • Cardholder consent captured and stored with explicit disclosure of AI agent purchase capabilities
  • PCI DSS SAQ-D completed and approved for the token vault environment
  • Spending control parameters documented in user-facing terms of service
  • Transaction disputes process updated to handle agent-initiated purchase investigations
  • Data retention policies for purchase intent logs comply with applicable regulations
  • User-accessible dashboard provides real-time view of all agent transactions and a kill switch for immediate token suspension

Monitoring and Alerting

Metric Alert Threshold Response Action
Spending control violations per hour > 3 Suspend token, notify security team
Failed cryptogram generations > 5 in 10 minutes Circuit breaker, human review
Transaction velocity > 10 per hour per agent Velocity hold, cardholder notification
New merchant ID encountered Any (if not on allowlist) Log and alert if allowlist mode enabled
Purchase intent hash mismatches Any Immediate suspension, security incident
Vault decryption errors > 1 Alert security team immediately
Daily spend threshold reached 80% of daily limit Notify cardholder, prepare for limit

Handling Disputes and Reversals for Agent-Initiated Transactions

One of the least-discussed aspects of agentic commerce is what happens when something goes wrong after the transaction. The agent may have purchased the wrong item, been deceived by a fraudulent merchant, or simply made an error in interpreting the user’s task. Because each transaction is cryptographically bound to a specific purchase intent, your dispute process has artifacts that traditional card disputes lack.

The purchase intent hash stored in your audit log can be presented to Visa during a dispute as evidence of what the agent was authorized to purchase versus what it actually bought. This hash, embedded in the transaction’s cryptogram, serves as immutable proof of the agent’s declared intent at authorization time. If the merchant charged a different amount than the one committed in the purchase intent, the cryptogram itself contains evidence of the discrepancy.

class TransactionDisputeHandler:
    """
    Generates dispute evidence packages for agent-initiated transactions.
    Leverages purchase intent audit trail for Visa dispute resolution.
    """
    
    async def generate_dispute_evidence(
        self, 
        transaction_id: str,
        task_id: str
    ) -> dict:
        """
        Compile complete evidence package for transaction dispute.
        """
        # Retrieve original purchase intent from audit log
        intent_record = await self._get_intent_from_audit_log(task_id)
        
        # Retrieve agent task context (conversation thread)
        thread_transcript = await self._get_thread_transcript(task_id)
        
        # Retrieve transaction authorization record
        auth_record = await self._get_auth_record(transaction_id)
        
        # Verify intent hash consistency
        stored_hash = intent_record["intent_hash"]
        auth_hash = auth_record.get("purchaseIntentHash")
        hash_consistent = (stored_hash == auth_hash)
        
        return {
            "transaction_id": transaction_id,
            "dispute_evidence": {
                "purchase_intent": intent_record["intent_payload"],
                "intent_hash": stored_hash,
                "auth_intent_hash": auth_hash,
                "hash_consistent": hash_consistent,
                "agent_justification": intent_record["business_justification"],
                "authorized_amount": intent_record["total_amount"],
                "charged_amount": auth_record["chargedAmount"],
                "amount_matches": (
                    intent_record["total_amount"] == auth_record["chargedAmount"]
                ),
                "task_context_summary": thread_transcript[:2000],
                "authorization_timestamp": auth_record["authorizedAt"],
                "intent_timestamp": intent_record["createdAt"]
            },
            "dispute_strength": self._assess_dispute_strength(
                hash_consistent=hash_consistent,
                amount_matches=(intent_record["total_amount"] == auth_record["chargedAmount"])
            )
        }
    
    def _assess_dispute_strength(
        self, 
        hash_consistent: bool, 
        amount_matches: bool
    ) -> str:
        if not hash_consistent:
            return "STRONG - Intent hash mismatch indicates transaction tampering"
        if not amount_matches:
            return "STRONG - Charged amount differs from authorized purchase intent"
        return "STANDARD - Transaction matched intent; investigate merchant fulfillment"

Scaling to Multi-Agent Workflows

The architecture described so far covers a single agent with a single payment token. Production applications increasingly involve multiple specialized agents working in coordination—a procurement agent, a price comparison agent, a supplier communication agent, and an approval routing agent might all participate in a single procurement workflow. Each agent in this chain should have its own scoped token with the minimal permissions necessary for its specific role.

The orchestration pattern that works best here is an “authorization delegation” model: the human grants a top-level orchestrator agent a token with maximum permitted scope, and that orchestrator can delegate sub-tokens with further restricted scopes to specialized sub-agents. Sub-tokens must have equal or lower limits than the token they were derived from—you cannot escalate permissions through delegation. This mirrors the principle of least privilege in traditional IAM design.

In implementation terms, this means your vault supports a token hierarchy: each token record includes a parent token reference (or null for top-level tokens), and your spending control validation gates check cumulative spending across the entire delegation chain—a sub-agent spending $50 counts against both its own $100 sub-limit and the parent agent’s $500 top-level limit. This prevents the classic “aggregate bypass” attack where multiple sub-agents each make small purchases that collectively exceed what the human authorized.


What Comes Next: The Evolving Agentic Commerce Landscape

The OpenAI-Visa collaboration is the first major infrastructure partnership in agentic commerce, but it will not be the last. Mastercard has publicly discussed similar token-for-agents initiatives, and American Express has been piloting agent-capable corporate card APIs in enterprise contexts since late 2024. The standards work is moving quickly: EMVCo has an active working group on software principal payment credentials, and the W3C Payments Working Group is considering how agent identity might integrate with the Payment Request API.

For developers building on this infrastructure today, a few forward-looking considerations merit attention. First, the agent identity assertion mechanism will likely be standardized: today’s OpenAI-specific JWT will probably converge toward a broader “AI agent verifiable credential” standard that works across model providers. Building your vault’s token access layer to accept pluggable identity validators now will save significant refactoring later.

Second, the concept of “agent reputation” is emerging in industry discussions—the idea that an agent’s transaction history could inform the spending limits it’s granted, similar to how consumer credit scores work. Designing your audit log schema to be compatible with potential future reputation systems (clear task-to-transaction linkage, standardized dispute outcomes, consistent metadata) is a low-cost investment with potentially high future value.

Third, the regulatory environment is shifting rapidly. The EU AI Act’s provisions on autonomous systems making financial decisions are still being interpreted for agentic commerce specifically, and several US states have introduced bills that could impose disclosure requirements on AI-initiated purchases. Building robust human notification infrastructure now—so users always know what their agents have purchased—positions you well for compliance regardless of how regulations ultimately land.


Conclusion

The convergence of OpenAI’s agent infrastructure and Visa’s tokenization platform represents a genuine architectural shift, not merely a marketing collaboration. For the first time, developers have access to a coherent, production-grade stack for building AI agents that can autonomously complete purchases within cryptographically enforced boundaries set by human principals.

The architecture in this guide—token provisioning with embedded spending controls, envelope-encrypted vault storage, purchase intent hashing for cryptogram binding, and layered spending control gates—addresses the core security requirements of machine-initiated commerce. None of these components are theoretical: each maps to real APIs and infrastructure patterns you can deploy today using Visa’s developer sandbox and the OpenAI Agents SDK.

The most important conceptual shift for developers approaching this domain is understanding that the security model is fundamentally different from consumer e-commerce. You’re not trying to verify that a human is who they say they are—you’re trying to ensure that an AI agent operates strictly within boundaries an authenticated human has deliberately set. Every design decision in the architecture flows from that distinction: the vault gates enforce human-set controls, the cryptogram binding proves the agent declared its intent before acting, and the audit trail ensures every autonomous purchase is fully attributable and reversible.

Start with the sandbox, build your vault and control layer first (before any agent integration), and treat spending control violations as security events rather than operational noise. The developers who get this right early will have a significant advantage as agentic commerce moves from infrastructure experiment to mainstream commerce pattern over the next 18 to 24 months.

Get Free Access to 40,000+ AI Prompts for ChatGPT, Claude & Codex

Subscribe for instant access to the largest curated Notion Prompt Library for AI workflows.

More on this