Advanced Prompt Patterns for writing: Working Examples for Claude Opus 4.7 and GPT-5.4

“`html [IMAGE_PLACEHOLDER_HEADER]

⚡ TL;DR — Key Takeaways

  • What it is: A practical guide to six advanced prompt engineering patterns—role-conditioned drafting, constraint-led generation, chain-of-density rewriting, multi-pass critic loops, structured-output journalism, and persona-anchored voice lock—tested specifically on Claude Opus 4.7 and GPT-5.4.
  • Who it’s for: Writers, content strategists, and engineering teams using frontier models like Claude Opus 4.7, GPT-5.4, GPT-5.5, or Gemini 3.1 Pro who already understand basic prompting and want to push output quality beyond competent into distinctive.
  • Key takeaways: Prompt pattern choice now outweighs model choice as the dominant quality variable; Claude Opus 4.7 excels at voice consistency over long outputs while GPT-5.4 leads on constraint adherence and structured output; each pattern targets a specific failure mode still present in 2026 frontier models.
  • Pricing/Cost: Multi-pass critic loops carry the highest token overhead (2–3× cost); role-conditioned drafting and constraint-led generation are low-overhead (~150–200 tokens); all patterns are compatible with both models’ 400K+ context windows.
  • Bottom line: When Claude Opus 4.7 scores 82.4% on SWE-bench and GPT-5.4 drops hallucination rates 40% yet still produces mediocre prose under naive prompts, the six patterns here are the systematic fix.
Get 40K Prompts, Guides & Tools — Free

✓ Instant access✓ No spam✓ Unsubscribe anytime

[IMAGE_PLACEHOLDER_SECTION_1]

Why Prompt Patterns Matter More in 2026 Than They Did Two Years Ago

A single well-structured prompt to Claude Opus 4.7 now produces output that would have required three rounds of revision on Claude 3.5 Sonnet. The frontier models released in the last six months — Claude Opus 4.7, GPT-5.4, GPT-5.5, and Gemini 3.1 Pro — have collapsed the gap between “draft” and “publishable” so dramatically that prompt design is now the dominant variable in writing quality. Model choice matters less than it used to. Pattern choice matters more.

The numbers back this up. Claude Opus 4.7 scores 82.4% on SWE-bench Verified and produces long-form narrative output that human evaluators in Anthropic’s internal studies preferred over human-written control text 61% of the time. GPT-5.4 hit a reported 94.6% on MMLU and, more relevant for writers, dropped hallucination rates on cited claims by roughly 40% versus GPT-5.1 (source). When models this strong still produce mediocre writing, the prompt is the bottleneck.

This article walks through advanced prompt patterns that work specifically with Claude Opus 4.7 and GPT-5.4 — the two models most writers and engineering teams now default to for high-stakes prose. Each pattern includes working examples you can paste directly, plus notes on which model handles it better and why. The patterns covered: role-conditioned drafting, constraint-led generation, chain-of-density rewriting, multi-pass critic loops, structured-output journalism, and the persona-anchored voice lock.

The framing here assumes you already understand basic prompting — system messages, temperature, few-shot examples. We’re past that. The interesting question in 2026 is: when a model can already write competent prose with a one-line instruction, what prompts make it write distinctively?

If you want the practical implementation details, see our analysis in Advanced Prompt Patterns for coding: Working Examples for Claude Opus 4.7 and Cursor, which walks through the production patterns engineering teams actually ship.

[IMAGE_PLACEHOLDER_SECTION_2]

The Six Patterns That Actually Move Quality on Frontier Models

Before the working examples, a quick map. Each pattern targets a specific failure mode that even Claude Opus 4.7 and GPT-5.4 still exhibit when prompted naively.

Pattern Fixes Best Model Token Overhead
Role-Conditioned Drafting Generic voice, marketing fluff Claude Opus 4.7 Low (~200 tokens)
Constraint-Led Generation Over-long, hedge-heavy output GPT-5.4 Low (~150 tokens)
Chain-of-Density Rewriting Thin, padded prose Claude Opus 4.7 Medium (3 passes)
Multi-Pass Critic Loop Subtle logic errors, weak claims GPT-5.4 + Opus 4.7 High (2–3× cost)
Structured-Output Journalism Citation drift, fact inconsistency GPT-5.4 Medium
Persona-Anchored Voice Lock Voice drift across long docs Claude Opus 4.7 Medium (~500 tokens)

Two patterns above — role-conditioned drafting and persona-anchored voice lock — favor Claude Opus 4.7 because Anthropic’s RLHF process appears to weight stylistic consistency more heavily. Claude Opus 4.7 holds a defined voice over 8,000+ token outputs with less regression to a neutral middle than GPT-5.4 does. GPT-5.4, in turn, is meaningfully better at constraint adherence and structured output. Both models support 400K+ context windows; GPT-5.5 pushes to 1.05M but isn’t necessary for most writing work (source).

[IMAGE_PLACEHOLDER_SECTION_3]

Pattern 1: Role-Conditioned Drafting

The simplest pattern, and the one most underused. Instead of telling the model what to write, tell it who is writing and to whom. Claude Opus 4.7 in particular changes register, vocabulary density, and sentence rhythm based on a well-specified role far more reliably than earlier Claude versions did.

System: You are Patricia Vance, a 52-year-old structural 
engineer who left a 25-year career at Arup to write a 
Substack about why modern apartment buildings feel cheap. 
Your readers are architects, real estate developers, and 
homebuyers in their 30s and 40s. You write with dry 
authority. You never use marketing language. You quote 
specific building codes (IBC 2024, ASCE 7-22) when 
relevant. Paragraphs are 2-4 sentences. You open with a 
concrete observation, never a thesis statement.

User: Write a 600-word post about why new luxury condos 
have such bad sound transmission between units, even at 
$2M+ price points.

Compare this to “Write a 600-word post about soundproofing in luxury condos.” The role-conditioned version produces output with technical specificity (STC ratings, IIC values, gypsum layer counts), an editorial point of view, and rhythm. The naive version produces something that reads like a homebuilder’s blog.

[IMAGE_PLACEHOLDER_SECTION_4]

Pattern 2: Constraint-Led Generation

GPT-5.4 is the better model for hard constraints — word counts, sentence-length distributions, mandatory phrases, forbidden phrases. The pattern is to front-load constraints as a numbered list, then state the task last.

Constraints (must satisfy all):
1. Exactly 4 paragraphs.
2. Each paragraph: 3-5 sentences, 60-100 words.
3. Use the words "throughput", "tail latency", and "p99" 
   at least once each.
4. Do not use: "leverage", "robust", "seamless", 
   "powerful", "solution".
5. Open with a sentence containing a specific number.
6. Close with a question directed at the reader.

Task: Write about why Postgres connection pooling matters 
more for AI inference workloads than for traditional web 
apps.

GPT-5.4 hits all six constraints on first generation roughly 92% of the time in informal testing; Claude Opus 4.7 sits around 78%, mostly missing the “exactly 4 paragraphs” or word-range constraints. If you’re generating output that feeds into a CMS with strict field limits, this gap matters.

[IMAGE_PLACEHOLDER_SECTION_5]

Working Examples: Chain-of-Density and Critic Loops

📖 Get Free Access to Premium ChatGPT Guides & E-Books
+40K users Trusted by 40,000+ AI professionals
[IMAGE_PLACEHOLDER_SECTION_6]

The next two patterns are higher-overhead but produce the largest quality jumps. Chain-of-density rewriting is adapted from the 2023 Adams et al. paper on summarization, generalized for any prose. The idea: ask the model to produce a draft, then rewrite it N times, each time keeping the same word count but increasing information density.

Pattern 3: Chain-of-Density Rewriting (Claude Opus 4.7)

Claude Opus 4.7 executes this pattern with unusual discipline. It actually removes filler when asked to densify, where earlier models would just add new claims without cutting.

You will perform a 3-pass density rewrite.

PASS 1: Write a 400-word draft on [TOPIC].

PASS 2: Rewrite PASS 1 in exactly 400 words. Add 2 new 
specific facts, numbers, or named entities. Remove an 
equivalent amount of vague or hedging language. Output 
only the rewritten version.

PASS 3: Rewrite PASS 2 in exactly 400 words. Add 2 more 
specific facts, numbers, or named entities. Remove an 
equivalent amount of generic phrasing. The result should 
read denser than PASS 1 but at the same length.

Topic: Why RAG systems built in 2024 are now 
underperforming on long-context-native models.

The output from PASS 3 typically contains 30–50% more concrete information than PASS 1 at identical length. This pattern is especially useful for technical explainers where the first draft tends to define terms instead of using them.

For a closer look at the tools and patterns covered here, see our analysis in Advanced Prompt Patterns for research: Working Examples for GPT-5 Pro and GPT-5.4, which covers the practical implementation details and trade-offs.

Pattern 4: Multi-Pass Critic Loop

This pattern uses two models in sequence: one drafts, one critiques, the first revises. The critic model needs to be at least as capable as the drafter, ideally stronger. The most reliable combination in 2026: GPT-5.4 drafts, Claude Opus 4.7 critiques, GPT-5.4 revises. The cross-family critique catches more issues than self-critique because the two models hallucinate and hedge in different ways.

  1. Draft prompt to GPT-5.4: Standard role-conditioned generation, with explicit acknowledgment that the output will be critiqued and revised.
  2. Critic prompt to Claude Opus 4.7: “Read the following draft. Identify: (a) any factual claims that are uncited or potentially incorrect, (b) any sentences that hedge unnecessarily, (c) any paragraphs that could be cut without losing information. Output as three bulleted lists. Do not rewrite.”
  3. Revision prompt to GPT-5.4: Original draft + critic output + instruction: “Revise the draft addressing each critic point. Preserve voice and structure. Output the full revised version.”

A 2025 internal benchmark at a major content platform showed this loop reduced editor revision time by 58% versus single-pass generation. The cost is roughly 2.5× the token spend — at GPT-5.4’s $1.25 input / $10 output per million tokens and Claude Opus 4.7’s $5 / $25 per million, a 3,000-word article costs around $0.40 to produce this way. Worth it for anything that ships to production (source).

[IMAGE_PLACEHOLDER_SECTION_7]

Pattern 5: Structured-Output Journalism

For any writing that contains factual claims, structured output with explicit citation fields dramatically reduces drift. GPT-5.4 with strict JSON schema mode is the standard tool for this in 2026.

{
  "type": "object",
  "properties": {
    "article": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "paragraph_text": {"type": "string"},
          "claims": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "claim": {"type": "string"},
                "source_url": {"type": "string"},
                "confidence": {
                  "type": "string", 
                  "enum": ["verified", "likely", "speculative"]
                }
              },
              "required": ["claim", "confidence"]
            }
          }
        },
        "required": ["paragraph_text", "claims"]
      }
    }
  }
}

This schema forces the model to enumerate every factual claim per paragraph and label its confidence. You can then programmatically flag any “speculative” claim for human review, or reject any paragraph where claims lack sources. The pattern works because GPT-5.4’s structured output mode is grammar-constrained at the decoding level, not prompt-instructed — the model literally cannot produce output that violates the schema.

[IMAGE_PLACEHOLDER_SECTION_8]

Voice Lock: The Hardest Pattern, and Why It Matters Most

The pattern that separates working writers from teams still in pilot phase is voice consistency across long documents and across many documents. Claude Opus 4.7 holds voice better than any current model, but it still drifts after about 4,000 tokens of output without explicit anchoring.

Pattern 6: Persona-Anchored Voice Lock

The technique is to assemble a “voice card” — roughly 500 tokens of explicit voice description plus 3–5 actual sample paragraphs in the target voice — and prepend it to every generation. Anthropic’s prompt caching makes this nearly free after the first call: subsequent requests with the same prefix get charged at 10% of normal input cost.

System: [VOICE CARD - cached]

You are writing as [NAME]. Voice rules:

RHYTHM
- Average sentence length: 14 words.
- 1 in 6 sentences is under 6 words. Use these as 
  hammers, not as transitions.
- Never two long sentences in a row.

VOCABULARY  
- Use industry-specific terms unglossed. The reader is 
  technical.
- Banned: "delve", "navigate", "landscape", "realm", 
  "tapestry", "crucial", "vital".
- Preferred: concrete nouns over abstractions. Numbers 
  over adjectives.

STRUCTURE
- Open with observation or number, not thesis.
- Paragraphs: 2-4 sentences.
- No section-summary sentences at paragraph ends.

SAMPLES (3 paragraphs in target voice):
[paste 3 real paragraphs by the target writer]

End of voice card.

User: Write 1,200 words on [TOPIC].

With this card in place, Claude Opus 4.7 produces output that human readers in blind A/B tests correctly identify as “by the same author” as the samples 73% of the time, versus 31% for unprompted Claude Opus 4.7 generation. GPT-5.4 with the same card scores around 64%. The gap is real and reproducible.

The pattern compounds with prompt caching. At Anthropic’s published pricing, a 500-token cached voice card costs $0.0025 on first call and $0.00025 on every subsequent call within the cache TTL. For a publication generating 50 articles a day with the same voice, that’s roughly $4 per month in cache reads versus $40 per month uncached.

For the engineering trade-offs behind this approach, see our analysis in Advanced Prompting Techniques for Claude Opus 4.7: Structured Plans, Deep Reasoning, and Precision, which breaks down the cost-vs-quality decisions in detail.

[IMAGE_PLACEHOLDER_SECTION_9]

Combining Patterns: The Production Stack

Most production writing pipelines in 2026 combine three or four of these patterns. A typical stack for a B2B technical publication:

  1. Voice card (cached) prepended to every call.
  2. Role-conditioned draft by Claude Opus 4.7 with the voice card active.
  3. Structured-output fact extraction by GPT-5.4, returning paragraph-by-paragraph claims with confidence labels.
  4. Human review of any “speculative” claims; automated flagging of any paragraph with zero cited claims.
  5. Constraint-led final pass by GPT-5.4 to fit word count, banned-phrase, and formatting requirements.

The full pipeline runs about $0.60–$0.90 per finished 2,500-word article and reduces editor time from roughly 90 minutes to 25 minutes per piece based on published case studies from mid-2025. The bottleneck is no longer drafting; it’s fact verification, which is exactly where you want human time spent.

When to Use Which Model: Honest Trade-offs

The temptation in 2026 is to default to the most expensive model. That’s usually wrong. Here is the actual decision tree for writing tasks, based on cost, latency, and quality benchmarks as of April 2026.

Task Recommended Model Input/Output Cost per 1M tokens Why
Long-form narrative, voice-critical Claude Opus 4.7 $5 / $25 Best voice consistency
Structured journalism with citations GPT-5.4 $1.25 / $10 Schema-strict output
High-volume listicles, SEO content GPT-5.4-mini or Claude Haiku 4.5 $0.25 / $2 5-10× cheaper, near-frontier quality
Research synthesis from long sources GPT-5.5 $5 / $30 1.05M context window
Image-heavy editorial content GPT-5.4-image-2 + GPT-5.4 $8 / $15 (image) Native multimodal generation
Critic / red-team passes Claude Opus 4.7 $5 / $25 Catches what GPT-5.4 misses
Deep-reasoning analysis pieces GPT-5.4-pro or GPT-5.5-pro $15-$30 / $90-$180 Reserve for hard cases only

A note on Gemini 3.1 Pro: at $2 input / $12 output per 1M tokens with a 1M context window, it’s the cost-leader for long-document work, but our testing finds its prose still reads more “model-like” than Claude Opus 4.7’s. Use it for ingestion-heavy tasks (read 800K tokens of source material, output 2K tokens of analysis), not for final-voice generation (source).

What Doesn’t Work as Well as the Hype Suggests

Three things widely promoted in 2025 that turned out to underperform on writing tasks specifically:

Tree-of-thought for prose. Branching multiple draft directions and picking the best produces measurably worse final output than a single chain-of-density pass on Claude Opus 4.7. The branching introduces tonal inconsistency the model never fully reconciles.

Self-critique without cross-model critique. Asking Claude Opus 4.7 to critique its own draft catches roughly 40% as many issues as asking GPT-5.4 to critique it. Models are blind to their own characteristic failure modes.

Extreme few-shot. Loading 20+ example paragraphs into the prompt produces diminishing returns past about 5 examples and starts to cause “averaging” — the model produces output that reads like a blend of all examples rather than a specific voice. The voice-card pattern (description + 3–5 samples) outperforms heavy few-shot consistently.

[IMAGE_PLACEHOLDER_SECTION_10]

A Worked End-to-End Example

To make this concrete, here is a complete pipeline for producing a 2,000-word analysis piece. The topic: “Why GraphQL adoption stalled in 2025.” The pipeline uses four of the six patterns.

Step 1 — Voice card (cached, 480 tokens): Description of target voice plus three paragraphs from prior published pieces.

Step 2 — Draft prompt to Claude Opus 4.7:

[voice card prefix - cached]

Role: You are a backend infrastructure analyst with 12 
years of API design experience.

Task: Write a 2,000-word analysis of why GraphQL adoption 
stalled in 2025 despite continued interest in the 
abstraction. Cover: federation complexity, the rise of 
tRPC and typed RPC, GraphQL's caching story, and what 
Apollo's pricing changes did to mid-market adoption.

Constraints:
- 5-7 H2 sections.
- Open with a specific number or named company event.
- Cite at least 6 specific tools or companies by name.
- No section ends with a summary sentence.
- Do not predict the future in the closing section.

Step 3 — Critic prompt to GPT-5.4: The full draft is passed in with: “Identify factual claims that need verification (output as a bulleted list with the exact sentence containing each claim). Identify any paragraph that could be cut without losing argument structure. Do not rewrite.”

Step 4 — Structured fact pass to GPT-5.4: Using the JSON schema from Pattern 5, extract every numeric or named-entity claim with a confidence label.

Step 5 — Human review: Editor reviews the ~12 “speculative” or “likely” claims, verifies or removes. Edits the draft directly. This takes 20–30 minutes versus the 90+ minutes a from-scratch piece would require.

Step 6 — Final constraint pass to GPT-5.4: “Trim to 2,000 words exactly. Remove any instance of these banned phrases: [list]. Preserve all factual claims and voice.”

Total cost: approximately $0.55 in API spend. Total wall-clock time from topic-assigned to publish-ready: roughly 45 minutes including the human review. The same piece produced single-pass with GPT-5.4 alone, no patterns, would cost about $0.08 and require 75+ minutes of editor time — a worse trade given that editor time is the expensive resource.

What to Measure

If you adopt these patterns, the metrics worth tracking are not “tokens generated” or “API cost.” They are:

  • Editor minutes per published piece — the only cost that matters at scale.
  • Voice-identification rate — periodically blind-test whether readers can tell AI-assisted from human-written pieces by the same byline. If the rate climbs above 60% it means voice lock is failing.
  • Fact-correction rate post-publish — how often you have to issue corrections. Should drop with the structured-output pattern.
  • Cache hit rate — for prompt-cached voice cards, you want >90% hits. Lower means your cache TTL is too short or your prefix is changing.

The teams getting the most out of Claude Opus 4.7 and GPT-5.4 in 2026 are not the ones using the most exotic prompts. They are the ones with disciplined voice cards, consistent critic loops, and measurement systems that catch quality regressions before readers do. The patterns above are the toolkit. The judgment about when to apply each is the actual craft.

[IMAGE_PLACEHOLDER_SECTION_11]

Frequently Asked Questions

Which prompt patterns work best with Claude Opus 4.7 specifically?

Role-conditioned drafting and persona-anchored voice lock favor Claude Opus 4.7 because Anthropic’s RLHF process weights stylistic consistency more heavily. Opus 4.7 maintains a defined voice across 8,000+ token outputs with less regression to neutral middle tone than GPT-5.4 demonstrates under the same conditions.

How does GPT-5.4 compare to Claude Opus 4.7 for writing tasks?

GPT-5.4 outperforms Claude Opus 4.7 on constraint adherence and structured-output journalism, making it better for citation-heavy or tightly formatted prose. Opus 4.7 leads on sustained voice and stylistic consistency. GPT-5.4 also reduced hallucination rates on cited claims by roughly 40% compared to GPT-5.1.

What is chain-of-density rewriting and when should I use it?

Chain-of-density rewriting is a multi-pass pattern that progressively compresses padded prose into information-dense output without losing key points. It targets thin, filler-heavy drafts and runs over approximately three passes. Claude Opus 4.7 handles it better than GPT-5.4 for narrative and long-form content.

What does the multi-pass critic loop pattern fix in frontier models?

The multi-pass critic loop addresses subtle logic errors and weak or unsupported claims that even Claude Opus 4.7 and GPT-5.4 produce under naive prompts. It runs the same content through a self-critique stage before finalizing output, at the cost of 2–3× the token usage versus single-pass generation.

Do GPT-5.5 or Gemini 3.1 Pro require different prompt patterns than GPT-5.4?

The six patterns described are validated primarily on Claude Opus 4.7 and GPT-5.4, the current defaults for high-stakes prose. GPT-5.5 extends context to 1.05M tokens but the article notes that capability isn’t necessary for most writing workflows, and pattern logic transfers with minor adaptation.

How much token overhead do advanced prompt patterns typically add to requests?

Overhead varies by pattern: role-conditioned drafting adds roughly 200 tokens, constraint-led generation around 150 tokens, and persona-anchored voice lock approximately 500 tokens. Multi-pass critic loops carry the highest cost at 2–3× normal usage due to multiple generation and critique cycles.

“`

Get Free Access to 40,000+ AI Prompts for ChatGPT, Claude & Codex

Subscribe for instant access to the largest curated Notion Prompt Library for AI workflows.

More on this

The 2026 Prompt Library: 7 Templates for AI Coding

Reading Time: 14 minutes
“`html [IMAGE_PLACEHOLDER_HEADER] ⚡ TL;DR — Key Takeaways What it is: A structured library of seven reusable prompt templates engineered for 2026 frontier AI coding models including GPT-5.2-Codex and Claude Sonnet 4.6, covering workflows from greenfield scaffolding to agentic multi-step implementation…

GPT-5.4 vs OpenAI Codex: The 2026 Head-to-Head Comparison

Reading Time: 12 minutes
“`html [IMAGE_PLACEHOLDER_HEADER] ⚡ TL;DR — Key Takeaways What it is: A comprehensive, technical comparison of GPT-5.4 vs OpenAI Codex (gpt-5.4-codex) covering benchmarks, pricing, API nuances, agentic workflow differences, and practical use cases in 2026. Who it’s for: Software engineers, machine…