Claude prompt compression before Anthropic API calls

Claude applications often keep quality high by sending broad context. That works until repeated chunks, tool traces, old chat turns, and policy boilerplate make each call more expensive than it needs to be.

Adola compresses the final context block before the model call. Rose 1 returns plain compressed text plus a receipt with original tokens, output tokens, saved tokens, compression ratio, latency, and risk flags.

Assemble contextCompressCall ClaudeReturn answerLog receipt

Best fits

Claude RAG calls

Compress retrieved documents before adding them to the Claude message context.

Tool-using agents

Reduce tool output and previous steps before the next Claude planning call.

Support copilots

Shrink ticket history and policy text before Claude drafts a support response.

Minimal server-side pattern

Keep both keys on your server. The Anthropic request receives the compressed context, while your application logs the receipt for cost and quality review.

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

export async function answerWithCompressedClaude({ question, context }) {
  const compressed = await fetch("https://api.adola.app/v1/compress", {
    method: "POST",
    headers: {
      "content-type": "application/json",
      authorization: `Bearer ${process.env.ADOLA_API_KEY}`
    },
    body: JSON.stringify({
      model: "rose-1",
      query: question,
      input: context,
      compression: { target_ratio: 0.35, preserve_order: true }
    })
  }).then((response) => response.json());

  const message = await anthropic.messages.create({
    model: "claude-3-5-haiku-latest",
    max_tokens: 600,
    messages: [
      {
        role: "user",
        content: `Question: ${question}\n\nCompressed context:\n${compressed.output}`
      }
    ]
  });

  return { message, compressionReceipt: compressed.receipt };
}

Try the compression hop first

The public demo endpoint is capped and does not require a key. Use one real prompt to inspect the receipt before changing your Claude path.

curl -s https://api.adola.app/v1/demo/compress \
  -H 'content-type: application/json' \
  --data '{
    "model": "rose-1",
    "query": "What should Claude answer?",
    "input": "Long retrieved context, tool trace, support ticket, or policy text...",
    "compression": { "target_ratio": 0.35, "preserve_order": true }
  }'