Claude applications often keep quality high by sending broad context. That works until repeated chunks, tool traces, old chat turns, and policy boilerplate make each call more expensive than it needs to be.
Adola compresses the final context block before the model call. Rose 1 returns plain compressed text plus a receipt with original tokens, output tokens, saved tokens, compression ratio, latency, and risk flags.
Best fits
Claude RAG calls
Compress retrieved documents before adding them to the Claude message context.
Tool-using agents
Reduce tool output and previous steps before the next Claude planning call.
Support copilots
Shrink ticket history and policy text before Claude drafts a support response.
Minimal server-side pattern
Keep both keys on your server. The Anthropic request receives the compressed context, while your application logs the receipt for cost and quality review.
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
export async function answerWithCompressedClaude({ question, context }) {
const compressed = await fetch("https://api.adola.app/v1/compress", {
method: "POST",
headers: {
"content-type": "application/json",
authorization: `Bearer ${process.env.ADOLA_API_KEY}`
},
body: JSON.stringify({
model: "rose-1",
query: question,
input: context,
compression: { target_ratio: 0.35, preserve_order: true }
})
}).then((response) => response.json());
const message = await anthropic.messages.create({
model: "claude-3-5-haiku-latest",
max_tokens: 600,
messages: [
{
role: "user",
content: `Question: ${question}\n\nCompressed context:\n${compressed.output}`
}
]
});
return { message, compressionReceipt: compressed.receipt };
}Try the compression hop first
The public demo endpoint is capped and does not require a key. Use one real prompt to inspect the receipt before changing your Claude path.
curl -s https://api.adola.app/v1/demo/compress \
-H 'content-type: application/json' \
--data '{
"model": "rose-1",
"query": "What should Claude answer?",
"input": "Long retrieved context, tool trace, support ticket, or policy text...",
"compression": { "target_ratio": 0.35, "preserve_order": true }
}'