LangChain ContextualCompressionRetriever alternative

LangChain's ContextualCompressionRetriever is useful when the bottleneck is too many retrieved documents. Production RAG and agent systems often have a later bottleneck too: the final prompt also contains policies, memory, tool output, prior turns, and formatting instructions.

That is where Adola fits. Keep your existing retriever. Join the context exactly as your chain or graph already does, send that block to Rose 1 with the user query, then pass the compressed output to your model.

RetrieveRerankAssembleCompressCall LLM

How it differs

Retriever filters

LangChain compression retrievers usually shrink the retrieved document set before prompt assembly.

Prompt compression

Rose 1 compresses the final context block you were about to send to the model.

Provider-neutral output

The result is plain text that can go to OpenAI, Claude, DeepSeek, local models, or a gateway.

Server-side pattern

const context = retrievedDocs
  .map((doc, i) => `[doc ${i + 1}] ${doc.pageContent}`)
  .join("\n\n");

const compressed = await fetch("https://api.adola.app/v1/compress", {
  method: "POST",
  headers: {
    "content-type": "application/json",
    authorization: `Bearer ${process.env.ADOLA_API_KEY}`
  },
  body: JSON.stringify({
    model: "rose-1",
    query: question,
    input: context,
    compression: { target_ratio: 0.35, preserve_order: true }
  })
}).then((response) => response.json());

const answer = await model.invoke([
  ["system", "Answer from the supplied context. Say when context is insufficient."],
  ["human", `Question: ${question}\n\nContext:\n${compressed.output}`]
]);

Try it without a key

The capped demo endpoint lets you test one real retrieval block before creating a workspace or touching your production chain.

curl -s https://api.adola.app/v1/demo/compress \
  -H 'content-type: application/json' \
  --data '{
    "model": "rose-1",
    "query": "What should the answer say?",
    "input": "Paste retrieved documents or LangGraph tool output here...",
    "compression": { "target_ratio": 0.35, "preserve_order": true }
  }'