Convoy
Convoy is an A/B testing platform for AI agents. Any change to a model, prompt, tool, workflow step, or sandbox can be tested against real user traffic. Convoy routes a small slice of requests to the new version, evaluates the results, and automatically promotes or rolls back the change.Integration
Every AI agent you want to test on Convoy must be triggered by an API call. Each endpoint is a testable unit — whether it’s a microservice API, a route in a monolith, or any unit you want to A/B test end-to-end. There are two integration points:Client
In the code that calls your agent, redirect requests through the Convoy proxy.
Agent Backend
In the agent itself, verify Convoy’s signature and report metrics back.
How it works
- Your client sends a request to the Convoy proxy instead of directly to your agent
- Convoy routes it to the correct version (stable or test) and forwards it to your agent backend
- Your agent verifies the request came from Convoy, processes it, and reports metrics (latency, outcome, cost, input/output) back to Convoy
- Convoy uses these metrics — along with an LLM judge — to automatically promote or roll back test versions
Before you start
In the Convoy platform, create an agent to get:- Proxy URL — where your client sends requests (e.g.
acme--chatbot.proxy.convoylabs.com) - Shared secret — used by both the client (as a bearer token) and the agent (for signature verification and metric reporting)
- Stable URL — your production agent backend
- Testing URL — your test environment for the new version of your agent

