Skip to main content

Convoy

Convoy is an A/B testing platform for AI agents. Any change to a model, prompt, tool, workflow step, or sandbox can be tested against real user traffic. Convoy routes a small slice of requests to the new version, evaluates the results, and automatically promotes or rolls back the change.

Integration

Every AI agent you want to test on Convoy must be triggered by an API call. Each endpoint is a testable unit — whether it’s a microservice API, a route in a monolith, or any unit you want to A/B test end-to-end. There are two integration points:

How it works

  1. Your client sends a request to the Convoy proxy instead of directly to your agent
  2. Convoy routes it to the correct version (stable or test) and forwards it to your agent backend
  3. Your agent verifies the request came from Convoy, processes it, and reports metrics (latency, outcome, cost, input/output) back to Convoy
  4. Convoy uses these metrics — along with an LLM judge — to automatically promote or roll back test versions

Before you start

In the Convoy platform, create an agent to get:
  • Proxy URL — where your client sends requests (e.g. acme--chatbot.proxy.convoylabs.com)
  • Shared secret — used by both the client (as a bearer token) and the agent (for signature verification and metric reporting)
  • Stable URL — your production agent backend
  • Testing URL — your test environment for the new version of your agent
Both your stable and testing environments must be running and reachable. Convoy routes traffic to both — stable serves your current production version, and the testing URL serves the new version you want to evaluate. Make sure the test environment is up before deploying a new version.

Use with AI tools

These docs are available as an MCP server. Connect it to your AI tool to use Convoy docs as context while coding.
claude mcp add --transport http convoy https://docs.convoylabs.com/mcp