Skip to main content

Platform Setup

This guide walks you through setting up Convoy from sign-up to your first live test.
1

Sign up

Go to app.convoylabs.com and create your account. This creates your organization — all agents, tests, and team members live under it.
2

Create an agent

An agent represents one testable endpoint — a model, prompt, workflow, or any unit you want to A/B test.
  1. Click Create Agent
  2. Configure your two environments:
    • Stable URL — your current production backend (e.g. https://api.acme.com/agent)
    • Testing URL — the environment running your new version (e.g. https://api-test.acme.com/agent)
  3. Convoy generates two values you need for integration:
    • Proxy URL — where your client sends requests (e.g. acme--chatbot.proxy.convoylabs.com)
    • Shared secret — used by the client as a bearer token and by the agent for signature verification
    Agent creation form
Both your stable and testing environments must be running and reachable. Convoy routes traffic to both — stable serves your current production version, and the testing URL serves the new version you want to evaluate.
Now integrate Convoy into your code:
3

Deploy a test

Once integrated and both environments are live, deploy a test to start routing traffic to the new version.
  1. Open your agent and click Deploy Test
  2. Configure the judge:
    • Judge model — the LLM that evaluates each session
    • Judge prompt — describes what to evaluate for your specific change. The judge receives each session’s input and output (reported via the session ingest endpoint) and scores it.
  3. Set thresholds that control automatic decisions:
    • Promote — when the test version meets this bar, Convoy increases its traffic share
    • Rollback — when the test version falls below this bar, Convoy cancels the test and sends all traffic back to stable
    Deploy test form
Thresholds apply to all metrics reported through the session ingest endpoint — outcome rates, latency, cost, and judge scores.
You can modify advanced rollout plan and evaluation settings based on your traffic level.
4

Monitor and act

After deploying, the agent page is your control center:
  • Rollout status — current traffic split, session counts, and judge scores
  • Pause — freeze the traffic split to investigate
  • Modify traffic — manually adjust the percentage going to the test version
  • Roll back — cancel the test and send all new sessions to stable
  • Promote — mark the test as the new stable. Merge your changes in your codebase, deploy to your stable environment, then promote on Convoy to route all traffic to stable Agent dashboard