Inference providers
The harness brokers all LLM calls coming out of the Sandbox VM. This page describes the provider model, the supported back-ends, and how to add a new one.
🚧 Stub. Fill in once the provider plugin interface is finalized.
The contract (TBD)
A provider is anything that can answer:
POST /v1/chat/complete
request body: { messages: [...], tools: [...], model: "...", stream: bool }
response: streaming JSON events (delta, tool_call, finish, error)
All providers must conform to this minimal interface; the harness translates from the workbench's internal event model into provider-native calls.
📝 TODO: write the Go interface signature, document the streaming contract, and link to the reference implementation (probably the Bedrock adapter, since that's the most thorough today).
Bedrock (default for development)
In development, the harness talks to Amazon Bedrock using a long-term
API key (a service-specific credential) minted from the
kameas-ai-sandbox AWS account. The minting flow is documented at
the org level; the harness just expects an env var or a keychain entry
and picks it up at startup.
Models we use today:
anthropic.claude-sonnet-*— default for chat- (others as needed; document here once they're pinned)
Direct providers
🚧 TODO once the adapters land:
- Anthropic (
api.anthropic.com)- OpenAI (
api.openai.com)- OpenRouter (
openrouter.ai) — multi-model gateway, useful for evals- Bento — self-hosted inference, target customer is a compliance-heavy org that can't send prompts to a SaaS provider
BYO endpoint
For enterprise customers running their own inference (typically an OpenAI-compatible endpoint behind their VPN), the harness ships a generic OpenAI-compatible adapter. The customer configures:
- Base URL
- API key (lives in OS keychain, never leaves the host)
- Model name(s) to expose to the user
📝 TODO: document the config schema once it's locked in.
Adding a new provider
🚧 TODO. Should cover: implementing the provider interface, adding a config schema entry, registering with the broker, and writing the smoke test against a fake server.