Open source reliability infrastructure

Ship Reliable AI Agents.
Every Time.

Trace every node, evaluate every change, monitor production health, and catch regressions before users do.

View Live Demo $ pip install cortexops
Open SourceMIT License12 FrameworksCI Ready
Hero Dashboard
Success badge Failure badge
Node
running...
Latency
updating...
Health Score
changing...
classify_intent
1.18s
tool call animated...
active
evaluate_policy
890ms
tool: issue_refund
2.01s
Trace expanding... PaymentGatewayTimeout after tool call
Trusted by

Built for teams shipping agents to production.

Open Source
MIT
Python SDK
GitHub Actions
Works with LangGraph
Works with CrewAI
Supported Frameworks

One integration. Every framework.

Dedicated landing pages for the stacks teams actually use — instrument without rewrites.

How it Works

Trace. Evaluate. Monitor.

Three production disciplines that turn opaque agent behavior into a system your team can operate.

1. Trace every run

Wrap your agent once and capture nodes, tools, state, latency, and failure details.

from cortexops import CortexTracer tracer = CortexTracer(project="agent") agent = tracer.wrap(graph)

2. Evaluate in CI

Run golden datasets in CI and stop regressions before they reach production.

cortexops eval run \ --dataset gold.yaml \ --fail-on "score < 0.9"

3. Monitor production

Watch health, drift, latency, and alerts after your agents are live.

tracer.monitor( alert="slack", on="quality_drop" )
Dashboard Screenshots

Everything in the CortexOps console.

From overview to API keys — the product surface teams use day to day at app.getcortexops.com.

app.getcortexops.com · Overview

Overview

Project health, live status, and the signals that matter first.

app.getcortexops.com · Projects

Projects

Organize agents by project with isolated keys and retention.

app.getcortexops.com · Traces

Traces

Node waterfall, tool calls, latency, and failure context.

app.getcortexops.com · Evaluations

Evaluations

Golden datasets, pass rates, regressions, and judge scores.

app.getcortexops.com · Prompt Versions

Prompt Versions

Track prompt changes against evals and roll back safely.

app.getcortexops.com · Datasets

Datasets

Versioned golden cases for CI and local eval runs.

app.getcortexops.com · Metrics

Metrics

Task completion, latency, error rate, and drift over time.

app.getcortexops.com · Alerts

Alerts

Route quality drops, timeouts, and anomalies to your channels.

app.getcortexops.com · API Keys

API Keys

Create, rotate, and revoke keys with least privilege.

app.getcortexops.com · Usage

Usage

Understand volume and plan limits without surprise bills.

app.getcortexops.com · Settings

Settings

Projects, retention, integrations, and team preferences.

Open dashboard
Why CortexOps

Here is why every AI engineering team needs CortexOps.

Trace Explorer

Full agent waterfall with nodes, tools, branches, latency, state, and failure context.

Evaluation

LLM-as-judge scoring, golden datasets, pass rates, and semantic quality checks.

Monitoring

Production health, latency, drift, anomaly, and cost signals in one view.

Prompt Version

Connect prompt changes to evals and traces so teams can roll back regressions.

CI/CD Gates

GitHub Actions-ready eval gates that fail builds when quality drops.

Alerts

Route failures, drift, latency spikes, and quality drops to your team channels.

CapabilityCortexOpsLangSmithLangfuseArize Phoenix
Agent execution tracingFull waterfallLangChain focusedLLM callsSpan tree
Framework support12 frameworksLangChainVia SDKSeveral
CI/CD eval gateFirst-class CLIPartialManualScripted
Open sourceMITNoYesElastic v2
Production alertsQuality, drift, latencyLimitedLimitedYes
Pricing

Start free. Scale when you are ready.

Free

$0

For side projects and evaluation.

  • Core tracing
  • Local eval runs
  • Python SDK
  • Community support
Start free

Enterprise

Custom

For compliance, scale, and private deployment.

  • Everything in Pro
  • SSO / SAML
  • VPC or self-hosted deploy
  • Dedicated support
Contact sales
FAQ

Questions, answered.

Which frameworks do you support?
CortexOps supports LangGraph, CrewAI, OpenAI Agents SDK, PydanticAI, Google ADK, Smolagents, LlamaIndex, Haystack, AutoGen, DSPy, Agno, and any callable wrapper. Dedicated pages cover the most common stacks.
How is this different from LangSmith or Langfuse?
Those tools are strongest around LLM calls. CortexOps is designed around agent execution: nodes, tools, state transitions, eval gates, monitoring, and alerts.
Can we self-host?
Yes. CortexOps is open source and MIT licensed, with a Python SDK and deployment paths for teams that need control over data.
Does it work in CI?
Yes. The eval command can run in GitHub Actions and fail a build when score, regression, latency, or task-completion thresholds are missed.
Where is the live dashboard?
Open app.getcortexops.com with your API key. Marketing lives here; the authenticated console is on the app subdomain.

Ship agents you can trust.

Developers love demos. Start with the live preview, then connect your first project when you are ready.