v0.3.0 live on PyPI — pip install cortexops

Stop shipping agents
you can't trust

Evaluate, observe, and gate LangGraph and CrewAI agents before they reach production. Built by a Senior AI Engineer at PayPal.

View on GitHub
Python
from cortexops import CortexTracer, EvalSuite

tracer = CortexTracer(project="payments-agent")  # key auto-loaded
agent  = tracer.wrap(your_langgraph_app)   # zero refactoring

results = EvalSuite.run(
    dataset="golden_v1.yaml",
    agent=agent,
    fail_on="task_completion < 0.90",  # CI gate
)
print(results.summary())
5 built-in eval metrics
1-line instrumentation
CI gate — blocks PRs
No per-trace billing
MIT licensed · open source
Pro plan

See inside every agent run.
In real time.

Click any trace row to see the node waterfall — exactly which step took how long, which tools were called, and what the output was. Debug a 2am incident in 30 seconds.

app.getcortexops.com
CortexOps Observability
project payments-agent
Live · 5s
Task completion
94.2%
↑ 2.1%
Error rate
2.3%
↓ 0.8%
Avg latency
487ms
P95 latency
1,240ms
Total traces
1,847
ID
Case
Latency
Failure
Time
4398c8e8refund_approved342ms09:24:11
b21fa3c2balance_check198ms09:24:08
9f3e1d77dispute_escalation3,240msTIMEOUT09:23:55
c84ab910refund_approved287ms09:23:41
e12cd456kyc_verification743ms09:23:30
a77bf219fraud_detection1,820msHALLUCI...09:23:18
Health
Success97.7%
Eval gatePassing
Regressions0
Failures
TIMEOUT2
HALLUCI...1
Latency
<200ms312
200–500498
>1s23
Open live dashboard →

From prototype to
trusted production

Four steps. No refactoring. Works with any LangGraph or CrewAI agent.

1
Instrument
Wrap any agent in one line. CortexTracer auto-detects LangGraph and CrewAI.
tracer = CortexTracer(project="my-agent") agent = tracer.wrap(your_app)
2
Define
Write golden datasets in YAML. Expected keywords, tool calls, latency budgets.
expected_output_contains: - refund - approved max_latency_ms: 3000
3
Gate
One fail_on expression. PRs are blocked when quality drops — automatically.
EvalSuite.run( fail_on="task_completion < 0.90")
4
Observe
Live traces, Slack alerts, waterfall debug view. Root cause in 30 seconds.
GET /v1/traces ?project=payments → node waterfall

Everything your agent needs
to ship safely

Zero-config instrumentation
Wraps LangGraph, CrewAI, or any callable with one line. No decorators, no config files, no refactoring.
Free
Golden dataset evals
YAML-based test cases with expected outputs, tool calls, and latency budgets. Run locally or in CI.
Free
CI eval gate
Block PRs with a single fail_on expression. Works with GitHub Actions, GitLab CI, any CI system.
Free
Hosted trace storage
90-day retention. Node waterfall, tool calls, latency breakdown. Live dashboard at app.getcortexops.com.
Pro
Slack + webhook alerts
Get paged when production regresses — before your users notice. Configurable thresholds per project.
Pro
LLM-as-judge scoring
GPT-4o evaluates open-ended outputs against your criteria. Heuristic fallback always included.
Pro

Know your bill
before you ship

LangSmith charges $39/seat plus $2.50–$5.00 per 1,000 traces. At 50k traces/month that's $164 per seat. CortexOps is $49/seat. Flat.

Capability
CortexOps
LangSmith
Pricing model
$49/seat flat
$39/seat + trace fees
Trace cost
Unlimited — included
$2.50–$5.00 / 1k
CI eval gate
1-line fail_on
Manual setup required
Framework lock-in
None — any agent
Best with LangChain
Payments domain
Built-in templates
Not available
Free local evals
Unlimited
5k traces/month
Open source
MIT licensed
Proprietary

Start free.
Scale with your agents.

No credit card required for free tier. Pro starts with a cancel anytime.

Free
$0
Full SDK, unlimited local evals, CI gate. Forever free.
  • pip install cortexops
  • Unlimited local eval runs
  • GitHub Actions CI gate
  • Golden dataset YAML format
  • CLI tool
Enterprise
Custom
VPC deployment, SSO, SLA guarantee, dedicated support.
  • Everything in Pro
  • VPC / on-prem deployment
  • SSO / SAML
  • Custom data retention
  • SLA guarantee

Complete docs at
docs.getcortexops.com

18 pages covering installation, golden datasets, CI gate, LangGraph, CrewAI, API reference, and more. No GitHub redirect.

Getting started
Quickstart CI eval gate LangGraph
API reference
Traces API SDK reference

Quickstart

Install the SDK and run your first eval in under 2 minutes.

# Install
pip install cortexops

from cortexops import CortexTracer, EvalSuite

tracer = CortexTracer(project="my-agent")
agent  = tracer.wrap(my_agent)

results = EvalSuite.run(
    dataset="golden_v1.yaml",
    agent=agent,
    verbose=True,
)
Read full docs →