The Open Stackfor Self-Improving Agents
Train, deploy, and continuously improve your own models on an integrated stack for compute, RL post-training, environments, evals, and inference.
Backed by
Lab. Post-train your own self improving agents
FIG.1
Hosted evaluations for you to benchmark the performance of your models.
FIG.2
Train large-scale models optimized for agentic workflows.
FIG.3
Dedicated or serverless inference for your custom models, with native LoRA support.
FIG.4
Feed production data back into training to compound model performance over time.
“Prime Intellect's Lab product auto-instruments your favorite coding agent out of the box — it's a huge differentiator that makes everything 10x easier. We've had vibe coding and vibe finance; now we're getting vibe RL”
Alex Shevchenko
Head of Applied Research
“Evals are the foundation for building better agents. Prime Intellect helps turn them into real improvement loops.”
Robin Salimans
Principal AI Engineer

Environment Hub
Access and contribute to 2,500+ open-source RL environments and a community of researchers and developers.
opencode-science
Solve science problems using OpenCode agent via...
deepdive
DeepDive QA RL environment with a Serper-powered search tool
rubric-discovery
Meta-environment for learning rubric functions from labeled...
mini-swe-agent-plus
Mini SWE Agent Plus environment for solving SWE issues inside Pri...
deepdive
DeepDive QA RL environment with a Serper-powered search tool
science-env
A collection of challenging single-turn science problems
hud-text-2048
Text-based 2048 game for training agents to reach target tiles through strategic moves
hud-text-2048
Text-based 2048 game for training agents to reach target tiles through strategic moves
will/tau2-bench
Verifiers implementation of tau2-bench
import verifiers as vf vf_env = vf.ToolEnv( dataset=dataset, parser=parser, rubric=rubric, tools=tool_list, max_turns=10, )
A library of modular components for creating RL environments and training LLM agents.
uv run rl \ --trainer @ examples/reverse_text/ rl/train.toml \ --orchestrator @ examples/ reverse_text/rl/orch.toml \ --inference @ examples/ reverse_text/rl/infer.toml
A framework for asynchronous reinforcement learning (RL) at scale.
deepswe-sandbox-1
python:3.11-slim
deepcoder-sandbox-1
python:3.11-slim
i3-math-sandbox-1
python:3.11-slim
For secure code execution optimized for large-scale reinforcement learning.
Compute. Find reliable compute operated globally from a single GPU to largest clusters.
On demand
Instant access to 1-256 GPUs.
Use your GPUs across clouds in a single platform.
FIG.5
H200
$1.99/HR
H200
$1.80/HR
H200
$1.23/HR
H200
$0.47/HR
B300
$4.99/HR
B200
$3.49/hr
H200
$3.14/HR
H100
$2.43/HR
Spot 0.94/HR
GH200
$3.14/HR
RTX Pro 6000
$3.14/HR
A100
$3.14/HR
A40
$3.14/HR
Liquid Reserved Clusters
Request large-scale clusters from 50+ providers.
Sell-back idle GPUs to our spot market.
FIG.7
Enter GPU name..
B300 SXM6 x 512
$5.00/HR/GPU
TOTAL $2,560/hr
Profit on idle capacity
+$20,183,040
Research. Our Contributions to the Frontier of Open-Source AI
DISCOVERINTELLECT-3
A 100B+ parameter Mixture-of-Experts model trained on our RL stack.
Customer Stories
How Zapier Turned AutomationBench Into a Continuous Agent Improvement Loop
How Ramp Trained a Small RL Subagent to Beat Frontier Models at Spreadsheet Retrieval
How Arcee trained Trinity Large (400B) on 2,048 GPUs
Join Prime Intellect
We are seeking the most ambitious developers to join our team — in San Francisco or remotely. Please send us examples of your exceptional work.








