The Open Stackfor Self-Improving Agents

Train, deploy, and continuously improve your own models on an integrated stack for compute, RL post-training, environments, evals, and inference.

START TRAINING

BOOK A CALL

Backed by

Founders FundAndrej KarpathyDylan PatelClem DelangueTri Dao

Lab. Post-train your own self improving agents

Start Training

Book a Demo

FIG.1

01Evaluations

Hosted evaluations for you to benchmark the performance of your models.

1.1100+ open-source models

1.2No infra, no setup.

1.3Public leaderboard

Run your first eval

FIG.2

Reward0.01

10.80.60.40.20

max_steps10,000

rollouts_per_example19

seq_len4

batch_size65536

max_tokens256

learning_rate0.00005

02Hosted Training

Train large-scale models optimized for agentic workflows.

2.1Train on 2,500+ RL environments

2.2Managed training workflows with full visibility and control

2.3Hands on support from our applied research team

Start Training

FIG.3

03Deployments

Dedicated or serverless inference for your custom models, with native LoRA support.

3.11-click deployment for any fine-tuned model

3.2LoRA adapters served alongside base models

3.3Zero config. No setup.

Deploy model

FIG.4

04Improve

Feed production data back into training to compound model performance over time.

4.1Evaluate model quality against your own benchmarks

4.2Route insights back into fine-tuning. Close the loop from deploy to retrain.

“Prime Intellect's Lab product auto-instruments your favorite coding agent out of the box — it's a huge differentiator that makes everything 10x easier. We've had vibe coding and vibe finance; now we're getting vibe RL”

Alex Shevchenko

Head of Applied Research

“Evals are the foundation for building better agents. Prime Intellect helps turn them into real improvement loops.”

Robin Salimans

Principal AI Engineer

Environment Hub

Access and contribute to 2,500+ open-source RL environments and a community of researchers and developers.

Explore Environments

Explore

My Stars

My Environments

Featured9

Show All

primeintellect

opencode-science

Solve science problems using OpenCode agent via...

scienceopencode+1

Updated 8 days ago

v0.3.8

primeintellect

deepdive

DeepDive QA RL environment with a Serper-powered search tool

rlqa+1

Updated 11 days ago

v0.2.5

stochi0

rubric-discovery

Meta-environment for learning rubric functions from labeled...

rlmtraining+4

Updated 2 months ago

v0.2.0

INTELLECT-33

primeintellect

mini-swe-agent-plus

Mini SWE Agent Plus environment for solving SWE issues inside Pri...

swesandbox+1

Updated 3 days ago

v0.2.23

primeintellect

deepdive

DeepDive QA RL environment with a Serper-powered search tool

rlqa+1

Updated 11 days ago

v0.2.5

primeintellect

science-env

A collection of challenging single-turn science problems

sciencesingle-turn

Updated 11 days ago

v0.1.3

Evals13

Show All

hud

hud-text-2048

Text-based 2048 game for training agents to reach target tiles through strategic moves

gametext+2

Updated 7 months ago

v0.1.0

hud

hud-text-2048

Text-based 2048 game for training agents to reach target tiles through strategic moves

gametext+2

Updated 7 months ago

v0.1.0

will

will/tau2-bench

Verifiers implementation of tau2-bench

tool-agent-usertool-use+2

Updated 2 months ago

v0.1.0

Verifiers

123456789

import verifiers as vf
vf_env = vf.ToolEnv(
        dataset=dataset,
        parser=parser,
        rubric=rubric,
        tools=tool_list,
        max_turns=10,
    )

A library of modular components for creating RL environments and training LLM agents.

Prime-RL

uv run rl \ 
--trainer @ examples/reverse_text/
rl/train.toml \
  --orchestrator @ examples/
reverse_text/rl/orch.toml \
  --inference @ examples/
reverse_text/rl/infer.toml

A framework for asynchronous reinforcement learning (RL) at scale.

Sandboxes

deepswe-sandbox-1

python:3.11-slim

deepcoder-sandbox-1

python:3.11-slim

i3-math-sandbox-1

python:3.11-slim

For secure code execution optimized for large-scale reinforcement learning.

Compute. Find reliable compute operated globally from a single GPU to largest clusters.

FIND COMPUTE

BOOK A DEMO

On demand

Instant access to 1-256 GPUs.
Use your GPUs across clouds in a single platform.

1.1SLURM, K8s OrchestrationOrchestrate dynamic workloads with enterprise-grade scheduling and container automation.

1.2Infiniband NetworkingScale distributed training with high-bandwidth interconnects across nodes.

1.3Grafana Monitoring DashboardsVisualize metrics in real time with customizable dashboards for full system observability.

GET COMPUTE

FIG.5

H200

Availablex2 · x1

$1.99/HR

80 GB VRAM·184 GB RAM·32 vCP

H200

Availablex2 · x1

$1.80/HR

80 GB VRAM·184 GB RAM·32 vCP

H200

Availablex2 · x1

$1.23/HR

80 GB VRAM·184 GB RAM·32 vCP

H200

Availablex2 · x1

$0.47/HR

80 GB VRAM·184 GB RAM·32 vCP

B300

Availablex2 · x1

$4.99/HR

288 GB VRAM·480 GB RAM·48 vCP

B200

Availablex2 · x1

$3.49/hr

192 GB VRAM·384 GB RAM·32 vCP

H200

Availablex2 · x1

$3.14/HR

141 GB VRAM·182 GB RAM·44 vCPUs

H100

Availablex2 · x1

$2.43/HR

Spot 0.94/HR

80 GB VRAM·185 GB RAM·32 vCPUs

GH200

Availablex2 · x1

$3.14/HR

96 GB VRAM·480 GB RAM·72 vCP

RTX Pro 6000

Availablex2 · x1

$3.14/HR

96 GB VRAM

A100

Availablex2 · x1

$3.14/HR

80 GB VRAM

A40

Availablex2 · x1

$3.14/HR

48 GB VRAM

Liquid Reserved Clusters

Request large-scale clusters from 50+ providers.
Sell-back idle GPUs to our spot market.

1.1Get quotes from 50+ datacenters within 24 hoursOne request, parallel bids for options, from H100, H200, to B200, B300, GB300 NVL72

1.2Re-sell idle GPUs back to our spot marketResell idle node on our spot market or put on our spot market with no manual ops. Reclaim capacity instantly when you need it

1.3Direct assistance from our research and infra engineering teamDedicated solutions engineer from cluster bring-up through steady-state

GET A QUOTE

BOOK A CALL

FIG.7

Enter GPU name..

B300 SXM6 x 512

SXM6

3-YEAR RESERVED

$5.00/HR/GPU

TOTAL $2,560/hr

Reserved cost (3yrs)

$67,276,800

Idle hrs resold

6,727,680 hrs

Cost of idle capacity

$33,638,400

Revenue at $8.00/hr/gpu

$53,821,440

Profit on idle capacity

+$20,183,040

Research. Our Contributions to the Frontier of Open-Source AI

DISCOVER

Announcements

INTELLECT-3: A 100B+ MoE trained with large-scale RL

Today, we release INTELLECT-3, a 100B+ parameter Mixture-of-Experts model trained on our RL stack, achieving state-of-the-art performance for its size across math, code, science and reasoning benchmarks, outperforming many larger frontier models.

SYNTHETIC-2 Release

Four million collaboratively generated reasoning traces.

INTELLECT-2 Release

The first 32B model trained through globally distributed RL.

SYNTHETIC-1 Release

Two million collaboratively generated reasoning traces from DeepSeek-R1.

Hugging Face

News & Insights.

SEE ALL