Authors

Manveer

Jannik

Matthew

Felix

Johannes

Vincent

Announcements

February 6, 2025

SYNTHETIC-1: Scaling Distributed Synthetic Data Generation for Verified Reasoning

Today, we are excited to introduce SYNTHETIC-1, a collaborative effort to create the largest open-source dataset of verified reasoning traces for math, coding and science, leveraging DeepSeek-R1. Our dataset consists of 1.4 million high-quality tasks and verifiers, designed to advance reasoning model training.

We invite everyone to contribute compute and join us in our effort to scale distributed reinforcement learning to o3-scale and beyond.

In our recent post, Decentralized Training in the Inference-Time-Compute Paradigm, we explored how this paradigm shift will fundamentally reshape compute infrastructure, making globally distributed training the way forward.

The DeepSeek-R1 paper highlights the importance of generating cold-start synthetic data for RL. As our first step toward state-of-the-art reasoning models, SYNTHETIC-1 generates verified reasoning traces across math, coding, and science using DeepSeek-R1.

Our Contributions

SYNTHETIC-1: Reasoning Dataset & Public Run: We are releasing our dataset of 1.4 million high-quality tasks and verifiers and are allowing anyone to contribute compute to our synthetic data generation run.

GENESYS: Synthetic Data Generation Framework easily extendable open-source library for synthetic data generation and verification and a call for crowdsourcing tasks & verifiers.

Contribute compute

‍

‍The Road to Fully Open-Source Reasoning Models

With the release of Deepseek-R1 and our own INTELLECT-MATH model, we have gained new insights into training state-of-the-art reasoning models such as o1.

The DeepSeek team first trained DeepSeek-R1-Zero entirely via reinforcement learning, using Group Relative Policy Optimization (GRPO) from DeepSeek v3. They then used R1-Zero to generate cold-start long Chain-of-Thought reasoning data to fine-tune DeepSeek v3. Finally, they applied GRPO training again on the resulting SFT model, producing the stronger DeepSeek-R1 model.

Key Findings from DeepSeek-R1:

Cold-start data for SFT significantly improves model performance, making R1 much stronger than R1-Zero.
Distillation from a strong teacher model is highly effective, even without additional RL. Whereas smaller models relying on the large-scale RL mentioned in this paper require enormous computational power and may not even achieve the performance of distillation.

Our Next Steps:

Therefore, our open reproduction and scaling of R1 will proceed in two steps, closely mirroring the DeepSeek-R1 approach:

Generate Verified Reasoning Data
- Create the largest dataset of verified reasoning traces across math, coding, and science.
- Open-source these datasets to enable stronger, smaller reasoning models.
📌 DeepSeek trained on only 800k samples for distillation; by scaling verified reasoning chains, we can likely achieve even better models.
Globally Distributed Reinforcement Learning with Verifiable Rewards
- Train SFT models further using reinforcement learning with verifiable rewards.
- Conduct this training in a globally distributed setting, allowing anyone to contribute compute.

‍

SYNTHETIC-1 Tasks & Verifiers

SYNTHETIC-1 consists of 1.4M curated tasks spanning math, coding, software engineering, STEM, and synthetic code understanding. It includes both programmatically verifiable problems (e.g., coding tasks with unit tests) and open-ended reasoning challenges verified using LLM judges. We also introduce a novel recipe to generate synthetic code understanding tasks that are highly challenging for state-of-the art LLMs.

777k Verifiable Math Problems:

To collect verifiable math problems, we use the NuminaMath dataset consisting of mostly high school competition level mathematics questions. We apply LLM-based filtering on top of the data to remove questions that are not automatically verifiable (e.g. questions asking for proofs) and to rewrite multiple-choice questions into a direct question-answer format.

144k Verifiable Coding Problems

We aggregate coding problems along with unit tests from the publicly available Apps, Codecontests, Codeforces and TACO datasets. Since most problems were available for python only, we rewrote them for Javascript, Rust and C++ to go from ~36k problems to ~144k. To verify LLM responses, we use containerized execution environments implemented in our library genesys.

70k Real-World Software Engineering Problems

We process CommitPack, a dataset of real-world Github commits, to curate a set of 70k open-ended software engineering problems. A single task instruction consists of a code file before a commit and an LLM-generated instruction to modify the code - this instruction is synthesized with an LLM that has access to the original commit message and the post-commit state of the file. To score solutions, we use an LLM judge that compares the given solution to the actual state of the file after the commit.

313k Open-Ended STEM Questions

We use the StackExchange dataset to collect questions from a wide range of technical & scientific domains. We apply LLM-based filtering to only select questions that have objectively correct responses (e.g. we filter out questions that explicitly ask for opinions) and that require reasoning rather than just retrieving information. Similar to our software engineering problems, we score responses by using an LLM judge that has access to the most upvoted response to a question.

61k Synthetic Code Understanding Tasks

We propose Synthetic Code Understanding, a task that is highly challenging for state of the art LLMs and can be generated in a fully automated manner without human annotators. The objective of this task is to predict the output of code that applies arbitrary transformations to a string given some test input. To generate task data, we prompt LLMs to generate arbitrary string processing functions and recursively make them more complex using a prompting scheme similar to evol-instruct.

To obtain inputs, we generate both random strings and select snippets from news articles, and feed them through the LLM-generated code to obtain ground truth outputs. To verify a predicted output, we can simply check if the predicted output string matches the ground truth string.

For long input strings and many iterations of making the code more complex, even o1 has a near-0% solve rate on this task. Our final dataset consists of 61k problems of varying complexity for Python, Javascript, Rust and C++.

‍

GENESYS

Genesys is our open-source library for synthetic data generation and verification that powers the release of SYNTHETIC-1. It contains efficient implementations of verifiers such as LLM judges and containerized code execution environments that run asynchronously, making it ideal for both synthetic data generation and particularly also reinforcement learning.

Genesys is designed to be easily extendable: You can generate responses for your own tasks by using a Hugging Face dataset with with our flexible schema, and add your own verifier with minimal implementation effort. Instructions for this can be found in our README.

We agree with Karpathy: one of the most impactful things the open-source community can do is to crowdsource tasks & verifier environments. Great examples of such environments include KernelBench, which evaluates an LLMs’ ability to write efficient GPU kernels, as well as SWE-Gym, which tests LLMs on real world software engineering tasks.

For friends of open source: imo the highest leverage thing you can do is help construct a high diversity of RL environments that help elicit LLM cognitive strategies. To build a gym of sorts. This is a highly parallelizable task, which favors a large community of collaborators.
— Andrej Karpathy (@karpathy) January 29, 2025

‍Get involved: Contribute to our open-source codebase, and reach out on Discord if you’d like to add high-quality tasks and verifiers to our ongoing SYNTHETIC-1 run.

‍

How To Contribute Compute

Anyone can now contribute resources to advance open-source AI through our platform and later on also in a permissionless way with their own hardware.

Dashboard: https://app.primeintellect.ai/intelligence (watch the run and contribute compute)

Preview: PI Protocol

SYNTHETIC-1 marks our first release running entirely on our internal protocol testnet.

We are building a trustless, peer-to-peer protocol for globally distributed compute, enabling decentralized intelligence markets — from decentralized model training, inference, fine-tuning, synthetic data generation to autonomous agents.

Anyone will be able to permissionlessly launch intelligence markets and have communities pool compute and intelligence resources towards the collective creation, ownership and directing of intelligence.

‍

As the next step, we plan to open our protocol infrastructure to permissionless public testnet contributions. Our recent work on TOPLOC, which enables verifiable inference through a locality-sensitive hashing scheme, will serve as the foundation for decentralized, verifiable compute contributions.

More details on our protocol roadmap and architecture will be shared soon!

**Prime Intellect Worker joining SYNTHETIC-1 Compute Pool:** Verifiers send challenges to the Worker to confirm the hardware configuration. The Verifier accepts the Worker into the Compute Pool and assigns tasks. The Worker executes the GENESYS container to generate synthetic data.

‍

Conclusion and Next Steps: Scaling to the Frontier

SYNTHETIC-1 is just the first step towards scaling to state-of-the-art open reasoning models.

We are looking forward to community contributions in the form of compute, code or data to create the largest open reasoning dataset to date.

Furthermore, we are currently extending our training framework prime to support globally distributed reinforcement learning as the next step in our plan.

If this sounds exciting to you, join us. We are a small, talent-dense and highly ambitious team and are looking for engineers and researchers to help us build open source AGI.

‍

Eurus-2-PRIME

System Prompt

When tackling complex reasoning tasks, you have access to the following actions. Use them as needed to progress through your thought process. [ASSESS] [ADVANCE] [VERIFY] [SIMPLIFY] [SYNTHESIZE] [PIVOT] [OUTPUT] You should strictly follow the format below: [ACTION NAME] # Your action step 1 # Your action step 2 # Your action step 3 ... Next action: [NEXT ACTION NAME]

User

Quadratic polynomials $P(x)$ and $Q(x)$ have leading coefficients $2$ and $-2,$ respectively. The graphs of both polynomials pass through the two points $(16,54)$ and $(20,53).$ Find $P(0) + Q(0).$

Eurus-2-PRIME

[ASSESS]

We are given two quadratic polynomials $P(x)$ and $Q(x)$ with leading coefficients $2$ and $-2$, respectively. The graphs of both polynomials pass through the points $(16,54)$ and $(20,53)$. We need to find the value of $P(0) + Q(0)$.

[ADVANCE]

Let's express the quadratic polynomials $P(x)$ and $Q(x)$ in the general form:
$ P(x) = 2x^2 + bx + c $
$ Q(x) = -2x^2 + dx + e $

Since the graphs of both polynomials pass through the points $(16,54)$ and $(20,53)$, we can substitute these points into the equations of $P(x)$ and $Q(x)$ to form a system of equations.

For $P(x)$:
$ P(16) = 2(16)^2 + 16b + c = 54 $
$ 512 + 16b + c = 54 $
$ 16b + c = -458 \quad \text{(Equation 1)} $

$ P(20) = 2(20)^2 + 20b + c = 53 $
$ 800 + 20b + c = 53 $
$ 20b + c = -747 \quad \text{(Equation 2)} $

For $Q(x)$:
$ Q(16) = -2(16)^2 + 16d + e = 54 $
$ -512 + 16d + e = 54 $
$ 16d + e = 566 \quad \text{(Equation 3)} $

$ Q(20) = -2(20)^2 + 20d + e = 53 $
$ -800 + 20d + e = 53 $
$ 20d + e = 853 \quad \text{(Equation 4)} $

[ADVANCE]

Now, we can solve the system of equations for $P(x)$:
Subtract Equation 1 from Equation 2:
$ (20b + c) - (16b + c) = -747 - (-458) $
$ 4b = -289 $
$ b = -\frac{289}{4} $

Substitute $b = -\frac{289}{4}$ into Equation 1:
$ 16\left(-\frac{289}{4}\right) + c = -458 $
$ -1156 + c = -458 $
$ c = 698 $

So, for $P(x)$, we have:
$ P(x) = 2x^2 - \frac{289}{4}x + 698 $
$ P(0) = 698 $

Next, we can solve the system of equations for $Q(x)$:
Subtract Equation 3 from Equation 4:
$ (20d + e) - (16d + e) = 853 - 566 $
$ 4d = 287 $
$ d = \frac{287}{4} $

Substitute $d = \frac{287}{4}$ into Equation 3:
$ 16\left(\frac{287}{4}\right) + e = 566 $
$ 1148 + e = 566 $
$ e = -582 $

So, for $Q(x)$, we have:
$ Q(x) = -2x^2 + \frac{287}{4}x - 582 $
$ Q(0) = -582 $

[ADVANCE]

Finally, we can find $P(0) + Q(0)$:
$ P(0) + Q(0) = 698 + (-582) = 116 $

So, the value of $P(0) + Q(0)$ is $\boxed{116}$.

INTELLECT-MATH-7B-SFT (ours)

System Prompt

Solve the following math problem efficiently and clearly. Think carefully and step by step about your response and reason before providing a final response. Conclude your response with: \n\nTherefore, the final answer is: $\\boxed{answer}$. I hope it is correct.\n\nWhere [answer] is just the final number or expression that solves the problem.

User

Quadratic polynomials $P(x)$ and $Q(x)$ have leading coefficients $2$ and $-2,$ respectively. The graphs of both polynomials pass through the two points $(16,54)$ and $(20,53).$ Find $P(0) + Q(0).$

INTELLECT-MATH-7B-SFT

So I've got this problem here about quadratic polynomials $P(x)$ and $Q(x)$. $P(x)$ has a leading coefficient of $2$, and $Q(x)$ has a leading coefficient of $-2$. Both polynomials pass through the points $(16,54)$ and $(20,53)$. I need to find $P(0) + Q(0)$.

First, since both $P(x)$ and $Q(x)$ are quadratic polynomials, I can express them in the general form:

$P(x) = 2x^2 + bx + c$

$Q(x) = -2x^2 + dx + e$

Wait, actually, the leading coefficient is the coefficient of the highest power of $x$, which for a quadratic is $x^2$. So yes, $P(x) = 2x^2 + bx + c$ and $Q(x) = -2x^2 + dx + e$.

Now, both polynomials pass through the points $(16,54)$ and $(20,53)$. That means when $x=16$, $P(16)=54$, and when $x=20$, $P(20)=53$. Similarly for $Q(x)$.

So, I can set up equations using these points.

First, for $P(x)$:

$P(16) = 2(16)^2 + b16 + c = 54$

Similarly, $P(20) = 2(20)^2 + b20 + c = 53$

Let me calculate $16^2$ and $20^2$:

$16^2 = 256$

$20^2 = 400$

So,

$P(16): 2256 + 16b + c = 54 \Rightarrow 512 + 16b + c = 54 \Rightarrow 16b + c = 54 - 512 \Rightarrow 16b + c = -458$

$P(20): 2400 + 20b + c = 53 \Rightarrow 800 + 20b + c = 53 \Rightarrow 20b + c = 53 - 800 \Rightarrow 20b + c = -747$

Now, I have two equations:

1) $16b + c = -458$

2) $20b + c = -747$

I can subtract equation 1 from equation 2 to eliminate $c$:

$(20b + c) - (16b + c) = -747 - (-458)$

$20b + c - 16b - c = -747 + 458$

$4b = -289$

So, $b = -289 / 4 = -72.25$

Wait, but coefficients are usually integers, but maybe not. Let me check.

Now, plug $b$ back into one of the equations to find $c$. Let's use equation 1:

$16(-72.25) + c = -458$

$-1156 + c = -458$

$c = -458 + 1156 = 698$

So, $P(x) = 2x^2 - 72.25x + 698$

Hmm, $P(0)$ would be $c$, which is $698$.

Now, let's do the same for $Q(x)$.

$Q(16) = -2*(16)^2 + d16 + e = 54 \Rightarrow -2256 + 16d + e = 54 \Rightarrow -512 + 16d + e = 54 \Rightarrow 16d + e = 54 + 512 \Rightarrow 16d + e = 566$

$Q(20) = -2*(20)^2 + d20 + e = 53 \Rightarrow -2400 + 20d + e = 53 \Rightarrow -800 + 20d + e = 53 \Rightarrow 20d + e = 53 + 800 \Rightarrow 20d + e = 853$

Now, equations for $Q(x)$:

1) $16d + e = 566$

2) $20d + e = 853$

Subtract equation 1 from equation 2:

$(20d + e) - (16d + e) = 853 - 566$

$4d = 287$

$d = 287 / 4 = 71.75$

Now, plug $d$ back into equation 1:

$1671.75 + e = 566 \Rightarrow 1148 + e = 566 \Rightarrow e = 566 - 1148 = -582$

So, $Q(x) = -2x^2 + 71.75x - 582$

Therefore, $Q(0) = e = -582$

Now, $P(0) + Q(0) = 698 + (-582) = 116$

Wait, but the problem says "present the answer in LaTeX format: \boxed{Your answer}"

So, I should write $\boxed{116}$

But let me double-check my calculations to make sure I didn't make any mistakes.

First, for $P(x)$:

$P(16) = 2(256) + 16b + c = 512 + 16b + c = 54 \Rightarrow 16b + c = -458$

$P(20) = 2*(400) + 20b + c = 800 + 20b + c = 53 \Rightarrow 20b + c = -747$

Subtracting: $4b = -289 \Rightarrow b = -72.25$

Then, $c = -458 - 16*(-72.25) = -458 + 1156 = 698$

Similarly for $Q(x)$:

$Q(16) = -2*(256) + 16d + e = -512 + 16d + e = 54 \Rightarrow 16d + e = 566$

$Q(20) = -2*(400) + 20d + e = -800 + 20d + e = 53 \Rightarrow 20d + e = 853$

Subtracting: $4d = 287 \Rightarrow d = 71.75$

Then, $e = 566 - 16*71.75 = 566 - 1148 = -582$

So, $P(0) = 698$, $Q(0) = -582$, sum is $116$.

Seems correct.

Final Answer

$\boxed{116}$

INTELLECT-MATH-7B (ours)

System Prompt

User

Quadratic polynomials $P(x)$ and $Q(x)$ have leading coefficients $2$ and $-2,$ respectively. The graphs of both polynomials pass through the two points $(16,54)$ and $(20,53).$ Find $P(0) + Q(0).$

INTELLECT-MATH-7B

So I have two quadratic polynomials, $P(x)$ and $Q(x)$, with leading coefficients 2 and -2, respectively. Both pass through the points $(16,54)$ and $(20,53)$. I need to find $P(0) + Q(0)$.

First, since both are quadratic polynomials, I can express them in the general form:

$P(x) = 2x² + bx + c$

$Q(x) = -2x² + dx + e$

But since both pass through the points $(16,54)$ and $(20,53)$, I can set up equations using these points.

For $P(x)$:

$P(16) = 2*(16)² + b \cdot 16 + c = 54$

$P(20) = 2(20)² + b \cdot 20 + c = 53$

Similarly for $Q(x)$:

$Q(16) = -2(16)² + d \cdot 16 + e = 54$

$Q(20) = -2(20)² + d \cdot 20 + e = 53$

Let's compute the squares:

$16² = 256$

$20² = 400$

So for $P(x)$:

$2256 + 16b + c = 54 \Rightarrow 512 + 16b + c = 54 \Rightarrow 16b + c = 54 - 512 = -458 \quad ...(1)$

$2400 + 20b + c = 53 \Rightarrow 800 + 20b + c = 53 \Rightarrow 20b + c = 53 - 800 = -747 \quad ...(2)$

For $Q(x)$:

$-2256 + 16d + e = 54 \Rightarrow -512 + 16d + e = 54 \Rightarrow 16d + e = 54 + 512 = 566 \quad ...(3)$

$-2400 + 20d + e = 53 \Rightarrow -800 + 20d + e = 53 \Rightarrow 20d + e = 53 + 800 = 853 \quad ...(4)$

Now, I have two systems of linear equations:

For $P(x)$:

$16b + c = -458 \quad ...(1)$

$20b + c = -747 \quad ...(2)$

Subtract equation (1) from equation (2):

$(20b + c) - (16b + c) = -747 - (-458)$

$4b = -289 \Rightarrow b = -\frac{289}{4}$

Now plug $b$ back into equation (1):

$16\left(-\frac{289}{4}\right) + c = -458 \Rightarrow -1156 + c = -458 \Rightarrow c = -458 + 1156 = 698$

So $P(x) = 2x² - \frac{289}{4}x + 698$

For $Q(x)$:

$16d + e = 566 \quad ...(3)$

$20d + e = 853 \quad ...(4)$

Subtract equation (3) from equation (4):

$(20d + e) - (16d + e) = 853 - 566$

$4d = 287 \Rightarrow d = \frac{287}{4}$

Now plug $d$ back into equation (3):

$16\left(\frac{287}{4}\right) + e = 566 \Rightarrow 1148 + e = 566 \Rightarrow e = 566 -1148= -582$

So $Q(x)=-2x^{2}+\frac{287}{4}x-582$.

Now, $P(0)=c=698$.

$Q(0)=e=-582$.

Therefore, $P(0)+Q(0)=698+(-582)=116$.

Final Answer

$\boxed{116}$.

Authors

Acknowledgements

SYNTHETIC-2: Planetary-Scale Pipeline Parallel Inference for Verified Reasoning

Today, we’re excited to launch SYNTHETIC-2, our next-generation, open-source reasoning dataset and planetary-scale, pipeline-parallel decentralized inference run.Built on our peer-to-peer inference stack and powered by the new DeepSeek-R1-0528 model, SYNTHETIC-2 generates verified reasoning traces spanning the most comprehensive set of complex reinforcement-learning tasks and verifiers released to date.

PCCL - Prime Collective Communications Library

We are excited to release the Prime Collective Communications Library, a low-level communication library built for decentralized training across the globe.

INTELLECT-2 Release: The First 32B Parameter Model Trained Through Globally Distributed Reinforcement Learning

We're excited to release INTELLECT-2, the first 32B parameter model trained via globally distributed reinforcement learning. Unlike traditional centralized training efforts, INTELLECT-2 trains a reasoning language model using fully asynchronous RL across a dynamic, heterogeneous swarm of permissionless compute contributors.

SYNTHETIC-1: Scaling Distributed Synthetic Data Generation for Verified Reasoning

Our Contributions

‍The Road to Fully Open-Source Reasoning Models

Key Findings from DeepSeek-R1:

Our Next Steps:

SYNTHETIC-1 Tasks & Verifiers

777k Verifiable Math Problems:

144k Verifiable Coding Problems

70k Real-World Software Engineering Problems

313k Open-Ended STEM Questions

61k Synthetic Code Understanding Tasks

GENESYS

How To Contribute Compute

Preview: PI Protocol

Conclusion and Next Steps: Scaling to the Frontier

SYNTHETIC-2: Planetary-Scale Pipeline Parallel Inference for Verified Reasoning

PCCL - Prime Collective Communications Library

INTELLECT-2 Release: The First 32B Parameter Model Trained Through Globally Distributed Reinforcement Learning

Related Posts

SYNTHETIC-2: Planetary-Scale Pipeline Parallel Inference for Verified Reasoning

PCCL - Prime Collective Communications Library

INTELLECT-2 Release: The First 32B Parameter Model Trained Through Globally Distributed Reinforcement Learning