Dataset Catalog

Production-ready datasets built by domain experts

STEM & Multimodal

Adversarial STEM Reasoning

PhD-level problems in mathematics, physics, chemistry, and biology designed to challenge frontier models through multi-step reasoning and adversarial examples.

Dataset Specifications:

📊 10K+ Problems
🎓 PhD-Level
Verified GTAs
🔬 Multi-Modal
Math Physics Chemistry Biology
Request Access
CLI & DevOps

Terminal Reasoning Benchmarks

Multi-step command-line interface tasks with Docker environments testing system-level reasoning, containerization, and real-world development workflows.

Dataset Specifications:

📊 500+ Tasks
🐳 Docker Ready
<40% Pass Rate
Auto-Graded
Bash/Shell Docker Kubernetes CI/CD
Request Access
Finance & Economics

Financial Reasoning Tasks

Complex financial analysis problems requiring market research, valuation modeling, and investment rationale synthesis across multiple data sources.

Dataset Specifications:

📊 2K+ Problems
💼 Industry-Grade
CFA-Level
📈 Real Markets
Valuations Market Analysis Investment Risk
Request Access
Coding & Engineering

Software Engineering Benchmarks

Full-stack development challenges including API design, system architecture, algorithmic problems, and real-world codebase scenarios.

Dataset Specifications:

📊 3K+ Challenges
💻 Multi-Language
Test Suites
🔧 Production-Grade
Python JavaScript System Design Algorithms
Request Access
Multi-Model Testing

Adversarial Failure Induction

Targeted prompts designed to induce specific failure modes across multiple frontier LLMs, including GPT-5, Claude Sonnet 4.5, and others.

Dataset Specifications:

📊 5K+ Prompts
🤖 4+ Models
Verified Failures
🎯 Targeted
GPT-5 Claude 4.5 Adversarial QA Testing
Request Access
Research & Analysis

Multi-Source Synthesis Tasks

Research-heavy problems requiring information gathering from multiple academic papers, datasets, and sources to reach verifiable conclusions.

Dataset Specifications:

📊 1.5K+ Tasks
📚 Multi-Source
Peer-Reviewed
🧠 Deep Research
Academic Synthesis Citations Analysis
Request Access

Need Custom Datasets?

We can create tailored datasets for your specific AI training needs.

Contact Us