Careers
/

RL Environments Architect

About Us

Our mission is to raise AGI with the richness of human intelligence — curious, witty, imaginative, and full of unexpected brilliance.

Surge was founded by engineers and researchers who dreamed of building the next generation AI. We're building a platform that powers the most powerful models in the world in partnership with companies like OpenAI, Anthropic, Meta, and Google.

At Surge, we believe the path to AGI isn't just about scaling compute—it's about embracing the unlimited ceiling of human intelligence and creativity in the data that shapes these systems. Our platform combines elite human expertise with cutting-edge tools for scalable oversight, from building rich RL environments to conducting rigorous evaluations that go beyond benchmarks. We've run a profitable business from day one without raising venture funding.

The Role

As an RL Environments Architect, you’ll design, instrument, and govern the simulated worlds where agents learn — from compact task microcosms to multi-agent, tool-using ecosystems. You’ll define the primitives, reward structures, interfaces, and telemetry that let us stress-test emerging capabilities while keeping training signals faithful, stable, and scalable.

Not only will you build environments, you’ll craft standards for data quality and reproducibility across large-scale agent gyms. This is a role for someone who sweats the details of simulation fidelity, thinks in terms of coverage and failure surfaces, and loves turning messy real-world phenomena into learnable curricula. Your work will form the backbone for safe, rapid progress in agentic systems.

What You'll Do

  • Architect a modular environment framework with clear APIs, curriculum scaffolds, and configurable reward/termination schemas
  • Establish quality bars: coverage metrics, invariance checks, and trace audits for environment outputs and agent experience buffers
  • Instrument rich telemetry for episode rollouts; mitigating reward hacking, mode collapse, and exploitable loopholes
  • Partner with researchers to translate real-world tasks into robust simulations, including synthetic data generators and evaluation suites

What We’re Looking for

  • Simulation & Systems Depth – Experience building RL environments or simulators (e.g., custom physics, multi-agent, tool APIs) with an eye for determinism, performance, and observability
  • Data Quality Leadership – Strong instincts for designing reward functions, scenario taxonomies, and QA pipelines that keep signals aligned and drift-free
  • Builder’s Mindset – Comfort collaborating across research and engineering to ship pragmatic, testable environments that evolve with model capabilities

How to Apply

To apply, please email talent@surgehq.ai with a resume and 2-3 sentences describing your interest in Surge. We love personal projects and writings too!

Help us raise AI for
the real world

Apply now