#21

Vmax

Commercial

medium confidence

vmax.ai ↗ · Status: active confirmed · Founded 2025

Vmax is a San Francisco reinforcement-learning startup (founded 2025 by three RL/robotics PhDs from UCL and UPenn) that automates the conversion of proprietary data and evals into RL environments for LLM-based agents, targeting long-horizon and coding tasks. Its public research includes unix-ctf, a procedural generator of capture-the-flag tasks for Unix/shell competence.

Key facts

Headquarters

San Francisco, USAreported cite

Headcount band

1-10estimated cite

Total raised

unknown cite

Last round

unknown

SOC 2

unknown

What they sell

environmentsconfirmed cite

Open source

noestimated cite

Deployment

unknown

Scale & velocity

Current headcount

~11 (LinkedIn 'all 11 employees' link, 2026-06-07); official LinkedIn size band 2-10reported cite

Headcount growth

unknown

Open roles

8confirmed cite

Other locations

unknown

Distributed / remote

unknown cite

Research depth

Has researchers

yesconfirmed cite

Researcher count

Small research-led team (~11 total). 3 co-founders are RL/robotics PhDs; named researchers from unix-ctf paper: Geoffrey Bradway, Roger Creus Castanyer, Lorenz Wolfreported cite

Backgrounds

Matthew Sargent, RL PhD, University College London (2019-2024); co-founder, Augustine Mavor-Parker, RL PhD, University College London; CTO; previously Redwood Research (AI safety), Cold Spring Harbor Laboratory (NeuroAI), Illumina (AI for genomics), Heejin Jeong, PhD in ESE/robotics, University of Pennsylvania (GRASP Lab, 2020); co-founder; off-policy TD learning for robotics/autonomous systems, Founding team described on vmax.ai as 3 RL PhDs from UCL and UPenn with publications at NeurIPS, ICML, AAAIreported cite

Papers / benchmarks

unix-ctf: Procedural Environments for Unix-Competence Reinforcement Learning, arXiv:2605.29115 (https://arxiv.org/abs/2605.29115); procedural generator of CTF tasks for shell agents (656 portable variants); authors include Geoffrey Bradway, Roger Creus Castanyer, Lorenz Wolf, Collaboration releasing ~1k JavaScript coding tasks in Harbor format (with Martian/ARES, compatible with the Terminal-Bench ecosystem)confirmed cite

Capital

Total raised

unknown cite

Last round

unknown

Investors

Race Capital, South Park Commonsreported cite

Valuation

unknown

Revenue signals

unknown

Security & compliance

SOC 2

unknown

Other certifications

unknown

Security page

unknown

Product

What they sell

environmentsconfirmed cite

Open source

noestimated cite

License

unknown

Deployment model

unknown

Maturity

unknown

Notable customers

Martian / ARES team (withmartian), partnership: jointly releasing ~1k JavaScript coding tasks in the Harbor format (Harbor = Terminal-Bench task format) self-claimed cite

Buyer analysis

Best fit: Teams needing custom, research-grade RL environments to train coding and long-horizon shell/terminal agents from proprietary data.

How we verified this

Re-verified the draft against vmax.ai, the company's Greenhouse, LinkedIn, South Park Commons, PitchBook, the unix-ctf arXiv paper, and the withmartian ARES post. Confirmed this is the CORRECT entity (SF RL-environments startup, vmax.ai) matching the 'research-grade RL environments' note, and ruled out two same-named decoys: the V-Max autonomous-driving framework (arXiv:2503.08388, valeoai) and a Shenzhen EV-charging 'VMAX' (Tracxn, founded 2005). Confirmed: what_they_sell=environments, 8 SF open roles, ~11 employees / band 1-10, unix-ctf paper (arXiv:2605.29115) with Vmax-affiliated authors, and investors Race Capital + South Park Commons (kept 'reported'). Corrected the founder set to THREE co-founders (added Heejin Jeong, UPenn), the draft listed only two and misattributed all to UCL. Corrected the customer attribution from 'Harbor / Laude Institute' to Martian/ARES (a collaboration, not a paying customer; remains self-claimed). Removed 'execution infrastructure' from focus_areas as unsupported. All funding amount/round/valuation correctly remain unknown; no SOC2/certifications/security page found. Overall confidence: medium."

Related vendors

Sources

vmax.ai/ · 2026-06-07, Official site, RL environments, open-ended learning, hiring researchers, unix-ctf research link, Greenhouse careers link
job-boards.greenhouse.io/vmax · 2026-06-07, Careers, 8 open roles, all San Francisco
www.linkedin.com/company/vmax-ai · 2026-06-07, Public LinkedIn snippet, ~11 employees, San Francisco (Brannan St 94107), 'Open ended task generation', named team members
www.linkedin.com/in/augustine-mavor-parker/ · 2026-06-07, Co-founder/CTO; RL PhD UCL; ex-Redwood Research, CSHL, Illumina
www.linkedin.com/in/matthewjsargent/ · 2026-06-07, Co-founder; RL PhD UCL
www.southparkcommons.com/companies/vmax/ · 2026-06-07, SPC portfolio, founders, founded 2025, SF, description; SPC as backer
arxiv.org/abs/2605.29115 · 2026-06-07, unix-ctf research paper, procedural Unix-competence CTF environments
withmartian.com/post/ares-open-source-infrastructure-for-online-rl-on-co · 2026-06-07, Mentions Vmax partnership releasing ~1k JavaScript tasks in Harbor format
newsletter.semianalysis.com/p/rl-environments-and-rl-for-science · 2026-06-07, SemiAnalysis, lists Vmax among RL-environment vendors
pitchbook.com/profiles/company/907262-38 · 2026-06-07, PitchBook profile (403 blocked); no disclosed funding amount per search snippets
www.crunchbase.com/organization/vmax · 2026-06-07, Crunchbase listing referenced; funding unannounced
x.com/MavorParker/status/1868009967518880183 · 2026-06-07, Founder post describing agents leveraging large-company dataset structure for multistep RL on long-horizon tasks

Last updated 2026-06-07 · Every quantitative field carries a source and a confidence tag. Fields we could not source publicly are marked unknown, never estimated. See the methodology.