rl-list.com
UPDATED 2026.06.07
rl-list.com · Vendors · Vmax
#21

Vmax

Commercial
medium confidence
vmax.ai ↗ · Status: active confirmed · Founded 2025

Vmax is a San Francisco reinforcement-learning startup (founded 2025 by three RL/robotics PhDs from UCL and UPenn) that automates the conversion of proprietary data and evals into RL environments for LLM-based agents, targeting long-horizon and coding tasks. Its public research includes unix-ctf, a procedural generator of capture-the-flag tasks for Unix/shell competence.

Key facts
Headquarters
San Francisco, USAreported cite
Headcount band
1-10estimated cite
Total raised
unknown cite
Last round
unknown
SOC 2
unknown
What they sell
environmentsconfirmed cite
Open source
noestimated cite
Deployment
unknown

Scale & velocity

Current headcount
~11 (LinkedIn 'all 11 employees' link, 2026-06-07); official LinkedIn size band 2-10reported cite
Headcount growth
unknown
Open roles
8confirmed cite
Other locations
unknown
Distributed / remote
unknown cite

Research depth

Has researchers
yesconfirmed cite
Researcher count
Small research-led team (~11 total). 3 co-founders are RL/robotics PhDs; named researchers from unix-ctf paper: Geoffrey Bradway, Roger Creus Castanyer, Lorenz Wolfreported cite
Backgrounds
Matthew Sargent, RL PhD, University College London (2019-2024); co-founder, Augustine Mavor-Parker, RL PhD, University College London; CTO; previously Redwood Research (AI safety), Cold Spring Harbor Laboratory (NeuroAI), Illumina (AI for genomics), Heejin Jeong, PhD in ESE/robotics, University of Pennsylvania (GRASP Lab, 2020); co-founder; off-policy TD learning for robotics/autonomous systems, Founding team described on vmax.ai as 3 RL PhDs from UCL and UPenn with publications at NeurIPS, ICML, AAAIreported cite
Papers / benchmarks
unix-ctf: Procedural Environments for Unix-Competence Reinforcement Learning, arXiv:2605.29115 (https://arxiv.org/abs/2605.29115); procedural generator of CTF tasks for shell agents (656 portable variants); authors include Geoffrey Bradway, Roger Creus Castanyer, Lorenz Wolf, Collaboration releasing ~1k JavaScript coding tasks in Harbor format (with Martian/ARES, compatible with the Terminal-Bench ecosystem)confirmed cite

Capital

Total raised
unknown cite
Last round
unknown
Investors
Race Capital, South Park Commonsreported cite
Valuation
unknown
Revenue signals
unknown

Security & compliance

SOC 2
unknown
Other certifications
unknown
Security page
unknown

Product

What they sell
environmentsconfirmed cite
Open source
noestimated cite
License
unknown
Deployment model
unknown
Maturity
unknown
Notable customers
Martian / ARES team (withmartian), partnership: jointly releasing ~1k JavaScript coding tasks in the Harbor format (Harbor = Terminal-Bench task format) self-claimed cite

Buyer analysis

Best fit: Teams needing custom, research-grade RL environments to train coding and long-horizon shell/terminal agents from proprietary data.

How we verified this

Re-verified the draft against vmax.ai, the company's Greenhouse, LinkedIn, South Park Commons, PitchBook, the unix-ctf arXiv paper, and the withmartian ARES post. Confirmed this is the CORRECT entity (SF RL-environments startup, vmax.ai) matching the 'research-grade RL environments' note, and ruled out two same-named decoys: the V-Max autonomous-driving framework (arXiv:2503.08388, valeoai) and a Shenzhen EV-charging 'VMAX' (Tracxn, founded 2005). Confirmed: what_they_sell=environments, 8 SF open roles, ~11 employees / band 1-10, unix-ctf paper (arXiv:2605.29115) with Vmax-affiliated authors, and investors Race Capital + South Park Commons (kept 'reported'). Corrected the founder set to THREE co-founders (added Heejin Jeong, UPenn), the draft listed only two and misattributed all to UCL. Corrected the customer attribution from 'Harbor / Laude Institute' to Martian/ARES (a collaboration, not a paying customer; remains self-claimed). Removed 'execution infrastructure' from focus_areas as unsupported. All funding amount/round/valuation correctly remain unknown; no SOC2/certifications/security page found. Overall confidence: medium."

Related vendors

Sources

  1. vmax.ai/ · 2026-06-07, Official site, RL environments, open-ended learning, hiring researchers, unix-ctf research link, Greenhouse careers link
  2. job-boards.greenhouse.io/vmax · 2026-06-07, Careers, 8 open roles, all San Francisco
  3. www.linkedin.com/company/vmax-ai · 2026-06-07, Public LinkedIn snippet, ~11 employees, San Francisco (Brannan St 94107), 'Open ended task generation', named team members
  4. www.linkedin.com/in/augustine-mavor-parker/ · 2026-06-07, Co-founder/CTO; RL PhD UCL; ex-Redwood Research, CSHL, Illumina
  5. www.linkedin.com/in/matthewjsargent/ · 2026-06-07, Co-founder; RL PhD UCL
  6. www.southparkcommons.com/companies/vmax/ · 2026-06-07, SPC portfolio, founders, founded 2025, SF, description; SPC as backer
  7. arxiv.org/abs/2605.29115 · 2026-06-07, unix-ctf research paper, procedural Unix-competence CTF environments
  8. withmartian.com/post/ares-open-source-infrastructure-for-online-rl-on-co · 2026-06-07, Mentions Vmax partnership releasing ~1k JavaScript tasks in Harbor format
  9. newsletter.semianalysis.com/p/rl-environments-and-rl-for-science · 2026-06-07, SemiAnalysis, lists Vmax among RL-environment vendors
  10. pitchbook.com/profiles/company/907262-38 · 2026-06-07, PitchBook profile (403 blocked); no disclosed funding amount per search snippets
  11. www.crunchbase.com/organization/vmax · 2026-06-07, Crunchbase listing referenced; funding unannounced
  12. x.com/MavorParker/status/1868009967518880183 · 2026-06-07, Founder post describing agents leveraging large-company dataset structure for multistep RL on long-horizon tasks
Last updated 2026-06-07 · Every quantitative field carries a source and a confidence tag. Fields we could not source publicly are marked unknown, never estimated. See the methodology.