rl-list.com
UPDATED 2026.06.07
rl-list.com · Vendors · HUD
#14

HUD

Commercial
medium confidence
www.hud.ai ↗ · Status: active confirmed · Founded 2025

HUD (YC W25, formerly hud.so) is a platform for building reinforcement-learning environments and evaluations for computer-use and browser agents. It lets teams wrap real software/code as agent-callable tools in isolated containers, define tasks and rewards, and run evals/RL at scale via an open-source SDK plus a cloud-hosted gateway. It maintains public benchmarks (OSWorld-Verified contributions, SheetBench-50) and positions frontier AI labs and agent-first startups as its target customers.

Key facts
Headquarters
San Francisco, USAconfirmed cite
Headcount band
11-50reported cite
Total raised
unknown
Last round
Seed (amount undisclosed)reported cite
SOC 2
unknown
What they sell
environmentsconfirmed cite
Open source
yesconfirmed cite
Deployment
managed-hosted + self-hosted (open-source SDK; cloud platform with local CLI execution and an OpenAI-compatible model gateway at inference.hud.ai)confirmed cite

Scale & velocity

Current headcount
~15 (per YC profile, accessed 2026-06-07)reported cite
Headcount growth
unknown
Open roles
5reported cite
Other locations
unknown
Distributed / remote
noestimated cite

Research depth

Has researchers
yesreported cite
Researcher count
unknown
Backgrounds
Jay Ram (CEO) - consumer apps, ML/quant research, Lorenss Martinsons (CPO) - Cognitive Science, Yale, Parth Patel (CTO) - evals and RL environments, Team reported to include International Olympiad medalists (IOI, IPhO) and researchers with ICLR/NeurIPS publicationsreported cite
Papers / benchmarks
OSWorld-Verified (369+ real-world desktop tasks; HUD/'Human Data' acknowledged among institutions providing feedback/fixes, per XLANG Lab), SheetBench-50 (financial-analyst spreadsheet benchmark, developed with Sepal AI; per HUD case study)reported cite

Capital

Total raised
unknown
Last round
Seed (amount undisclosed)reported cite
Investors
Y Combinator (W25 batch), Exceptional Capitalreported cite
Valuation
unknown
Revenue signals
unknown

Security & compliance

SOC 2
unknown
Other certifications
unknown
Security page
https://www.hud.ai/dpa (Data Processing Addendum)reported cite

Product

What they sell
environmentsconfirmed cite
Open source
yesconfirmed cite
License
MITconfirmed cite
Deployment model
managed-hosted + self-hosted (open-source SDK; cloud platform with local CLI execution and an OpenAI-compatible model gateway at inference.hud.ai)confirmed cite
Maturity
GAestimated cite
Notable customers
DoorDash self-claimed UiPath self-claimed Sharpe self-claimed OpenAI self-claimed Anthropic self-claimed cite

Buyer analysis

Best fit: Teams that need to benchmark or RL-train computer-use/browser agents against real-software tasks with reproducible, containerized environments.

How we verified this

Confirmed this is the correct company: HUD (hud.ai, YC W25, formerly hud.so), a platform for wrapping real software/code as agent-callable RL environments and evals for computer-use/browser agents - matching the 'wrapper layer' directory note. Disambiguated from an unrelated same-named Israeli startup ('Hud', runtime code sensor, backed by Square Peg, ~$21M) whose funding figures had leaked into search snippets; HUD's actual funding amount remains undisclosed, though a seed round is confirmed via Exceptional Capital's portfolio (added as an investor, reported). Biggest correction: OpenAI and Anthropic were downgraded from 'verified' to 'self-claimed' customers - the cited XLANG/OSWorld-Verified page only co-acknowledges HUD as a benchmark feedback contributor alongside those labs, which is collaboration, not customer verification. Benchmarks downgraded to 'reported' (OSWorld page names 'Human Data', SheetBench is self-reported). open_roles_count downgraded to reported. License upgraded to confirmed after re-fetching the MIT license field on GitHub. DoorDash/UiPath/Sharpe correctly remain self-claimed. No SOC 2/ISO found; security_page is only a DPA. Overall confidence: medium.

Related vendors

Sources

  1. www.ycombinator.com/companies/hud · 2026-06-07, YC profile: founders (Jay Ram, Lorenss Martinsons, Parth Patel), founded 2025, ~15 people, SF, W25, frontier-lab positioning
  2. www.ycombinator.com/companies/hud/jobs · 2026-06-07, 5 open roles, all San Francisco
  3. github.com/hud-evals/hud-python · 2026-06-07, OSS SDK, MIT license reported, ~258 stars, v0.5.41 (Apr 2026), prebuilt computer/shell/file/browser tools
  4. pypi.org/project/hud-python/ · 2026-06-07, PyPI package for hud-python
  5. docs.hud.ai/ · 2026-06-07, Deployment model: cloud-hosted + local CLI, containerized isolated environments, OpenAI-compatible gateway at inference.hud.ai (Claude/GPT/Gemini/Grok); no SOC2/license stated in docs
  6. www.hud.ai/ · 2026-06-07, Homepage (HTTP 429 on fetch); via search: customers DoorDash, UiPath, Sharpe self-claimed; positioning as RL environments/evals for CUAs
  7. www.hud.ai/case-studies/sheetbench-50 · 2026-06-07, SheetBench-50 developed with Sepal AI; finance professionals from PwC/Cisco/Charles Schwab/Fannie Mae involved
  8. www.hud.ai/dpa · 2026-06-07, Data Processing Addendum page
  9. xlang.ai/blog/osworld-verified · 2026-06-07, Third-party: HUD (Human Data, hud.so) listed among institutions providing OSWorld-Verified feedback alongside OpenAI, Anthropic, Moonshot, ByteDance, Simular - verifies frontier-lab collaboration
  10. foundertrace.com/companies/hud_yc_w25/ · 2026-06-07, Founders, Martinsons Yale Cognitive Science background
  11. www.workatastartup.com/companies/hud · 2026-06-07, Team described with Olympiad medalists and ICLR/NeurIPS researchers (via search snippet)
  12. newsletter.semianalysis.com/p/rl-environments-and-rl-for-science · 2026-06-07, Technical description: HUD wraps software in dockerized container + MCP server exposing agent tools
  13. www.linkedin.com/company/hud-evals · 2026-06-07, Correct LinkedIn handle for HUD (YC W25) RL-environments company
  14. www.crunchbase.com/organization/hud · 2026-06-07, Crunchbase profile (HTTP 403 on fetch); funding not verified
  15. x.com/hud_evals/status/1919262852088570225 · 2026-06-07, HUD X post: hiring research engineers, works with frontier labs to evaluate CUAs
Last updated 2026-06-07 · Every quantitative field carries a source and a confidence tag. Fields we could not source publicly are marked unknown, never estimated. See the methodology.