#14

HUD

Commercial

medium confidence

www.hud.ai ↗ · Status: active confirmed · Founded 2025

HUD (YC W25, formerly hud.so) is a platform for building reinforcement-learning environments and evaluations for computer-use and browser agents. It lets teams wrap real software/code as agent-callable tools in isolated containers, define tasks and rewards, and run evals/RL at scale via an open-source SDK plus a cloud-hosted gateway. It maintains public benchmarks (OSWorld-Verified contributions, SheetBench-50) and positions frontier AI labs and agent-first startups as its target customers.

Key facts

Headquarters

San Francisco, USAconfirmed cite

Headcount band

11-50reported cite

Total raised

unknown

Last round

Seed (amount undisclosed)reported cite

SOC 2

unknown

What they sell

environmentsconfirmed cite

Open source

yesconfirmed cite

Deployment

managed-hosted + self-hosted (open-source SDK; cloud platform with local CLI execution and an OpenAI-compatible model gateway at inference.hud.ai)confirmed cite

Scale & velocity

Current headcount

~15 (per YC profile, accessed 2026-06-07)reported cite

Headcount growth

unknown

Open roles

5reported cite

Other locations

unknown

Distributed / remote

noestimated cite

Research depth

Has researchers

yesreported cite

Researcher count

unknown

Backgrounds

Jay Ram (CEO) - consumer apps, ML/quant research, Lorenss Martinsons (CPO) - Cognitive Science, Yale, Parth Patel (CTO) - evals and RL environments, Team reported to include International Olympiad medalists (IOI, IPhO) and researchers with ICLR/NeurIPS publicationsreported cite

Papers / benchmarks

OSWorld-Verified (369+ real-world desktop tasks; HUD/'Human Data' acknowledged among institutions providing feedback/fixes, per XLANG Lab), SheetBench-50 (financial-analyst spreadsheet benchmark, developed with Sepal AI; per HUD case study)reported cite

Capital

Total raised

unknown

Last round

Seed (amount undisclosed)reported cite

Investors

Y Combinator (W25 batch), Exceptional Capitalreported cite

Valuation

unknown

Revenue signals

unknown

Security & compliance

SOC 2

unknown

Other certifications

unknown

Security page

https://www.hud.ai/dpa (Data Processing Addendum)reported cite

Product

What they sell

environmentsconfirmed cite

Open source

yesconfirmed cite

License

MITconfirmed cite

Deployment model

managed-hosted + self-hosted (open-source SDK; cloud platform with local CLI execution and an OpenAI-compatible model gateway at inference.hud.ai)confirmed cite

Maturity

GAestimated cite

Notable customers

DoorDash self-claimed UiPath self-claimed Sharpe self-claimed ⚑OpenAI self-claimed ⚑Anthropic self-claimed cite

Buyer analysis

Best fit: Teams that need to benchmark or RL-train computer-use/browser agents against real-software tasks with reproducible, containerized environments.

How we verified this

Confirmed this is the correct company: HUD (hud.ai, YC W25, formerly hud.so), a platform for wrapping real software/code as agent-callable RL environments and evals for computer-use/browser agents - matching the 'wrapper layer' directory note. Disambiguated from an unrelated same-named Israeli startup ('Hud', runtime code sensor, backed by Square Peg, ~$21M) whose funding figures had leaked into search snippets; HUD's actual funding amount remains undisclosed, though a seed round is confirmed via Exceptional Capital's portfolio (added as an investor, reported). Biggest correction: OpenAI and Anthropic were downgraded from 'verified' to 'self-claimed' customers - the cited XLANG/OSWorld-Verified page only co-acknowledges HUD as a benchmark feedback contributor alongside those labs, which is collaboration, not customer verification. Benchmarks downgraded to 'reported' (OSWorld page names 'Human Data', SheetBench is self-reported). open_roles_count downgraded to reported. License upgraded to confirmed after re-fetching the MIT license field on GitHub. DoorDash/UiPath/Sharpe correctly remain self-claimed. No SOC 2/ISO found; security_page is only a DPA. Overall confidence: medium.

Related vendors

Sources

www.ycombinator.com/companies/hud · 2026-06-07, YC profile: founders (Jay Ram, Lorenss Martinsons, Parth Patel), founded 2025, ~15 people, SF, W25, frontier-lab positioning
www.ycombinator.com/companies/hud/jobs · 2026-06-07, 5 open roles, all San Francisco
github.com/hud-evals/hud-python · 2026-06-07, OSS SDK, MIT license reported, ~258 stars, v0.5.41 (Apr 2026), prebuilt computer/shell/file/browser tools
pypi.org/project/hud-python/ · 2026-06-07, PyPI package for hud-python
docs.hud.ai/ · 2026-06-07, Deployment model: cloud-hosted + local CLI, containerized isolated environments, OpenAI-compatible gateway at inference.hud.ai (Claude/GPT/Gemini/Grok); no SOC2/license stated in docs
www.hud.ai/ · 2026-06-07, Homepage (HTTP 429 on fetch); via search: customers DoorDash, UiPath, Sharpe self-claimed; positioning as RL environments/evals for CUAs
www.hud.ai/case-studies/sheetbench-50 · 2026-06-07, SheetBench-50 developed with Sepal AI; finance professionals from PwC/Cisco/Charles Schwab/Fannie Mae involved
www.hud.ai/dpa · 2026-06-07, Data Processing Addendum page
xlang.ai/blog/osworld-verified · 2026-06-07, Third-party: HUD (Human Data, hud.so) listed among institutions providing OSWorld-Verified feedback alongside OpenAI, Anthropic, Moonshot, ByteDance, Simular - verifies frontier-lab collaboration
foundertrace.com/companies/hud_yc_w25/ · 2026-06-07, Founders, Martinsons Yale Cognitive Science background
www.workatastartup.com/companies/hud · 2026-06-07, Team described with Olympiad medalists and ICLR/NeurIPS researchers (via search snippet)
newsletter.semianalysis.com/p/rl-environments-and-rl-for-science · 2026-06-07, Technical description: HUD wraps software in dockerized container + MCP server exposing agent tools
www.linkedin.com/company/hud-evals · 2026-06-07, Correct LinkedIn handle for HUD (YC W25) RL-environments company
www.crunchbase.com/organization/hud · 2026-06-07, Crunchbase profile (HTTP 403 on fetch); funding not verified
x.com/hud_evals/status/1919262852088570225 · 2026-06-07, HUD X post: hiring research engineers, works with frontier labs to evaluate CUAs

Last updated 2026-06-07 · Every quantitative field carries a source and a confidence tag. Fields we could not source publicly are marked unknown, never estimated. See the methodology.