Now accepting pilot partners

See how they actually work.
Not how they interview.

A comprehensive AI-native assessment suite — work trials, interviews, cognitive tests — that predicts on-the-job performance with evidence.

Trusted by founders from YC, SPC, and more

polymath-demo
Scroll
The Problem

Hiring is broken.
Everyone knows it.

That sinking feeling three months in when your "great interview" turns into a performance problem.

Resumes lie

Credentials don't predict performance. You're hiring based on marketing, not capability.

Interviews are theater

Candidates perform rehearsed answers. You learn who interviews well, not who ships.

LeetCode is cargo cult

Inverting binary trees has nothing to do with building products. Memorization isn't problem-solving.

AI changes everything

Your best hire uses AI to 10x their output. Traditional interviews penalize this.

"The best predictor of future performance is past performance — in similar conditions."

So why are we still using artificial conditions to predict real work?

How It Works

Four steps to certainty

A comprehensive assessment suite — work trials, interviews, cognitive tests — that shows you who someone actually is.

01

Use any AI tools

Real environment. Real tools.

Candidate works on their own machine with their choice of AI tools. Just like their actual job.

Watcher connected
$claude --version
Claude Code v1.0.14
$git clone project-repo && cd project-repo
$npm run dev
02

Ambiguous Assignment

Designed to reveal agency.

A real-world project with incomplete requirements. See how they handle ambiguity and turn chaos into shipped code.

Assignment Brief

Build a support inbox triage system. Ingest tickets, cluster by topic, propose auto-replies, admin UI for review.

"Requirements intentionally incomplete. Ask questions. Make decisions. Ship something real."

ambiguityagencytradeoffs
03

Interviews & Cognitive Tests

Multi-modal signal, not just code.

Structured interviews probe reasoning, judgment, and communication. Cognitive tests measure raw problem-solving ability. Patterns across contexts — not one performance.

I
Walk me through a tradeoff you'd make differently today.
Verbal reasoning
4.4
Pattern recognition
4.7
Numerical reasoning
4.1
04

Evidence-Based Report

Signal, not vibes.

Rubric scores, evidence clips, and a clear recommendation you can act on.

Candidate ReportHIRE
Intelligence
4.5
AI Tool Usage
4.8
Judgment
4.2
Agency
4.6
"Strong evidence of independent thinking and effective AI usage..."
The Report

Decisions backed by evidence

Not a vibe. A report that makes the hiring decision obvious — with receipts.

Candidate Evaluation Report

Full-Stack Engineer • Jan 2025

STRONG HIRE
Active Time

14.2 hours

Commits

47

Decisions Logged

12

Rubric Scores

Intelligence4.1
Grit4.6
AI Tool Usage4.4
Openness4.6

Key Evidence

14:23 — "Chose to ship auth-less MVP first, documented security as Day 2 priority. Explicitly traded speed for completeness."

16:45 — When tests failed, diagnosed root cause in 8 mins using Claude Code. Fixed without introducing regressions.

What we measure

Six dimensions that predict on-the-job success. Each scored 1–5 with evidence.

Intelligence

Problem decomposition, pattern recognition, learning speed

Grit

Persistence through ambiguity and setbacks

AI Tool Usage

Leverages AI effectively as a force multiplier

Openness

Receptivity to feedback, new approaches, and change

Judgment

Prioritization, tradeoffs, knowing what matters

Agency

Proactive decisions, owns outcomes, escalates with options

Get Early Access

Join the waitlist

Limited pilot spots available. Be among the first.

No spam, everPilot spots limited