Project Canary

Completed

Foundational MOVE Fellowship project (Sept-Oct 2025) — a community-driven effort to train and refine frontier AI models. Completed 15,000+ tasks across 15 domains, improving Review 1 approval rates from 10% to 40%.

AI TrainingData GenerationHandshake AIMOVE Fellowship

Project Canary — MOVE Fellowship Foundation

The foundational phase of the MOVE Fellowship at Handshake AI (Sept–Oct 2025), focused on large-scale task generation to train and refine frontier AI models.

Overview

Project Canary was a community-driven initiative where subject-matter experts contributed training tasks across 15 academic domains. The goal: generate high-quality, diverse training data that would push frontier models toward deeper domain expertise.

My Contribution

As a core contributor in the Computer Science domain, I:

  • Generated and reviewed tasks covering algorithms, data structures, systems design, machine learning, and software engineering
  • Contributed to over 15,000 tasks across the fellowship
  • Focused on tasks requiring PhD-level reasoning — problems that couldn't be solved by simply retrieving information

Impact

Quality Metrics

The most significant achievement was improving task quality:

MetricBeforeAfter
Review 1 approval rate10%40%
Tasks completed15,000+
Domains covered15

A 4× improvement in first-review approval rates meant significantly less rework, faster iteration, and higher-quality training data reaching the model.

Lessons Learned

  1. Domain expertise matters: Generic annotators produce generic data. PhD-level contributors produce training signal that actually moves the needle on hard problems.
  2. Quality over quantity: A single well-crafted reasoning task is worth more than a hundred trivial ones.
  3. Review feedback loops: Tight review cycles with specific feedback accelerate quality improvement dramatically.

Legacy

Project Canary laid the groundwork for Project Orion — the specialized refinement phase that followed, where the broad data generation shifted to targeted reasoning, safety, and red-teaming work.