Skip to content
OpenTrain AI

Biology Reasoning Evaluator — PhD in Biology

OpenTrain AI · Remote · Worldwide · Posted Jun 9, 2026

Apply for this job Hourly · $80/hr

About OpenTrain

OpenTrain aggregates data-labeling and AI-training jobs from many companies and platforms into a single job board so you can discover this work without hunting dozens of sites. Creating an OpenTrain account is free and applying takes only a few minutes.

About AI Training Work

AI systems learn from curated human examples and evaluations. This project focuses on the human judgment that shapes biological reasoning in generative models: reviewing responses, spotting errors, and providing model-guided feedback to improve accuracy and safety.

  • Work is fully remote and typically flexible in schedule.
  • Contributors directly influence model behavior by producing high-quality evaluations and exemplar solutions.

The Role

We are hiring contractors with a PhD in Biology (or closely related life science) to evaluate AI-generated biology answers. You will apply your domain expertise to judge biological correctness, reasoning depth, and clarity, and to produce exemplar explanations.

  • Employment type: Contractor, Part-time.
  • Workload: Less than 20 hours/week; minimum availability 17–20 hrs/week.
  • Pay: $80 USD per hour (PAY_PER_HOUR).
  • Worldwide applicants accepted.

What You'll Do

Your day-to-day work involves close review and structured rating of model outputs. You will follow detailed rubrics to assign evaluation ratings, write clear critiques, and supply corrected or exemplary responses.

  • Assess biological correctness, reasoning depth, and clarity of AI-generated responses.
  • Identify issues in study design, methodology, statistics, calculations, and interpretation.
  • Fact-check claims and provide precise, referenced corrections when required.
  • Draft exemplar explanations or model solutions with step-by-step reasoning and correct terminology.
  • Rate and compare multiple responses using detailed rubrics (label type: EVALUATION_RATING) on TEXT data.

Requirements

You must meet every listed qualification below to be considered. These requirements ensure evaluations are rigorous, reproducible, and usable for model training.

  • PhD in Biology or a closely related life science; degree from a Top-100 university preferred.
  • Peer-reviewed publication record (first- or co-author).
  • Proven experience creating or critically reviewing complex biological content (protocols, curricula, grant sections, manuscripts, or computational analyses).
  • Breadth across core areas such as molecular/cell biology, genetics/genomics, biochemistry, and physiology.
  • Strong experimental design and statistics literacy; ability to spot methodological flaws.
  • Exceptional scientific writing at C1+ English level with clear, rigorous, stepwise reasoning.
  • Strong fact-checking using reputable public sources and precise referencing when required.
  • Consistent application of evaluation rubrics and meticulous attention to detail and reproducibility.
  • Availability for a minimum of 17–20 hours/week; preferred cadence ~8 hours/day during active sprints.

Preferred Experience

These are not required but will strengthen your application and onboarding success.

  • Prior experience with data labeling, RLHF, or AI model evaluation is a plus.
  • Familiarity with structured rubric-based evaluation workflows and production-quality feedback.

How It Works & Onboarding

Selection includes paid qualification and project exams to ensure evaluators meet the project's standards. The labeling software for this project is listed as OTHER; details and access will be provided during onboarding.

  • Onboarding includes a paid 1–2 hour qualification exam and a paid 1–2 hour project exam.
  • You will work with TEXT data and assign EVALUATION_RATING labels according to detailed rubrics.
  • Assignments are contractor-based and scheduled around project sprints; sessions may concentrate work into multi-hour days during active periods.

Who Should Apply

Apply if you meet the PhD and publication requirements and enjoy precise, methodical review of scientific reasoning. This role suits subject-matter experts who can convert disciplinary judgment into clear, reproducible feedback for AI training.

  • You are detail-oriented, methodical, and comfortable writing rigorous critiques in professional English.
  • You want flexible, remote, part-time contractor work that directly impacts how AI systems reason about biology.