Biology Reasoning Evaluator — PhD in Biology

OpenTrain AI · Remote · Worldwide · Posted Jun 9, 2026

About OpenTrain

OpenTrain aggregates data-labeling and AI-training jobs from many companies and platforms into a single job board so you can discover this work without hunting dozens of sites. Creating an OpenTrain account is free and applying takes only a few minutes.

About AI Training Work

AI systems learn from curated human examples and evaluations. This project focuses on the human judgment that shapes biological reasoning in generative models: reviewing responses, spotting errors, and providing model-guided feedback to improve accuracy and safety.

Work is fully remote and typically flexible in schedule.
Contributors directly influence model behavior by producing high-quality evaluations and exemplar solutions.

The Role

We are hiring contractors with a PhD in Biology (or closely related life science) to evaluate AI-generated biology answers. You will apply your domain expertise to judge biological correctness, reasoning depth, and clarity, and to produce exemplar explanations.

Employment type: Contractor, Part-time.
Workload: Less than 20 hours/week; minimum availability 17–20 hrs/week.
Pay: $80 USD per hour (PAY_PER_HOUR).
Worldwide applicants accepted.

What You'll Do

Your day-to-day work involves close review and structured rating of model outputs. You will follow detailed rubrics to assign evaluation ratings, write clear critiques, and supply corrected or exemplary responses.

Assess biological correctness, reasoning depth, and clarity of AI-generated responses.
Identify issues in study design, methodology, statistics, calculations, and interpretation.
Fact-check claims and provide precise, referenced corrections when required.
Draft exemplar explanations or model solutions with step-by-step reasoning and correct terminology.
Rate and compare multiple responses using detailed rubrics (label type: EVALUATION_RATING) on TEXT data.

Requirements

You must meet every listed qualification below to be considered. These requirements ensure evaluations are rigorous, reproducible, and usable for model training.

PhD in Biology or a closely related life science; degree from a Top-100 university preferred.
Peer-reviewed publication record (first- or co-author).
Proven experience creating or critically reviewing complex biological content (protocols, curricula, grant sections, manuscripts, or computational analyses).
Breadth across core areas such as molecular/cell biology, genetics/genomics, biochemistry, and physiology.
Strong experimental design and statistics literacy; ability to spot methodological flaws.
Exceptional scientific writing at C1+ English level with clear, rigorous, stepwise reasoning.
Strong fact-checking using reputable public sources and precise referencing when required.
Consistent application of evaluation rubrics and meticulous attention to detail and reproducibility.
Availability for a minimum of 17–20 hours/week; preferred cadence ~8 hours/day during active sprints.

Preferred Experience

These are not required but will strengthen your application and onboarding success.

Prior experience with data labeling, RLHF, or AI model evaluation is a plus.
Familiarity with structured rubric-based evaluation workflows and production-quality feedback.

How It Works & Onboarding

Selection includes paid qualification and project exams to ensure evaluators meet the project's standards. The labeling software for this project is listed as OTHER; details and access will be provided during onboarding.

Onboarding includes a paid 1–2 hour qualification exam and a paid 1–2 hour project exam.
You will work with TEXT data and assign EVALUATION_RATING labels according to detailed rubrics.
Assignments are contractor-based and scheduled around project sprints; sessions may concentrate work into multi-hour days during active periods.

Who Should Apply

Apply if you meet the PhD and publication requirements and enjoy precise, methodical review of scientific reasoning. This role suits subject-matter experts who can convert disciplinary judgment into clear, reproducible feedback for AI training.

You are detail-oriented, methodical, and comfortable writing rigorous critiques in professional English.
You want flexible, remote, part-time contractor work that directly impacts how AI systems reason about biology.