Biology Reasoning Evaluator — PhD in Biology
OpenTrain AI · Remote · Worldwide · Posted Jun 9, 2026
About OpenTrain
OpenTrain aggregates data-labeling and AI-training jobs from many companies and platforms into a single job board so you can discover this work without hunting dozens of sites. Creating an OpenTrain account is free and applying takes only a few minutes.
About AI Training Work
AI systems learn from curated human examples and evaluations. This project focuses on the human judgment that shapes biological reasoning in generative models: reviewing responses, spotting errors, and providing model-guided feedback to improve accuracy and safety.
- Work is fully remote and typically flexible in schedule.
- Contributors directly influence model behavior by producing high-quality evaluations and exemplar solutions.
The Role
We are hiring contractors with a PhD in Biology (or closely related life science) to evaluate AI-generated biology answers. You will apply your domain expertise to judge biological correctness, reasoning depth, and clarity, and to produce exemplar explanations.
- Employment type: Contractor, Part-time.
- Workload: Less than 20 hours/week; minimum availability 17–20 hrs/week.
- Pay: $80 USD per hour (PAY_PER_HOUR).
- Worldwide applicants accepted.
What You'll Do
Your day-to-day work involves close review and structured rating of model outputs. You will follow detailed rubrics to assign evaluation ratings, write clear critiques, and supply corrected or exemplary responses.
- Assess biological correctness, reasoning depth, and clarity of AI-generated responses.
- Identify issues in study design, methodology, statistics, calculations, and interpretation.
- Fact-check claims and provide precise, referenced corrections when required.
- Draft exemplar explanations or model solutions with step-by-step reasoning and correct terminology.
- Rate and compare multiple responses using detailed rubrics (label type: EVALUATION_RATING) on TEXT data.
Requirements
You must meet every listed qualification below to be considered. These requirements ensure evaluations are rigorous, reproducible, and usable for model training.
- PhD in Biology or a closely related life science; degree from a Top-100 university preferred.
- Peer-reviewed publication record (first- or co-author).
- Proven experience creating or critically reviewing complex biological content (protocols, curricula, grant sections, manuscripts, or computational analyses).
- Breadth across core areas such as molecular/cell biology, genetics/genomics, biochemistry, and physiology.
- Strong experimental design and statistics literacy; ability to spot methodological flaws.
- Exceptional scientific writing at C1+ English level with clear, rigorous, stepwise reasoning.
- Strong fact-checking using reputable public sources and precise referencing when required.
- Consistent application of evaluation rubrics and meticulous attention to detail and reproducibility.
- Availability for a minimum of 17–20 hours/week; preferred cadence ~8 hours/day during active sprints.
Preferred Experience
These are not required but will strengthen your application and onboarding success.
- Prior experience with data labeling, RLHF, or AI model evaluation is a plus.
- Familiarity with structured rubric-based evaluation workflows and production-quality feedback.
How It Works & Onboarding
Selection includes paid qualification and project exams to ensure evaluators meet the project's standards. The labeling software for this project is listed as OTHER; details and access will be provided during onboarding.
- Onboarding includes a paid 1–2 hour qualification exam and a paid 1–2 hour project exam.
- You will work with TEXT data and assign EVALUATION_RATING labels according to detailed rubrics.
- Assignments are contractor-based and scheduled around project sprints; sessions may concentrate work into multi-hour days during active periods.
Who Should Apply
Apply if you meet the PhD and publication requirements and enjoy precise, methodical review of scientific reasoning. This role suits subject-matter experts who can convert disciplinary judgment into clear, reproducible feedback for AI training.
- You are detail-oriented, methodical, and comfortable writing rigorous critiques in professional English.
- You want flexible, remote, part-time contractor work that directly impacts how AI systems reason about biology.