Bilingual AI Safety Data Evaluator (English/Spanish C1+)
OpenTrain AI · Remote · Worldwide · Posted Apr 3, 2026
About OpenTrain
OpenTrain is a central job board for AI-training and data-labeling work. We gather openings from many companies and labeling platforms so you can discover roles like this one without searching dozens of sites.
Creating an OpenTrain account is free and applying takes only a few minutes.
About AI training work
AI training (data labeling or human feedback work) is the human side of building modern AI: people annotate, evaluate, and correct model outputs so systems learn to be accurate, safe, and useful.
This job focuses on safety evaluation and red‑teaming for large language models, where your decisions and written rationales directly shape how models respond to sensitive or adversarial inputs.
The Role
You will review AI-generated text in Spanish and English, evaluate the quality of safety and reasoning decisions, and provide detailed annotations and feedback. This is a remote, hourly-paid contractor position.
Work includes labeling and quality-checking safety data, performing red‑teaming to uncover edge cases, and applying nuanced policy judgments to detect and mitigate risky or unsafe model responses.
- Employment type: Contractor (remote, worldwide).
- Pay: Hourly USD 20 (hourly range reported USD 14–24).
- Data type: Text. Label types: Evaluation rating, RLHF, Text generation.
- Labeling software: OTHER.
What You'll Do
Provide clear, reproducible rationales for safety and moderation decisions in both Spanish and English. Your annotations will be used to train and evaluate model behavior.
Identify adversarial prompts and edge cases, recommend mitigations, and help improve policy coverage and model robustness through red‑teaming.
- Label AI outputs for safety, accuracy, logic, and explanation quality.
- Quality-check and audit safety datasets to ensure consistency and completeness.
- Perform adversarial testing / red‑teaming to surface tricky or harmful model behaviors.
- Apply nuanced policy judgments across multilingual content, preserving meaning and severity across English and Spanish.
Requirements
You must meet all listed requirements below; no substitutions unless explicitly noted by the hiring team.
Work may involve explicit, toxic, violent, sexual, or psychologically disturbing content; emotional resilience and comfort with such material are required.
- Near-native or native Spanish proficiency in reading and writing.
- Minimum C1 English proficiency in reading and writing.
- Bachelor’s degree or higher in Communications, Linguistics, Psychology, Law/Policy, Security Studies, or equivalent professional experience.
- 5+ years professional experience in Trust & Safety, content moderation, policy operations, risk, compliance, investigations, or related safety roles.
- Proven LLM red‑teaming or adversarial testing experience, including identifying edge cases and recommending mitigations.
- Strong knowledge of safety domains: hate and harassment, sexual content, self-harm, violence, bias, illegal goods/services, malicious activities, malicious code, and misinformation.
- Experience applying policy guidelines consistently across multilingual or cross-cultural content, especially Spanish and English.
- Localization or translation experience preferred, with ability to preserve meaning, severity, and intent across languages.
- Strong analytical writing skills; able to produce clear, reproducible rationales for moderation and safety decisions.
Who Should Apply
Apply if you have substantial hands-on experience in Trust & Safety or moderation and a track record of working with LLMs or policy-driven content review.
This role suits professionals who can make consistent, well-documented safety judgments in two languages and who are comfortable testing model behavior under adversarial conditions.
- Intermediate experience level: experienced practitioners with domain expertise.
- Ideal for bilingual (Spanish/English) safety specialists, policy analysts, moderators, or red‑teamers.
How It Works
OpenTrain lists this contractor opportunity so you can apply quickly; creating an account on OpenTrain is free and applications take only a few minutes.
If selected, you will work remotely and be paid hourly as a contractor. Specific schedules and onboarding details will be provided by the hiring team after application.
- Worldwide applicants are accepted; the role is remote.
- Pay is hourly (USD 20 reported; hourly range USD 14–24).
- Expect text-based annotation tasks focused on evaluation, RLHF, and text generation.