LlamaIndex Developers Needed for AI Code Review & Evaluation

OpenTrain AI · Remote · Worldwide · Posted May 23, 2026

About OpenTrain

OpenTrain aggregates data-labeling and AI-training jobs from many companies and labeling platforms into one searchable place. Creating an OpenTrain account is free and applying to listings takes only a few minutes.

About AI Training Work

AI training (data labeling and human feedback) is the human layer that teaches models to behave correctly and usefully. For developer-focused projects like this one, contributors review code, prompts, and model outputs so LLMs learn accurate retrieval and structured-data handling.

The Role

We are hiring an experienced LlamaIndex developer to review and evaluate AI-generated LlamaIndex code, prompts, and explanations. You will label, categorize, and provide detailed, structured feedback on the AI’s outputs to improve retrieval, indexing, and query performance.

This listing requires hands-on LlamaIndex expertise and strong English writing skills; you will also conduct AI-driven interviews to screen other LlamaIndex developer candidates for this evaluation project.

Compensation: $25 USD per hour (PAY_PER_HOUR).
Time commitment: Less than 20 hours per week, part-time contractor work.
Location: Worldwide — fully remote.
Labeling focus: COMPUTER_PROGRAMMING_CODING on COMPUTER_CODE_PROGRAMMING data.
Labeling software: OTHER (project-specific tool).

What You'll Do

Your day-to-day work mixes hands-on labeling and interviewing. You will analyze AI-generated prompts, responses, and code snippets related to LlamaIndex and record structured labels and evaluations that capture accuracy, relevance, and best-practice adherence.

Review AI-generated LlamaIndex prompts, explanations, and code for correctness and completeness.
Label and categorize outputs according to project guidelines and produce structured feedback.
Identify errors, inconsistencies, inefficiencies, and missing best practices in code and explanations.
Assess retrieval quality: whether indexed sources are correctly retrieved, structured, and processed.
Conduct AI-driven interviews that probe candidates' practical LlamaIndex experience and communication skills.

Requirements

You must demonstrate substantial, practical experience with LlamaIndex and related retrieval systems. Strong English writing and the ability to explain technical issues clearly are mandatory.

Minimum: 5+ years hands-on experience working with LlamaIndex (required by the project).
Deep knowledge of retrieval-augmented generation (RAG), document indexing, and structured-data retrieval.
Experience with vector databases and embedding workflows (examples: FAISS, Pinecone, ChromaDB).
Expertise in query optimization, document chunking strategies, and integrating LlamaIndex with LLMs or LangChain.
Proven ability to analyze and improve AI-generated code and explanations; prior code-review experience preferred.
Availability for under 20 hours/week and willingness to work as a contractor/part-time contributor.
Comfort using a project-specific labeling tool (listed as OTHER) to submit structured labels and feedback.
Strong, clear English communication and documentation skills.

Interview Guidelines You'll Follow

As the interviewer for candidates applying to the evaluation project, you will follow structured steps that confirm practical skills and communication. The guidelines below define the topics and checks you must cover during interviews.

Experience Assessment: verify 5+ years with LlamaIndex; ask about real-world RAG, indexing, and structured retrieval projects.
Technical Knowledge Check: present a short LlamaIndex code snippet with a deliberate issue and ask the candidate to identify and fix it.
System Design Questions: ask how they'd structure, optimize, and retrieve data efficiently using LlamaIndex, vector stores, and chunking.
Task Understanding & Evaluation Skills: ask how they'd analyze, correct, and enhance an AI-generated LlamaIndex response; evaluate their critique approach.
Communication Check: require a clear, simple explanation of a complex LlamaIndex feature (e.g., hierarchical retrieval or embedding-based search).
Final Confirmation: confirm prior experience reviewing AI-generated code, willingness to complete structured labeling tasks, and current availability.
Important: proceed only with candidates who demonstrate strong, hands-on LlamaIndex experience; reject vague, purely theoretical answers.

How to Apply

Apply through OpenTrain by creating a free account and submitting your profile. Applications usually take only a few minutes.

Be prepared to demonstrate past LlamaIndex work, provide concrete examples of RAG projects, and complete sample evaluation tasks during the interview and onboarding.

Pay: $25/hour (USD) as a contractor; part-time engagement under 20 hours/week.
Project scope: labeling and evaluating AI-generated LlamaIndex code and conducting evaluation interviews.
Worldwide candidates accepted; ensure you can meet the English communication requirement.