PDF Structuring & Annotation (C1/C2 English, JSON/HTML Experience)
OpenTrain AI · Remote · Worldwide · Posted Mar 5, 2026
About OpenTrain
OpenTrain is a central job hub that gathers AI-training and data-labeling opportunities from many companies and platforms into one place. Creating an OpenTrain account is free and applying takes only a few minutes.
We make it easy to discover short-term, remote annotation projects so you can find flexible work that fits your schedule.
About AI Training Work
AI training (also called data labeling or annotation) is the human effort that helps machine-learning models learn structure, meaning, and safe behavior. Tasks include annotating document layout, transcribing text, describing images, and converting content into structured formats.
This kind of work is commonly remote, flexible, and accessible to people with attention to detail and basic technical skills.
- Work from anywhere with an internet connection.
- Choose flexible hours on most projects.
- Many projects need no prior experience; specialist tasks pay more for domain knowledge.
The Role
We are seeking detail-oriented freelancers with strong English (C1/C2) and basic HTML/JSON experience to annotate PDF documents. You will create structured representations of each PDF’s content and layout using bounding boxes and exact text transcription.
This is an entry-level, part-time contractor role open worldwide. The labeling types are BOUNDING_BOX and TEXT_GENERATION, and the project uses OTHER labeling software as provided by the client.
- Employment type: Contractor, Part-time
- Experience level: Entry level
- Data type: DOCUMENT
- Label types: BOUNDING_BOX, TEXT_GENERATION
- Labeling software: OTHER
- Location: Worldwide (remote)
What You'll Do
Your core task is transforming PDF pages into a clean, machine-readable structure that captures layout, hierarchy, and content exactly as shown.
- Draw bounding boxes around elements such as headings, paragraphs, figures, tables, and images.
- Categorize elements (e.g., section header, text block, figure, chart, image).
- Assign hierarchy (section vs. subsection) and other structural roles.
- Transcribe all visible English text exactly as it appears.
- Describe images and figures in plain English.
- Convert tables into well-formatted JSON arrays representing rows, columns, and cells.
Requirements
You must meet the essential qualifications below to be eligible for this project.
- C1/C2 level English proficiency is required. You will need to provide an official document certifying this after you apply and before you start the project.
- Experience with document annotation, transcription, or structured data creation.
- Familiarity with document structures (sections, subsections, headers, etc.).
- Ability to accurately draw bounding boxes and label elements.
- Basic knowledge of HTML or JSON formatting and experience converting tables into JSON arrays.
- Ability to transcribe text exactly as it appears and skill in identifying and describing figures, charts, and images.
Test Task (Required During Interview)
During the live chat interview you will be asked to complete a short JSON test task and submit your JSON response before the interview finishes. Your answer must match the requested structure exactly.
- Test Task: Table Annotation (JSON Format)
- Convert the sample table below into a JSON structure with rowCount, columnCount, and a cells array.
- Sample table to convert:
- Table 1: AI Adoption Rates by Industry
- Industry | Adoption Rate
- Healthcare | 76%
- Finance | 65%
- Retail | 48%
- Cells must include: kind ("columnHeader" or "bodyCell"), rowIndex (0 for header), columnIndex (0-based), content (text).
- Your JSON must be clean, structured, and match the format described above exactly. You must submit this JSON before you complete the live chat interview.
Compensation & Schedule
Pay is by the hour at USD 7.00 per hour. This project is part-time with less than 20 hours per week expected. You will be engaged as a contractor.
- Payment type: Pay per hour
- Hourly rate: USD 7.00
- Time requirement: Less than 20 hours/week
- Work setting: Remote, worldwide
How It Works
Apply on OpenTrain by creating a free account and submitting the short application. If selected, you will have a live chat interview where you must complete the JSON test task described above.
After the interview and before starting, you must provide proof of C1/C2 English proficiency as stated in the requirements.
- Create a free OpenTrain account and apply (takes only a few minutes).
- If invited, join a live chat interview and submit the required JSON test response during that chat.
- Provide the required English certification after applying and before beginning work.
- You will be engaged as a part-time contractor and paid hourly at the listed rate.
Who Should Apply
This role is a good fit if you have strong written English, an eye for layout and hierarchy, and basic experience with HTML or JSON. Prior annotation or transcription experience is helpful but not mandatory.
If you are detail-focused, comfortable transcribing text exactly, and able to convert table data into structured JSON, please apply.
- Good candidates: document annotators, transcribers, content-structure specialists, and people with basic web/data formatting experience.
- Not required: advanced coding skills or specialized domain knowledge.