Vibecode Specialist (Web Scraping & Data Extraction)

OpenTrain AI · Remote · Worldwide · Posted Jun 9, 2026

About OpenTrain

OpenTrain is a central job board for AI-training and data-labeling work. We aggregate roles across many AI companies and labeling platforms so you can find remote, flexible projects in one place.

Creating an OpenTrain account is free, and applying for positions here takes only a few minutes.

About AI training and this work

AI systems learn from examples prepared and reviewed by people. In this space, developers and data specialists produce high-quality training datasets that make models more accurate, robust, and safe.

This role sits at the intersection of software engineering and data preparation: your scrapers and validations will feed fine-tuning and evaluation pipelines used to improve model behavior.

The role — Vibecode Specialist (Web Scraping & Data Extraction)

You will own end-to-end scraping workflows to extract structured data from complex websites and deliver clean datasets (CSV, JSON, Google Sheets). Work combines hands-on Python scraping, use of internal tools, and quality control in a hybrid AI+human setup.

This is a remote, contract, part-time opportunity requiring 20+ hours per week. The rate is $20 USD per hour. Candidates from any country may apply.

Work type: Contractor, Part-time
Time commitment: 20+ hours/week
Pay: $20 USD per hour
Location: Global — any location (remote)

What you'll do

Build resilient scraping workflows and deliver accurate, normalized structured outputs for downstream AI training and evaluation.

Design and implement Python-based scrapers using tools like BeautifulSoup plus Selenium/Playwright or equivalents.
Scrape dynamic/JS-heavy sites, handling infinite scroll, AJAX, and JS-rendered content reliably.
Extract data across multi-level/hierarchical site structures (e.g., category → entity → details).
Validate, clean, normalize, and format outputs as CSV, JSON, or Google Sheets per specification.
Use a mix of internal tools (Apify, OpenRouter) and your own scripts—integrate AI agents where they speed up repetitive steps.
Implement batching, parallelization, or other scaling strategies for large jobs and troubleshoot failures.
Document edge cases, selector fallbacks, retry logic, and any changes in site structure for reviewers.

Requirements

You must meet the following essential qualifications and be able to follow detailed specs and document edge cases clearly.

Minimum 1+ year experience in at least one: web scraping, data engineering, software development, automation, or data analysis.
Strong Python web scraping skills (e.g., BeautifulSoup + Selenium/Playwright or equivalents).
Proven experience scraping dynamic/JS-heavy sites (infinite scroll, AJAX, JS-rendered content).
Experience extracting from multi-level/hierarchical site structures (category → entity → details).
Ability to handle changing site structures and implement resilient scraping strategies (selectors, fallbacks, retries).
Ability to clean, normalize, and validate scraped data and deliver structured formats (CSV, JSON, Google Sheets).
Experience with batching/parallelization or equivalent approaches for scaling scraping jobs.
Familiarity using LLMs/AI tools to accelerate workflows (prompting, automation, extraction assistance).
English level B2+ (upper-intermediate or higher) with ability to follow detailed specs and document edge cases clearly.

Who should apply

This role is a fit for mid-level engineers and data specialists who enjoy hands-on scraping and delivering production-quality datasets for ML pipelines.

Web scrapers, data engineers, automation engineers, or developers with practical scraping experience.
People comfortable combining custom scripts with third-party tooling and AI assistants.
Applicants who can work independently on remote, contractor engagements and communicate issues clearly in English.

How it works and next steps

Create a free OpenTrain account and submit your application — applying takes only a few minutes. If selected, you'll receive instructions for any technical checks or onboarding tasks.

During work, you'll follow detailed project specs, deliver structured files, and document edge cases and scraping logic for reviewers and downstream teams.