Backend Engineer (Python / JavaScript) | $15/hr Remote

Crossing Hurdles · Posted 2026-04-25

Position: SwarmBench Task Engineer SWE / CodeType: Short-Term Contract (4 weeks)Compensation: $15 per hourLocation: RemoteCommitment: 20-40 hours per week with 4 hours overlap with PSTRole ResponsibilitiesBuild multi-agent benchmark tasks based on real-world open-source code changes such as bug fixes, migrations, and refactorsWork with the Harbor evaluation framework to run and validate tasks inside Docker environmentsWrite clear and precise task instructions specifying file paths, function signatures, expected behavior, and constraintsDesign and implement Python-based verification scripts to validate correctness of agent-generated code changesCreate decomposition strategies that split complex code changes across multiple independent sub-agentsRun, debug, and refine tasks within containerized environments to ensure reproducibility and determinismEvaluate task performance signals and improve task quality, clarity, and difficultyContribute to benchmark development for advanced AI coding agentsRequirementsStrong years of experience in Python and JavaScript developmentExperience with AI coding benchmarks (e.g., SWE-bench, Terminal-Bench)Strong experience reading and navigating large open-source codebases (e.g., Django, Flask, FastAPI, Node.js, or similar)Familiarity with Git workflows including pull requests, diffs, cherry-picking, and working with specific commitsComfortable with Docker including writing Dockerfiles, building images, and debugging container issuesExperience writing test scripts using pytest, unittest, or custom assertion-based testingAbility to write clear, precise, and unambiguous technical specificationsAbility to work independently in a remote environmentApplication ProcessApply/Easy Apply and check email for application formFill Google formAssessment Link (After shortlisting to be completed within 24 hours)

Apply for this role