Position:
SWE Expert
Type:
Hourly contract
Compensation:
$70-$150 per hour
Commitment:
Project Based
Location:
Remote
Role Responsibilities
Translate high-level AI evaluation objectives into structured, testable deliverables with defined inputs, outputs, and success criteria
Create documentation describing expected behavior, constraints, and edge cases for evaluation workflows
Develop lightweight automation scripts that generate artifacts, validate outputs, and enforce formatting requirements
Build deterministic Python verifier scripts that confirm task completion through output or final-state validation
Design prompts and evaluation tasks that reliably trigger intended workflow behavior while preventing instruction leakage
Implement robust error handling and clear failure messaging in verification tooling
Develop negative-control or baseline approaches that test whether evaluation systems correctly distinguish valid solutions
Maintain well-structured, reproducible artifacts with consistent naming and version control practices
Requirements
Strong Python skills including scripting, file system operations, parsing, and deterministic validation logic
Experience with automated evaluation, testing frameworks, or verification workflows
Familiarity with prompt design and evaluation methodologies for large language models
Ability to create structured technical documentation using tools such as Markdown or similar formats
Experience with developer tooling such as Git, command-line workflows, virtual environments, and dependency management
Understanding of reproducible evaluation practices and deterministic task design
Strong communication skills with the ability to produce clear specifications and controlled project scope
Application Process (Takes 20 Mins)
Upload resume
Interview (15 min)
Submit form