Position:
Senior Software Engineer – Python (LLM Evaluation \& Repository Validation)
Type:
Contractor Assignment (3 months)
Compensation:
$19-$20/hr
Location:
Remote
Commitment:
40 hrs/week with some overlap with PST
Role Responsibilities
Evaluate and analyze GitHub issues across widely used open-source repositories.
Set up and configure development environments for repositories, including Docker setup.
Modify and run codebases locally to evaluate LLM performance in debugging and bug-fixing scenarios.
Evaluate unit test coverage and assess the quality of existing tests.
Work with real-world Python repositories to identify engineering tasks suitable for LLM evaluation.
Collaborate with researchers to expand datasets used for training and evaluating AI models.
Identify repositories and issues that present challenging real-world scenarios for LLMs.
Support the creation of realistic software engineering tasks based on repository histories.
Requirements
Strong experience in software development.
Strong experience with Python programming.
Proficiency in Git, Docker, and development environment setup.
Experience working with well-maintained public GitHub repositories.
Ability to understand, modify, and test complex codebases.
Strong analytical thinking and problem-solving skills.
Excellent written English and ability to document findings clearly.
Ability to work independently in a remote environment.
Desktop or laptop with a stable internet connection.
Preferred Qualifications
Experience contributing to open-source projects.
Exposure to LLM evaluation, AI research, or machine learning projects.
Experience with developer tools, automation agents, or CI/CD pipelines.
Application Process
Application Form
ICF + Profile Review
Assessment
Submit