We are seeking a Machine Learning Engineer to join our team and support the GenAI initiative. In this role, you will focus on designing, improving, and optimizing backend infrastructure to power LLM-based applications using OpenAI APIs. Your skills in MLOps, CI/CD, observability, and cloud-native technologies will be essential to ensure the reliability, scalability, and efficiency of AI-driven systems.
Responsibilities
Develop and improve backend infrastructure for AI and LLM-based solutions
Integrate and oversee LLM applications within cloud environments
Scale AI systems to meet performance and reliability requirements
Implement automated deployment processes through CI/CD pipelines
Track and maintain the performance of AI services to ensure consistency
Establish logging and observability frameworks for monitoring LLM API performance
Collaborate with DevOps teams to streamline workflows and enhance system dependability
Work closely with AI and Data Science teams to develop and enhance application features
Leverage cloud platforms, especially Azure, to deploy and scale AI-powered applications
Design and build APIs and microservices to support AI-driven functionalities
Requirements
At least 2 years of experience in Machine Learning Engineering with a focus on backend and software development
Strong expertise in integrating and working with OpenAI APIs and other AI services
Hands-on experience with MLOps tools such as Orion, ArgoCD, and Opsera for deployment automation
Proficiency with monitoring and observability tools, including Grafana, Dynatrace, and ThoughtSpot
Comprehensive knowledge of cloud platforms, particularly Azure, as well as Apache Spark and Databricks
Advanced Python programming skills for backend development and implementation
Proven experience in designing and building APIs and microservices architecture
Fluency in English, both verbal and written, with a minimum proficiency level of B2+
Nice to have
Knowledge of Data Science principles and workflows
Experience with Large Language Models (LLMs)
Understanding of Natural Language Processing (NLP) methodologies and applications
We offer
Career plan and real growth opportunities
Unlimited access to LinkedIn learning solutions
Constant training, mentoring, online corporate courses, eLearning and more
English classes with a certified teacher
Support for employee’s initiatives (Algorithms club, toastmasters, agile club and more)
Enjoyable working environment (Gaming room, napping area, amenities, events, sport teams and more)
Flexible work schedule and dress code
Collaborate in a multicultural environment and share best practices from around the globe
Hired directly by EPAM \& 100% under payroll
Law benefits (IMSS, INFONAVIT, 25% vacation bonus)
Major medical expenses insurance: Life, Major medical expenses with dental \& visual coverage (for the employee and direct family members)
13 % employee savings fund, capped to the law limit
Grocery coupons
30 days December bonus
Employee Stock Purchase Plan
12 vacations days
Official Mexican holidays, plus 5 extra holidays (Maundry Thursday and Friday, November 2nd, December 24th \& 31st)
Monthly non-taxable amount for the electricity and internet bills
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
By applying to our role, you are agreeing that your personal data may be used as in set out in EPAM´s Privacy Notice and Policy.