Join our team as a
Site Reliability Engineer
specializing in cloud platform development and DevOps.
You will collaborate with cross-functional teams to design, implement, and maintain scalable cloud infrastructure, ensuring optimal performance and reliability. Apply now to contribute to innovative cloud solutions and enhance our operational excellence.
Responsibilities
Collaborate with cross-functional teams to design and implement cloud-based solutions
Ensure optimal performance, reliability, and scalability of cloud infrastructure through proactive monitoring and automation
Implement and maintain CI/CD pipelines for cloud applications
Contribute to improving cloud architecture and best practices
Monitor and manage incident response using ITSM tools
Develop and maintain technical documentation and automation scripts
Support continuous integration and deployment processes
Requirements
Solid experience in platform engineering for microservices-based environments with 2+ years in site reliability engineering
Strong knowledge of instrumentation, monitoring, alerting, and incident management
Familiarity with DevOps methodologies and tooling
Proficiency in software engineering principles including modular design and automated testing
Hands-on experience with Python or other scripting languages such as Bash or PowerShell
Experience working with Microsoft Azure services including AKS, ACR, and Virtual Machines or equivalent cloud platforms
Basic understanding of Kubernetes concepts and operations
Experience with at least one CI/CD tool such as GitHub Actions, GitLab CI, Jenkins, or CircleCI
Excellent communication and collaboration skills
English proficiency at B2 level or higher
Nice to have
Experience with Azure DevOps Pipelines
Familiarity with Datadog
Knowledge of Splunk
Experience with incident management platforms
We offer
International projects with top brands
Work with global teams of highly skilled, diverse peers
Healthcare benefits
Employee financial programs
Paid time off and sick leave
Upskilling, reskilling and certification courses
Unlimited access to the LinkedIn Learning library and 22,000+ courses
Global career opportunities
Volunteer and community involvement opportunities
EPAM Employee Groups
Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn