Join our evolving Enterprise Technology team as a Senior Site Reliability Engineer to maintain critical applications and infrastructure
You will apply DevOps tools and engineering expertise to create sustainable solutions. If you are passionate about building scalable infrastructure and continuous improvement, we encourage you to apply.
Responsibilities
Maintain and improve enterprise application infrastructure using DevOps methodologies
Implement and manage CI/CD pipelines to support rapid and reliable software releases
Administer and optimize Kubernetes clusters for scalability and security
Develop automation scripts and tools primarily using Python
Manage cloud infrastructure on Amazon Web Services and Microsoft Azure with a focus on security and access management
Collaborate with software development teams to enhance infrastructure as code using Terraform
Monitor system performance and implement proactive measures to ensure high availability
Handle operational requests and maintenance events efficiently
Analyze and resolve complex technical issues related to infrastructure and deployment
Ensure compliance with security policies and best practices across all systems
Document infrastructure configurations and standard operating procedures
Participate in disaster recovery and business continuity planning
Continuously evaluate new technologies to improve system reliability and performance
Requirements
Minimum of 3 years experience in a Site Reliability Engineering or similar DevOps role
Expert knowledge of Python programming language
Extensive experience with Amazon Web Services and Microsoft Azure including API, authentication, and serverless technologies
Strong understanding of cloud networking, Kubernetes cluster administration, security, IAM, and configuration automation
In-depth knowledge of CI/CD processes, source control, containers, and infrastructure management using Terraform
Experience with IaaS enablement and enhancement
Proven track record in enterprise-scale software development and release management
Solid understanding of automation principles related to CI/CD and IaaS
Excellent complex problem-solving and analytical skills
Ability to manage operational requests and maintenance events effectively
Strong communication skills with English proficiency at B2+ level