Join our team as a Senior Site Reliability Engineer focusing on cloud platform development and DevOps practices.
You will partner with various teams to design, build, and sustain scalable cloud infrastructure, ensuring high performance and dependability. Apply now to make an impact on cloud innovations and operational excellence.
Responsibilities
Partner with cross-disciplinary teams to design and deploy cloud-native solutions
Maintain and enhance cloud infrastructure performance, reliability, and scalability through automation and monitoring
Develop and manage CI/CD workflows for cloud deployments
Participate in refining cloud architecture and operational best practices
Handle incident response management using ITSM tools
Create and update technical documentation and automation scripts
Support continuous integration and delivery operations
Requirements
Proven experience in platform engineering within microservices environments, with over 3 years in site reliability engineering
In-depth knowledge of instrumentation, monitoring, alerting, and incident management
Experience with DevOps tools and practices
Strong foundation in software engineering including modular design and automated testing
Proficient in Python or scripting languages like Bash or PowerShell
Hands-on experience with Microsoft Azure services such as AKS, ACR, and Virtual Machines or similar cloud platforms
Basic comprehension of Kubernetes operations and concepts
Familiarity with at least one CI/CD system such as GitHub Actions, GitLab CI, Jenkins, or CircleCI
Excellent communication and teamwork abilities
English language proficiency at B2 level or above
Nice to have
Hands-on experience with Azure DevOps Pipelines
Knowledge of Datadog monitoring tools
Understanding of Splunk analytics
Experience with incident management systems