Scope of work
Design, build, and maintain CI/CD pipelines using Jenkins, GitlabCI and ArgoCD to streamline deployment and integration processes.
Manage and optimize cloud infrastructure on Amazon Web Services (AWS) and Google Cloud Platform (GCP).
Build and maintain infrastructure as code (IaC) using Terraform
Deploy, configure, and maintain Kubernetes clusters, ensuring high availability and scalability.
Implement and maintain monitoring systems using Prometheus, Grafana, and related observability tools.
Set up and manage centralized logging systems using ELK (Elasticsearch, Logstash, Kibana) or equivalent stacks.
Proactively monitor system health, performance, and capacity to ensure uptime and reliability.
Troubleshoot and resolve application, infrastructure, and networking issues across multiple environments.
Collaborate with developers to ensure system design follows best practices for reliability, performance, and security.
Automate operational tasks through scripting.
Recruitment Requirements
At least 5 years experience
Proven experience with AWS (and GCP cloud platforms).
Hands-on experience with CI/CD tools such as Jenkins, GitLab CI and ArgoCD.
Strong knowledge of Kubernetes for container orchestration.
Experience with monitoring and alerting tools — particularly Prometheus, Grafana, or similar.
Experience with logging and analysis systems using ELK stack (Elasticsearch, Logstash, Kibana).
Solid understanding of Linux systems administration and network troubleshooting.
Strong problem-solving skills and ability to perform root-cause analysis.
Strong experience with Terraform for Infrastructure as Code (IaC)
Excellent collaboration and communication skills in English
Preferred Qualifications
Experience working in production-grade Kubernetes environments.
Knowledge of service meshes, cloud networking, or security best practices.
Background in SRE principles (SLIs, SLOs, SLAs, error budgets).
Scripting proficiency in Python, Bash, or Go.