Key Responsibilities
Design, implement, and maintain CI/CD pipelines using GitLab for automated testing, deployment, and monitoring.
Manage and optimize Kubernetes clusters for scalability, performance, and high availability in production environments.
Architect and support cloud infrastructure on AWS, including networking, security, and cost optimization.
Build and maintain infrastructure as code (IaC) using tools such as Terraform or Ansible.
Develop and manage monitoring and alerting systems to proactively identify and resolve production issues.
Collaborate closely with developers to streamline build, release, and deployment processes.
Drive automation across infrastructure provisioning, testing, and deployment workflows.
Ensure system reliability, scalability, and security through best DevOps practices.
Document infrastructure configurations, CI/CD pipelines, and standard operating procedures.
Requirements
3+ years of hands-on production experience in a DevOps or Site Reliability Engineering (SRE) role.
Proven experience managing AWS production environments.
Solid understanding of Kubernetes (deployment, scaling, troubleshooting).
Strong experience in GitLab CI/CD pipeline design and automation.
Proficiency in scripting (e.g. Python, Bash, or JavaScript).
Familiarity with Terraform, Ansible, or similar IaC tools.
Understanding of networking, containerization, and distributed systems.
Experience working in Agile environments.
Preferred Skills (Advantageous)
Experience with monitoring tools (e.g. Prometheus, Grafana, ELK Stack).
Exposure to multi-cloud or hybrid deployments.
Knowledge of security and compliance best practices in DevOps.
Strong problem-solving and collaboration skills.