The DevOps Lead is responsible for defining the strategy, governance, and execution of all DevOps practices within the organization. This role combines technical leadership, operational excellence, and strategic vision, ensuring scalability, automation, and reliability across the entire software delivery lifecycle. The DevOps Lead will lead a multidisciplinary team and collaborate closely with Development, Operations, and AI Engineering groups to drive modernization, automation, and innovation across cloud and on-premise environments.
Key Responsibilities
Leadership \& Strategy
Define and implement the DevOps strategy, ensuring alignment with business and technology goals.
Lead, mentor, and grow the DevOps engineering team, promoting a culture of automation, continuous improvement, and operational excellence.
Define best practices and governance for CI/CD, cloud infrastructure, observability, and security.
Coordinate with senior management and project stakeholders to ensure DevOps initiatives contribute to the broader digital transformation strategy.
Automation \& Delivery
Oversee the design and optimization of CI/CD pipelines for continuous integration, testing, and deployment across multiple products and platforms.
Implement Infrastructure-as-Code for standardized, reproducible environments.
Drive automation across configuration management, testing, and monitoring.
Cloud \& Platform Engineering
Supervise cloud operations ensuring reliability, scalability, and cost efficiency.
Guide the transition toward cloud-native architectures and container orchestration (Kubernetes, Docker).
Ensure the integration of AI/ML workloads (MLOps) within the company’s infrastructure, supporting model deployment and lifecycle management.
Reliability \& Observability
Define and maintain SLAs/SLOs, ensuring optimal system reliability and uptime.
Implement monitoring, logging, and observability frameworks (Prometheus, Grafana, ELK) in collaboration with Operations.
Lead incident management and root-cause analysis, ensuring post-mortem documentation and continuous improvement.
Security \& Compliance
Enforce DevSecOps principles, integrating security checks within CI/CD and IaC pipelines.
Collaborate with IT Security to ensure compliance with industry standards (ISO 27001, SOCx, GDPR, etc.).
Requirements
Required Qualifications
Proven experience (5+ years) in DevOps or Site Reliability Engineering roles, including leadership or senior responsibilities.
Strong technical background in (or similar):
-
Automation tools: Jenkins, GitLab CI/CD, Ansible, n8n, Airflow.
-
Containers \& Orchestration: Docker, Kubernetes, Nomad, Consul.
-
Cloud platforms: AWS, Azure, GCP (at least one required).
-
IaC: Terraform, CloudFormation.
-
Monitoring \& Logging: Prometheus, Grafana, ELK.
-
Programming/Scripting: Python, Bash, Go, Java or Node.js.
Solid understanding of networking, system design, and distributed architectures.
Soft Skills
Strong leadership, team-building, and mentoring abilities.
Excellent communication and collaboration across technical and non-technical teams.
Analytical mindset and problem-solving skills in complex environments.
Strategic thinking and decision-making capability.
Proactive attitude and passion for innovation in cloud, automation, and AI.
Preferred Qualifications
Experience in multi-cloud strategies or hybrid environments.
Knowledge of AI/ML platform operations.
Certifications such as:
-
AWS Solutions Architect / DevOps Engineer
-
Azure DevOps Expert
-
Google Cloud Professional Engineer