Job description
We are looking for a DevOps Engineer with over 2 years of experience to help design, operate, and scale our hybrid infrastructure (on-premises and AWS Cloud).
The ideal candidate is a system thinker with strong skills in automation, Kubernetes, and CI/CD, who takes ownership of reliability, cost-efficiency, and continuous improvement.Key Responsibilities* Analyze usage and apply cost optimization strategies (Savings Plans, RI).
Manage S3 lifecycle, EBS cleanup, and resource right-sizing.
Support capacity and resource planning for future growth.
1. Infrastructure Management (40%)* Maintain Proxmox/Vmware production/non-production environments
Build and operate Kubernetes and HPC/Slurm clusters.
Perform security patching (Harderning, ISMS 27001)
Backup \& restore data
Monitor infrastructure (Prometheus, Grafana, Checkmk, ...)
2. AWS Infrastructure \& Cost Optimization (20%)* Operate AWS production workloads (EC2, EKS, RDS, S3, IAM, Batch).
3. CI/CD \& Automation (20%)* Build and maintain CI/CD pipelines (GitLab CI, ArgoCD)
Implement GitOps practices
Build self-service tools for developers
4. Documentation \& Collaboration (20%)* Support members with infrastructure-related requests
Maintain runbooks, SOPs, and architecture documentation
Your skills and experience
Must have skills:* 2+ years with AWS (EC2, EKS, RDS, S3, IAM, VPC).
2+ years with Docker, Kubernetes/ Rancher (EKS or on-prem).
Experience with Terraform/CloudFormation and GitLab CI/CD, ArgoCD.
Strong background in Linux, scripting, and monitoring tools (Prometheus/Grafana).
Knowledge of security, IAM, and cost management best practices.
Nice to Have:* Experience with Vmware/Proxmox/HPC
Familiarity with Prometheus, Grafana, or ELK Stack
Background in SysOps or on-premises infrastructure management
Why you'll love working here
Provide laptop for working
Working in a fast-paced, engineering-driven environment focused on automation and reliability
12 days annual leave
Bonus of at least one month’s salary per year
Insurances