👨🏻‍💻 postech.work

Senior SRE Engineer - Cloud Operations

Qdrant • 🌐 Remote

Remote Posted 1 day, 3 hours ago

Job Description

Qdrant is a cutting-edge vector database company on a mission to revolutionize how organizations manage and query unstructured data. Our open-source engine and managed cloud solutions power AI-driven search, recommendation, and data discovery at scale. We are a remote-first company, building a global team of passionate engineers to push the boundaries of database infrastructure.

As a Senior DevOps / SRE Engineer on the Cloud Operations team, you will focus on keeping Qdrant Cloud reliable, observable, and secure as usage and infrastructure complexity grow. Your primary responsibility is operational excellence: stability, incident response, and continuous improvement of production systems.

This role is operations-heavy, ideal for engineers who thrive in owning reliability and reducing operational risk at scale.

Tasks

Operate and maintain production cloud infrastructure at scale

Own Kubernetes infrastructure, networking, and deployment pipelines

Improve monitoring, logging, alerting, and operational visibility

Lead incident response, root cause analysis, and follow-up actions

Reduce operational toil through automation and better tooling

Improve reliability, security, and performance of production systems

Collaborate closely with Platform and Regions \& Clusters teams

Maintain and evolve runbooks, operational procedures, and alerts

Participate in on-call rotations and continuous reliability improvements

Requirements Must have

5+ years of experience in DevOps, SRE, or infrastructure operations roles

Strong hands-on experience operating Kubernetes in production

Solid knowledge of Linux systems, networking, and cloud infrastructure

Experience working with AWS, GCP, or Azure

Strong understanding of monitoring, alerting, and incident management

Experience with infrastructure-as-code and automation tooling

Comfortable owning on-call responsibilities and production incidents

Strong operational mindset and clear communication skills

Nice to have

Experience with Terraform or similar IaC tools

Familiarity with Prometheus, Grafana, Loki, or OpenTelemetry

Exposure to security, compliance, or hardening initiatives

Scripting experience in Python, Bash, or Go

Experience in SaaS, cloud, or data infrastructure environments

Benefits

Competitive salary, equity, and benefits

Fully remote setup with flexible working hours

Clear ownership of reliability and operational excellence

Opportunity to work on mission-critical customer-facing infrastructure

Strong collaboration with platform and engineering teams

If you enjoy keeping complex systems reliable and improving operations through automation and discipline, we’d love to hear from you.

Recruiting Agencies and Headhunters, please only via . ?ref\=qdrant

Get job updates in your inbox

Subscribe to our newsletter and stay updated with the best job opportunities.