Role: Site Reliability Engineer (SRE) – Database Services
Location: Open to LATAM
About the Role
We are looking for a
Site Reliability Engineer (SRE)
to join the Database Engineering team and contribute to the reliability, resilience, and automation of mission-critical PostgreSQL environments.
This role is ideal for an SRE who wants to grow into database operations and work closely with senior DBREs.
Key Responsibilities
Operate, improve, and enhance runbooks and standard operating procedures (SOPs).
Troubleshoot complex production issues and collaborate with engineering teams for long-term fixes.
Build automation, tooling, and monitoring to eliminate manual database operations.
Investigate incidents, perform root-cause analysis, and deliver preventive solutions.
Learn PostgreSQL and progressively take on database operations tasks.
Participate in PostgreSQL on-call rotation after gaining sufficient expertise.
Minimum Qualifications
Excellent written and verbal communication skills.
5+ years of experience operating large-scale SaaS platforms.
Strong Linux fundamentals.
Hands-on production experience with
Docker
(building images).
Experience running workloads on
Kubernetes
or similar container orchestration systems.
Solid understanding of
CI/CD
(Jenkins, ArgoCD preferred).
Preferred Qualifications
Experience with relational databases (PostgreSQL preferred).
Strong interest in learning PostgreSQL internals.
Automation/tooling development in
Python or Go
.
Cloud experience (OCI, AWS, GCP, Azure).
What We Offer
Opportunity to work with highly complex, large-scale database systems.
Mentorship from expert DBREs.
Ownership of automation and reliability initiatives impacting production platforms.
Ankit Kumar Srivastava
Ankit.Srivastava@quantumworldit.com
If you belong to any LATAM country, feel free to join using the link below 👇
https://chat.whatsapp.com/JRtO82ZvzaZ636HHt829Dn
Connect on LinkedIn for Latam jobs
https://www.linkedin.com/in/ankit-s-2b684b8b/