We are seeking an experienced Senior Site Reliability Engineer to join our dynamic team. In this role, you will monitor and optimize systems, debug issues and automate routine tasks while collaborating closely with our development teams.
Responsibilities:
Monitor service health, performance, alerts, and capacity.
Dive deep into the application stack to optimize and troubleshoot.
Automate routine tasks and improve system reliability.
Collaborate with developers to improve workflows and scalability.
Adapt and thrive in a rapidly-changing high growth environment.
Qualifications:
Bachelor's Degree in Computer Science or related field
At least 5 years of relevant work experience.
Strong experience with Kubernetes and Unix/Linux environments.
Proficient in monitoring and debugging Kafka messaging queues.
Hands-on experience with Python for scripting and automation.
Experience monitoring and troubleshooting APIs.