Role Description:

SRE will work within the Video Network division to design, build, operate our next generation Video Cloud platform, driving efficiency, reliability and scalability across our cloud infrastructure. Will work primarily on AWS with opportunities to expand across multi-cloud (Azure, GCP).

Deliverables:

Deploy solutions in POC, Staging, Production environments, ensuring reliability \& scalability

Lead/support customer onboarding, including environment setup and configuration

Provide tech support to partners/customers on Synamedia technologies, products \& solutions

Troubleshoot resolve moderate to complex tech issues, ensuring timely resolution \& customer satisfaction

Replicate/analyze issues in a controlled lab environment to validate fixes and improvements

Document tech solutions \& best practices, contributing to internal knowledge bases \& support documentation

Deliver tech presentations \& cross-training sessions to internal/external stakeholders

Collaborate closely with cross-functional teams (Engineering, Sales, and Product Management) to enhance product quality and customer experience

Foster teamwork by actively sharing insights \& collaborating with peers toward common objectives

Demonstrate a continuous commitment to technical excellence, innovation, and learning

Responsibilities:

Design, build, and operate scalable and secure Cloud infrastructure solutions across AWS, Azure, or GCP

Manage and resolve Service Requests, Incidents, Problems, and Change Requests related to Cloud environments

Analyze complex technical issues, propose effective solutions and communicate recommendations clearly to stakeholders

Drive automation across the infrastructure — develop tools, scripts, and pipelines to minimize manual intervention and improve operational efficiency

Monitor system performance and anticipate scaling needs to ensure service stability under varying workloads

Implement and maintain monitoring and observability frameworks to proactively detect and remediate system anomalies

Create and maintain documentation, including architecture diagrams, runbooks, and knowledge base articles

Define and track key metrics for Cloud resource utilization, performance, and cost efficiency.

Build cost-optimization dashboards and automation to visualize and control cloud spend at both infrastructure and Kubernetes levels

Collaborate with development and operations teams to enhance CI/CD pipelines, ensuring smooth deployments and high availability

Continuously research and adopt emerging tools, frameworks, and best practices in Cloud and DevOps

Soft Skills:

Analytical and troubleshooting skills

Eager to learn. Technical aptitude to assimilate new learning quickly (essential)

Excellent written and verbal communication skills (essential)

Flexible: Very able to adapt to a changing environment (essential)

Able to take initiative and drive change (essential)

Performs well under pressure and in disruptive environments where priorities can change in response to customer demand (essential)

Capacity and passion to help customers. Good customer engagement (essential)

Customer facing skills, negotiations, customer satisfaction, clear verbal, written and presentation communication skills

Highly organized with ability to manage multiple projects \& escalations in fast paced environment

Site Reliability Engineer (SRE)

Job Description

Login / Register

👋 Let's find you a Dream Job

Check Your Email!

Get job updates in your inbox