👨🏻‍💻 postech.work

DevOps-6

Realign LLC • 🌐 In Person • 💵 $142,000 - $142,000

In Person Posted 2 days, 21 hours ago

Job Description

Toronto, Ontario M5V 3L9 Posted March 12th, 2026

Looking for more job opportunities? Click here!

Job Type: Full Time

Job Category: IT

Job Description DevOps

Toronto, ON - Onsite

Required Skills:

We are seeking a lead Site Reliability Engineer to join our dynamic Capital Markets Platform Engineering team. This role will focus on designing, building, and maintaining a secure, scalable, and resilient platform infrastructure with Kubernetes and Google Kubernetes Engine (GKE) as the core technologies. The ideal candidate will have expertise in container orchestration, cloud infrastructure, automation, and CI/CD pipelines.

Kubernetes \& GKE Management: Design, deploy, and maintain Kubernetes clusters on GKE / on-premises infrastructure for mission-critical capital markets applications, ensuring high availability, scalability, and compliance with security standards.

Platform Automation: Develop Infrastructure-as-Code (IaC) solutions using tools like Terraform, Helm, and Ansible to automate cluster provisioning, scaling, and maintenance.

Cloud Infrastructure Management: Architect and manage cloud infrastructure within Google Cloud Platform (GCP) with a focus on cost optimization, network security, and performance.

CI/CD Enablement: Implement and optimize CI/CD pipelines for containerized applications using tools like Jenkins, ArgoCD, and GitLab CI ensuring rapid and secure software delivery.

Observability \& Monitoring: Implement monitoring, logging, and alerting solutions using tools such as Dynatrace, Prometheus, ELK Stack, and Google Cloud Operations Suite.

Security \& Compliance: Ensure platforms meet security, regulatory, and compliance requirements for capital markets, including RBAC, service mesh, encryption, and network policies.

Collaboration \& Mentoring: Work closely with development, security, and operations teams to promote a DevOps culture and mentor junior engineers on platform engineering best practices.

Incident Management: Participate in L2/L3 on-call rotations and help with incident management, root cause analysis, and proactive risk mitigation.

Required Skills

CLOUD DEVELOPER

SQL APPLICATION DEVELOPER

Get job updates in your inbox

Subscribe to our newsletter and stay updated with the best job opportunities.