Location:
Montreal, QC
Openings:
1
Category:
Information Technology — DevOps / Platform Engineering
Engagement:
Fully On-Site in Montreal QC, Full-Time Permanent Opportunity
Project / Program Overview
Venture-backed deep-tech startup building AI and game-streaming software and scaling a global edge-computing platform. The company is tripling headcount within 12 months; this role will build and own the infrastructure foundation that supports that growth.
Role Summary
A hands-on DevOps expert who will design, scale, and operate cloud-native infrastructure, set engineering standards, and own CI/CD and release practices.
Key Requirements
DevOps SME:
Deep expertise in infrastructure, deployment pipelines, and production operations.
Startup-minded:
Comfortable with ambiguity, rapid iteration, and shifting priorities.
Responsibilities
Build secure, scalable, high-performance cloud-native infrastructure.
Own, optimize, and scale CI/CD pipelines and release engineering.
Define DevOps standards; mentor engineers and champion best practices.
Introduce modern tools to increase engineering velocity and platform resilience.
Recommend and define a pragmatic DevOps roadmap within 6 months.
Support production deployments and participate in an on-call rotation as needed.
Must-Have Requirements
8+ years in Software/DevOps with demonstrated expertise with Kubernetes and Linux
Linux:
Deep knowledge of internals and performance tuning; strong scripting (Bash, Python).
Kubernetes:
Proven experience operating at scale, building Kubernetes Clusters from scratch (designed for millions of users).
CI/CD:
Expert in building and maintaining robust, scalable pipelines (GitHub Actions/GitLab CI/Jenkins).
Containers:
Advanced Docker usage and orchestration patterns.
Cloud \& IaC:
AWS/GCP/Azure plus Terraform/Ansible (or similar).
Observability:
Prometheus, Grafana, ELK/EFK; alerting and SLO-driven ops.
Security:
DevSecOps best practices, least-privilege, and secrets management.
Nice to Have
Experience with GPU drivers (NVIDIA/AMD) or other performance-sensitive systems.
Ways of Working \& Tools
OS/Platform:
Linux, Kubernetes, containers
CI/CD \& IaC:
GitHub Actions / GitLab CI / Jenkins; Terraform, Ansible
Cloud:
AWS / GCP / Azure (any combination)
Observability:
Prometheus, Grafana, ELK/EFK, alerting/on-call
Languages/Scripting:
Bash, Python; Git workflows (PRs, trunk-based/GitFlow)
Security:
DevSecOps, shift-left testing, secrets management, least-privilege/Zero Trust principles