Job Title:
Sr DevOps Engineer (with ML)
Location:
Toronto, ON (Hybrid, 4 days onsite)
Potential updated:
You will be deploying and modernizing Machine Learning Kubernetes infrastructure. Ensuring deployments comply with enterprise security standards. The day-to-day task involves delivery, deployment, and upgrading of scalable systems designed for data ingestion, processing, validation, model training, large-scale computation, monitoring, and serving prediction results.
Our stack for DevOps includes Kubernetes, Docker, Databricks, Blobfuse, Terraform, Helm, Github Actions, and Saltstack, with a majority of our infrastructure running on Azure cloud.
Job Requirements
Must-Have:
5+ years of experience building sophisticated and automated production infrastructure
Experience with Kubernetes, docker and container orchestration
Experience with Terraform
A background in software engineering, working within a software development team
Solid cloud experience (Preferably Azure or AWS)
Strong scripting skills, i.e., Bash, Python, Groovy etc.
Experience with managing CI/CD tools and pipelines
Experience with Linux systems administration skills in a Cloud environment, Redhat and Ubuntu
Experience with Git, and Jenkins
Strong verbal and written communication skills, with the ability to work effectively across teams and produce engineering documentation
BA/BS degree or equivalent experience; Computer Science background preferred.
Nice-to-Have:
Knowledge of IP networking, VPN’s, DNS, load balancing and firewalls
Familiarity with monitoring tools
Experience with automated testing tools
Experience troubleshooting and tuning systems performance
Experience with Saltstack or other configuration management
Experience resolving and triaging docker image problems
Experience optimizing system-level design and architecture