👨🏻‍💻 postech.work

Site Reliability Engineer

NTT DATA North America • 🌐 In Person

In Person Posted 2 days, 18 hours ago

Job Description

SRE – Site Reliability Engineer

We are currently seeking a Site Reliability Engineer to join our team in GDL, Jalisco (MX-JAL), Mexico (MX).

Perform L1.5 activities such as monitoring, deployment, rollback. Monitor the efficiency of the Azure cloud systems to prevent outages and initiate an Incident Management bridge in case of an outage. Troubleshoot Azure resources, escalate to Level 3 (Software Development Team).

Understand the Microsoft Azure Cloud - ideally Azure Fundamentals certified OR Computer Science/Information Systems Management degree.

Familiar with PaaS and IaaS - VMs, Storage, EventHub, Service Fabric Cluster (SFC), Azure Kubernetes Service (AKS), CosmosDB, SQL Server, IoT Hub, Databricks, KeyVault, Datalake. Understand the concept of Internet of Things (IoT) - telemetry, ingestion, processing, data storage, reporting.

Understand and know on-prem virtualization (VMWare, or Hyper-V, or Nutanix).

Understand the concept tools - Octopus, Bamboo, Terraform, Azure DevOps, Jenkins, Github, Ansible.

Understand the concept of container orchestration platforms (e.g. Kubernetes).

Understand the concept of scripts: Powershell, Python.

Understand the difference between NoSQL and SQL databases, and how to maintain them.

Understand monitoring and logging systems (LogAnalytics, Splunk, ELK, Prometheus, Nagios, Zabbix, etc.).

Independent thinker - why does it break, what can I proactively do to fix it.

Get job updates in your inbox

Subscribe to our newsletter and stay updated with the best job opportunities.