👨🏻‍💻 postech.work

Site Reliability Engineer

Softensity Inc • 🌐 In Person

In Person Posted 1 week ago

Job Description

Role Summary

The SRE Technical Member will:

Deliver engineering, operational, and administrative support for the application and its technology landscape.

Address reliability and operational challenges such as application failures, production issues, infrastructure performance (disk, memory), monitoring, and security.

Serve as a mid-level subject matter expert, integrating with multiple teams to develop and evolve SRE practices for Azure-based environments.

Participate in production support activities, including deployments, upgrades, and critical issue resolution.

This role is central to designing, implementing, and maintaining monitoring, alerting, and reporting solutions across servers, containers, databases, and cloud infrastructure components.

Key Responsibilities

Collaborate with Central SRE, DevOps, and InfoSec teams on new projects, platform builds, and deployments.

Contribute to the

design, implementation, and operation

of large-scale, Azure-based platforms.

Apply

industry best practices

in monitoring, alerting, reporting, and cloud architecture.

Participate in

infrastructure, application, and security planning

, focusing on scalability, redundancy, and data preservation.

Support

high-availability topologies

with development teams.

Produce

documentation and weekly operational status reports

, detailing project progress and key metrics.

Provide

engineering and support

for technical infrastructure, cloud, databases, and application performance.

Manage

incident response, change management, and user permissions

following SRE best practices (Google SRE model).

Maintain close collaboration between Application, Central SRE, DevOps, InfoSec, and business units.

Assist in configuring and onboarding new applications into the Azure DevOps (ADO) platform.

Core Technical Skills

Strong understanding of

SRE fundamentals

: monitoring, alerting, reporting, performance, availability, and incident response.

Hands-on experience with

CI/CD tools

(Git, Azure Pipelines, Ansible, etc.).

Infrastructure as Code (IaC)

design, scripting, and setup.

Deep knowledge of

Azure Web Services

— installation, configuration, and management.

Experience administering

Microsoft applications

(.NET, C#, Angular) with focus on automation, optimization, and security.

Proficiency in

Cosmos DB

and

MS SQL

operational tasks.

Excellent

troubleshooting, root-cause analysis

, and

problem-solving

skills.

Experience with

disaster recovery, scalability testing,

and

capacity planning

.

Qualifications

Bachelor’s degree

in a technical discipline (Computer Science, Engineering, or related field).

5+ years of industry experience

in SRE, DevOps, or related technical operations roles.

Proven experience in

cloud infrastructure

,

automation

, and

application reliability engineering

within large-scale, enterprise environments.

Get job updates in your inbox

Subscribe to our newsletter and stay updated with the best job opportunities.