👨🏻‍💻 postech.work

Site Reliability Engineer

Internet Brands • 🌐 Remote

Remote Posted 5 days ago

Job Description

WebMD is an Equal Opportunity/Affirmative Action employer and does not discriminate on the basis of race, ancestry, color, religion, sex, gender, age, marital status, sexual orientation, gender identity, national origin, medical condition, disability, veterans status, or any other basis protected by law.

Responsibilities:

Assist developers in converting their applications to 12-Factor principles so that the apps can run cleanly in a containerized environment, and migrate cleanly from one environment to the next

Set up, monitor, and debug CI/CD pipelines constructed in GitLab and Jenkins. This includes Docker container builds managed by bespoke Jenkins libraries, so the job will entail debugging and enhancing these libraries (written in Groovy) from time to time

Assist developers in debugging application configuration, code questions, or debugging issues across various languages: PHP, NodeJS, Java, Ruby, .NET Core, and sometimes even Elixir

Assist developers to introduce exception reporting and APM metrics into existing and new applications, to improve runtime performance and error monitoring

Monitor, tune, and debug issues related to application health and Kubernetes cluster health

Identify bugs or shortcomings in the platform and implement solutions to make the IKE Platform better for users and/or easier to maintain

Required Skills:

Advanced English in speaking and writing is required.

Proficiency with using GenAI to help you troubleshoot and resolve issues specific to the problem.

git, including management of remotes, branching/merging, and rebasing

Basics of Jenkins: Jobs, Pipelines, webhooks

Basic RDBMS design and tuning

Linux "power user" skills

Docker concepts:

Builds, including multi-stage builds

Layer management and caching

Volume management

Security

Container networking and IPC

docker-compose

Use of Kubernetes: namespaces, core objects (Deployment, StatefulSet, PersistentVolumes, etc.)

Familiarity with configuring and automating Prometheus metrics collection and Grafana dashboards

Familiarity with principles of web application development:

Configuration by convention (e.g. Rails, Laravel, etc.)

Proxying server configurations (Nginx and/Apache)

Exception reporting, logging, and performance metrics

Good to know:

Rancher kubernetes cluster management

12-Factor concepts and their real-life application

Familiarity with as many programming languages and frameworks as possible: PHP/Laravel, NodeJS, Java/Spring Boot, Ruby/Rails, .NET Core, Elixir, Rust, Python/Django/Flask, etc.

Experience with MVC frameworks in particular

Get job updates in your inbox

Subscribe to our newsletter and stay updated with the best job opportunities.