Role: Sr. Site Reliability Engineer
Location: Mexico -Remote
Job Type : Contract
Job Description: The Senior SRE role is ultimately responsible for ensuring the reliability, availability, and performance of our technology and systems directly supporting our end customers and internal customers. They will work closely with the product development and platform engineering teams to build and maintain scalable systems and robust automation that supports the company's business goals. The ideal candidate will have a history of successfully implementing and using tools like Terraform, Packer, Splunk, SignalFx, and other observability/IAC tools supporting systems with around the clock availability requirements. In addition, the ideal candidate will possess sufficient software skills to properly scrutinize and troubleshoot applications supporting our customers. They should have a strong aptitude for learning new technologies, embracing and driving solutions to challenging projects and problems. This role requires a seasoned engineer with the ability to collaborate across multiple cross-functional teams while exhibiting a rich set of problem-solving skills, along with being self-motivated and have a passion for quality!
Responsibilities:
Develop and maintain monitoring tools, alerts, and dashboards to provide visibility into system health and performance.
Proactively gather and analyze both metric and log data from systems and applications to perform anomaly detection, performance tuning, capacity planning and fault isolation.
Collaborate with development teams to implement and deploy new features and enhancements, ensuring they meet reliability, security and performance standards.
Partner closely with other teams on enterprise standards/best practices.
Identify options for problem resolution and initiate corrective actions.
Mentor junior members, document and share solutions.
Collaborate cross functionally. Qualifications:
Minimum 4 years’ experience in any combination of software engineering roles of some type: SRE, DevOps, applications, services, tools/automation, release, etc.
Minimum 3 years’ experience with SRE/DevOps practices and automation tooling
Experience with observability solutions tools like Splunk, Datadog, SignalFx, etc.
Experience deploying, maintaining and supporting software applications/services in the AWS ecosystem
Proactive approach to identifying problems and solutions
Experience writing code with one or more interpreted languages such as: Python, PHP, Perl, Ruby, Linux Shell
Experience with Terraform or Cloud Formation scripting
Experience with configuration management tools like Ansible, Chef or Puppet
Experience with standard software development best practices and tools such as code repositories (Git preferred)
Experience executing in an agile software development environment
Good understanding of pricing/cost models across AWS services, especially compute, storage, and database offerings
Must be able to multitask and work well with changing priorities in a fast paced, 24x7 environment
Must be highly collaborative and be able to work in a team environment consisting of both technical and business people
Excellent communication, problem solving and customer service skills
A strong ability to learn and adapt to new technologies
Education: Bachelor’s degree in computer science, science, engineering or workforce equivalent technical certifications preferred
Thanks
Rakesh Pathak \| Senior Technical Recruiter
Phone: 609-360-2642
Rakesh.pathak@ampstek.com\| www.ampstek.com
https://www.linkedin.com/in/rakesh-kumar-pathak-00b039167/