Lead Site Reliability Engineer-Infra

Lead Site Reliability Engineer-Infra Job Description Template

Our company is looking for a Lead Site Reliability Engineer-Infra to join our team.

Responsibilities:

  • Working closely with development teams to create resilient systems that are able to run and repair themselves with minimal human interaction;
  • Build infrastructures as a code using python or ruby , chef , puppet etc;
  • Participate in an on-call rotation as required;
  • Lead architectural effort to continuously improved performance, scalability, & resiliency of enterprise and micro services platforms;
  • Collaborate with cross functional teams to see products through from conception to delivery.

Requirements:

  • Hands-on working exp. with;
  • 8+ years of experience architecting integrated stack solutions (storage, network, compute) within an enterprise scale production environment;
  • Hands-on experience with orchestration and system configuration tools such as Chef, Terraform, Puppet, Ansible , etc;
  • Managing and operating SQL and NOSQL databases like Mysql, Postgres and Mongo;
  • CI/CD tools, such as Jenkins, TeamCity, GitLab, Bamboo, TravisCI, or CircleCI;
  • Excellent written and verbal communication, people and collaboration skills;
  • BS/MS in Computer Science or equivalent with 10+ years’ experience in designing, troubleshooting, and tuning large scale cloud systems;
  • Experience with working in agile scrum teams, development, testing and DevOps in an enterprise software development environment.