Sr. Site Reliability Engineer-Platform Operations

Sr. Site Reliability Engineer-Platform Operations Job Description Template

Our company is looking for a Sr. Site Reliability Engineer-Platform Operations to join our team.

Responsibilities:

  • Administration of Linux machines, Web servers, Application servers, Databases;
  • Limiting time spent on operational work, blameless post-mortems and proactive identification of potential outages factor into iterative improvement;
  • Release engineering for new product releases and maintenance;
  • Participate in 24×7 on-call rotation for after-hours emergencies;
  • Application and infrastructure support for customer environment;
  • Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.

Requirements:

  • 4+ years of experience with Linux operating systems internals and administration (e.g., filesystems, inodes, system calls);
  • 7+ years of total professional experience;
  • 2+ years of experience in Administration of AWS/Azure Cloud;
  • Ability to debug and optimize code and automate routine tasks;
  • 4+ years of production system administration and web operations experience;
  • 2+ years of experience with configuration management tools like Chef, Puppet, Salt or equivalent;
  • Experience in massive scale web operation production support;
  • Bachelor’s Degree Computer Science/Technology;
  • Interest in designing, analysing and troubleshooting large-scale distributed systems;
  • 2+ years of experience with DevOps automation development using Perl, Go, PHP, Python, Ruby or equivalent;
  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.