Cloud Site Reliability Engineer Job Description

Cloud Site Reliability Engineer Job Description Template

Our company is looking for a Cloud Site Reliability Engineer to join our team.

Manage our cloud application using common DevOps and Agile practices to successfully keep uptime and delivery;
About 50% of your time should be spent automate the site systems to self-manage and self-heal.

Experience managing Windows and Linux servers;
Minimum of 5+ years of experience;
Experience in public clouds (Preferably Azure and AWS);
Deep understanding and knowledge of modern monitoring and alerting tools such as ELK stack, Nagios, Prometheus, Qualys, Dome9, etc;
Excellent communication and documentation skills;
Bachelor’s degree in Computer Science, Computer Engineering or related field or equivalent experience;
Experience with configuration management software like puppet, salt, chef, etc;
Working knowledge of scripting language such as Python, C#, PowerShell, Bash, etc;
Experience with some of the DevOps standard tools such as docker, Cloudera, Hadoop, terraform, Jenkins, git, consul, Vault, etc;
Strong problem-solving skills.