Senior Site Reliability Engineer - Platform Engineering
AgileEngine
AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.
WHY JOIN US
If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you!
ABOUT THE ROLE
We are looking for a Senior Site Reliability Engineering to strengthen our platform reliability and observability capabilities. You will own the design and operation of monitoring infrastructure — including Datadog APM, alerting, and distributed tracing — across Kubernetes-based microservices on AWS. The role spans backend engineering and SRE practice in roughly a 65/35 split, with direct involvement in CI/CD integration and observability automation. You will also support internal teams in adopting monitoring best practices as we modernize our R&D platform.
WHAT YOU WILL DO
- Design, build, and maintain scalable backend and platform components;
- Implement and manage observability solutions across distributed systems;
- Configure dashboards, alerts, and APM for tracing, metrics, and logging;
- Monitor and improve system reliability, scalability, and performance;
- Deploy, operate, and maintain services in Kubernetes environments;
- Integrate observability tools into CI/CD pipelines and cloud infrastructure;
- Automate monitoring and operational workflows using scripting;
- Provide operational and training support for observability platforms, especially Datadog;
- Collaborate with engineering teams to improve system visibility and reliability practices.
MUST HAVES
- 4+ years of experience with Python, Node.js, or Java;
- Hands-on experience with API integrations;
- Strong experience in Kubernetes environments;
- Experience with Datadog or similar tools such as Prometheus and Grafana;
- Ability to configure dashboards, alerts, and APM;
- Experience monitoring containerized and microservices architectures;
- Hands-on experience with AWS;
- Experience integrating observability tools into cloud environments;
- Experience with CI/CD integrations for observability;
- Ability to automate monitoring and operational tasks using scripting;
- Upper-intermediate English level.
NICE TO HAVES
- Experience owning and operating an internal engineering platform, especially observability platforms;
- Demonstrated ownership of reliability, scalability, and performance;
- Ability to proactively lead maintenance and platform improvements;
- Experience installing and configuring Datadog agents and integrations;
- Experience managing API keys and secure configurations;
- Experience managing user roles and access controls;
- Familiarity with Go (Golang);
- Experience with additional observability tools such as New Relic, Dynatrace, Elastic Stack, or Splunk.
PERKS AND BENEFITS
- Remote work & Local connection: Work where you feel most productive and connect with your team in periodic meet-ups to strengthen your network and connect with other top experts.
- Legal presence in India: We ensure full local compliance with a structured, secure work environment tailored to Indian regulations.
- Competitive Compensation in INR: Fair compensation in INR with dedicated budgets for your personal growth, education, and wellness.
- Innovative Projects: Leverage the latest tech and create cutting-edge solutions for world-recognized clients and the hottest startups.
- ...Job Summary Zafin is seeking a Cloud Site Reliability Engineer II (CSRE II) to lead strategic initiatives in ensuring the reliability, scalability, and performance of our cloud infrastructure and applications. This advanced role requires mastery in cloud technologies, strategic...Senior
- ...Job Title: Software Engineer Experience: 2-4 Years Work Location: Trivandrum Roles & Responsibilities: Drive architectural... ...different teams. Call out major risks and issues from a reliability perspective using data to make informed decisions and drive mitigation...Suggested
- Site Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance...Suggested
- ...THE ROLE We are looking for a Middle SRE Operations Engineer to maintain reliability across a cloud-based SaaS platform. You’ll handle live incidents... .... MUST HAVES - 2+ years of experience in Site Reliability Engineering, DevOps, or Production Operations;...SuggestedFull timeLocal areaRemote job
- ...Site Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance...SuggestedHourly pay
- ...required: esk, Helm, automation, Cloud Security, Service Reliability Management, SRE principles, Terraform., AWS, Github SC... ...Looking for: We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our growing SRE team and play a critical role...Full timeContract workHybrid workShift work
- ...We’re Hiring: Site Reliability Engineer II (Azure | Azure Kubernetes) Location: Trivandrum | Preference: Immediate / Short Joiners Are you passionate about building highly reliable and scalable cloud platforms? We’re looking for an experienced SRE (9+ years)...Immediate start
- ...infrastructure as code (IAC) patterns that meet security and engineering standards using one or more technologies (Terraform, scripting... ...and resolving production incidents. Collaboration: Promote reliability best practices and ensure smooth deployments. Automation: Build...Hybrid work
- ...Responsibilities Join Our Platform Engineering Team as a Senior DevOps Engineer! Are you passionate about building scalable, reliable, and cutting-edge infrastructure solutions... ...Experience ~3+ years of experience in DevOps, Site Reliability Engineering (SRE), or related...Extra incomeSummer holiday jobsVacation SchemeHybrid workWork at officeWorldwide
- Site Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance...Hourly payTemporary workHybrid work
- ...pride in our work. What You'll Do Reliability Management: Manage system uptime across... ...SDKs to meet security standards. CI/CD Engineering: Design and maintain robust pipelines... ...pathing. Mentorship & Influence: mentoring senior engineers and influencing technical...Hybrid work
- ...strengthen their brand, retain members, acquire new ones and deliver digital fitness using the unique power of mobile. With over 4,000 sites and over 20 million app downloads worldwide, Innovatise is the leader in providing marketing-focuses branded apps for gyms. OUR...SeniorHybrid workWorldwideFlexible hours
- ...We're Hiring: Senior Software Engineer – AI (Generative AI & LLMs) Location: Trivandrum (Onsite) Salary: Up to 13 LPA Experience: 3–5 years (AI/ML Engineering) Availability: Immediate joiners or up to 20 days' notice About the Role We are looking for...SeniorImmediate start
- ...Requirements At Quest Global, it's not just what we do but how and why we do it that makes us different. With over 25 years as an engineering services provider, we believe in the power of doing things differently to make the impossible possible. Spanning 18 countries and...Senior
- ...why we do it that makes us different. With over 25 years as an engineering services provider, we believe in the power of doing things differently... ...to solve problems better and faster. Key Responsibilities Senior Software Engineer - .Net and Angular Required Skills And...Senior
Rs 8 - 15 lakhs p.a.
Mandatory Skills: Python, MySQL, RabbitMQ, Flask, REST API, Mircoservices, Docker, Knowledge of Cloud (AWS/Azure/GCP), Familiarity Deployment (DevOps & CI/CD) Good to Have: Experience with Kubernetes Exposure to system observability (logging, monitoring, tracing)...SeniorHybrid work- ...Job Requirements Senior DevOps Engineer is responsible for automation of CI/CD pipelines, testing, and deployment of AI models and application services. This role shall ensure smooth production releases with monitoring and approval of workflows. Key Responsibilities...SeniorHybrid work
- ...Requirements At Quest Global, it's not just what we do but how and why we do it that makes us different. With over 25 years as an engineering services provider, we believe in the power of doing things differently to make the impossible possible. Our people are driven by...SeniorImmediate start
- Role Summary Create solutions based upon requirements provided by the supervisor. Understand the instructions and technology and modifies/maintains the existing platform. Coding and programming all required layers/levels of the application. Role Description Develops...Senior
- ...Position - Senior Software Engineer - Dot net Full stack Experience – 5+ Years Location – Remote Budget – 22LPA Interview mode: Candidate has to attend L1 round at any of Speridian Office (TVM/KOC/BLR/MUM/Calicut) Develop and maintain applications using C# and...SeniorFull timeWork at officeRemote job
- Description Key Responsibilities Administration Manage roles, profiles, and permissions. Configure objects, fields, page layouts, record types, dynamic forms, and validation rules. Design and implement automation using low code solutions such as Flow. Configure...Senior
- ...Job Title: Senior Software Engineer Experience: 4+ Years Work Location: Trivandrum Job Summary We are looking for a highly skilled Senior Software Engineer with 4–6 years of hands-on experience in designing, developing, and maintaining scalable web applications...Senior
- ...Job Requirements Senior Software / Lead Engineer with expertise in Embedded Linux / QNX for Automotive IVI Roles & Responsibilities Develop... ...and testing to validate functionality, performance, and reliability. Perform system-level debugging, troubleshooting, and...Senior
- ...such as Terraform, Ansible, or ARM templates. Monitor system performance and troubleshoot issues to ensure high availability and reliability. Collaborate with development teams to integrate DevOps practices into the software development lifecycle. Implement security...Senior
- ...Why This Role Is Critical Owns the core migration logic, scripting, and platform configuration . Ensures clean, accurate, and reliable CMDB and ITSM data . Bridges business requirements and technical implementation . Maintains integration continuity during...Senior
- Description : Note : We are looking we are looking for Automation Engineers with 8+ years of experience, along with good expertise in Zephyr and TestComplete, who are open to work from our Trivandrum office.About :You will be the automation expert, responsible for designing...SeniorFull timeWork at office
Rs 3 - 8 lakhs p.a.
Qualification ~ Graduate BE/B.Tech in IT/Computer Science and Postgraduate - MCA/MTech with specialization in computers /electronics /IT or M.Sc(CS/IT) Job Description -Development of Android and iOS mobile applications. Knowledge of cross platforms such as Cordova ...Senior- We are seeking a Senior DevOps Engineer with strong expertise in Azure Cloud services. The ideal candidate must have hands-on experience with application CI/CD pipelines using Azure DevOps or GitHub Actions, containerization with Docker, and managing Azure Kubernetes Service...Senior
- ...simulation. We are offering a position, as a Senior Software Developer for Test and Tool... ..., and release test tools faster and more reliable. Your Qualifications You have a B... ...and has experience in requirements engineering and stakeholder management You have...SeniorWork at officeWorldwide
- ...become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all. Role: Senior iOS Engineer Practice: Mobile Strategy & Delivery Excellence Experience: 4 - 8 Years Role Summary We are seeking a hands-on...Senior
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Site Reliability Engineer - Platform Engineering. Be the first to apply!


