Lead Site Reliability Engineer - Cloud Infrastructure
Vikash Technologies
Job Description : Job Summary : We are seeking an experienced Lead Site Reliability Engineer (SRE) to drive the reliability, scalability, performance, and operational excellence of mission-critical platforms and applications.The ideal candidate will possess strong expertise in Site Reliability Engineering, DevOps practices, cloud infrastructure, automation, and production operations.As a technical leader, you will be responsible for ensuring high availability of systems, improving operational efficiency through automation, leading incident management processes, and mentoring engineering teams on SRE best practices.Key Responsibilities : - Lead the design, implementation, and optimization of highly available, scalable, and resilient production environments.- Drive Site Reliability Engineering initiatives focused on system reliability, performance, and operational excellence.- Develop and implement automation solutions to eliminate manual processes and improve operational efficiency.- Design, build, and maintain CI/CD pipelines to support rapid and reliable software delivery.- Manage and optimize cloud infrastructure across Azure, AWS, or GCP environments.- Define and implement monitoring, alerting, logging, and observability frameworks.- Lead incident response, troubleshooting, root cause analysis, and post-incident reviews.- Establish reliability metrics, SLAs, SLOs, and error budgets to improve service performance.- Collaborate with development, infrastructure, security, and product teams to enhance platform stability and scalability.- Mentor and guide SRE/DevOps engineers while promoting reliability engineering best practices.- Drive continuous improvement initiatives related to infrastructure, deployment processes, and operational workflows.- Ensure compliance with security, governance, and operational standards.Requirements : - Minimum 7+ years of hands-on experience in Site Reliability Engineering (SRE), DevOps, Platform Engineering, or Infrastructure Operations.- Proven experience in a Technical Lead or Team Lead role.- Strong expertise in Linux administration, performance tuning, and troubleshooting.- Hands-on experience with scripting and automation using Shell, Python, PowerShell, or similar technologies.- Strong knowledge of DevOps practices and CI/CD pipeline implementation.- Experience working with modern infrastructure automation and deployment methodologies.- Hands-on experience with at least one major cloud platform : i. Microsoft Azureii. Amazon Web Services (AWS)iii. Google Cloud Platform (GCP)- Strong experience managing large-scale production environments with a focus on : i. Availabilityii. Reliabilityiii. Scalabilityiv. Monitoringv. Incident Management- Experience implementing monitoring, logging, and observability solutions.- Strong troubleshooting, root cause analysis, and problem-solving skills.- Experience handling critical production incidents and driving resolution efforts.- Strong understanding of infrastructure reliability, disaster recovery, and business continuity concepts.- Excellent communication, stakeholder management, and leadership skills.- Ability to work effectively in fast-paced, highly collaborative environments (ref:hirist.tech)
- Job Description :We are seeking a highly skilled and experienced Lead Cloud Infrastructure Engineer to join our dynamic team. The ideal candidate will be passionate about building and maintaining complex systems, with a holistic approach to architecture. You will play a key...Suggested
- ...a highly skilled and experienced Senior Infrastructure Engineer to join our dynamic team. The ideal candidate... ...designing, implementing, and managing cloud infrastructure, ensuring scalability,... ...resolve complex infrastructure issues.- Lead and participate in incident response, troubleshooting...Suggested
- ...Incident Response & Leadership : - Lead and mentor a team of 5-6 SREs... ...Communicate effectively with engineering, product, and leadership teams... ...on team performance and reliability metrics- Drive continuous improvement... ...(Linux & Windows; VMware), cloud platforms (AWS, GCP, Azure),...SuggestedHybrid workImmediate start
- Description :We are seeking a skilled Site Reliability Engineer to join our team on a contract basis. As an SRE, you will work to ensure our systems, services and applications running on Google Cloud Platform (GCP) are reliable, performant and scalable. The ideal candidate...SuggestedContract work
- ...has allowed us to create an AI-enabled, specialty-specific cloud platform that places patients at the center of care.A Culture... ...patient outcomes.The Role : Visionary ReliabilityAs a Staff Site Reliability Engineer, you are a primary architect of our technical future. You don...SuggestedLong term contract
- ...Positions: 4About the Role:As a DevOps Engineer, you will be responsible for designing, implementing, and managing cloud-native infrastructure, CI/CD pipelines, Kubernetes environments... ...teams to improve deployment reliability, platform scalability, security, and operational...
- Description :Why This Role Is Important To Us :As a Principal Site Reliability Engineer, you will act as a technical authority across one or more... ...-based SaaS platform and Site Reliability strategy. You will lead complex initiatives related to platform reliability, observability...Hybrid work
- Lead MuleSoft Engineer.Experience Level : 6+ Years.Location : Any.Job Description : We are seeking... ...including business analysts, developers, and infrastructure teams, to ensure the successful... ...related technologies.2. Solid experience in cloud-based integrations and working within...Full time
- ...govern secure, resilient, and scalable cloud/hybrid infrastructure on Microsoft Azure , integrating on-... ...Private Link) and ensure operational SLOs.- Lead modernization/migration for Windows/... ...reviews.- Partner with platform engineering, security, app/dev, and risk/compliance...Hybrid work
- ...are seeking a hands-on .NET Lead / Architect with strong experience... ...to architectural guidelines.Cloud & Integration : - Architect... ....- Ensure scalability, reliability, and performance optimization... ...containerization (Docker, Kubernetes) and infrastructure-as-code (ARM/Bicep/Terraform)...Full time
- ...status- Maintain detailed logs and documentation for audit and compliance tracking purposes- Collaborate with application owners and infrastructure teams to schedule changes and minimize service disruptionsVulnerability Management - Servers & Workstations :- Manage endpoint...
- ...world's most complex and mission-critical systems. As a Site Reliability Engineer III at JPMorgan Chase within the Consumer & Community Banking... ...with simple and straightforward solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize...Hybrid work
- ...Hyderabad). The organisation delivers technology solutions across Cloud, DevOps, SAP, and AI for enterprise clients globally and has... ...culture. Role overview We are looking for a Site Reliability Engineer with a strong Observability specialisation to drive service...Full timeHybrid workShift work
- ...LivePerson (NASDAQ: LPSN) is a leading customer engagement company, creating digital experiences... ...are powered by our Conversational Cloud each month. You'll be successful at... ...and yourself. Job Description : Site Reliability Engineer (Platform Engineer) Mid Level (...Casual workRemote jobFlexible hours
- ...transform the lives of patients while transforming your career. Site Reliability Engineer What you will do Let’s do this. Let’s change the world... ...engineering teams. This hands-on role focuses on supporting cloud-based infrastructure, automating operations, maintaining...Full timeLocal areaShift work
- Job Title : Site Reliability Engineer (SRE) / Production Engineering LeadLocation : HyderabadWork Mode : Work From Office (5 days)Experience : 12- 1... ...experienced Site Reliability Engineer (SRE) / Production Engineering Lead to manage and enhance the reliability, scalability, and...Hybrid workWork at office
- ...receivable workflow and payment software, we provide the world’s leading brands with AI-powered solutions across the full AR... ...work. When we fall short, we own it and come back stronger. Site Reliability Engineer As a Site Reliability Engineer within our Operations Engineering...Full time
- ...Backup & Recovery complete, reliable, and effective. AutoRABIT’s highly... ...security, and automation of cloud-native platforms on AWS.... ...metrics, alerts) Automate infrastructure using Terraform Operate workloads... ...years in SRE/DevOps/Platform Engineering ~3+ years in AWS-based...Full timeHybrid workWork at officeWorldwide
- ...Job Summary: We are looking for a highly skilled and adaptable Site Reliability Engineer (6+ Years) to become a key member of our Cloud Engineering team. In this crucial role, you will be instrumental in designing and refining our cloud infrastructure with a strong focus...
- ...Site Reliability Engineer II / Cloud Engineer II / Systems Engineer II Job Location - Hyderabad Work Mode - Work from Office (5 days a week) Role Type - Client Implant Responsibilities Implement, Own, maintain, monitor & support the backend servers & micro-services...Full timeWork at office
- ...are looking for a Senior Platform Engineer with a Software Engineering... ...developer with proficiency in DevOps/Infrastructure tooling and expertise in cloud platforms and operations. In this... ...to enhance system performance and reliability, as well as building internal...Long term contractTemporary workLocal area
- ...Site Reliability Engineer (SRE) Position Summary The Site Reliability Engineer (SRE) will be a hands-on contributor within the Site Reliability... ...detection. · Extend observability coverage across infrastructure, applications, APIs, and databases. Reliability Engineering...WorldwideShift work
- Role Overview :As a Site Reliability Engineer (SRE), you will play a critical role in ensuring the availability, reliability, and performance of customer... ...in Windows-based production environments (IIS) along with cloud, automation, and DevOps practices.Key Responsibilities :-...Worldwide
- Job Summary : Experienced Cloud Platform Engineer with 10+ years of expertise in multi-cloud infrastructure provisioning, automation, and enablement across Azure, AWS, and GCP.... ...ServiceNow workflows.Key Responsibilities : - Lead end-to-end cloud provisioning and enablement across...Long term contract
- ...opportunities! At Xenon7, we work with leading enterprises and innovative... ...of IT including Data, Web, Infrastructure, AI, and many others. Our... ...Infrastructure & DevOps Engineer to support the development and... ...automating, and maintaining cloud-native infrastructure that enables...Full timeRemote job
- ...position is critical to protecting the infrastructure behind the most representative, complete... ...Responsibilities: ~ Design and implement Azure cloud-based infrastructure, including using... ..., and own postmortems. ~ Work with Engineering teams and external teams, gather...Work at office
- ...Job Title: Cloud Engineer Lead (AI & Cloud Infrastructure) Experience: 8- 12+ Years Employment Type: Full-Time Work mode: On-site Location: Raidurg Main Road, Hyderabad. Notice Period: Immediate Joiner (15-30 days) About the Role...Full timeImmediate start
- Job Title : Senior Infrastructure Automation EngineerGrade : 5B (10+ Years)Location : Noida, Mumbai... ...Senior Infrastructure Automation Engineer responsible for designing, implementing,... ...compliance, and supporting CI/CD-enabled cloud provisioning. The ideal candidate will have...
- ...ownership of the BHIM platforms application stability, reliability, and continuous improvement. The ideal candidate will bring strong expertise in DevOps, CI/CD, and Site Reliability Engineering (SRE) practices while leading maintenance initiatives, managing critical incidents...
- Job Description :As a Site Reliability Engineer, you'll bridge the gap between software development and operations, applying software engineering principles to infrastructure and operations problems. You'll help design, build, and maintain the systems that keep our services...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Lead Site Reliability Engineer - Cloud Infrastructure. Be the first to apply!
