AI Solutions and Platforms Operations Engineer
PepsiCo
The AI Observability Engineer (Agentic Frameworks & AI Agent Operations Center Developer)builds and operationalizes agentic AI solutions using modern orchestration frameworks and contributes to an AI Agent Operations Center that enables safe, reliable, and observable agent behavior at scale. This role focuses on developing agent workflows (planning, tool execution, memory, and RAG), integrating guardrails and evaluations, and delivering operational capabilities such as run management, telemetry, and incident triage for production agents.
Responsibilities- AI Agent Operations Center (70%)
- Build “operations center” capabilities for agent runtime management: agent registry, versioning, deployment tracking, and run histories
- Enable operational workflows such as incident triage, replay/debug runs, trace correlation, and root-cause analysis across agent steps
- Implement operational dashboards and views for agent health: success rate, latency, tool failure rate, cost per run, and loop detection
- Instrument agent flows end-to-end using OpenTelemetry (or equivalent), enabling correlation across prompts, tool calls, retrieval, and responses
- Implement semantic conventions and tagging standards (agent name/version, tool name, model provider, environment, tenant/app)
- Partner with SRE/observability teams to ensure production-grade monitoring, alerting, and operational readiness
- Collaboration with Teams (10%)
- Collaborate with transformation teams and business stakeholders to understand requirements and tailor AI agents to specific domains.
- Work closely with AI platform teams to build scalable and cross-domain AI agents while ensuring end-to-end observability.
- Integration & Deployment (10%)
- Build and maintain CI/CD pipelines for agent services and operations center components, including automated testing and deployment
- Automate onboarding for new agent use cases (templates, scaffolding, configuration checks)
- Drive best practices for secure, scalable, and cost-effective agent deployments
- Continuous Learning (10%)
- Stay updated with the latest advancements in AI and machine learning technologies and integrate these into existing or new AI agents.
- Conduct thorough testing and validation to ensure the reliability and accuracy of AI agents and solutions.
Key Skills/Experience Required Minimum Qualifications:
- Education: Bachelor’s in Computer Science, AI/ML, Data Science, or a related field.
- Experience: 3–5+ years of software engineering experience; 1+ years building and observe AI/ML or GenAI applications preferred
- Required Expertise:
- Hands-on experience with agentic frameworks (Crew.ai, LangChain, Semantic Kernel, AutoGen, or similar)
- Proficiency in Python (primary) and familiarity with APIs/microservices patterns
- Strong experience with RAG patterns (embeddings, vector search, retrieval evaluation, chunking strategies)
- Experience with cloud environments (Azure/AWS/GCP) and containerized deployments (Kubernetes/AKS/EKS)
- Familiarity with observability fundamentals (logs/metrics/traces) and production troubleshooting
- Experience building internal developer platforms or operational consoles (agent registry, run tracking, dashboards)
- Familiarity with OpenTelemetry, distributed tracg, and telemetry pipelines
- Experience with Azure AI Search / vector databases, prompt/version management, and evaluation frameworks
- Knowledge of Responsible AI practices: data handling, safety guardrails, audit trails, and redaction strategies
- FinOps exposure: token/GPU cost optimization and chargeback/showback reporting
- Technical Proficiency: Agent orchestration design (planning, tool execution, memory, RAG), Strong engineering discipline: testing, versioning, CI/CD, automation, Operational mindset: reliability, debuggability, and incident response support
- Problem-Solving: Ability to translate business challenges into technical solutions.
- Collaboration Skills: Effective at working within cross-functional teams.
- Agility: Flexibility to adapt to changing requirements and new technologies.
- Communication Skills: Capable of explaining complex technical concepts to non-technical stakeholders.
- ...Overview The Junior AI Observability Architect is an execution-focused engineer who designs, builds, and operates observability capabilities within a defined domain of the enterprise AI observability platform. Working under the strategic direction of the Senior AI Observability...SuggestedFull time
- ...Overview : We are seeking a Seasoned Apigee Solution & Platform Architect with hands-on expertise in... ...including solution design, infrastructure engineering, production stability, and innovation.... ...domain expertise and leverage Generative AI to drive automation, observability, and...SuggestedLong term contractHybrid work
- Description :We are looking for experienced AI Solutions Engineers with strong expertise in Microsoft Copilot Studio and Power Platform technologies to build and manage production-grade AI agents and workflow automation solutions.The ideal candidate should be capable of owning...Suggested
- ...new digital products and new operations outcomes. The transformation... ...Director –SRE Orchestration Solutions, an advanced subject matter... ...expert of Software Development engineering, Application architectures... ...Resolve Resiliency framework with AI Enabled solutions to...SuggestedFull time
- ...Job Title: Cyber Security Operations Engineer Experience: 4 7 Years Location: Mohali / Kochi... ...SIEM identity email cloud and network platforms including accurate prioritisation and escalation... ...in the latest technologies like AI machine learning and product development...SuggestedFull time
- ...and connected administrative solutions from invoicing to accounting.... ...ensuring the seamless and secure operation of our technology. Our... ...As IT Workplace Operations Engineer you will be responsible for... ...Intune Jamf SSO MDM and core SaaS platforms ensuring system availability...Long term contractFull timeHybrid workWork at officeImmediate startRemote jobHome office
- Power Automate Engineer(Desktop & Cloud)Work Location : Noida/Hyderabad/ChandigarhShift... ...-grade automation and analytics solutions. The ideal candidate will have deep expertise in Microsoft Power Platform, RPA, Process Mining, and AI-driven automation, with strong communication...Full time
- ...reliability and security of cloud platforms and services. • Design, deploy, and govern AI-powered agents (e.g., using... ...management. • Handle regular operational requests with hands-on experience... ...Implementing AI based automation solutions for Cloud Operations to Monitor...Work at officeRemote jobFlexible hours
- .... Develop scripts for routine operational tasks like backups, health checks... ...advanced observability platforms (Dynatrace, CloudWatch) with AIOps... ...• Design, deploy, and govern AI-powered agents (using Azure Copilot... ...robust secrets management solutions (AWS Secrets Manager, HashiCorp...Work at officeRemote jobFlexible hours
- ...experienced DevOps Specialist, GitHub & Cloud Platform Engineer to join our enterprise cloud team. This... ...DevOps environments, GitHub enterprise solutions, implementing Public Cloud Landing Zone... ...with GitHub Copilot for Business and AI-assisted development workflows...
- ...of our Banking clients who has Global operations. We're building a team to work on a new Strategic Group platform for our customers' collections business... ...scalable, and comprehensive technology solution. The Systems Engineering team will lead and own the ongoing enhancement...Flexible hours
- ...process services and intellectual property solutions. CGI works with clients through a local... ...(GIB). Learn more at . Job Title: Platform Engineer Position: SSE / LA Experience: 6... .... Develop automation scripts and operational tooling to improve platform efficiency...Full timeLocal areaShift work
- ...strategic Principal Architect Insurance Platform Architecture to lead enterprise-... ...enabling scalable, cloud-native, AI-driven, and customer-centric insurance operations.The ideal candidate will possess... ...-performing enterprise insurance solutions.- Collaborate with business...
- ...Architect Applications & Platforms Primary Skills ~ SAP Basis S/4 HANA Upgrade and Migration... ...Responsibilities Design and implement SAP solutions across modules like ECC CRM BW BOBJ... ...We may use artificial intelligence (AI) tools to support parts of the hiring process...Full time
- ...are building a shared Agentic AI Platorm that enables teams... ...enterprise to design deploy and operate AI-powered agents and workflows... ...and build core agentic AI platform capabilities while actively ensuring... ...creating fragmented one-off solutions. The role combines hands-on...Long term contractFull timeTemporary workHybrid workRemote job
- ...Technology Consulting Title: Specialist III, Platform Engineer EY is a global leader in assurance,... ...we do and deliver at EY. Technology solutions are integrated in the client services we... ...integrating emerging technologies from AI to Data Analytics into every corner of...Full timeRemote jobFlexible hours
Rs 10 - 40 lakhs p.a.
...highly experienced and strategic AWS Platform Architect to lead the design,... ...container orchestration. An active AWS Solutions Architect Professional or AWS DevOps Engineer Professional certification is... ...ensure seamless deployment and operational excellence. Optimize AWS workloads...Full timeHybrid workWeekday work- ...Location: Hyderabad The Associate Director, AI Platform Architect will play a pivotal role in... ...delivering enterprise-grade AI platform solutions across Novartis. Working at the... ...across all geographies in which Novartis operates. Leadership & Mentorship Provide technical...Full time
- ...domains of IT including Data, Web, Infrastructure, AI, and many others. Our expertise in IT solutions development and on-demand resources allows us to partner... ...We are seeking a senior Databricks Automation & AI Platform Engineer with deep expertise in Python, Databricks platform...Immediate start
- ...seeking a highly experienced Power Platform Technical Advisor to provide... ...development initiatives, and AI/Copilot adoption within a... ...hygiene for low-code/no-code solutions Guide CI/CD strategy and deployment... ...technical teams Ability to operate independently as a...Long term contractFull timePart timeUS shift
- ...Location: Hyderabad The Associate Director, AI Platform Architect – AWS will play a strategic... ...Units, AI PMO, MLOps, Security, and Engineering teams to shape the future of scalable AI... ...and data platforms. MLOps & Platform Operations Define scalable MLOps architecture leveraging...
- ...Ready to build the future with AI At Genpact we dont just... ...scaling advanced technology solutions to help global enterprises work... ...leading ourdeep business knowledge operational excellence and cutting-edge... ...- Security Operations Engineering Desk Analyst In this role...Full timeShift work
- Job Title : Service & Platform ArchitectFunction : Technology / Enterprise... ...Management (STP/SPM), and IT Operations Management (ITOM), while... ...scalable, future-proof platform solutions.Key Responsibilities :Platform... ...with ITSM processes to enable AI-driven incident correlation and...Full timeLocal area
- ...implementation of the customers Snowflake platform- Demonstrate technical... ...personally developed code, solution architectures and... ...profile (Cloud Data Platform / AI-led Data Transformation)- Must... ...+ years of experience in Data Engineering / Data Architecture, with strong...Immediate start
- We are seeking a highly skilled Platform Architect with extensive experience in AI and enterprise service management solutions. The ideal candidate will be responsible for designing, implementing, and optimizing scalable platforms, ensuring alignment with organizational goals...
- ...IT services and IT consulting solutions. We leverage our technical... ...maximize growth, streamline operations, and foster innovation. We are... ...looking for a hands-on AWS Cloud Engineer to support day-to-day cloud... ...secure, and cost-effective cloud platforms.Key Responsibilities : -...Hybrid work
- ...harnessing data insights and leveraging AI responsibly to search deeper and... ...ever before. Join our R&D Data Platforms & Products Team as Platform Engineer and you can help make it happen.... ...intelligence (AI) and machine learning (ML) solutions, to accelerate R&D, manufacturing...Full time
- ...tackling industry challenges, and building solutions that meet the evolving needs of modern enterprises.As we advance our data platform and engineering capabilities, we are seeking an... ...data modeling, governance, automation, and AI enablement, with a focus on solving challenges...
- ...hands-on Databricks & GCP Data Platform Architect who will design... ...implement scalable Lakehouse solutions on Google Cloud Platform (GCP... ...deployments 2. Data Engineering (Hands-on) Build and optimize... ...standards 4. DevOps Automation & Operations (Hands-on) Build CI/CD...Full timeRemote job
- ...enterprise architecture, and deal solutioning expertise to lead large-scale... ...Microservices.- Build next-generation platforms integrating Azure OpenAI, Cognitive Services, AI/ML, and analytics platforms like... ...architects, consultants, and engineering teams on Azure best practices.-...Hybrid work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Solutions and Platforms Operations Engineer. Be the first to apply!

