Data Engineer

PropStream

Role Overview We are looking for a hands-on, senior Databricks Architect to design, build, and govern our Lakehouse data platform from the ground up. You will own the end-to-end architecture of our data infrastructure — from raw ingestion through the Medallion layers to serving — and establish the engineering standards that will guide the entire data organization. This is a highly strategic and technical role focused on driving adoption of Databricks, Unity Catalog, and modern Lakehouse patterns across all data products and pipelines.

Key Responsibilities Lakehouse Architecture & Design Design and implement a production-grade Medallion Architecture (Bronze / Silver / Gold) across all data pipelines. Establish best practices for Delta Lake table design, partitioning strategies, Z-ordering, and optimization across large-scale datasets. Define data modeling standards and schema evolution policies across the Lakehouse. Architect end-to-end data flows from ingestion (streaming and batch) through transformation and serving layers.

Unity Catalog & Data Governance Lead the setup, configuration, and rollout of Unity Catalog as the centralized governance layer for all data assets. Design metastore hierarchy, catalog/schema/table organization, and tagging standards. Implement fine-grained access control (row-level, column-level), data masking policies, and audit logging. Establish data lineage tracking and ensure end-to-end visibility across all pipelines. Define and enforce data classification and sensitivity frameworks for PII and regulated data assets.

Pipeline Development & Orchestration Build and maintain production-grade data pipelines using PySpark, Delta Live Tables (DLT), and Databricks Workflows / Jobs. Design modular, reusable pipeline patterns including incremental ingestion, CDC (Change Data Capture), and full-refresh strategies. Implement robust pipeline observability: logging, alerting, lineage tracking, and SLA monitoring. Leverage Databricks Repos for CI/CD integration, managing code promotion across dev / staging / production environments.

Performance & Compute Optimization Optimize Spark execution plans, identify and resolve performance bottlenecks across large-scale distributed workloads. Right-size cluster configurations: Serverless warehouses, auto-scaling job clusters, and photon-enabled SQL warehouses. Leverage Serverless Warehouses and SQL Warehouses for BI and ad hoc analytics workloads, minimizing cost and cold-start latency. Manage cost governance for compute, storage, and DBU consumption across workspaces.

Developer Experience & Standards Set up and maintain Databricks Repos with standardized project structures and Git integration. Define Python coding standards, notebook best practices, and modular library patterns for the data engineering team. Build reusable Python utility libraries for common patterns: schema validation, data quality checks, Delta operations, and logging. Establish unit testing and integration testing frameworks for Spark pipelines.

Security, Compliance & Networking Configure workspace-level and account-level security: Private Link, IP access lists, secrets management via Databricks Secrets or AWS Secrets Manager. Design and enforce network isolation for sensitive data workloads. Ensure compliance with data residency and access control requirements for customer data.

Collaboration & Enablement Partner with data engineers, data scientists, and analytics engineers to ensure the platform meets diverse workload needs. Mentor the engineering team on Databricks, Spark optimization, and Lakehouse best practices. Produce architectural documentation, runbooks, and internal knowledge bases. Evaluate and recommend new Databricks features and third-party integrations relevant to the organization's data roadmap.

Required Qualifications Core Databricks & Lakehouse 5+ years of hands-on experience with Databricks, with at least 2 years in an architect or senior lead role. Deep expertise in Unity Catalog: metastore setup, three-level namespace, ACL design, and data governance workflows. Strong mastery of the Medallion Architecture and Delta Lake: ACID transactions, time travel, compaction, and OPTIMIZE/VACUUM strategies. Proven experience designing and deploying production pipelines with Databricks Jobs and Workflows, including multi-task job DAGs, retry logic, and notifications. Hands-on experience with Databricks Repos and CI/CD integration for notebook and Python library deployments. Experience configuring and operating Serverless SQL Warehouses and Serverless compute for Jobs.

Apache Spark Expert-level PySpark development: DataFrames, Spark SQL, window functions, broadcast joins, and UDFs. Strong understanding of Spark internals: DAG execution, shuffle optimization, memory management, and speculative execution. Experience with structured streaming and micro-batch processing patterns. Proven ability to diagnose and resolve Spark performance issues using Spark UI and event logs.

Python & Software Engineering Advanced Python skills with a strong software engineering background: packaging, testing (pytest), virtual environments, and dependency management. Experience building modular Python libraries for data engineering use cases. Familiarity with common data engineering libraries: pandas, pydantic, great_expectations or similar DQ frameworks.

Cloud & Infrastructure Experience deploying Databricks on AWS, including workspace provisioning, IAM integration, and VPC configuration. Familiarity with cloud-native storage (S3/ADLS), external locations in Unity Catalog, and storage credentials management. Exposure to infrastructure-as-code tooling (Terraform, Databricks Asset Bundles, or similar).

Preferred Qualifications Databricks Certified Data Engineer Professional or Databricks Certified Associate Developer for Apache Spark certifications. Experience with Delta Live Tables (DLT) for declarative pipeline authoring. Familiarity with dbt (data build tool) integrated with Databricks SQL. Experience with Databricks Feature Store or MLflow for ML platform use cases. Exposure to Databricks Marketplace and Partner Connect integrations. Experience with Elasticsearch, Apache Kafka, or other streaming/search technologies complementary to the Lakehouse.

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the Data Engineer in Vapi vacancy

Lead/ Senior Data Engineer_ Exp: 6+ Years
Job Description : Design, develop, and optimize database-centric solutions on a high-volume data platform. Comfortable with at least one language with Python. Work extensively with SQL in a large-scale production environment Develop and improve ETL/data pipelines supporting...
Suggested
Atyeti Inc
Vapi
2 days ago
AWS data engineer
Role: AWS Data Engineer Experience: 7-8 years Location: Pan India (Remote - UK Shift) 7-8 years of experience in data engineering Strong expertise in SQL (complex queries, optimization) Hands-on experience in PySpark for large-scale data processing Experience with AWS services...
Suggested
Remote job
MethodHub
Vapi
24 days ago
AWS Data Engineer
...Minimum Requirements ~ Bachelor's degree in Computer Science, Engineering or related field, or equivalent training, fellowship, or work experience ~5+ years of experience working in data integration, pipelines, data modeling, ~ Experience designing and deploying code...
Suggested
DGN Technologies Inc
Vapi
7 hours ago
AWS Data Engineer
AWS PySpark Redshift Data Engineer Contract India (Offshore) Technet IT has partnered with a global technology services organisation supporting a portfolio of enterprise life sciences clients (large-scale, multi-national environment) to hire an experienced AWS PySpark Redshift...
Suggested
Long term contract
Contract work
TechNET IT Recruitment Ltd
Vapi
24 days ago
AWS Data Engineer
About the Job We are seeking a skilled Data Engineer to architect, build, and optimise scalable data platforms on cloud infrastructure. The role involves close collaboration with cross-functional teams to deliver robust, secure, and high-performance data solutions that support...
Suggested
Mindfire Solutions
Vapi
24 days ago
AWS Data Engineer
Job Summary: We are seeking a talented AWS Data Engineer to join our dynamic Data Engineering team. The ideal candidate will be responsible for designing, developing, and maintaining scalable data pipelines and architectures in the AWS cloud environment. This role will collaborate...
PamTen Inc
Vapi
24 days ago
Data Engineer
Sikich India is seeking an experienced Data Engineer to join our Data & AI practice. You will design, build, and optimize end-to-end data solutions using Microsoft’s data platforms, including Microsoft Fabric, Azure Synapse Analytics, Databricks, and Power BI. Your work will...
Sikich India
Vapi
2 days ago
Data Engineer
...Job Title: Data Engineer Work Mode: Remote Location: Gurgaon Hiring Alert | Data Engineer (3–4 Years Experience) We are looking for a talented Data Engineer to join our growing team! If you have strong expertise in SQL and Power BI , along with experience...
Remote job
Flexible hours
BayOne Solutions
Vapi
28 days ago
Data Engineer
We’re seeking a Data Engineer with at least one year of hands‑on experience in data engineering and analytics to join our collaborative team. In this role, you’ll build and maintain end‑to‑end ETL pipelines, design data models, and transform raw data into actionable insights...
NetworkHQ
Vapi
24 days ago
Data Engineer
.../ Remote Required Skills & Experience Strong experience with big data technologies such as Apache Spark, Hadoop, and Hive. Hands‑on experience... ...world constraints. Job Description We are seeking a skilled Data Engineer to design, build, and optimize large‑scale batch data pipelines...
Contract work
Remote job
Insight Global
Vapi
24 days ago
Data Engineer
We are looking for a highly skilled and motivated Data Engineer with strong expertise in AWS data services to join our data platform team. The ideal candidate will have hands-on experience designing scalable data pipelines, workflow orchestration frameworks, and large-scale...
Trantor
Vapi
24 days ago
Data Engineer
Job Title: Data Engineer - AWS & Databricks Job Summary: We are looking for a results-driven Data Engineer with strong expertise in Amazon Web Services (AWS) and Databricks to build scalable and efficient data solutions. The candidate will be responsible for developing robust...
Solvex Solutions
Vapi
24 days ago
Data Engineer
Azure Data Engineer (Contract) – INDIA – Fully Remote – 6 Months We’re hiring an experienced Azure Data Engineer to support a growing analytics platform and help scale modern data pipelines in a cloud-first environment. This is a hands-on role focused on building reliable...
Contract work
Remote job
InterEx Group
Vapi
2 days ago
Data Engineer
...Job Purpose We are seeking a highly skilled Senior Data Engineer (Contractor) to design, build, and optimize scalable data platforms and pipelines. This role will play a critical part in developing modern data lakehouse architectures on AWS, enabling advanced analytics and...
For contractors
Experion Technologies
Vapi
22 days ago
Data Engineer
Key Responsibilities: * Design and develop end-to-end data solutions using Microsoft Fabric components (Data Factory, Synapse Data Engineering, Data Warehouse, Real-Time Analytics). * Build scalable and efficient ETL/ELT pipelines for structured and unstructured data. * Develop...
Deltacubes
Vapi
24 days ago
Data Engineer
Job description Location- Bangalore Experience- 5-7 Yrs Overview: Data Engineer with experience in designing, building, and optimizing scalable data pipelines and data models. Strong expertise in SQL, Azure Data Factory (ADF), and Python for data processing, transformation,...
Scientist Technologies
Vapi
24 days ago
Data Engineer
Avigna is hiring Data Engineer Our IT Delivery Center aims to build strong and sustainable solutions for customers across Europe. We are looking for a Data Engineer to join the team to build, support, and drive customer innovations, bringing efficiency to processes through...
Remote job
Avigna AB
Vapi
24 days ago
Data Engineer
...provide on-demand video interviews service to employers and job-seekers. We looking for part time remote interviewer for Snowflake Data Engineer skills. Please apply immediately if you are interested. you will get a call from our team member. PLEASE REVIEW THE DETAILS, AND...
Part time
Freelance
Immediate start
Remote job
Risebird
Vapi
24 days ago
MDM Data Engineer
Hope you are doing good Our Client Is Looking For an MDM Data Engineer Find below the Job Description. Kindly reply to me back with your updated resume, contact details, and the best time to reach you. Job Term: Contract - (6+ months) Project Location : Remote Work Interview...
Contract work
Remote job
KPG99 INC
Vapi
24 days ago
Data Engineer
...with hyper-focused, domain-centered teams and cutting-edge tech, data, and analytics. Our real-world practitioners work collaboratively... ...outcomes Job Description: Mode of work: Remote Role: Senior Data Engineer 9 to 15 years of prior experience as a software engineer or data...
Remote job
Shift work
Firstsource
Vapi
24 days ago
Data Engineer
Job Tittle: Data Engineer Experience: 6-10 Years We are seeking a Senior Data Engineer to design, build, and maintain the core infrastructure powering our data-driven ecosystem. In this role, you will own the end-to-end data pipeline, converting complex business needs into...
TIGI HR
Vapi
24 days ago
Data Engineer
Job Summary We are looking for a skilled Data Engineer with strong experience in Google Cloud Platform (GCP) to design and build scalable data pipelines. The ideal candidate will have hands-on expertise in processing JSON data structures (including complex 3×3 nested formats...
EXL
Vapi
24 days ago
Data Engineer
Job Description Infiligence is a multi national AI-led Platform Engineering company with offices in Chennai, Calgary, and Pleasanton, California... .... Key Responsibilities Design, develop, and maintain scalable data pipelines for batch and real-time data processing using Azure (...
Worldwide
Infiligence Inc
Vapi
24 days ago
Data Engineer
Engineer the Data Backbone for AI with goML At goML, we build modern Generative AI, AI/ML, and Data Engineering solutions that help enterprises turn data into intelligent, scalable systems. Our mission is to bridge advanced data platforms with real-world business needs—enabling...
Remote job
Flexible hours
GoML
Vapi
2 days ago
Data Engineer
Must Haves: -3+yrs Python experience for data engineering and backend development - API building using FastAPI - Azure AI Search for building scalable search and retrieval solutions - Strong SQL expertise Day to Day: This person will be responsible for designing, building...
Remote job
Insight Global
Vapi
2 days ago
Data Engineer
We are hiring a Data Engineer/ Software Engineer with 3-5 years of relevant experience in data engineering. About Forage AI: Forage AI is a pioneering AI-powered data extraction and automation company that transforms complex, unstructured web and document data into clean, structured...
Full time
Forage AI
Vapi
24 days ago
Data Engineer - CBI
JD - Data Engineer - (Contract) This is a contractual position for D Square Consulting Services Pvt Ltd Experience - 5-7 Years Location - Bangalore Work mode - Hybrid Notice period: Immediately to 30 days Job Summary We are seeking a skilled Data Engineer to join our dynamic...
Contract work
Hybrid work
Immediate start
D Square Consulting Services Pvt Ltd
Vapi
24 days ago
Data Engineer
Job Title: Data Engineer Work Location: Any Infosys Development Center (Preferred: Bangalore, Karnataka) Experience: Relevant Experience: 5+ Years Total Experience: 6-8 Years Domain: Banking Additional Details Work Mode: Hybrid Shift Timing: General Shift (10:00 AM - 7:15 PM...
Hybrid work
Shift work
MyRemoteTeam Inc
Vapi
24 days ago
PowerBi Data Engineer
Skill: Power Bi Data Engineer Location: Pune (Shivaji Nagar Office) Experience: 7 to 12 years Np: Immediate to 30 Days Job Description: Seeking a Senior Specialist with 7 to 11 years of experience in Power BI, SQL and data analysis to drive data visualization and analytics...
Work at office
Immediate start
LTIMindtree
Vapi
24 days ago
SIEM & Data Engineer AVP [T500-25427]
...network across Americas, EMEA and Asia Pacific. About the Role: Position Title: Security Engineer Corporate Title: AVP Location: Bengaluru. Job Profile / Position details: As a Security & Data Engineer, you will be responsible for designing, implementing, and maintaining secure...
Hybrid work
Worldwide
MUFG
Vapi
24 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Data Engineer. Be the first to apply!