Senior Lead Engineer – Data Engineering

Department:
Data Engineering
Project Location(s):
Bangalore, Karnataka
Job Type:
Full Time
Education:
Bachelor in Engineering / Technology

Senior Lead Engineer – Data Engineering

We are seeking a Senior Data Engineer with in-depth knowledge of Databricks and Unity Catalog to serve as the subject matter expert for all things Databricks within our organization. This role requires deep expertise in Databricks, including CI/CD setup for data bricks, data lineage through Unity Catalog, and strong proficiency in ETL, SQL, and modern data engineering practices. You will be the go-to person for designing, implementing, and optimizing data solutions with Databricks.

 

Key Responsibilities:

  • Serve as the point of contact and subject matter expert for all Databricks-related activities, including architecture, development, and operational best practices.
  • Should work closely with Sales team, propose data roadmap to prospects intending to migrate to cloud, create proof of concepts to showcase our expertise.
  • Design, develop, and manage ETL/ELT pipelines in Databricks using Python (PySpark), integrating various data sources to support business operations.
  • Leverage Unity Catalog to ensure data lineage, security, and governance are properly managed across the Databricks environment.
  • Implement and maintain CI/CD pipelines for Databricks, ensuring smooth deployments, version control, and automation using Git and other DevOps tools.
  • Build scalable data architectures, including Data Lakes, Lakehouses, and Data Warehouses, ensuring efficient data management and accessibility.
  • Configure and optimize Databricks clusters, jobs, and workflows for both batch and streaming data processing to handle large-scale datasets.
  • Stay up-to-date with the latest Databricks features and advancements, continuously enhancing our data engineering practices.
  • Collaborate with cross-functional teams to implement data governance and ensure compliance with security and industry regulations.
  • Monitor and tune Databricks workloads to ensure high performance and scalability, adapting to business needs as required.
  • Provide training, guidance and mentorship to fellow cloud engineers, ensuring adherence to best practices and fostering a collaborative environment.

 

Qualifications

  • 5+ years of experience in data engineering with significant expertise in Databricks and Apache Spark.
  • Proficient in Unity Catalog for managing data lineage, security, and governance within the Databricks ecosystem.
  • Experience of estimating and migrating legacy data warehouse workloads to Azure/Hybrid Cloud.
  • Proficient in Unity Catalog for managing data lineage, security, and governance within the Databricks ecosystem.
  • Experience building and optimizing ETL pipelines using tools like Azure Data Factory, Informatica, or similar.
  • Strong understanding of CI/CD practices with experience in Git for version control and integration with Databricks.
  • Expertise in SQL development and performance tuning for large-scale datasets.
  • Knowledge of the Azure ecosystem, including data services like Azure Data  Factory, Azure Data Lake and Azure Storage.
  • Ability to work with both batch and streaming data processing pipelines.
  • Experience with data modeling and dimensional design (e.g., star schema).
  • Good understanding of data governance, compliance, and security best practices.
  • Excellent communication and problem-solving skills, with the ability to manage multiple priorities.
  • Ability to stay current on Databricks innovations and proactively introduce new features and capabilities to the team.

This is custom heading element