Job Summary:
We are seeking a skilled Data Engineer with expertise in Azure Databricks to design, develop, and maintain scalable data solutions. You will work closely with data scientists, analysts, and business stakeholders to build robust data pipelines and enable data-driven decision-making.
Key Responsibilities:
Data Pipeline Development:
Design and implement scalable ETL/ELT pipelines using Azure Databricks, Azure Data Factory, and Apache Spark.
Integrate data from various sources including REST APIs, SFTP, and cloud storage.
Data Storage \& Management:
Work with Azure Data Lake Storage, Azure SQL Database, and Delta Lake to manage structured and unstructured data.
Ensure data quality, integrity, and security across all storage layers.
Performance Optimization:
Monitor and troubleshoot data pipelines for performance and reliability.
Apply best practices for cost-effective and efficient data processing.
Collaboration:
Partner with data scientists, analysts, and business teams to understand data requirements.
Translate business needs into technical solutions.
Documentation \& Compliance:
Document pipeline architecture, data flows, and technical specifications.
Ensure compliance with data governance and security standards.
Required Qualifications:
Bachelor’s degree in Computer Science, Engineering, or related field.
2–5 years of experience in data engineering, with a focus on Azure Databricks.
Proficiency in
SQL
,
Python
, and
Apache Spark
.
Experience with
Azure Data Factory
,
Delta Lake
, and
Azure Synapse Analytics
.
Strong understanding of data warehousing, ETL processes, and big data technologies.
Preferred Skills:
Experience with tools like
Airflow
,
Kafka
, or
Starburst
.
Familiarity with
machine learning data preparation
.
Knowledge of
data governance
,
security
, and
compliance frameworks
.