About the Role
We are looking for a Senior Data Engineer to join our Data Platform team and lead the design, development, and optimization of our enterprise data products using Azure Databricks \& Lakehouse architecture. You will work across ingestion, data modeling, automation, ML, and data governance to build a scalable data ecosystem serving analytics, ML, and business activation use cases.
Key Responsibilities
Design \& develop scalable data pipelines using Azure Databricks (SQL, Python, PySpark).
Implement Delta Lake / Lakehouse – Medallion architecture (Bronze–Silver–Gold).
Optimize performance, cost, cluster tuning, scheduling, and serverless compute.
Implement CI/CD, DBX version control, Unity Catalog governance \& cluster policies.
Integrate Databricks with Azure ADLS Gen2, Azure SQL, ADF / Databricks Jobs, Event Hub, Key Vault, Terraform.
Build automation: Auto EDA (profiling, anomaly detection), AutoML \& MLflow pipelines.
Apply LLM/Data GPT for automated SQL generation, documentation, data lineage \& data quality explanation.
Work closely with business teams to translate requirements into scalable solutions.
Platform Scope You Will Help Build:
Data ingestion system, data cleaning \& standardization
Global-ID data connection / mapping
Data crawler
Enterprise Data Lake \& Feature Store
Realtime \& batch analytics
Activation API
Data Catalog, Data Lineage, Data Quality Monitoring
Data Access governance, usage monitoring, pricing \& FinOps visibility
Data security best practices
Qualifications
Requirements
5+ years in Data Engineering or distributed data processing
Expert in Azure Databricks (Delta Lake, Unity Catalog, DBX version control, cluster policies, CI/CD)
Strong data modeling (star schema, dimensional, data vault) \& ELT frameworks
Hands-on with PySpark, SQL, Python, Databricks SQL
Experience with AutoML / MLflow (train → deploy → monitor)
Experience applying GenAI / Data GPT to data workflows
Nice-to-have
Streaming: Structured Streaming, Auto Loader, Kafka/EventHub
Databricks Photon, Serverless SQL, fine-grain access control
Cost governance \& FinOps for Databricks \& Azure
Why Join Masan
Be part of Masan’s digital transformation journey with high-impact, real-world data challenges.
Build a modern end-to-end data platform from scratch
Work with Databricks, AutoML, Generative AI – cutting-edge stack
High ownership, high-impact engineering role