Job Description
We seek a Data Governance Engineer who will be responsible for establishing and maintaining data governance practices across our local data platform. This role will focus primarily (70%) on data governance initiatives including security management, query optimization, and resource allocation, while supporting (30%) analytics automation and pipeline maintenance.
Primary Responsibilities: Data Governance (70%)
Data Governance Framework
Optimize data pipelines and SQL queries to ensure minimal resource consumption, high stability, and efficient performance across platforms such as HDFS, Presto, and Spark
Establish and maintain continuous monitoring of governance scores, ensuring all teams achieve and sustain scores above 85/100, based on storage utilization and compute resource efficiency
Develop governance policies and best practices tailored to local team needs
Data Service Management
Integrate all VN reports and data pipelines to SLA Manager System for comprehensive resource tracking
Lead optimization of resource allocation across all teams through performance monitoring and capacity planning
Develop and manage Asset Status Tracker for complete visibility of data assets and project dependencies
Implement and monitor failed task notifications, working with teams to establish Seatalk Failed Task Notifications
Data Quality \& Performance
Prioritize and implement quality rules for key tasks ensuring stable data resources
Conduct regular query performance reviews and optimization sessions with Functional BI teams
Establish data quality metrics and monitoring dashboards for proactive issue detection
Lead troubleshooting efforts for data quality issues and performance bottlenecks
Security \& Access Management
Manage data access controls and security policies across all data platforms
Conduct regular security audits and compliance reviews
Work with Regional team to ensure alignment with global data governance standards
Requirements
Strong experience with data governance frameworks and best practices
Hands-on experience with HDFS, Presto, Spark and distributed systems
Proven track record in working with large distributed data warehouse systems and very-high data volume query optimization
Strong SQL skills and understanding of data modeling \& data warehouse management principles
Knowledge of data security and access control management
Excellent communication skills to work with both technical and business stakeholders.
Experience with data quality tools and frameworks
Experience with Python for automation and monitoring scripts
Experience with Asset Status Tracking systems
Knowledge of notification systems integration (Seatalk or similar)
Background in implementing data governance in cross-functional environments
LLM applications