POSITION:
DATA ENGINEER
JOB DESCRIPTION
Design and maintain robust and scalable data pipelines to handle large datasets.
Implement and manage data storage solutions, including databases and data lakes, ensuring data integrity and performance.
Integrate data from various internal and external sources such as databases, APIs, flat files, and streaming data.
Ensure data consistency, quality, and reliability through rigorous validation and transformation processes.
Implement ETL processes to automate data ingestion, transformation, and loading into data warehouses and lakes.
Optimize ETL workflows to ensure efficient processing and minimize data latency.
Implement data quality checks and validation processes to ensure data accuracy and completeness.
Develop data governance frameworks and policies to manage data lifecycle, metadata, and lineage.
Facilitate effective communication and collaboration between the AI and data teams and other technical teams.
Identify areas for improvement in data infrastructure and pipeline processes.
Stay updated with the latest industry trends and technologies related to data engineering and big data.
JOB REQUIREMENTS
Bachelor’s degree in Computer Science, Data Science, or relevant majors.
Minimum of 4 years of experience as Data Engineer or a similar role.
Proven experience with AWS, GCP, Azure.
Strong background in data integration, ETL processes, and data pipeline development.
Led the design and development of high-performance AI and data platforms, including IDEs, permission management, data pipelines, code management and model deployment systems.
Proficiency in Python or Java.
Strong knowledge of SQL, NoSQL, data lakes.
Experience with big data technologies (e.g., Apache Spark, Hadoop).
Experience with Jenkins, GitLab CI, CircleCI.
Understanding of data engineering and MLOps methodologies.
Awareness of security best practices in data environments.
Excellent problem-solving skills and attention to detail.