Descripción
We are looking for a skilled Data Engineer to join our Tech team. The ideal candidate will possess experience in SQL and Python for developing data pipelines
Responsabilidades
The incumbent for the position is expected to deliver but not limited to the following responsibilities:
Collaboration and Support
: Collaborate with cross-functional teams to understand data requirements and provide technical support for data-related initiatives and projects.
Data Pipeline Implementation
: Design, develop and maintain scalable data pipelines to ingest, transform, and load data from various sources into cloud-based storage and analytics platforms using Python, PySpark, and SQL
Data Infrastructure Development
: Design, build, and maintain scalable data infrastructure on Azure / AWS / GCP using PySpark for data processing to support various data initiatives and analytics needs within the organization
Performance Optimization
: - Optimize data processing workflows and cloud resources for efficiency and cost-effectiveness, leveraging PySpark's capabilities. - Implement data quality checks and monitoring to ensure the reliability and integrity of data pipelines.
Build and optimize data warehouse solutions
for efficient storage and retrieval of large volumes of structured and unstructured data.
Data Governance and Security
: Implement data governance policies and security controls to ensure compliance and protect sensitive information across Azure, AWS and GCP environments.
Calificaciones y Habilidades
An ideal candidate will be/have:
Bachelor’s degree in computer science, Engineering, Statistics, Mathematics, or related field. Master's degree preferred
2 years of experience as Data Engineer
Cloud data storage is mandatory
Strong understanding of data modeling, ETL processes, and data warehousing concepts
Experience in SQL language, relational data modelling and sound knowledge of Database administration is mandatory
Proficiency in Python related to Data Engineering for developing data pipelines, ETL (Extract, Transform, Load) processes, and automation scripts.
Proficiency in Microsoft Excel
Experience within integrating data management into business and data analytics is mandatory
Experience working with cloud platform for deploying and managing scalable data infrastructure
Experience working with technologies such as DBT, airflow, snowflake, Databricks among others is a plus
Excellent Stakeholder Communication
Familiarity with working with numerous large data sets
Comfort in a fast-paced environment
Strong analytical skills with the ability to collect, organize, analyses, and disseminate significant amounts of information with attention to detail and accuracy
Excellent problem-solving skills
Advanced English is mandatory
Strong interpersonal and communication skills for cross-functional teams
Proactive approach to continuous learning and skill development