Responsibilities
Data Infrastructure is at the core of data-driven culture. The Data Infrastructure team is responsible for building and maintaining the platforms and frameworks which drive the data ecosystem. Some of these systems include: data pipelines (ingestion), the Spark ecosystem and other derived data creation frameworks, the company-wide data lake, query execution systems (Presto/Trino)
Model several streams of data into different storage formats like parquet to enable different statistical calculations.
Work together with product teams (application, BI, DS,...), enabling them to make the most out of their data.
Implement observability systems to track data quality and consistency.
Basic Qualifications
Skill and knowledge in Python, and target in Big Data career.
Knowledge with DevOps.
Have an understanding of Data flow in DWH.
At least 2 years of relevant working experience.
Preferred Qualifications
Skill and knowledge in Java, Scala.
Good understanding of data management - data lineage, metadata, data governance.
Have an understanding of big data and big data platforms.
Knowledge with S3, Spark, Jupyter, Hive, Docker, Kubernetes, Airflow, Datahub, Trino, Starrocks, Superset.
Skills in task and time management, proactive problem solver.
Self-learning and applying into work skills.
Teamwork and communication skills.
Why you'll love working here
Negotiable salary according to qualification (13th month salary + Performance Bonus) ;
The working environment is open, energetic and professional. More opportunities for career promotion;
Free Food \& Drink: lunch, fresh fruit/cake, coffee \& tea.
Premium healthcare: annual health check and attractive healthcare coverage under the company’s own policy;
Employee Relationship: company trip, team bonding \& sport club;
Leave Paid: 12 days for annual leave \& 6 days for sick leave (max 18 days).