Project overview: The Advanced Analytics team delivers long-term value across the company by building custom data products and analytical models, primarily using Machine Learning, but also a range of advanced analytics techniques. Our work enhances business capabilities and integrates directly with operational systems. We follow Agile development, use LEAN tools for change management, and rely on an MLOps environment to ensure scalable, production-ready solutions.
Position overview: We’re looking for a Data Engineer to join the Advanced Analytics team. You’ll design and build data pipelines that power our data products, working closely with Data Scientists and ML Engineers. This includes ingesting data from raw or structured sources (batch and streaming), integrating with the Data Lake, and ensuring production-quality pipelines using best practices. The ideal candidate is hands-on, collaborative, and passionate about data engineering in a cloud-first environment.
Responsibilities: Design and implement scalable data pipelines (batch and streaming)
Collaborate with Data Scientists and ML Engineers to support model development and retraining
Ingest data into the Data Lake using GCP-native tools
Apply MLOps and CI/CD best practices in pipeline development
Build and maintain ETL/ELT processes and unified data models
Continuously test and optimize data workflows
Promote a data-driven culture with a focus on reliability and scalability
4+ years as a Data Engineer or Software Engineer
Experience with GenAI projects
Strong Python and SQL skills
Experience working in UNIX environments
Hands-on experience with GCP, especially its data tools (e.g., BigQuery, Dataflow, Pub/Sub, Cloud Storage, Composer)
Experience with CI/CD pipelines and Docker
Solid understanding of data structures and distributed processing
Experience testing and debugging data pipelines
Nice to have: Experience with Terraform
Experience with API development
Stream processing tools (e.g., Apache Beam, Kafka)
Exposure to MLOps workflows and tools