👨🏻‍💻 postech.work

Data Engineer

Atlantis IT Group • 🌐 In Person

In Person Posted 2 days, 21 hours ago

Job Description

Job Title: GCP Data Engineer

Location: Toronto, ON – Hybrid

Contract Role

Job Description

We are seeking a highly skilled Senior Data Engineer with deep expertise in Google Cloud Platform (GCP), distributed data processing, and cloud-native data architectures. This role involves designing, building, optimizing, and maintaining scalable data pipelines and analytical platforms supporting enterprise‑grade workloads. The ideal candidate brings strong hands‑on experience in BigQuery, Dataflow, DataProc, Dataform, Cloud Composer (Airflow), PySpark, and end‑to‑end ELT/ETL frameworks, along with robust knowledge of metadata, lineage, data quality, and CI/CD automation.

Key Responsibilities

1. Data Engineering \& Architecture

Design and implement end‑to‑end data architectures on GCP, including data lakes, data marts, and warehouse models.

Build scalable batch and streaming pipelines using Dataflow, DataProc (Spark), Dataform, and Pub/Sub.

Architect low‑latency, high‑throughput processing solutions supporting advanced analytics and ML workloads.

Develop pre‑aggregated models, materialized views, and optimized analytical structures in BigQuery.

2. ETL/ELT Pipeline Development

Design, develop, test, and optimize ELT/ETL pipelines for structured and unstructured data.

Use Dataform and Cloud Composer (Airflow) for orchestration, dependency management, and metadata logging.

Implement best practices for ingestion, transformation, storage, and data access patterns.

3. Data Quality, Metadata \& Governance

Implement enterprise‑grade data quality checks using Great Expectations or custom Python frameworks.

Manage metadata, lineage tracking, data cataloging, and compliance with governance standards.

Ensure data integrity, schema enforcement, and security‑by‑design principles across all data pipelines.

4. Cloud Infrastructure \& DevOps

Build and automate cloud infrastructure using Terraform, Jenkins, GitLab CI, and IaC best practices.

Develop CI/CD workflows for pipeline deployments, testing gates, and operational automation.

Monitor pipelines using Cloud Monitoring \& Logging, optimizing for performance and cost.

5. Cross‑Functional Collaboration

Work closely with data scientists, analysts, platform engineering, and product owners to translate complex business needs into scalable data solutions.

Support legacy-to-GCP migration initiatives, including Hadoop and on‑premise workloads.

Enable advanced analytics and ML workloads through ML‑ready data pipelines.

6. Advanced Analytics \& ML Support

Support feature engineering and ML data preparation for Vertex AI, Gemini, HuggingFace, or other ML platforms.

Enable vector database workflows and generative AI data pipelines.

Required Technical Skills

Cloud \& Big Data

Google Cloud Platform: BigQuery, DataProc, Dataflow, Cloud Composer (Airflow), GCS, Cloud Run, EventArc

Distributed Computing: Apache Spark, PySpark, Kafka

Data Lake \& Lakehouse Architectures

Programming \& Tools

Python, SQL, Java

Git, Bitbucket, Jenkins, GitLab CI

Terraform (IaC)

REST APIs, FastAPI

Airflow DAG development

Data Engineering Competencies

Data modeling (OLTP/OLAP)

Data Warehousing

ELT/ETL pipelines

Streaming \& real‑time processing

Data profiling and validation

Metadata, lineage, quality management

Get job updates in your inbox

Subscribe to our newsletter and stay updated with the best job opportunities.