Hybrid – 3 days on site
Key Responsibilities
Data Pipeline Development
Design and implement robust ETL/ELT pipelines using GCP services like Dataflow, Dataproc, Cloud Composer, and Data Fusion.
Automate data ingestion from diverse sources (APIs, databases, flat files) into BigQuery and Cloud Storage
Data Modelling \& Warehousing
Develop and maintain data models and marts in BigQuery.
Optimize data storage and retrieval for performance and cost efficiency.
Security \& Compliance
Implement GCP security best practices including IAM, VPC Service Controls, and encryption.
Ensure compliance with GDPR, HIPAA, and other regulatory standards.
Monitoring \& Optimization
Set up monitoring and alerting using Stackdriver.
Create custom log metrics and dashboards for pipeline health and performance
Collaboration \& Support
Work closely with cross-functional teams to gather requirements and deliver data solutions.
Provide architectural guidance and support for cloud migration and modernization initiatives
Skillset
Technical Skills
Languages: Python, SQL, Java (optional)
GCP Services: BigQuery, Dataflow, Dataproc, Cloud Storage, Cloud SQL, Cloud Functions, Composer (Airflow), App Engine
Tools: GitHub, Jenkins, Terraform, DBT, Apache Beam
Databases: Oracle, Postgres, MySQL, Snowflake (basic)
Orchestration: Airflow, Cloud Composer
Monitoring: Stackdriver, Logging \& Alerting
Certifications
Google Cloud Certified – Professional Data Engineer
Google Cloud Certified – Associate Cloud Engineer
Google Cloud Certified – Professional Cloud Architect (optional)
Soft Skills
Strong analytical and problem-solving skills
Excellent communication and stakeholder management
Ability to work in Agile environments and manage multiple priorities
Experience Requirements
Extensive experience in data engineering
Strong hands-on experience with GCP
Experience in cloud migration and real-time data processing is a plus