👨🏻‍💻 postech.work

Senior Data Engineer

PAVE • 🌐 In Person

In Person Posted 1 day, 8 hours ago

Job Description

Position Overview

We’re looking for a talented Data Engineer with strong AWS expertise to design, build, and maintain the data infrastructure that powers our vehicle inspection platform. At Pave.ai, you’ll be responsible for developing scalable and reliable data pipelines that process millions of vehicle inspections, images, and automotive data points — delivering real-time insights to customers across the automotive ecosystem.

In this role, you will collaborate closely with our engineering and data science teams based in both Canada and Vietnam, working together to design end-to-end solutions that support advanced analytics, machine learning models, and business intelligence tools. You’ll play a key role in ensuring data accuracy, scalability, and system performance.

Key Responsibilities

Data Pipeline Development

Design and implement scalable ETL/ELT pipelines for processing vehicle inspection data, images, and metadata

Build real-time data processing workflows for instant inspection results and damage detection

Create data ingestion solutions from mobile apps, APIs, IoT devices, and third-party automotive systems

Implement data quality frameworks to ensure inspection accuracy and compliance

Optimize pipelines for processing high-volume image data and computer vision outputs

AWS Data Platform Management

Architect data warehousing solutions using Amazon Redshift for vehicle inspection analytics

Design schemas optimized for automotive data (VIN, inspection history, damage reports, pricing)

Implement data lakes using S3 for storing inspection images, videos, and unstructured data

Manage inspection metadata and vehicle catalogs using AWS Glue Data Catalog

Build ML-ready datasets for computer vision and damage detection models

Analytics \& Visualization

Develop QuickSight dashboards for vehicle inspection metrics, damage trends, and pricing analytics

Create self-service analytics for dealerships, insurers, and fleet operators

Build real-time inspection monitoring dashboards for quality assurance

Implement predictive analytics for vehicle valuation and damage assessment

Design automated reports for inspection volumes, accuracy rates, and customer KPIs

Data Integration \& Orchestration

Integrate with automotive data providers (Carfax, KBB, automotive APIs)

Build real-time processing for mobile inspection data using Kinesis

Implement workflows connecting inspection data with customer CRMs and dealer management systems

Design event-driven architectures for inspection status updates and notifications

Create APIs for inspection data access by partners and third-party platforms

Infrastructure \& Operations

Implement Infrastructure as Code using CloudFormation or Terraform

Set up monitoring and alerting using CloudWatch and SNS

Ensure data security through encryption, VPC configuration, and IAM policies

Optimize AWS costs through resource management and Reserved Instances

Maintain data recovery and backup strategies

Own operational reliability of the data platform, including versioned pipelines, CI/CD integration for test data provisioning, and improvements in data quality and governance to prevent application failures from raw vs. processed data mismatches

Required Qualifications

Experience

4+ years of experience as a Data Engineer or similar role

3+ years of hands-on experience with AWS data services

Experience with image/video data processing and storage at scale

Background in automotive, insurance, or inspection technology is a plus

Proven track record of building production data pipelines for high-volume consumer applications

Technical Skills

AWS Services Expertise:

Amazon Redshift: Cluster management, performance tuning, Spectrum

Amazon QuickSight: Dashboard development, SPICE, ML insights

AWS Glue: ETL jobs, crawlers, data catalog

Amazon S3: Data lake architecture, lifecycle policies, partitioning

Amazon Athena: Query optimization, partition projection

Amazon Kinesis: Real-time data streaming and analytics

AWS Lambda: Serverless data processing

Amazon EMR: Big data processing with Spark/Hadoop

Programming and Tools:

Strong programming skills in Python and SQL

Experience with PySpark or Spark SQL

Proficiency with Git and CI/CD pipelines

Knowledge of data orchestration tools (Airflow, Step Functions)

Familiarity with dbt (data build tool) for data transformation

Data Engineering Concepts:

Strong understanding of data warehousing and data lake architectures

Experience with both batch and stream processing paradigms

Knowledge of data modeling techniques (star schema, data vault)

Understanding of data governance and lineage

Core Competencies

Strong analytical and problem-solving skills

Excellent communication skills for working with technical and business stakeholders

Self-motivated with ability to work independently

Detail-oriented approach to data quality

Passion for automation and optimization

Preferred Qualifications

AWS Certifications (Solutions Architect, Data Analytics Specialty)

Experience with computer vision data pipelines and ML model deployment

Knowledge of automotive industry data standards (VIN decoding, OBD-II)

Experience with geospatial data and location-based analytics

Familiarity with image optimization and CDN strategies

Understanding of data privacy regulations (GDPR, CCPA) for consumer data

Experience with mobile app analytics and real-time data synchronization

Background in building multi-tenant SaaS data architectures

What We Offer

Competitive salary

Flexible work arrangements, including hybrid options

13th-month bonus in accordance with company policy

Comprehensive health, dental, and vision insurance for the employee and one dependent

Professional development budget

Opportunity to shape the future of AI technology

Collaborative and innovative work environment

Get job updates in your inbox

Subscribe to our newsletter and stay updated with the best job opportunities.