It’s never been a more exciting time to join Vistra.
At Vistra our purpose is progress. We believe that our clients have the power to change the world and to do great things for global progress, and we exist to remove the friction that comes from the complexity of global business – to help our clients achieve progress without friction.
But progress only happens when people come together and take action. And we’re absolutely committed to building a culture where our people can do just that.
We are seeking a
Data Engineer
who will design, build, and maintain robust data pipelines and analytics infrastructure to support data-driven decision making across the organization. Working with modern data technologies including Python, AWS serverless services, MySQL databases, and analytics tools, this role ensures reliable data flow from various sources to analytical systems. The ideal candidate combines strong technical skills in data engineering with experience in cloud-native architectures to deliver scalable, cost-effective data solutions that enable business intelligence and advanced analytics.
Key Responsibilities:
Design and implement scalable ETL/ELT pipelines using AWS services including AWS Glue, Lambda, S3, and Step Functions to process structured and unstructured data from multiple sources.
Build and optimize data integration processes connecting MySQL databases, APIs, and external data sources to analytical systems and data warehouses.
Develop automated data quality monitoring, validation, and cleansing processes to ensure data accuracy, completeness, and consistency across all data pipelines.
Create and maintain data models, schemas, and documentation to support analytics teams, data scientists, and business stakeholders in accessing and understanding data.
Implement real-time and batch data processing solutions using serverless architectures, optimizing for performance, scalability, and cost-effectiveness.
Collaborate with development teams to integrate data collection points into Next.js applications and Node.js services, ensuring seamless data capture from user interactions and system events.
Build and maintain data analytics APIs and services that provide clean, transformed data to business intelligence tools, dashboards, and reporting systems.
Monitor data pipeline performance, troubleshoot issues, and implement proactive alerting and logging mechanisms using AWS CloudWatch and other monitoring tools.
Design and implement data backup, archival, and disaster recovery strategies to ensure data availability and business continuity.
Work with data analysts and business stakeholders to understand reporting requirements and translate them into efficient data processing workflows.
Required Qualifications:
Bachelor’s degree in Computer Science, Data Engineering, Mathematics, or a related technical field.
4-6 years of hands-on data engineering experience with strong proficiency in Python for data processing, transformation, and pipeline development.
Extensive experience with AWS data services including AWS Glue, Lambda, S3, Athena, Redshift, and Kinesis for building serverless data pipelines.
Strong SQL skills and experience with MySQL database design, optimization, and administration including performance tuning and query optimization.
Experience with data pipeline orchestration tools such as Apache Airflow, AWS Step Functions, or similar workflow management systems.
Proficiency in data formats including JSON, CSV, Parquet, and Avro, with understanding of when to use each format for optimal performance.
Knowledge of data warehousing concepts, dimensional modeling, and analytics best practices for supporting business intelligence requirements.
Experience with version control systems, CI/CD pipelines, and infrastructure as code practices for deploying and managing data infrastructure.
Bonus Qualifications:
AWS certifications such as AWS Certified Data Analytics Specialty or AWS Certified Solutions Architect.
Experience with streaming data technologies including Apache Kafka, AWS Kinesis, or real-time data processing frameworks.
Knowledge of machine learning workflows and experience building data pipelines that support ML model training and inference.
Familiarity with business intelligence tools such as Tableau, Power BI, or AWS QuickSight for creating data visualizations and dashboards.
Experience with containerization technologies like Docker and orchestration platforms for deploying data processing applications.
Understanding of data governance, privacy regulations, and security best practices for handling sensitive data in cloud environments.
Experience with NoSQL databases such as DynamoDB, MongoDB, or Elasticsearch for handling unstructured data and high-volume analytics workloads.
Company Benefits:
At our Singapore office, we believe in putting our employees’ well-being first! We offer a flexible hybrid working arrangement and birthday leave.
Additionally, we provide comprehensive medical insurance and dental coverage, wellness allowance and competitive annual leave entitlement to support your well-being and time to recharge or explore your passions out of work.