👨🏻‍💻 postech.work

Senior Data Quality Engineer

EPAM Systems • 🌐 Remote

Remote Posted 1 day, 10 hours ago

Job Description

We are looking for a skilled and experienced Senior Data Quality Engineer to join our team. In this role, you will play a critical part in ensuring the accuracy, reliability, and efficiency of our data systems and processes at scale. If you are passionate about leading impactful data quality initiatives and working with cutting-edge technologies, this position will allow you to shape the future of our data ecosystem.

Responsibilities

Lead the development and execution of data quality strategies, ensuring accuracy and reliability across data products and processes

Drive data quality initiatives while promoting best practices across teams and projects

Develop and implement advanced testing frameworks and methodologies to meet enterprise data quality standards

Manage and prioritize complex data quality tasks, ensuring efficiency under tight deadlines and competing priorities

Design and maintain comprehensive testing strategies for evolving system architectures and data pipelines

Provide guidance on resource allocation and prioritize testing efforts to align with business and regulatory requirements

Establish and continuously improve a data quality governance framework to ensure compliance with industry standards

Build, scale, and optimize automated data quality validation pipelines for production environments

Collaborate with cross-functional teams to address infrastructure challenges and enhance system performance

Mentor junior team members and maintain detailed documentation for test strategies, plans, and frameworks

Requirements

At least 3 years of professional experience in Data Quality Engineering

Advanced programming skills in Python for data validation and automation

Expertise in Big Data platforms, including tools from the Hadoop ecosystem such as HDFS, Hive, and Spark, as well as modern streaming platforms like Kafka, Flume, or Kinesis

Practical experience with NoSQL databases such as Cassandra, MongoDB, or HBase, managing large-scale datasets

Proficiency in data visualization tools like Tableau, Power BI, or Tibco Spotfire to support analytics and decision-making

Extensive experience with cloud platforms such as AWS, Azure, or GCP, with a strong understanding of multi-cloud architectures

Advanced knowledge of relational databases and SQL (PostgreSQL, MSSQL, MySQL, Oracle) in high-volume, real-time environments

Proven experience in implementing and scaling ETL processes using tools like Talend, Informatica, or similar platforms

Familiarity with deploying and integrating MDM tools into workflows, as well as performance testing tools like JMeter

Advanced experience with version control systems such as Git, GitLab, or SVN, and expertise in automation for large-scale systems

Comprehensive understanding of modern testing frameworks (TDD, DDT, BDT) and their application in data environments

Experience with CI/CD practices, including pipeline implementation using tools like Jenkins or GitHub Actions

Strong analytical and problem-solving skills, with the ability to interpret complex datasets into actionable insights

Exceptional English communication skills (B2 level or higher), with experience engaging stakeholders and leading discussions

Nice to have

Hands-on experience with additional programming languages like Java, Scala, or advanced Bash scripting for production data solutions

Advanced knowledge of XPath and its use in data validation and transformation workflows

Experience designing custom data generation tools and synthetic data techniques for advanced testing scenarios

Get job updates in your inbox

Subscribe to our newsletter and stay updated with the best job opportunities.