Position Overview

Job Title: Data Engineer (6-Month Contract)

Department: Services

Location: Singapore

Reporting To: Contract

Duration: 6 months

Tookitaki is seeking a Data Engineer (Contract) with strong expertise in Apache Spark and Cloudera (CDP) to support high-priority data initiatives for our AI-driven financial crime prevention platforms—FinCense and the AFC Ecosystem. This role will contribute to building and maintaining robust data pipelines that ensure accurate, scalable, and production-grade data processing across real-time and batch workflows.

Position Purpose

This role is designed to support data engineering efforts during a critical delivery phase. The engineer will work closely with platform, product, and services teams to enable high quality data ingestion, transformation, and availability across Tookitaki’s compliance modules. The work done in this role directly contributes to risk scoring, transaction monitoring, and fraud detection systems for global banks and fintech clients.

Key Responsibilities

1. Spark-Based Data Development

Design and optimize batch and streaming pipelines using Apache Spark.

Debug performance and memory issues in Spark-based ETL processes.

2. Cloudera Data Platform (CDP) Handling

Leverage HDFS, Hive, Impala/Trino, and HBase within Cloudera to support data workflows.

Collaborate with infra teams to ensure CDP cluster reliability and schema alignment.

3. Pipeline Development \& Monitoring

Build ingestion pipelines using Kafka, Hive, Spark for large-scale financial datasets.

Support Airflow-based orchestration and ensure production SLAs are met.

4. Data Validation \& Debugging

Write and optimize SQL queries to validate data accuracy and ingestion success.

Assist in tracing pipeline issues and executing backfills if necessary.

5. Cross-Functional Collaboration

Coordinate with data scientists, DevOps, and service teams to support platform releases.

Deliver on strict project timelines tied to active client deployments.

Qualifications and Skills

Education

Bachelor’s/Master’s in Computer Science, Engineering, or related discipline.

Experience

5–8 years as a Data Engineer, with at least 2 years in Spark-heavy environments.

Prior experience working with Cloudera Data Platform (CDP) in production.

Technical Expertise

Apache Spark (Core, SQL, Tuning)

Cloudera CDP: Hive, HDFS, HBase, Impala/Trino

Kafka, Airflow, SQL

Python and Bash scripting

Familiarity with Linux-based environments

Exposure to AWS is a plus

Soft Skills

Strong problem-solving mindset

Ability to thrive in contractual, delivery-driven settings

Clear communication and documentation habits

Focus on execution, quality, and speed

Key Competencies

Data Pipeline Ownership

Big Data Architecture

Execution Agility in Project Timelines

Collaborative Implementation Mindset

Operational Readiness Success Metrics

On-time delivery of assigned pipeline components

Stability and performance of Spark workflows in UAT and production

Accuracy of data validation and transformation logic

Cross-team satisfaction with deliverables in rollout sprints

Data Engineer (6-Month Contract)

Job Description

Login / Register

👋 Let's find you a Dream Job

Check Your Email!

Get job updates in your inbox