Domain Skills:
1. KYC / AML expertise
2. Client Onboarding \& Lifecycle Management
Technical Skills:
Data Pipeline Development (scalable ETL/ELT pipelines)
Big Data Infrastructure (Spark, Flink, Hadoop, Kafka)
Data Orchestration (Airflow or Prefect)
Data Processing Frameworks (Spark, Hadoop, or Flink)
Cloud Data Platforms (AWS, Azure, or GCP)
Programming (Python or Scala)
SQL and database technologies (e.g., Oracle, PostgreSQL)
NoSQL database technologies (e.g., Mongo, Couch)
Data Orchestration (Airflow or Prefect)
Technical Skills (Plus):
Containerization (Docker, Kubernetes)
Technical Skills (Good to Have):
Large-scale document processing (Spacy, NLTK, LLMs)
Agentic RAG framework (LangChain, CrewAI, Vector Databases)
Fine-tuning LLMs
Distributed caching (Hazelcast or Redis)
Distributed, multi-tier application experience
High-performance, scalable application experience
Soft Skills:
Strong communication and stakeholder management
Analytical mindset with attention to data accuracy and integrity
Ability to work in agile, cross teams