🚀
**Be part of a movement to
change the way Europe pays**
In today’s digital world, payments often still feel outdated: random delays and confusing rules make it harder than it should be to pay and get paid. The European Payments Initiative (EPI) is here to change all that, forever.
With Wero, our digital wallet, we make sending and receiving money simple, seamless and secure across France, Belgium and Germany, with more countries and omnichannel solutions coming soon. Supported by 14 major banks and the two largest European acquirers, EPI is building a new, proudly European payment system: easy, instant and transparent, all for the greater good.
🔎
What's in it for you
We are currently on the lookout for an
NLP Data Engineer
with hands-on experience in text data processing and document ranking. You’ll join a team working on heterogeneous data sources, where your work will help establish processing pipelines that optimize document relevance and retrieval.
🐝 About The Team
You’ll join a pan-European AI team directly reporting to the CTPO of the company. The team is dedicated to building reliable and impactful GenAI solutions and operates in an Agile environment and works closely across countries to design, benchmark, and deploy LLM and RAG systems that truly make a difference. They are located across Europe, in Germany, France, the Netherlands and Belgium.
💥
Your impact
Prepare, clean, and sanitize heterogeneous text data sources (Confluence, Readme, Slack, GitHub) for usage in AI applications
Design and implement document scoring and ranking pipelines using models like BM25, TF-IDF, or neural rankers
Build and optimize embedding workflows to support semantic similarity, nearest-neighbour search, and RAG pipelines
Research and benchmark the latest NLP methods to continuously improve document processing, retrieval, and ranking
💻
Technology stack
Languages \& frameworks: Python, PySpark, Numpy, Pandas
NLP libraries: spaCy, NLTK, Hugging Face Transformers
Embedding models: Word2Vec, GloVe, FastText, BERT, RoBERTa, LASER
RAG frameworks: LangChain, LlamaIndex, Haystack, Guidance, PromptLayer
Infrastructure \& deployment: AWS, Docker, Kubernetes
🕵🏻♀️
To succeed, you should meet at least 70% of these requirements
+3 years’ experience in building NLP and text processing pipelines. An advantage would be prior experience in AI and ML in general.
Ability to understand, design, and apply evaluation metrics for document ranking and retrieval
Experience with embeddings and similarity search, including pre-trained and fine-tuned models such as BERT, Word2Vec, or FastText
Experience implementing document scoring and ranking models (BM25, TF-IDF, or neural rankers)
Familiarity with RAG frameworks such as LangChain, LlamaIndex, or Haystack
Proficiency in Python and NLP libraries like spaCy, NLTK, Hugging Face Transformers
Ability to evaluate, design, and implement improvements to document processing and retrieval workflows
Fluent in English (CEFR C1 or C2);French, German, Dutch, or other European languages are a plus.
🪜 If this looks like you, the recruitment
steps are:
A first call with one of our recruiters
An interview focused on the mission and your expertise with your future manager
A final interview with our CTPO
Hopefully, an offer you can’t refuse
⛔ Turn back if …
You’re not comfortable getting hands-on with messy, heterogeneous text data and turning it into structured, usable inputs
You lack a solid understanding of document ranking, embeddings, and the importance of numbers-driven evaluation
You're not confident using both pre-built NLP tools and custom created NLP and text processing pipelines
Otherwise apply!
🫶 Our commitment to equal employment opportunities
EPI offers the same job opportunities to all, without distinction of gender, ethnicity, religion, sexual orientation, social status, disability or age. EPI promotes the development of an inclusive work environment that mirrors the diversity of the clients our product is serving.