Wanted: Data Scientist who will help us keep LLMs under control.
About us
============
White Circle is an AI Safety company building the safety, reliability, and optimization layer for AI systems. At the core of our platform are policies â simple natural-language rules that define what an AI model should and shouldnât do. We automatically test, enforce, and continuously improve these policies at scale.
Weâve raised $11M from top funds, founders, and senior leaders at OpenAI, Anthropic, HuggingFace, Mistral, DeepMind, Datadog, Sentry, and others
We process over one hundred million API calls every month
We fine-tune and train our own LLMs so they run faster and cheaper than any open or proprietary model
Weâre a small, highly focused team. If you want to work deeply on hard problems, see your work ship to production quickly, and influence how AI safety is actually built â youâre the one we need.
You will:
=============
Turn petabytes of unstructured text into a structured, explorable view (topics, clusters, segments, trends, anomalies): iterate from âunknown unknownsâ to stable definitions we can track.
Build scalable representation pipelines: sampling strategies, preprocessing/normalization, embeddings at scale, indexing, and retrieval to make the corpus searchable and analyzable.
Use LLMs pragmatically: labeling/classification, weak supervision, data enrichment, summarization, and automated diagnostics of inbound volumes (with cost/quality controls).
Deliver insights that change decisions: translate findings into product and operational actions (what data we have, whatâs missing, where quality breaks, what to prioritize next).
Ship self-serve analytics: datasets, data models, and lightweight tools/dashboards so the team can explore and answer questions without ad-hoc requests.
Partner closely with engineering/research: align pipelines with production constraints (latency/cost/privacy), and integrate outputs into workflows.
Youâll fit right in if you:
===============================
Strong Python + SQL with an engineering mindset: you can build reliable pipelines, not just notebooks.
Solid applied NLP/ML experience on real-world text: embeddings, clustering, topic modeling, semantic search, classification; you understand failure modes and how to debug them.
Comfortable at scale: distributed processing, large-scale storage-querying, and performance-cost tradeoffs.
You know how to evaluate fuzzy problems: offline/online metrics, human-in-the-loop labelling, inter-annotator agreement, drift monitoring, and reproducibility.
Prior work with safety/moderation datasets, policy/rule systems, or high-volume logging/observability
Why White Circle
====================
Salary of $80,000 to $150,000 + equity
20 days of paid vacation
Work from Paris (hybrid) + relocation package
Best medical insurance in France
All the hardware, tools, and services you need
Covered subscriptions for AI agents and IDEs
Team off-sites twice a year: weâve recently been to the Alps and to Saint-Tropez
How we hire
===============
Intro call with one of our colleagues
Complete the take-home assignment
Show your best during the technical interview
Final call with our CEO and CTO
Please submit your application in English - itâs our company language so youâll be speaking lots of it if you join
Compensation Range: $80K - $150K