👨🏻‍💻 postech.work

Python Data Engineer – AI/Healthcare

Hays • 🌐 In Person

In Person Posted 3 days, 16 hours ago

Job Description

Your New Company

A leading Canadian retailer at the forefront of food and health innovation, committed to delivering quality, value, and sustainability. With a diverse portfolio of trusted brands and a strong presence across grocery, pharmacy, and wellness, this organization plays a vital role in serving millions of Canadians every day. Their focus on community, environmental responsibility, and long-term growth makes them a cornerstone of the Canadian retail landscape.

Your Role:

In this role, you will bridge the gap between complex data orchestration and advanced machine learning services. You will be responsible for developing the "brain" of our agent—implementing multi-agent architectures, robust RAG pipelines, and safety guardrails—while ensuring our underlying data ecosystem (Vector DBs, SQL, and Knowledge Graphs) is perfectly tuned for high-stakes medical contexts.

Agentic Systems \& ML Services

Architect Multi-Agent Workflows:

Design and implement a 2-layer supervisor-router graph and subgraphs using

LangGraph

and

LangChain

to coordinate complex task execution.

Reasoning \& Planning:

Implement

ReAct patterns

to enable the AI to perform autonomous chain-of-thought reasoning, strategic planning, and tool-based actions.

Tool Integration:

Develop and maintain function calling capabilities and Model Context Protocol (MCP) servers to allow the agent to interact with external APIs and databases.

Safety \& Guardrails:

Build and deploy rigorous guardrail systems to detect and mitigate malicious inputs, handle medical crisis queries, and prevent inappropriate or biased outputs.

Evaluation Frameworks:

Build and maintain a comprehensive evaluation service to measure LLM performance, grounding accuracy, and agentic reliability.

Data Engineering \& Infrastructure

Data Acquisition:

Develop scalable web scrapers and data collection pipelines using

Scrapy

and

BeautifulSoup

.

Pipeline Orchestration:

Manage complex ETL/ELT workflows using

Apache Airflow

to process and ingest healthcare data.

Hybrid Data Storage:

Architect and optimize data ingestion into

Weaviate

(Vector DB) for semantic search and possibly future case for

Neo4j

(Knowledge Graph) for structured relationship mapping.

RAG Optimization:

Build and refine a full

Retrieval-Augmented Generation (RAG)

pipeline to ensure all LLM responses are grounded in verified healthcare data sources.

Minimum Qualifications

Advanced Python:

Expert-level proficiency in Python and its data ecosystem.

LLM Orchestration:

Proven experience with

LangChain

and

LangGraph

for building stateful, multi-agent systems.

Database Expertise:

Hands-on experience with Vector Databases (e.g.,

Weaviate

), Graph Databases (e.g.,

Neo4j

), and standard SQL.

Data Engineering:

Proficiency with

Apache Airflow

and web crawling frameworks (Scrapy/BeautifulSoup).

Experience \& Knowledge

RAG \& Grounding:

Deep understanding of embedding models, retrieval strategies, and grounding techniques to minimize hallucinations.

Agentic Patterns:

Practical experience implementing ReAct, Plan-and-Execute, or similar agentic reasoning patterns.

Safety \& Ethics:

Experience implementing LLM safety layers and handling sensitive user queries (preferably in a regulated domain like healthcare).

API Development:

Strong experience building and consuming RESTful APIs and implementing tool-calling interfaces.

Preferred Qualifications:

Experience with healthcare data standards (e.g., HIPAA compliance, FHIR).

Experience building and scaling production-grade evaluation suites for LLMs (e.g., PromptEval, RAGAS, LangSmith).

What You'll Get in Return

Competitive rate.

Challenging and great work environment

Get job updates in your inbox

Subscribe to our newsletter and stay updated with the best job opportunities.