ABOUT HAIBOT
Haibot is an AI-powered bookkeeping automation service built for UK accounting practices. We handle the execution of bookkeeping workflows — bank reconciliations, payroll submissions, invoice publishing, VAT returns, and document management — on behalf of our clients, who are accounting practices serving small and medium businesses across the UK and Ireland.
We currently serve approximately 90 practices and operate on a Service with a Software model: we do not just sell software, we run the work. Our automation stack is the core of our service delivery. We are now rebuilding that stack from the ground up — replacing our UiPath-based platform with a fully custom, self-hosted automation infrastructure that gives us complete control, lower operating costs, and the architectural foundation to scale.
This role is at the centre of that build.
THE ROLE
You will own the design and delivery of Haibot's new automation control plane — the system that schedules, executes, monitors, and recovers our bots across 90+ client environments. This is a greenfield build: you will make the foundational architecture decisions, set the engineering standards, and lay the technical groundwork that the rest of the team builds on.
You will work directly with the founder and a small team of automation engineers. There is no bureaucracy, no lengthy approval chains, and no legacy codebase holding you back. There is also no hand-holding — we need someone who can think clearly about distributed systems, make defensible trade-offs, and build things that work reliably in production.
WHAT YOU WILL BUILD
Control Plane (replaces UiPath Orchestrator)
Temporal-based workflow orchestration layer with per-client Namespace isolation
Scheduling architecture for time-based, event-based, and manually triggered bot runs
Queue management with priority, retry logic, and dead-letter handling
Multi-tenant isolation ensuring one client's bots cannot affect another's
Execution logging pipeline: structlog → Fluent Bit → Loki, queryable per client and run
Monitoring and alerting via Grafana and Prometheus with Slack integration
Secrets \& Credential Management
Self-hosted HashiCorp Vault with per-client secret paths
AppRole authentication for bot workers — no static credentials in environment variables
Audit logging of all secret access across client environments
Bot Framework (co-owned with Automation Engineers)
Shared Python library for bot authoring: retry logic (Tenacity), structured logging (structlog), error taxonomy
Temporal activity and workflow patterns, reusable across all bot types
Bot skeleton template and folder structure standards
Operator API (backend for the dashboard)
FastAPI backend exposing endpoints for bot triggering, status queries, and run history
JWT-based authentication with role-based access control (Admin / Operator / Read-Only)
Server-Sent Events for real-time bot status streaming to the operator dashboard
WHAT WE ARE LOOKING FOR
Essential — you must have these
Production experience with async Python (asyncio, Pydantic, modern packaging)
Hands-on experience with a workflow orchestration system — Temporal strongly preferred; Prefect, Celery+Redis, or Airflow accepted with clear ability to learn Temporal
Strong understanding of distributed systems concepts: durability, idempotency, retry semantics, queue design
PostgreSQL — schema design, indexing, and operational confidence
FastAPI or equivalent async Python web framework
Comfortable making and defending architecture decisions without a committee
Experience building multi-tenant systems with hard isolation requirements
Strongly preferred — will differentiate your application
Direct Temporal experience in production (workflows, activities, Namespaces, retry policies)
HashiCorp Vault — secrets engines, AppRole auth, policy management
Self-hosted infrastructure: Docker Compose or Kubernetes, service networking, backup and restore
Grafana stack: Prometheus metrics, Loki log aggregation, Fluent Bit log shipping
Experience in a regulated or compliance-sensitive domain (fintech, legal, healthcare)
RPA or automation engineering background — understanding of what reliable bot execution actually requires
Nice to have — not required
Playwright or browser automation experience
Windows automation (pywinauto, desktop application interaction)
Xero or accounting software API integration
Next.js / React — enough to collaborate effectively with a frontend engineer
WHAT THIS ROLE IS NOT
We want to be direct about fit. This role is probably not right for you if:
You want to work in a large engineering team with established processes and senior oversight at every decision
You are primarily a data engineer — this is application and systems engineering, not pipelines or analytics
You prefer synchronous Python and have limited async experience
You are not comfortable owning infrastructure decisions in addition to application code
You want a role where the tech stack is already decided and your job is to execute within it
WHAT YOU CAN EXPECT FROM US
Direct access to the founder — no product managers between you and the person setting direction
Genuine ownership of the architecture — your decisions will shape the platform for years
A team that understands automation deeply and will challenge you on the right things
Competitive compensation with scope to grow as the platform and team scale
Hybrid working from Birmingham — we value in-person collaboration but are not presenteeist
The chance to build something from scratch that actually runs in production, at scale, with real consequences
Job Type: Full-time
Pay: £30,000.00-£40,000.00 per year
Benefits:
Casual dress
Free parking
Profit sharing
Work from home
Work Location: Remote