Role Description
We are looking for a hands-on experienced DevOps Engineer with a proven track record of working successfully in a seed-stage startup environment or ground up build. You will be responsible for building and maintaining our cloud infrastructure and establishing best practices for security, monitoring, and incident response. You will also be a key partner in building the serving infrastructure that supports our core ML/AI models.
This role requires a strong generalist who can thrive with minimal supervision and is comfortable wearing multiple hats.
Key Responsibilities
Infrastructure Management: Design, deploy, and maintain highly available and scalable infrastructure using Infrastructure as Code (IaC) principles (e.g., Terraform, CloudFormation) on different cloud environments.
CI/CD Implementation: Build, maintain and optimize Spector wide CI/CD Pipelines.
Monitoring and Logging: Implement and manage monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK stack) to ensure the health and performance of our services.
Security: Establish and enforce security best practices across the infrastructure and application stack, including secrets management, network segmentation, and vulnerability patching.
Cost Optimization: Proactively identify and implement strategies for optimizing cloud expenditure without compromising performance or reliability.
Collaboration: Work closely with the engineering team to streamline development workflows, troubleshoot production issues, and provide infrastructure support.
Must Have:
5+ years of professional experience as a DevOps Engineer, Site Reliability Engineer (SRE), or similar role.
Demonstrated experience working within a seed-stage or ground up building.
Must understand the resource constraints, rapid pivoting, and hands-on generalist nature of this environment.
Strong experience with Docker and Kubernetes (EKS/GKE/AKS).
Expert proficiency with Terraform or similar IaC tools.
Experience setting up and maintaining CI/CD pipelines (e.g., GitLab CI, GitHub Actions, Jenkins).
Experience with database administration (e.g., PostgreSQL, MongoDB, Redis).
Preferred to have:
Experience in MLOps and building production serving infrastructure for machine learning models.
Security certifications or extensive experience with cloud security frameworks.
Experience in on-prem deployment and observability.
Experience in building and deploying ML models for real time time series data like sensor/IoT data.
About Spector.ai
Spector.aiis a well-funded, fast-paced, innovative seed-stage startup focused on a mission to solve the $1.5 trillion challenge of industrial asset reliability. Spector is building an AI-first industrial agent platform designed to transform plant reliability and performance from reactive to autonomous operations. By combining machine learning and domain-specific industrial AI Agents, Spector.ai enables real-time diagnostics, root cause analysis, and actionable recommendations at scale. The platform extracts insights from complex industrial data including unstructured documentation and live sensor streams reducing false positives, shortening time to resolution, and scaling expertise without reliance on data scientists.
We are rapidly growing and looking for an experienced, self-driven DevOps Engineer to join our core team. This is a unique opportunity to shape our infrastructure, tooling, and deployment practices from the ground up, ensuring we can scale effectively and reliably as we move toward our next stage of funding and growth.