```html
About the Company:
UptimeAI is leading the way in predictive analytics and AI-driven solutions to optimize operational uptime and reduce downtime for industrial and enterprise clients. Our innovative platform harnesses cutting-edge data science to deliver actionable insights, ensuring maximum efficiency and reliability. UptimeAI uniquely combines Artificial Intelligence with Subject Matter Knowledge from 200+ years of cumulative experience to explain interrelations across upstream/downstream equipment, adapt to changes, identify problems, and give prescriptive diagnosis like a human expert would.
About the Role:
We are a fast-growing, AI-first SaaS startup backed by top-tier investors and operating across India and the US. Our platform helps enterprises optimize critical business functions using cutting-edge AI and automation. As we scale, we’re looking for a hands-on DevOps Engineer who thrives in startup environments and can take ownership of cloud infrastructure, deployment, and CI/CD workflows.
Responsibilities:
Design, implement, and manage cloud infrastructure across Azure for both internal platforms and customer-specific deployments
Configure and maintain VPCs, VPNs, and peering to enable secure, scalable, and isolated environments
Build and automate CI/CD pipelines for application and ML workloads
Manage multi-tenant vs single-tenant deployments based on customer requirements
Implement monitoring, alerting, logging, and disaster recovery strategies
Work closely with engineering to ensure seamless Dev→Prod flows and secure release management
Set up and manage infrastructure as code (e.g., Terraform, Pulumi, Bicep, CloudFormation)
Optimize costs, performance, and availability for both internal and customer-facing cloud workloads
Enforce security best practices, access control, and compliance across infrastructure
Qualifications:
3 - 8 years of experience as a DevOps/SRE/Cloud Engineer in high-growth SaaS or product startups
AWS Certified (at least Solutions Architect - Associate) and Azure Certified (e.g., AZ-104 or higher)
Strong experience with Azure networking, including: VPC, VPNs, Subnets, Route Tables, Security Groups, NAT Gateways
Site-to-site VPN setups for enterprise customers
Proven experience deploying applications to customer-controlled cloud environments (BYOC) and company-controlled SaaS environments
Expertise with tools like: CI/CD: GitHub Actions, GitLab CI, Azure Pipelines; IaC: Terraform, Bicep, or Pulumi; Containers: Docker, Kubernetes (EKS/AKS preferred)
Familiarity with Secrets Management, IAM, Role-based Access Control, and SSO/SAML integration
Strong scripting skills in Bash, Python, or PowerShell
Comfortable working in a fast-paced, ambiguous startup environment
Required Skills:
Experience with AI/ML pipeline deployment or GPU workloads
Exposure to SOC2, ISO27001, or GDPR compliance in a cloud environment
Familiarity with tools like Prometheus, Grafana, Datadog, ELK, or Azure Monitor
Pay range and compensation package:
Not specified in the provided job description.
Equal Opportunity Statement:
UptimeAI is committed to diversity and inclusivity in the workplace.
Why to join UptimeAI:
Impact Industry-Wide Change: Contribute to transformative solutions that significantly improve operational efficiency and reliability for global clients.
Collaborative and Growth-Oriented Environment: Join a talented, passionate team that values innovation, continuous learning, and professional growth.
Opportunities for Leadership and Innovation: Lead pioneering projects, influence product development, and shape the future of industrial AI solutions.
```