About ALLSIDES
ALLSIDES is redefining how the world experiences 3D content. We combine physically accurate scanning and generative AI to power content creation workflows for e-commerce, virtual environments, and immersive experiences. Our clients include global brands like adidas, Meta, Amazon, and Zalando.
We operate a rapidly scaling photorealistic 3D scanning operation, capturing tens of thousands of assets annually while training next-generation AI models. As an NVIDIA Inception member, we collaborate with leading research institutions and actively participate in top-tier conferences in 3D computer vision and AI.
More info:
https://www.allsides.tech
\|
https://blogs.nvidia.com/blog/covision-adidas-rtx-ai/
Position Overview
We're looking for an Infrastructure \& DevOps Engineer to build and maintain the foundation of our compute infrastructure. You'll work on hardware provisioning, networking, container orchestration, and deployment pipelines across cloud and on-premise environments. This role focuses on making our multi-GPU clusters reliable, our deployments reproducible, and our developers productive.
Main Responsibilities
Provision, configure, and maintain heterogeneous compute clusters (CPU/GPU) across multiple physical locations
Implement dynamic compute and storage provisioning based on workload demands
Design storage solutions at both hardware and software level (NAS, distributed filesystems, storage tiering)
Implement and manage container orchestration systems (Kubernetes, Docker) for development and production workloads
Design and maintain infrastructure as code using tools like Terraform and Ansible
Build and optimize job scheduling and resource allocation systems (Slurm, Kubernetes)
Set up monitoring, alerting, and observability infrastructure (Prometheus, Grafana, IPMI)
Profile and optimize system-level performance: GPU utilization, memory bandwidth, I/O throughput, network latency
Manage networking, VPNs, and secure access across distributed systems
Handle reliability concerns: hardware failure detection, job checkpointing, disaster recovery
Qualifications
Strong Linux system administration knowledge
Experience with containerization (Docker) and orchestration (Kubernetes)
Knowledge of infrastructure as code (Terraform, Ansible)
Experience with HPC clusters and job scheduling (Slurm)
Familiarity with monitoring solutions (Prometheus, Grafana)
Understanding of networking principles and implementation
Experience with hardware infrastructure management (IPMI, BMC, server maintenance)
Knowledge of storage systems design (NFS, Ceph, distributed filesystems)
Nice to Have
Experience with cloud services (AWS, or others)
Familiarity with bare-metal provisioning (MaaS)
What we offer
Compensation that reflects your experience including stock-options
Lunch voucher for working days
We assist with relocation
Flexible working hours and work-from-home policy
Family-friendly environment
Amazing office space in South Tyrol, located at the Durst Group
Personal and professional growth opportunities
You don't have to tick every box to apply, your drive and passion matter most!
This role is located on-site in Brixen/Bressanone, Italy. If you are interested, please apply with your CV attached to careers@allsides.tech