Verition Fund Management LLC (“Verition”) is a multi-strategy, multi-manager hedge fund founded in 2008. Verition focuses on global investment strategies including Global Credit, Global Convertible, Volatility \& Capital Structure Arbitrage, Event-Driven Investing, Equity Long/Short \& Capital Markets Trading, and Global Quantitative Trading.
We are seeking a Senior DevOps Engineer to lead the design and operation of mission-critical trading infrastructure in the cloud. This role will own production Kubernetes platforms, CI/CD pipelines, and core infrastructure systems that power real-time trading operations. The Senior DevOps Engineer will work closely with trading, quant, risk management, and development teams to ensure system reliability, security, and performance for applications processing live Bloomberg pricing data and portfolio analytics. This role reports into the Director of Cloud Platform Engineering.
Responsibilities:
Design, build, \& maintain production Kubernetes (EKS) clusters running critical trading analytics platforms and live market data distribution systems
Design, build, \& maintain CI/CD pipelines using Jenkins and ArgoCD that deploy trading applications across development and production environments
Design, build, \& maintain enterprise repository infrastructure (Nexus) hosting Docker images, Python packages, and internal APIs serving 15+ development teams
Design, build, \& maintain comprehensive monitoring and alerting systems using Datadog to provide observability across all trading and risk management applications
Own incident response for infrastructure outages affecting live trading operations, including certificate failures, load balancer issues, and data pipeline disruptions
Lead complex infrastructure migrations including database transitions (RDS to Aurora), storage modernization (FSX/EFS), and operating system standardization
Ensure DevOps best practices (CI/CD, GitOps, infrastructure-as-code with Terraform, automated testing, documentation) are met
Partner with business stakeholders including portfolio managers, quants, risk analysts, and development teams to deliver reliable, performant infrastructure
Own day-to-day operations including triaging production incidents, managing access requests for 50+ users monthly, handling certificate renewals, and coordinating maintenance windows
Implement security and compliance controls for source code management (Bitbucket), authentication (SSO/SAML), and access management across all systems
Optimize system performance to address degradation in reporting processes and resource constraints affecting data science workflows
Mentor junior engineers through code reviews, documentation creation, and knowledge sharing sessions
Refine JIRA tickets with technical rationale, acceptance criteria, and risk assessment
Qualifications:
6+ years of experience in a DevOps, Site Reliability Engineering, or Infrastructure Engineering role
4+ years of experience within financial services (ideally investment management or trading operations)
4+ years of hands-on experience with Kubernetes in production environments
4+ years of experience with AWS services (EKS, EC2, RDS/Aurora, S3, IAM, VPC, Load Balancers)
4+ years of experience with CI/CD platforms (Jenkins, ArgoCD, or similar)
Strong experience with containerization technologies (Docker, container registries)
Proficiency with infrastructure-as-code tools (Terraform preferred)
Strong Linux system administration skills
Solid scripting skills in Python, Bash, or similar languages
Experience with monitoring and observability platforms (Datadog, CloudWatch, or similar)
Proven track record responding to production incidents in high-pressure environments
Excellent verbal and written communication skills
- Nice to Have:
Experience with GitOps workflows and declarative infrastructure management
Knowledge of trading systems, market data feeds (Bloomberg), or portfolio analytics platforms
Experience with repository management platforms (Nexus, Artifactory)
Familiarity with Apache Airflow or other workflow orchestration tools
Experience with certificate management and PKI infrastructure
Background in database administration (MySQL, PostgreSQL, Aurora)
Experience with SSO/SAML authentication systems and identity management
AWS certifications (Solutions Architect, DevOps Engineer, or similar)
Experience with performance optimization and capacity planning for high-throughput systems
Track record building self-service platforms and internal tooling to improve developer productivity