Role Overview
Jefferies is seeking a hands-on, senior system engineer to join the global platform engineering organization and build, automate, and operate cutting-edge platform solutions in a hybrid cloud environment.
Responsibilities include designing, delivering, and managing a scalable, stable, and highly available private cloud platform with strong focus on automation, orchestration, and tooling, fulfilling user requests, and continuously enhancing the Jefferies Private Cloud Platform.
The role is technical and execution-focused, requires stepping beyond comfort zones when necessary, and involves close collaboration with global technical leads, infrastructure, and cross-functional teams to deliver end-to-end solutions that ensure reliability, performance, and security at scale.
Technical Environment
OS: Oracle Linux, Amazon Linux, RHEL; Windows Server
Linux: systemd, SELinux, LVM, multipathing, bonding/teaming, PAM/SSH, kernel tuning
Windows: AD-integrated servers, Group Policy, WSUS/SCCM patching, Failover Clustering, DFS, SMB/NFS, IIS
Compute/Virtualization: VMware vSphere, Nutanix; Cisco UCS, HPE; KVM
Storage/Protection: Pure Storage, NetApp ONTAP, Cohesity; SAN (Cisco/Brocade); snapshot/replication technologies
Identity/Network: Active Directory, DNS/DHCP/IPAM (Infoblox/Microsoft), ADFS/AD CS; SSSD/realmd/Kerberos for Linux; TCP/IP, VLANs (partner with Network Engineering)
Cloud/Automation: AWS, Azure; Terraform, Ansible, PowerShell, Python, Bash/KornShell, Packer, Git, CI/CD (Azure DevOps/Jenkins), artifact registries
Containers/Observability: Docker/Podman, EKS/AKS (preferred/desired); Zabbix, Splunk, AppDynamics, SCOM, OpenTelemetry
Imaging/Config: MDT, Orchestrator, SCCM/SCOM; kickstart/cloud-init for Linux; packaging/repo management (yum/dnf/apt; Spacewalk or equivalent)
DevOps \& Governance: Platform-as-a-Product mindset; IaC/GitOps for infra and app workloads; automated testing (lint/unit/integration), security scanning (SAST/DAST/CSA), policy-as-code (OPA/Conftest); change management driven by CI/CD pipelines; artifacts and SBOM governance
Core Responsibilities
Build hardened golden images and baselines
Linux: Kickstart/cloud-init, Packer; repo/patching with yum/dnf/apt; SSH/PAM and security hardening
Windows: GPO baselines, MDT/SCCM task sequences, Sysprep; WSUS/SCCM patching; Defender/ATP integration
Deliver desired state automation using Terraform, Ansible, PowerShell, and Python; manage code in Git/Bitbucket; integrate CI/CD (Jenkins/Bamboo/GitHub Actions) with automated tests, security scans, and artifact management; enable self-service via APIs/portals for app teams
Implement and test HA/DR; validate RTO/RPO; performance/capacity tuning across VMware vSphere/Nutanix clusters; document in Confluence, Jira, ServiceNow, Change management (CAB)
Administer VMware vSphere, Nutanix, Cisco UCS/HPE servers; lifecycle, capacity, firmware/driver updates; vCenter/Prism operations
Provision and operate storage (Pure Storage, NetApp ONTAP) and data protection (Cohesity); SAN zoning/masking on Cisco/Brocade; snapshot/replication orchestration
Linux: multipath/udev, LVM, ext4/xfs, snapshot consistency
Windows: MPIO/VSS, NTFS/ReFS, Volume Shadow Copy
Operate core services: Microsoft Active Directory, DNS/DHCP/IPAM (Infoblox/Microsoft), ADFS/AD CS; automate with PowerShell/Ansible
Linux: SSSD/realmd/Kerberos integration, sudo policy, cert enrollment
Windows: OU/GPO hygiene, AD CS templates, auto-enrollment
Container support (preferred/desired): collaborate with app teams on base image standards and registry governance; advise on scanners (vulnerability/SBOM); provide templates (e.g., Helm/Kustomize) and GitOps guidance (e.g., Argo CD/Flux) for EKS/AKS.
Instrument platforms and critical applications; build shared dashboards/alerts/runbooks using Zabbix, SCOM, Splunk, AppDynamics, and OpenTelemetry; define SLOs/error budgets with app teams; enable trace/log/metric correlation
Linux: journald/syslog, auditd, node exporters
Windows: Event Logs, PerfMon counters, SCOM agents
Drive vulnerability remediation and compliance (CIS/NIST); coordinate patch cycles and reporting; use Splunk for evidence and audit support
Troubleshoot cross-domain incidents spanning infra and apps; lead RCA/postmortems; document fixes in Jira/Confluence; deliver changes through CAB and pipelines; drive “shift-left” reliability improvements with app teams
Participate in weekly on-call rotation providing off-hours and weekend support; ensure operational readiness with runbooks and automation
Champion DevOps across Infra-to-App: standardize IaC/GitOps, shared pipeline templates, environment provisioning (ephemeral/dev/test/prod), security/compliance guardrails, and continuous improvement using DORA metrics (lead time, deployment frequency, change failure rate, MTTR)
Qualification and Experience
10–15+ years in platform/systems engineering across Linux/Windows, storage, and virtualization
Expert hands-on with VMware/Nutanix; Cisco UCS/HPE; Pure/NetApp; Cohesity; SAN (Cisco/Brocade)
Strong automation: Terraform, Ansible, PowerShell, Python, Shell (Bash/KornShell); CI/CD (Azure DevOps/Jenkins/GitHub Actions), Git, Packer; artifact/registry/SBOM management; GitOps tools (Argo CD/Flux)
Deep identity and core services: AD, DNS/DHCP/IPAM (Infoblox/Microsoft), PKI/cert services; Linux integration (SSSD/LDAP/Kerberos)
Proven HA/DR design and testing; performance tuning, capacity planning, cost optimization across compute/storage platforms
Observability and SRE: SLOs, error budgets, postmortems; Zabbix, Splunk, AppDynamics, SCOM, OpenTelemetry
Networking fundamentals (TCP/IP, VLANs) and security/compliance (CIS, NIST, ISO 27001, SOX)
Containers (preferred/desired): Docker/Podman, EKS/AKS; image governance and supply chain security; GitOps tools (Argo CD/Flux)
Packaging and repository management for Linux; experience with Spacewalk or equivalent a plus
Education/Certs preferred: Bachelor’s in CS or related; RHCE/LFCS, MCSE/MCA, VMware/Nutanix, AWS/Azure
DevOps mindset: experience driving change across infra and app delivery; establishing IaC/GitOps standards, shared pipeline templates, automated testing/security scanning/policy-as-code; measurable improvement using DORA metrics (lead time, deployment frequency, change failure rate, MTTR)
Beyond Technical Desirable Skills
Clear written and verbal communication to technical and non-technical audiences in English
Effective, timely decision-making aligned with standards and best practices
Strong problem-solving and customer service; performs well under pressure
Produces and maintains runbooks and design/deployment documentation
Initiative and drive; self-directed learning/upskilling
Works effectively in local and global teams, with and without direct supervision
Willing to participate in on-call rotation and weekend support