Division: Group Technology Services (GTS)
Role Summary
We are looking for a Network Monitoring Engineer to strengthen the Network Automation \& Monitoring squad within our banking environment. You will design, implement, and evolve proactive monitoring capabilities across complex, missionâcritical networksâcovering switches, routers, firewalls, load balancers, and moreâwhile ensuring compliance with financial regulations and internal operational processes. Youâll combine deep networking expertise with automationâfirst thinking to reduce MTTR, increase observability, and improve service reliability.
Key Responsibilities
Monitoring Architecture \& Implementation: Design and deploy endâtoâend monitoring solutions (fault, performance, capacity, and topology) across multiâvendor network technologies.
Tooling \& Integrations: Configure, maintain, and extend monitoring platforms; integrate with incident, logging, and CMDB systems (e.g., ServiceNow/Tivoli/Splunk) to enable alerting, ticketing, runbooks, and autoâdiagnostics.
Automation \& Reliability: Build and enhance automation for data collection, health checks, diagnostics, and remediation (scripts, jobs, pipelines) to drive proactive operations.
Observability \& Reporting: Define KPIs/SLAs, create dashboards, and produce actionable reports for operations, risk, and audit stakeholders; champion data quality and coverage.
Operational Excellence: Follow established processes (change, release, incident, problem, configuration management); contribute to standards, SOPs, and continuous improvements.
Compliance \& Risk: Ensure solutions and workflows meet banking regulatory requirements and internal controls; support audits and evidence gathering.
Collaboration: Work closely with Network Engineering, Security, Service Management, and application teams to align monitoring with business priorities and project roadmaps.
Knowledge Sharing: Document architectures, procedures, and runbooks; mentor teammates and promote best practices within the squad.
Required Qualifications
Experience: Minimum 10 years of handsâon networking experience across switching, routing, firewalls, load balancers , and related technologies in large, complex environments.
Monitoring Expertise: Proven track record designing and operating enterpriseâgrade network monitoring solutions (fault/performance/capacity/topology).
CrossâTechnology Knowledge: Strong transversal understanding of heterogeneous network stacks and vendor ecosystems.
Scripting \& Automation: Practical experience implementing monitoring via automation (e.g., scripts, scheduled jobs, APIs) and/or tooling integrations.
Ways of Working:
-
Ability to work autonomously in an enterprise context.
-
Strong analytical/problemâsolving skills.
-
Ability to learn new technologies quickly.
-
Effectiveness in a fastâpaced team environment.
-
Clear, concise communication (verbal and written), plus organizational skills and high attention to quality while following processes and operational procedures.
-
Teamâoriented mindset and ability to collaborate across squads and departments.
NiceâtoâHave Skills
Tools: Knowledge of SevOne and NetBrain (usage, administration, or integration).
Development \& CI/CD: Experience with Python (structured, testable code), pipelines (e.g., Azure DevOps/GitLab CI), and packaging; exposure to API design/consumption.
Ecosystem: Familiarity with integrations into ServiceNow , Tivoli , Splunk , CMDBs, and topology/diagnostic platforms.
Data \& Storage: Basic understanding of timeâseries/metrics stores and dashboarding concepts (e.g., KPIs, SLOs).
Security \& Compliance: Awareness of controls relevant to financial services (e.g., change management, evidence, traceability).
What Success Looks Like
Comprehensive monitoring coverage with reliable alerting, low false positives, and clear diagnostic workflows.
Reduced MTTR through actionable dashboards and automationâdriven triage.
High auditâreadiness with documented configurations, evidence, and repeatable procedures.
Strong collaboration with engineering and operations, leading to measurable resilience and performance improvements.