Platform Monitoring Engineer
PARTNER COMPANY
OneWeb Technologies
ENGAGEMENT TYPE
In-Person
LOCATION
Tyson's Corner, VA
Opportunity Description
Position
Platform Monitoring Engineer Will spearhead the development and support of our Zabbix monitoring platform, ensuring its seamless operation and efficacy. With a focus on maintaining the current Zabbix installation, managing the user interface and dashboard display, and conducting regular updates and patches, Will collaborates with service owners to logically group data points and proactively identify and address gaps in the monitoring solution. Proficient in Linux environments, Will conducts daily system checks, responds to events, and manage server patching while automating reports for IT management. With a keen eye for detail and a passion for system reliability, Will thrives in a collaborative environment, continuously enhancing monitoring systems' effectiveness.
Skill Sets: Should have a good communication skills , knowledge to setup, initialize, and configure monitoring tools (Zabbix, Nagios, SolarWinds, etc)
Skills required
Make changes/update to network monitoring tool to reflect the current state of the enterprise IT infrastructure
Implement monitoring solutions utilizing SMNP protocol
Work with networking and routing protocols such as TCP/IP, EIGRP, BGP, OSPF, MPLS, etc., experience with Network Management protocols, such as SNMPv3, SYSLOG, and Log-Files.
Develop scripts to automate tasks using API interfaces
Design and implement data and systems integrations with network monitoring product such as Zabbix
Configure monitoring for AWS servers to ensure application availability
Utilize network performance tools to detect network performance problems.
Assign thresholds to network KPI’s and implement alerts
Familiarity with RHEL-like linux
Understanding of basic IP routing concepts and DNS
Good to have knowledge on ITIL Framework, Agile ways of working and working with Jira tool and Service Now
Experience to Setup, initialize, and configure monitoring tools (Zabbix, Nagios, etc)
Well-practiced troubleshooting methodology
Primary Responsibilities
Develop and support of the Zabbix monitoring platform
Maintaining current Zabbix installation
Maintaining Zabbix user interface and dashboard display
Perform product updates, upgrades and apply patches to the application and the underlying platform
Working with different service owners to logically group individual data points into actionable categories
Proactively identifying gaps within the monitoring solution and remediating those gaps
Working with service owners to filter irrelevant data and alerts
Manage and work in a Linux operating system and services environment.
Perform daily system checks, review and respond to events reflected in various management tools, and perform server patch management.
Conduct system audit reviews and perform maintenance functions as required to ensure system health.
Develop and automate various reports required by IT Management that depict availability of various systems
Maintaining runbooks with respect to monitoring systems
Familiarity with how IT environments are set up across on premise and Cloud datacenters
Familiarity with Site Reliability Engineering and AIOps concepts
Develop automation scripts to enhance and implement automated health checks
Candidates with these skills will be given preferential consideration
Should have strong networking knowledge of networking devices such as routers, switches, and firewalls (Cisco and Juniper)
Should have strong system administration experience (Windows and Linux).
Should have strong requirements planning and project implementation skills and be able to work with internal technical staff and outside contractors to deliver results.
A broad understanding in systems, networking and network monitoring administration
Proven ability to successfully plan, develop and execute technology solutions.
Has actively acted as the point person for critical issues that affect the server and storage infrastructure for the network monitoring platform.
Experience/understanding of SNMP protocol, ability to perform MIB analysis and configure SNMP trap parsing and SNMP table polling.
Experience with instrument monitoring to detect network performance problems.
Zabbix Administration.
Qualifications
Experience is preferred over education/degree, but one of the following will suffice:
A bachelor's degree in computer science, engineering, or related field and 4 - 5 years of related experience.
Some education and 4 - 5 years of related experience.
Benefits (If hired after internship)
Medical Insurance
Dental Insurance
Vision Insurance
Flexible Spending Account
Short- and Long-Term Disability
Life Insurance
PTO
10 Paid Holidays
401k with Matching