Monitoring is a key part of any IT infrastructure, required to keep an eye on “what’s going on,” ensure good quality of service for end-users, and peace of mind for everyone involved, from the IT support team up to senior leadership.
Its primary task is to monitor, collect, analyze data and alert from the different systems in the ecosystem (messaging, web, MTD, email, etc.) to detect any possible issue or service deterioration, and eventually notify the responsible person so they can take action if needed. Ideally this would happen proactively, before it reaches end-users and impacts them. The last thing any IT service provider wants is a phone call from a VIP user or, even worse, from the CEO, wondering what is going on and why they were not aware of it yet or already working on a solution to mitigate the impact ASAP.
Another crucial aspect of monitoring is to detect any recurrent issues that might be related to a bigger problem (ex: hardware malfunction), as well as help anticipate the need to update/upgrade the infrastructure to keep up with the growing demand driven by end-users and the mobile devices they bring to the workplace. For the latter, inventory is crucial to ensure a minimum level of security and detect any device not previously authorized, to eventually block it or ask the user to enroll it with the company UEM solution to bring it back into compliance.
However, monitoring at the end is only as good as the response given to an alert, which is why we need to ensure people take the right actions once notified, and here is where the fun starts. In a small environment, this might be easy for a jack-of-all-trades IT administrator working 9 to 5, since they would know what to do as they most surely built the environment themselves. But in most environments, there is a bigger team or teams responsible only for specific tasks and parts of the infrastructure, and we need to ensure we trigger the right one for the right issue and have the right process in place so they know what to do in most situations (ex: restart a specific service).
This is when our team is made up of internal employees sitting on-premise, but with outsourced services or recent growing demand for working from home scenarios, we also need to ensure they can actually connect to the infrastructure from wherever they are (service center around the globe, home office a couple of miles away, etc.) seamlessly and securely.
We also need to make sure that they are reachable when they are needed. This might seem obvious but whoever worked on-duty at some point might remember a situation where he/she needed to call in to escalate to another on-duty engineer…and that person was not reachable! This could be due to either a phone turned off, no battery, no bars, landline to mobile forwarding not working, or one of our favorites: simply sleeping 😊.
For instance, back in March 2020 a prominent cloud provider encountered an issue that affected as many as 6,000 companies in Europe for up to 9 hours. If that can happen to a well-established IT company, why wouldn’t it happen to others?
Monitoring is integral to the quality of service for end-users, not to mention the quality of IT infrastructures themselves. If a company's monitoring solution can ensure detection and alerting of existing as well as potential issues and can be backed up by a diligent and responsive service team, then that solution will be successful.
How are you monitoring your mission critical infrastructure? And, are you monitoring ALL of your mobile infrastructure?
(C) Rémi Frédéric Keusseyan, Mobility Expert/Master Trainer