The working life of today’s IT professional is consumed by the management of a seemingly never-ending stream of data. As I am sure you are aware, and I hate to be the bearer of bad news if not, but not only must this flood of information be monitored, but its interpretation also falls within the IT professional’s job remit.

Within the decision making process, there are some decisions that aren’t really decisions at all; in fact, some require such little cognitive input that they are often performed on a Friday afternoon. Unfortunately, the data streams that are left for Monday morning do involve a little thinking and, at times, an added sprinkle of extra data to make sense of it all.

Finding the signal through the noise

What you are then left with is what we in the industry like to call a ‘shed-load of data.’ The real fun begins when you must now determine which of this information has any practical future use or value, and then separate it from the rest.

DataLine NORD-4
DataLine NORD-4 monitoring center

While this is common practice in many a data center process, infrastructure monitoring is where it really starts to get interesting.

Isolating the critical data from the rest within this aspect of your work is crucial for ensuring system uptime and IT performance, both of which have a huge impact on overall business success.

In my experience, I have found there are four key fundamentals that it’s helpful to bear in mind when monitoring:

  • Context. It may seem obvious, but it is imperative to have a comprehensive understanding of what it is being monitored from the outset. If you are receiving alerts from a piece of critical infrastructure that continues to be fully functional, then it’s probably best to ignore the lazy voices inside your head and take a deeper look. The alerts may well be indicative of a wider developing issue, which if left alone will likely have a far more severe impact later down the line.
  • Severity. The understanding of processes within the data center must also cover an insight into the critical ranking of its applications and components as, unfortunately, not all things in this world are created equally. Subsequently, upon the occasion where everything goes south and things stop working, having the ability to identify and prioritize what is of greater importance is highly valuable.
  • Simplicity. Time is a luxury that the majority of data center professionals do not have the privilege of commanding. For many, having the ability to take a legally contracted work break isn’t likely, so being able to dedicate time to input all-singing, all-dancing monitoring solutions is plain unrealistic. It’s therefore important to ensure that monitoring and alerting solutions are simple and easy to maintain. This will help quickly identify any issues and avoid system downtime.
  • Correlation. No disrespect to the other three, but if you’re going to remember any of these, then I recommend it being this one (which is why I put it last). Root cause identification is the real end goal, as it will help you save the most amount of time in the long term. You should be using monitoring and alerting to correlate events and occurrences, which will in turn help locate the original issues.

Context, severity, simplicity and correlation. I guess when you say them all in a row it sounds like a call-to-action from a self-help book which, now I think about it, this may well be. Nonetheless, using the four as foundations for monitoring and alerting methods will go a long way in helping to achieve the goals of your data center and the wider organization.

Chris Paap is a technical product manager at SolarWinds