While the work done by data centers is invisible to most of the general public, they are responsible for the delivery of a wide variety of digital services that are critical in today’s economy. Data centers support the SaaS applications used to manage our food and other supply chains, store the medical records used by health care professionals, and deliver the data that allows us to continue working on documents, spreadsheets, and other files, whether or not one is at the office or at home.
The Covid-19 pandemic has caused people to become even more dependent on these digital services over the past few months, while the pandemic itself has make it more difficult for data center operators to ensure their customers’ data is always available. For example, a recent Uptime Institute survey found that Covid-19 has caused four percent of operators to experience an outage and 10 percent to experience a slowdown.
However, with robust data backup and recovery process processes, these operators can avoid data loss, and ensure their customers’ data is available even as they struggle to deal with employees who are off sick for weeks, stay-at-home orders that limit how many of their employees can work on-site, a greater volume of cyberattacks, and similar pandemic problems. By preparing themselves to deal with these and other data backup and recovery challenges that can arise from a pandemic, hurricane, cyberattack or other disaster, data center operators can prevent these disasters from limiting their ability to provide their customers with the fast, reliable access to data that they have come to depend on.
One way data center operators can prepare for these types of challenges and continue to deliver their customers high levels of data availability during disasters like the COVID-19 pandemic is to adopt ISO 9001 and similar quality management standards. With these standards data center operators can implement quality management systems that ensure their data backup and recovery processes are documented, audited, and tested. These standards also provide a jumping-off point for them to ensure these processes are customized to meet their data centers’ specific needs. By adopting these standards, and adapting them to their own infrastructure and operations, data center operators can implement a set of backup and recovery processes that are repeatable, defensible, and cost effective, even in the midst of a disaster.
Identify potential pitfalls by documentation
ISO 9001 and similar quality management system standards require data center operators to not just establish disaster recovery and other quality objectives but document the actual processes they will use to achieve these objectives, step by step. In doing so, data center operators can identify problems with these processes that might arise if they have to be completed during a pandemic or other disaster. For example, if everyone is working remotely during the Covid-19 crisis, or if a specific employee is sick, a disaster recovery process that requires a specific person to physically turn on a server in the data center will not be able to be completed. Another recent Uptime Institute survey found that a third of data center operators believe the biggest risk to their operations is a reduced level of staff. By adopting ISO or other quality management standards these operators can put in place processes that allow them to maintain continuity even if some of their employees are unavailable or working offsite during a disaster.
A common roadmap
In addition, by documenting their backup and recovery processes, ISO and other quality management standards provide a common roadmap and language for all of a data center’s staff to work from when data loss occurs during or as a result of a disaster. This avoids confusion and ensures that all the necessary data recovery processes are completed in the manner and order they should be, even if team members are located in different locations around the world. If ISO or similar standards are in place, available team members can always refer to the process documents for guidance on how to quickly recover the data their customers need.
Audits and tests
Another benefit of these standards is that they mandate periodic audits and other testing of backup and recovery processes, documentation of the results of these tests, and resolution of any problems encountered during these tests. Just as initial documentation of processes provides an opportunity to identify problems with disaster recovery processes, so do periodic tests that ensure these processes actually work in practice. Moreover, if these audits or tests reveal problems, these problems are documented and can then be fixed before a real disaster arises. The last thing data center operators want to experience during a disaster is a nasty last minute surprise, especially if the surprise could have been detected by a test months ago.
Use creativity and advanced solutions
However, while data centers should look to adopt ISO 9001 or similar standards to help them develop and implement robust disaster recovery processes, they need to remember that these standards are a general guide, and one-size does not fit all. They will need to adapt these standards to their data center’s specific infrastructure and operations if they hope to maximize the benefits that come from adopting these standards. This means thinking creatively, and using the standards to account for all the likely problems that might arise with backup and recovery at their own data center during a disaster.
It also means using advanced backup and recovery solutions that allow them to more easily implement the processes recommended by the standards. For example, data center operators should look for solutions that allow them to easily document the results of the tests of their backup and recovery processes, automate these processes, and adjust these processes as their operations change over time.
ISO and similar standards provide valuable guidance on how data center operators can chart a path to developing and implementing robust backup and recovery processes that will allow them to avoid data loss during a disaster like the Covid-19 pandemic. Their adoption should be at or near the top of every data center operator’s to-do list, if they have not adopted them already.