The British Antarctic Survey (BAS) is one of the world’s leading environmental research centers with a long and distinguished history carrying out research and surveys in the Antarctic and surrounding regions. As a component of the Natural Environmental Research Council (NERC), the Cambridge-based center has undertaken the majority of Britain’s scientific research on and around the frozen continent for more than 60 years.
The organization’s current research strategy is called Polar Science for Planet Earth. As part of this mission, BAS combines advanced science programs with essential logistics to carry out complicated and sophisticated scientific field research. To that end, BAS employs more than 400 staff and supports three research stations in the Antarctic located at Rothera, Halley and Signy, as well as two stations on South Georgia, at King Edward Point and Bird Island. The organization also uses ice-strengthened ships, such as the RSS James Clark Ross, and five aircraft, to drive advancements in oceanographic and climate research.
Our rate of change is increasing dramatically
According to Jeremy Robst, IT support engineer and head of Unix systems for BASy, keeping pace with rapid change and surging data growth are top priorities. “We are collecting 10 times the amount of data we gathered just 10 years ago,” he explains. “Our rate of change is increasing dramatically, which puts pressure on our data collection and storage systems. This year, we’ll collect twice the amount of data as last year.”
Robst and a small yet dedicated IT team struggled to accommodate insatiable data storage demands generated by BAS’s various months-long expeditions and ongoing scientific research.
For example, during a recent 2.5-month Antarctic expedition, Robst was responsible for safekeeping the data captured from a multitude of sensors, floating buoys and GPS systems used to map the seafloor in precise detail and collect water column sonar data along with vital geological and environmental information. “We measure as much of the Antarctic environment as possible,” Robst said. “Dozens of individual and highly specialized probes collect up to 10GBs of data an hour. All this information is then secured for subsequent analysis and scientific modeling.”
Everything from air and water temperature, water turbulence, wind information and the amount of radiation from sunshine falling on the ocean is captured, along with the types and amounts of marine life moving in the water.
One six-week expedition produces at least 500GBs of data, which then is brought back to the BAS high-performance computing (HPC) center in Cambridge to be fed into a variety of models used by hundreds of scientists around the world. “Modeling is responsible for at least 60 percent of our data storage growth,” says Robst. “The rest is raw data collection from multiple sources, which also is rising sharply.”
Time for SuperDARN
In addition to meeting expeditions’ data needs, BAS became part of a major global initiative, called Super Dual Auroral Radar Network (SuperDARN). As one of three hubs worldwide to measure and collect a variety of atmospheric conditions and measurements, BAS was faced with an immediate requirement for major storage expansion. The challenge was finding the solution that best fit the organization’s requirements for high-capacity, high performance storage within its budget parameters.
BAS developed a set of selection criteria to assist in identifying the fastest storage at the best possible price. Redundancy and reliability also were critical factors as BAS’s data center operates 24/7 to accommodate scientific modeling demands: It can take up to two to three months to process code,” says Robst. “So we needed to make sure that high-performance storage would be available to support long-term projects and that everything—power, controllers, connections were redundant.”
To ensure better performance, reliability and redundancy, the BAS team also was planning to move off VxFS in favor of the Lustre parallel file system, and so were looking for a solution provider with proven Lustre experience. “Switching to Lustre would give us something more suitable for our environment, especially given our growing community of Lustre users, including researchers at Cambridge University, which is right next door to us,” Robst says. “We also felt we could achieve the optimal price-performance benefits in a Lustre environment.”
BAS also wanted a solution that would support a growing VMware environment as the organization expected to grow well beyond its current installation of 100 virtual machines. In reviewing a mixture of vendor responses, BAS discovered that some solutions were too expensive, too large physically, too heavy or required too much power.
Storing big data sets
Previously, the organization took advantage of storage from DataDirect Networks (DDN), so the team was familiar with the company’s excellent price-performance benefits. BAS was interested in DDN’s SFA770X platform, which features “pay-as-you-grow” scalability. “Fitting within our budget was where DDN really started to stand out,” says Robst. “Then we saw the amount of storage, in a high-density appliance at the proper size. That made my mind up.”
To expand storage capacity, BAS selected DDN’s SFA7700X hybrid flash storage appliance, which is purpose-built to tackle big data requirements. Moreover, the SFA7700X enabled BAS to combine the power of Flash technology with the economy of hard drives to lower overall total cost ownership. “Part of the attraction with DDN’s SFA7700X was the ability to mix and match storage types in one enclosure,” Robst adds. “This was the first time we were able to deploy SSD drives, which enabled us to blend some of the fastest online storage with near-line storage.”
What are the Benefits
With DDN SFA7700X storage in place, BAS now is ideally positioned to extract maximum value from its ever-increasing big data requirements. For instance, scientific researchers fully utilize the mix of flash and rotating media to speed performance for a variety of applications and climate models, such as a weather forecasting model developed by the UK’s Hadley Center for Climate Prediction and Research. Additionally, researchers can fully take advantage of the Weather Research and Forecasting (WRF) model and the MIT General Circulation Model (MITgcm), which is designed for the study of the atmosphere, ocean and climate. To aid in genomics research, scientists process models using Basic Local Alignment Search Tool (BLAST).
“We use a lot of common models to ensure that BAS research can be shared globally,” says Robst. “DDN delivers scalable capacity for us to ensure data is always available and accessible. With the SuperDARN project, we have a similar set-up to how the European Organization for Nuclear Research, known as CERN , operates. For example, we support a system that collects radar data from around the world. As the European data collection hub, we now ensure that hundreds of scientists can share information collected here as well as at other hubs in North America and Asia.”
Adding storage without needing more power
Thanks to DDN’s high-density configuration, BAS has been able to replace a 30U rack containing 50TBs with a 16U rack with 650TBs of usable storage for a 10 times boost in capacity. “DDN’s density is our No. 1 benefit as we now can grow capacity without increasing our storage footprint,” adds Robst. “The ability to increase storage without needing more power or space will make a big difference in the future.” BAS could continue to upgrade its existing storage infrastructure, adding more SFA7700X systems to achieve a three-to-four times increase in capacity of up to 1.5PBs within the original footprint.
Additionally, DDN’s scalability will enable BAS to better accommodate ever-increasing research requirements. “With the latest DDN expansion, we have effectively doubled the capacity we had last year,” adds Robst. “That should last us another two years of aggressive growth before we need to think about another expansion.”
BAS now is well prepared for its next Antarctic expedition in September 2015. “DDN storage helps us manage our research needs with reliable, scalable Lustre and VMware support,” concludes Robst.