Archived Content

The following content is from an older version of this website, and may not display correctly.

The storage industry is increasingly aware that an alternative methodology is required to efficiently and securely store the exabytes of unstructured information we generate around the world every day. And, although object storage is not a new paradigm - most people interact with object storage on a daily basis when they retain and share information on DropBox, store or retrieve photos on Facebook, listen to songs on Spotify for example - this explosion in unstructured data has catapulted object storage as an inexpensive, scalable, self-healing, multitenant platform for storing petabytes of data.

What is web object storage?
Object storage is essentially just a different way of storing, organizing and accessing data on disk but in a much more scalable, and much more cost-effective way.

Like files, objects contain data -- but unlike files, objects are not organized in a hierarchy. Every object exists at the same level in a flat address space called a storage pool and one object cannot be placed inside another object.  Objects are immutable, making this storage methodology perfect for long-term retention of data archives, analytics data, and service provider storage with SLAs associated with data delivery.

Both files and objects have metadata associated with the data they contain, but objects are characterised by their extended metadata. Each object is assigned a unique identifier that allows a server, or end user, to retrieve the object without needing to know the physical location of the data. This approach is useful for automating and streamlining data storage in cloud computing environments.

As stated, object storage is not a new concept. We were an early innovator with our Web Object Scaler (WOS) in 2009 and more recently the launch of WOS 360 – the latest derivation of our object storage offering. But, much advancement has been made and today’s current generation of object storage platforms cannot be compared to the earlier generations - which were merely black boxes designed to store immutable copies of documents, mostly for compliance environments.

The current generation of object storage platforms are designed with openness and flexibility in mind. To solve the unstructured data challenge, we need flexible object storage platforms that support the entire data life cycle and integrate with a wider range of applications: genomics research, worldwide collaboration, file sync and share, social media applications, content distribution and many more.

The promise of web object storage
The promise of object storage is simple: to enable customers to build a highly reliable, infinitely scalable and most efficient storage pool for all their unstructured data needs. For this to be true, object storage platforms need to meet all of the object storage requirements and provide a set of tunable parameters – yet few platforms available today live up to that promise.

In essence, there are five key requirements customers will need in order to define the architecture of their scale-out storage infrastructures: scalability, accessibility, efficiency, reliability and performance.

- Scalability: Object storage was purposefully designed for very large volumes of unstructured data, with unlimited scalability as the ultimate objective.

Inherently, storage platforms can be scaled in three dimensions: the total volume of storage, the number of objects and the number of sites. The number of objects tends to be a particular challenge when objects are very small or the ratio of small vs. big objects cannot be accurately predicted.

Applications may not have high scalability requirements for all three dimensions initially, but over time those requirements can change due to external elements, so it’s important to build in all three scalability dimensions from the start.

- Accessibility: One of the key benefits of true object storage is the absence of file systems. This also creates a challenge: the accessibility of the data.

Typically, object storage is accessed through applications, which use application protocol interfaces (API’s) to interact directly with the backend storage pool.

Several attempts have been made to have the industry agree on a standard object interface, with mixed success. The Amazon® S3 and Openstack® Swift API’s are currently seeing the widest adoption, but it still remains to be seen if one of them will ever become a true standard and an open API.

Therefore, it is important for object storage platforms to provide wide support for multiple protocols and applications, including file system gateways to integrate with legacy applications.

- Efficiency: Most object storage platforms claim to be the most cost efficient solution on the market, and many platforms will be, for a very specific use case and for a very limited range of parameters. What is more difficult is to find the platform that offers the overall best efficiency, including infrastructure cost (raw storage, data centre, etc.), management cost (how many people are required to manage the infrastructure) and bandwidth consumption.

- Reliability: Central in any storage architecture is reliability. Whether you are building a low-cost archive or a high-performance storage cloud, reliability is key. But there is a lot more to data reliability than most storage providers like to admit.

Data reliability dimensions are: availability (the time your data is instantly available for access); durability (the guarantee that your data will not be corrupted); and integrity (the assurance that your data will remain unchanged and cannot be tampered with). Security is a feature inherent to data integrity.

Only when reliability is provided in all three dimensions, will you achieve true data reliability. Most platforms provide acceptable reliability grades on one, sometimes two, of the above.

- Performance: Storage performance is measured in throughput, IOPS and latency, however most platforms are not optimized for small objects, so IOPS tends to be neglected in performance conversations.

Similarly, most object storage platforms do not allow for latency optimization, which is the reason they do not mention latency in their performance conversations. So it is important to look for solutions which measure all three performance measurements.

Object-based storage is experiencing huge growth as customers realise its benefits as an inexpensive, scalable solution. For end users looking at web object storage solutions for the first time, follow this check list of key requirements for a comprehensive, 360 degree approach to storage of the exabytes of unstructured information we generate every day.