Copyright © All rights reserved. Made By Serif. Terms of use | Privacy policy
the                                                 site Web-Scale Architectures

Stefan Bernbo is the founder and CEO of Compuverde. For 20 years, Stefan has designed and built numerous enterprise scale data storage solutions designed to be cost effective for storing huge data sets. From 2004 to 2010 Stefan worked within this field for Storegate, the wide-reaching Internet based storage solution for consumer and business markets, with high availability and scalability requirements. Previously, Stefan has worked with system and software architecture on several projects with Ericsson.

About the Author:

Feature Article: April 2015

Data + Technology Today

Check out Craig S. Mullins’ blog on data + database technology. more>

Gartner estimates that the Internet of Things (IoT) will include 26 billion installed units by 2020, and that the IoT market will exceed $300 billion by that time, mostly in services. These billions of connected devices and sensors will continue to generate petabytes of data. In addition, cloud services are migrating huge volumes of data as well.

These rapid technological advances demand new storage architectures. It is becoming increasingly clear that even a linear growth trajectory for storage is insufficient to deliver the quantity of storage needed for data produced by the IoT. Current architectures have bottlenecks that, while merely inconvenient for legacy data, are simply untenable for the scale of storage needed today.

Enterprises are learning to accommodate the explosive growth of data by adopting web-scale architectures that enable virtualization, compute and storage functionality on a tremendous scale.

Scale and Performance to Overcome Bottlenecks

A single point of entry can become a bottleneck and a single point of failure, especially with the demands of cloud computing on Big Data storage. A key element in web-scale storage design is that it removes all bottlenecks from storage architecture. Adding redundant, expensive, high-performance components to alleviate the bottleneck, as most service providers presently do, adds cost and complexity to a system very quickly. On the other hand, a horizontally scalable web-scale system designed to distribute data among all nodes makes it possible to choose cheaper, lower-energy hardware.

For cloud providers, which must manage far more users and greater performance demands than do enterprises, solving performance problems like data bottlenecks is a big concern.  While the average user of an enterprise system demands high performance, these systems typically have fewer users, and those users can access their files directly through the local network. Furthermore, enterprise system users are typically accessing, sending and saving relatively low-volume files like document files and spreadsheets, using less storage capacity and alleviating performance load.

It’s a different story, though, for a cloud user outside the enterprise. The system is being accessed simultaneously over the Internet by exponentially more users, which itself becomes a performance bottleneck.  The cloud provider’s storage system not only has to scale to each additional user, but must also maintain performance across the aggregate of all users.  Significantly, the average cloud user is accessing and storing far larger files – music, photo and video files – than does the average enterprise user.  Web-scale architectures are designed to prevent the bottlenecks that this volume of usage causes in traditional legacy storage setups.

Goodbye to Hardware

To be able to build out affordably to meet increasing demand, web-scale architecture must forego hardware altogether and be built on software exclusively. Since hardware inevitably fails (at a number of points within the machine), traditional appliances – storage hardware that has proprietary software built in – typically include multiple copies of expensive components to anticipate and prevent failure. These extra layers of identical hardware mean higher costs in energy usage and add layers of complication to a single appliance.  Because the actual cost per appliance is quite high compared with commodity servers, cost estimates often skyrocket when companies begin examining how to scale out their data centers. One way to avoid this is by using software-defined vNAS or vNAS in a hypervisor environment, both of which offer a way to build out servers at a web-scale rate.

Reducing Complexity, Increasing Consistency

Centralization has been a trend in data centers, but distributed storage presents the best way to build at web-scale levels.  This is because there are now ways to improve performance at the software level that neutralize the performance advantage of a centralized data storage approach.  

The appeal of cloud-based services is that users can access them from anywhere at any time. Service providers, therefore, must be able to offer data centers located across the globe to minimize load time. With global availability, however, comes a number of challenges. Load is active in the data center in a company’s region. This creates a problem, since all data stored in all locations must be in sync. From an architecture point of view, it’s important to solve these problems at the storage layer instead of up at the application layer, where it becomes more difficult and complicated to solve.

If a local data center or server goes down, global data centers must reroute data quickly to available servers to minimize downtime. While there are certainly solutions today that solve these problems, they do so at the application layer.  Attempting to solve these issues that high up in the hierarchy of data center infrastructure – instead of solving them at the storage level – presents significant disadvantages in terms of cost and complexity. Solving these issues directly at the storage level through web-scale architectures delivers significant benefits in efficiency, time and cost savings.

Building for the Future

If companies continue to rely on expensive, inflexible appliances in their data centers as demand for storage grows, they will be forced to spend heavily to develop the storage capacity they need to meet customer needs. Having an expansive, rigid network environment locked into configurations determined by an outside vendor severely constrains the ability of the organization to react nimbly to market demands, much less anticipate them proactively.  Web-scale storage philosophies enable major enterprises to “future proof” their data centers.  Since the hardware and the software are separate investments, either may be switched out to a better, more appropriate option as the market dictates, at minimal cost.  

Storage architecture at web-scale is required to meet the challenges that are arising as cloud services increase in popularity and the Internet of Things becomes a reality. Technologies such as hyper-converged infrastructures and software-defined storage are helping enterprises and ISPs to expand their compute capacity. These options will serve organizations that need to efficiently store ever-growing volumes of data.

By Stefan Bernbo

Founder and CEO of Compuverde

VP Marketing, CA Technologies and author of Agile Marketing

Evaluating the Different Types of DBMS Products.

This 2015 article by Craig Mullins is a part of a multi-part series on database systems from TechTarget.