Stefan Bernbo is the founder and CEO of Compuverde. For 20 years, Stefan has designed and built numerous enterprise scale data storage solutions designed to be cost effective for storing huge data sets. From 2004 to 2010 Stefan worked within this field for Storegate, the wide-reaching Internet based storage solution for consumer and business markets, with high availability and scalability requirements. Previously, Stefan has worked with system and software architecture on several projects with Ericsson.
About the Author:
Feature Article: July 2014
Check out Craig S. Mullins’ blog on data + database technology. more>
According to a recent IBM study, over 90 percent of the data in the world today was created in the last three years alone. This massive accumulation of data can be attributed to sources that include weather, social media, digital pictures and videos, financial transactions and of course cell phone records. Unsurprisingly, businesses continue to wrestle with how best to collect, analyze and monetize the wealth of information. One popular answer? Virtualization.
In virtualization, a software program is integrated into the network architecture to simulate the functions of physical hardware. This approach offers higher levels of flexibility and hardware cost savings as even greater levels of hardware functionality is virtualized. The growing popularity of virtualization allows organizations to run substantially more applications at a given time, which demands high volumes of storage and revives the need for elegant management, flexibility and efficiency. The entire storage ecosystem must therefore adjust to a virtualized world as fast as it can for organizations to remain competitive.
Virtualization on the Rise
Virtualization has been gaining traction amongst the IT crowd for the cost savings and flexibility it offers. Data center managers benefit from virtualization’s ability to efficiently utilize data center hardware. It is common for physical servers to sit idle in response to fluctuations in activity. However, by installing virtual servers inside the hardware, the organization can optimize its computing power and better utilize its hardware, a solution that makes ideal use of virtualization’s benefits.
Another significant benefit of virtualization is its ability to boost flexibility in the data center. The convenience of maintaining network infrastructure running virtual rather than physical machines cannot be overstated. For example, if the organization wants to change hardware, the data center administrator can easily migrate the virtual server to the updated, more powerful hardware, gaining better performance at a lower cost. Prior to the availability of virtualization, administrators had to install the new server and then reinstall and migrate all the data stored on the old server. Instead of manually migrating the data, virtualization enables administrators to move the entire virtual machine in one fell swoop.
Virtualization Isn’t for Everyone
Not all data centers are interested in virtualization. Those with a significant number of servers – somewhere in the range of 20-50 or more – are starting to consider converting their servers into virtual machines. First, these organizations can benefit from substantial cost savings and flexibility as described above. Additionally, virtualizing one’s servers makes them exceedingly easier to manage. The challenge administrators and staff face in managing large numbers of physical servers can be overwhelming. Virtualization ensures that data center management becomes easier by enabling administrators to run the same total number of servers on fewer physical machines.
Despite the clear benefits of virtualization, the trend towards greater adoption of virtual servers is placing stress on traditional data center infrastructure and storage. The first virtual machines utilized the local storage found within the physical server, making it impossible for administrators to migrate a virtual machine in one physical server to another with a more powerful CPU. Introducing shared storage – either via network-attached storage (NAS) or a storage area network (SAN) solution – to the VM hosts resolved this issue, and its success paved the way for scaling increasing numbers of virtual machines, which all became located in shared storage. Eventually the situation matured to today’s server virtualization landscape, where all physical servers and VMs are connected to the same storage. Much like the freeway system in Los Angeles, the problem now becomes one of congestion.
A single point of entry quickly becomes a single point of failure. With all data flow forced through a single gateway, data gets bogged down rapidly during periods of heightened traffic. With the number of VMs and quantity of data only projected to grow to dizzying levels, it is clear that this approach to storage architecture must be improved.
Learning from Lessons Past
Early adopters of virtualized servers – such as telcos or major service providers – have already encountered problems with data bottlenecks and are taking steps to negate its impact. As other organizations begin to move to virtualize their data centers, they will run into the same issues as well.
However, there are workarounds to address network congestion. Organizations looking to optimize virtualization but wanting to avoid data congestion can achieve a balance by removing the single point of entry. Both NAS and SAN storage solutions today have just a single gateway that controls the flow of data, leading to congestion when demand spikes. Instead, organizations should seek solutions that have multiple data entry points and distribute load evenly among all servers. That way the system maintains optimal performance and reduces lag time, even under a heavy load. While this approach represents the most straightforward fix, the next generation of storage architecture offers another alternative as well.
Where Computing and Storage Meet
To meet the storage challenge of scale-out virtual environments, the practice of running VMs inside the storage node themselves (or running the storage inside the VM hosts) – thereby turning it into a compute node – is quickly rising to prominence as the next generation in storage architectures.
In this approach, the entire architecture is decentralized. For example, if the organization is using shared storage in a SAN, typically the VM hosts from the top of the storage layer, essentially turning it into one huge storage system with a single entry point. To address the data congestion issues this approach generates, some organizations are moving away from the traditional two-layer architecture that has both the virtual machines and the storage running out of the same layer.
The movement towards virtualizing infrastructure is not showing signs of slowing anytime soon. Indeed, increasing numbers of companies will adopt virtualization and will inevitably encounter the performance latency previously described. However, by taking a cue from the early adopters who have developed the best practices above, organizations can deploy a successful scale-out virtual environment that keeps costs low while maximizing performance.
By Stefan Bernbo
Founder and CEO of Compuverde
VP Marketing, CA Technologies and author of Agile Marketing
This 2015 article by Craig Mullins is a part of a multi-part series on database systems from TechTarget.
|What is a Database?|
|The History and Future of Database Change Management|
|Fixing Corrupt Microsost Access Databases|
|How to Work Remotely and Still Be The Best|
|Social Data Has Become Social Big Data|
|The Future of Data Centers: Achieving Agility in a Rapidly Shifting World|
|Here’s a News Year’s Resolution: Master Your Database|
|Making the Grade: Cost Savings Upgrades for Today's Data Center|
|How to Choose the Best DBA for Your Company|
|Virtualization: Wading Through the Deluge of Data|
|SQL Databases and Network Attached Storage|
|Why Big Data Needs Cloud|
|Ten reasons why you should use data models to build apps|
|Beware Big Schema|
|How to Implement Successful Data Integration Cross-Regionally|
|Forging a Path Beyond Hadoop - Software Database Mgmt Sys for Big Data Analytics|
|Database Tips and Tricks|
|Why Data Still Matters|