Daniel Hardman is savvy architect with passion, business and people sense, and a track record of orchestrating killer solutions to complex problems. He has significant experience managing both local and outsourced teams. Hardman has been the Chief Architect at Adaptive Computing for the past two years.
About the Author:
Feature Article: March 2014
Ask a hundred pundits, and you’ll get a hundred definitions of big data. Some suggest a specific size (“anything over 50 TB is big data”); others like to talk about the 3 Vs (volume, velocity, variety) or the 4 Vs (3 Vs + veracity). But I think the simplest definition is best:
Big data is any data too overwhelming to
mine for insight with naive methods.
Notice the term “naive methods”—not “easy methods” or “familiar methods” or “old methods.” If you can think of a straightforward and practical way to get what you want out of the data, off the top of your head, it’s not big data. Even if your solution is expensive, big, or time-consuming. On the other hand, if using the data requires thoughtful weighing of tradeoffs and expenses, discussions with stakeholders, the creation of custom tools, trial and error, or the resetting of expectations, then you’ve met a big data test.
The other half of the definition is also significant—”mine for insight.” If all you want to do is dump data onto massive tape libraries and archive it for a decade, it’s not really in the big data sweet spot. You may be wrestling data, and it may be big, but you’re not really pursuing the problem that’s got the whole tech industry buzzing.
Big data’s raison d’etre is insight.
Which leads us to cloud. Tackling big data without a cloud-centric worldview is sort of like building a skyscraper without doing a soil study first: you might make some initial progress, but sooner or later you’ll discover that you need to understand and thoroughly adapt an (inadequate) foundation. At a minimum, you’ll experience false starts and thrashing; in many cases, you may never place a capstone.
The reason for this claim goes back to the two-bolded assertions above. Cloud is all about dynamic environments, agility, adjusting, experimenting… If you’re going to do some analyzing, you want to do it without massive CAP-EX, so you can learn while the price is affordable. That’s cloud.
Cloud is also about flexible applications—scaling out, plumbing connections when they’re needed, renting access to world-class tools you could not otherwise afford… And that’s what you need for insight. Most of us don’t have the deep pockets to build or buy the computational horsepower of Google Big Query, or of Amazon’s Dynamo DB or CloudSearch or Elastic MapReduce. But with cloud, we can rent it. This makes entire categories of insight accessible to mere mortals.
The CIA didn’t hire Amazon to create an internal cloud just so they could run an intranet and internal wiki. They are building insight factories out of their intelligence, and they need a cloud to make it work.
Not all compelling tech problems live at this nexus, but an amazing number do–and the convergence is intensifying.
By Daniel Hardman
Chief Architect, Adaptive Computing
This 2015 article by Craig Mullins is a part of a multi-part series on database systems from TechTarget.
|What is a Database?|
|The History and Future of Database Change Management|
|Fixing Corrupt Microsost Access Databases|
|How to Work Remotely and Still Be The Best|
|Getting in Touch with Big Data|
|Planning for Effective Data Warehouse Testing|
|Social Data Has Become Social Big Data|
|The Future of Data Centers: Achieving Agility in a Rapidly Shifting World|
|Here’s a News Year’s Resolution: Master Your Database|
|Making the Grade: Cost Savings Upgrades for Today's Data Center|
|How to Choose the Best DBA for Your Company|
|Virtualization: Wading Through the Deluge of Data|
|SQL Databases and Network Attached Storage|
|Why Big Data Needs Cloud|
|Ten reasons why you should use data models to build apps|
|Beware Big Schema|
|How to Implement Successful Data Integration Cross-Regionally|
|Forging a Path Beyond Hadoop - Software Database Mgmt Sys for Big Data Analytics|
|Database Tips and Tricks|
|Why Data Still Matters|