There is a lot of hyperbole about big data. A lot of hyperbole.
I saw a video last week in which someone said that all the knowledge and wisdom in the world is now available in the data on the Internet. If you are following me, you are sure to know my reaction to that kind of statement. Foolish does not even come close to an accurate description of my pejorative opinion of statements like that.
There is NO KNOWLEDGE in big data. There is only big data. Data is just data. And big data is a big headache when people make hyperbolic statements like the one in this video.
Another interview in the same video proved to be more thought provoking. The guy expresses contempt for the practice of using big data as a marketing club by sending messages to people urging them to buy more stuff. His point is that big data is really a symptom of the AGE OF CONNECTION. From his viewpoint, the networking of data makes it possible to create more value for users by allowing for greater connection among people.
Perhaps. First, we have to be able to swim through a swiftly growing sea of data.
One way to look at big data is to consider the three Vs. Call it V-Cubed, or V3.
Volume… How much?
Velocity… How fast?
Variety… What kinds?
The three Vs don’t add up; they are multipliers. Volume fortifies velocity. Variety multiplies volume. Velocity enables variety. The combination of the three dimensions adds power to the growth of our data and systems.
I call this effect Gordon Moore’s Law on steroids. Moore’s Law states that the computational speed of processors doubles every 18 months while the cost is cut in half. It applies to all sorts of other technology, and to the adoption of it. While Moore’s Law is a squaring — a doubling of the capability at half the cost — the speed of adoption of the data and the addition of software applications and hardware combine to cube the effect.
Volume is a function of HOW MUCH data we are creating. Every time you download an image, you make a copy of the image on your computer. You craft an e-mail — yesterday’s memo — and the recipient gets a duplicate. The increasing volume of data is mostly caused by the duplication of data. Variety lends a helping hand, creating volume. Ten years ago we were impressed with two-megapixel cameras. Today your phone sucks if it doesn’t have eight megapixels!
Velocity is a combined function of the increasing speed of the networks and the number of people the networks reach. In 1990, a 10 MIP fiber backbone in a building was the cutting-edge hotrod of networking. Today I have 25 MIPS into my house, thanks to FiOS and fiber optic networks! The current mainline standard is 10–40 GIPS (gigabytes per second) between major switching nodes, and 10 GIPS into the user circuits. They are testing 100 GIPS in the field today, and 400 GIPS in the lab, which should hit the field in three years.
Add in the effect of improvements in multiplexing, the mixing of more signals into a single fiber, and both volume and velocity pick up, so the fiber networks can carry even more data.
The build-out of wireless infrastructure in the Third World further magnifies the multiplicative effect of velocity and volume. Ten years ago, the question of copper or fiber in the final mile was a First World question. The Third World did not have to worry about that, because there was no copper to start with. But there were great distances. However, microwave communications allowed for 20-mile-plus line-of-sight distances between cell phone towers. Moore’s law applies to the power consumption of the chips, too, so integrated cell phone tower systems, complete with self-contained generators, interconnected with microwave line haul, and became the solution in Africa. In 2008, a 747 freighter left the US every week, bound for Nigeria, filled with the radio systems that helped spread a wireless telephone and data system across a large area of Africa.
In the next decade, people in Third World Africa will enjoy data parity with people in the First World. And it will not only be the rich that have access, because Gordon Moore’s law tells us that the cost is cut every 18 months, so if a communications device costs $100 today, it will cost $0.78 in 2023.
The number of bytes in the average digital photo has doubled twice in the past three years. We have more people posting photographs on the Internet now than last year. Photographs change shape and message, as people lift the images and add words and captions, creating memes. We not only duplicate data, we take the duplicates and modify them, creating even more data to load onto the global network.
What about the amount of music available on the Internet? A decade ago the music industry was on a death hunt for Napster. Now we not only have the legal iTunes as a source of music, but dozens of others like it. How about television? A decade ago you could not lose yourself looking at videos of a Russian video prankster in West Palm Beach, watch a guy pull pranks on drive-through window girls, or even watch a guy give video reviews of fast food from the driver’s seat of his car. YouTube did not exist. And it is not the only game in town. Let me introduce you to Hulu, Vimeo, Dailymotion, iFilm, and MyVideo.
Just look at how much the Google home page has changed in the past year! How much more can they squeeze onto the screen? Folks, variety is growing at the same pace as volume and velocity. Talk about a snowball. Hell, this is an avalanche. It is part of the headache. There has to be some systematic reaction to this self-perpetuating cycle of growth.