If you are from the IT arena, you would have definitely heard people or seen articles referring to a new buzzword called ‘Big Data’, in the last few months. It is as if the IT domain is void of larger-than-life terms that this new jargon has been thrown around by analysts and consultants as if their life depended on its popularity. I think it was last year (2011) when the battle cry became louder, following up from the surmise in the last two or three years. I am not going to talk about what Big Data refers to or what’s its proper definition is, I am just going to write down my over-simplistic thoughts in this post. Having said that, it would make some sense that I borrow the popular definition of Big Data. This Fujitsu whitepaper mentions “Big data – in commonly-used term to describe data that exceeds the processing capacity of conventional database systems”.
Buildup to Big Data
The main players in IT know that the inception of Big Data is not something which was unexpected. There has been this steady desire from the executive management of top corporations to have these “holistic dashboards” that allow them to see a snapshot of all possible data related to their company whenever they want, in a rapid manner, using any device of their choice. This requirement has been the Achilles Heel of BI/Datawarehousing consultants who proudly proclaimed that they can transform the company by bringing all the data together. Then, we have some governments who wish to record everything about their citizens so as to protect them! In my view, I see these two reasons as the main push to Big Data.
History has seen many thinkers who wanted to see and control everything but the mechanisms just weren’t there. With the aid of computer technology that has evolved in a grand manner in the last three decades, many of the old visions are/will be seeing light finally. So, what are these technologies that matter the most for Big Data? Cloud computing and Distributed storage would be the obvious answer as they come to the mind right away. But, it is the declining costs of computer paraphernalia and the increasing speeds of broadband internet that are the major breakthroughs as they operate at the infrastructural level for Big Data. Then, we had the Data Mining movement of scanning data to find all sorts of patterns so as to increase sales, which has seen the development of complex algorithms that are self-learning too. So, there came the data handling mechanisms to compliment the data collection and storage mechanisms. To put it simply, now you can store anything about anything in multiple synchronized places across the globe and access anything and predict something from anywhere using any device.
Real “Agenda” of Big Data
The online activities of tech analysts and consultants show that they are trying to make Big Data as popular as Beatles by mentioning about it in every 2nd tweet/FB post/article/talk/webinar Candidly speaking, the IT industry players have always created clamours of this type to ensure they are in business. The western business model which fosters innovation by pushing for growth year on year, makes sure companies try everything possible to meet the two pseudo-lofty ideals of continuous growth and global expansion. Therefore, Big Data will also be accepted as a part of the IT family even though it may not be required by some companies. On a funny vein, there must be some old CFOs and senior management folks who would be thinking on the lines of “Oh no! now, I have to train for another IT jargon. The old days were much better with simple books, notes and logs” I see Big Data as the brainchild of capitalism. Corporations want to mine customers data in all possible ways. Ultra personalization refines our taste even if we don’t want to. Its leads to over-indulgence and that means more money in the coffers of the companies. I can go on and crib about this phenomenon but I guess that you can see it as a key capitalistic agenda by now. The other real benefactor of Big Data would be our national governments. Due to the rise of terrorism in the last few years, government bodies are trying to read, see and hear anything out there. This means maintaining records of telephone conversations and using mobile data to track our movements (see this TED video for reference) , video-recording our living areas, monitoring our online presence, checking our transactions and finally using all these different types of data to predict any civilian unrest or spot some anomaly. Just visualizing the breadth and depth of the data involved here, one would immediately point to Big Data. In fact, Big Data gained some mainstream traction after white house published an article on Big Data. Big Brother lends its adjective to Data for its own benefits The Panoptican vision also comes to the fore here. More data might lead to better governance but at the cost of our privacy. Lets see if world peace is achieved at least with Big Data.
Offshoots of Big Data
Interestingly, Big Data could help in realizing other revolutionary constructs that have been in a pipeline for a long time. ‘Internet of Things’ and ‘Semantic Web’ are the ones that I could immediately think of. Big Data would serve as the solid basement for these two constructs. In fact, Linked Data (condensed version of Semantic Web) and Big Data need each other to evolve as effective metadata management and semantics are very much necessary for Big Data. A parasitic relationship can be seen here. Seeing from a societal perspective, the large-scale logging of user data may lead to a massive opt-out campaign by consumers who are unwilling to allow companies to analyse their personal data. This is already seen but what I foresee is a massive awakening kind of movement(see this related TED video on being beware of online filter bubble)
Well, Big Data can be seen as an all-encompassing Knowledge Management solution or as a killer combination of Business Intelligence and Predictive Analytics set on steroids or as the fictional light saber of capitalistic economies and governments, but it is there to stay and swamp my Social Media timelines