million at Accel Partners, the VC firm. Big data is one of the biggest transformational changes in the data center and IT landscape, said Ping Li, a partner at the VC firm Accel Partners, which is running a $100 million Big Data fund. It happens once in a generation, he told the audience at a Churchill Club panel in Silicon Valley. Gartner predicts that data will grow 800 percent over the next five years and 80 percent of the data will be unstructured. And just what constitutes Big Data? After SC2011, the US supercomputing conference in Seattle in November, Addison Snell, an industry analyst, blurred the lines a bit in a podcast with HPCwire. There is small Big Data just as there is entry level high performance computing, he said. Someone who has worked in gigabytes and now has to work in terabytes is dealing with Big Data, added the CEO of Intersect360. Its relative to the infrastructure you had before. It may incorporate complex event processing, data mining and complex real-time analytics. Big Data can have many elements large files, large volumes and real-time I/O within a short data life span. Every vendor at SC2011 was talking about big data, agreed Nicole Hemsoth, editor of Datanami. Or to put Snells observation another way, Big Data breaks existing systems and ways of working. A lot of people know how to work with data, observed Anand Rajaraman, but now
there is a lot more data so the kinds of things
you can do with it and the way you work with it can are very different. The founder of companies which have been acquired by Amazon and Walmart, Rajaraman is now senior vice president at Walmart Global ecommerce and co-founder @WalmartLabs, and a professor at Stanford. The tools [for Big Data] are very different. Many of the fundamental algorithms for predictive analytics depend crucially on keeping the data in main memory with a single CPU to access it. Big Data breaks that condition. The data cant all be in memory at the same time, so it needs to be processed in a distributed fashion. That requires a new programming model. This can be hard for traditional data users to understand, He watches students attack Big Data problems by creating a sample, but that defeats the value of Big Data with all its potentially informative outliers. Businesses are catching on to the promise of Big Data said Luke Lonergan, chief technology officer at Greenplum, an analytics company that was acquired by EMC. Every business is looking for ways to get tighter connection with its customers, to improve prediction and and move them along a trajectory. We see a certain urgency around Big Data. Traditional users of Big Data retail, telecom and intelligence are already comfortable with it, said Lonergan. The next big set of users are in mobile-social, especially incorporating geo-location. Some areas have
Analytics Training Institute
A Redwood Associates Learning Initiative
been underserved, such as health care, which
he described as the third rail because it has been too hard and too slow. But now health care is experiencing fundamental change similar to what retail felt when customers came in armed with smartphones and had more information than sales people. Patients are starting to acquire more information and health care providers are developing more analytics.
legacy business intelligence and ERP, putting
it on top and calling it Big Data. I think we will see new applications that are Big Data. We are just starting to see the seeds as people recognize the potential of the new platform. We will see startups that will build new applications which will break the existing platforms and start again.
Big Data is also going to change science, said
Rajaraman.
Big Data is at a stage where early innovators
are out there. This is like the early days of ecommerce when pioneers had to figure out components like fraud and payments, he added.
The way science is done has evolved over the
centuries observation, notebooks and then you come up with a theory. That was followed by algorithmic science. Now data analysis is the fourth paradigm for the new way science is being done ranging from earth science, to chemistry to biology and psychology. Science is about more and more data. Big Data is not an American fad it is a global phenomenon, said Accels Li. Its amazing how global this is. Adoption of Big Data is happening faster in some emerging markets where they dont have to content with legacy databases. Instead they are using open source tools with commodity servers and storage. I think we are in the very early days of transformation, Li added. Accel is looking for companies working in data management and platforms.
Rajaraman agreed.
Each of those is now a big company. In Big
Data, each of the new use cases will get commercialized and become a big company on its own. Data growth today is faster than Moores law and faster than bandwidth growth, so we need new ways of organizing data, new computing and algorithms because computational power is not keeping up. People underestimate the degree to which this can change their business, said Greenplums Lonergan. Those with data science teams begin to understand; others dont see how much it can do. Rather than any single stovepipe, we are about predicting the future. Some expect they can buy an application to make Big Data happen, but it doesnt work that way. Businesses should see an explosion of new activity around the explosion of Big Data.
A lot of the applications that ride on top of
these new data platforms have yet to be invented. Right now users are importing