Você está na página 1de 18

Big Data

New Frontiers for IT Management

Executive Briefing
Definition Catalysts Potential Value of Big Data Thought Leaders Leading Technology Vendors

The Age of Big Data

Data is a new class of economic asset, like currency and gold.

What is Big Data

A massive volume of both structured and unstructured data that is so large that it's difficult to process with traditional database and software techniques.

What is Big Data


Walmart handles more than 1 million customer transactions every hour. Facebook handles 40 billion photos from its user base. Decoding the human genome originally took 10 years to process; now it can be achieved in one week.

Volume, Velocity and Variety


Big data spans three dimensions
Volume
Petabytes per day/week

Velocity
Real-time capture and Real-time analytics

Variety Big Data


Unstructured data, web logs, audio, video, image

Traditional Approach V/s Big Data Approach


Traditional Approach
Structured & Repeatable Analysis Business Users Determine what question to ask

Big Data Approach


Iterative & Exploratory Analysis IT Delivers a platform to enable creative discovery

IT
Structures the data to answer that question
Monthly sales reports Profitability analysis Customer surveys

Business Explores what questions could be asked

Brand sentiment Product strategy Maximum asset utilization

Catalyst Commodity Servers


Commodity server hardware creating the possibility for cost effective massively parallel processing (MPP)

Example server might contain:


CPU 16 Cores RAM 1 Terabyte Disk 500 Terabytes Ethernet 1 Gbit

Catalyst Humans and the Internet


1.2 Billion active mobile broadband subscriptions Web sites with 300+ million unique visitors/month Facebook Yahoo Google YouTube

Potential Value of Big Data

$300 billion potential annual value to US health care.


$600 billion potential annual consumer surplus from using personal location data. 60% potential in retailers operating margins.

Source: McKinsey Global Institute - 2010

Leading Technology Vendors


ExampleVendors Commonality

IBM Netezza EMC Greenplum Oracle Exadata

MPP architectures Commodity Hardware RDBMS based Full SQL compliance

Hadoop Open Source



Started by Google and Yahoo! Now Open Source Hadoop NoSQL approach to data Foundational Technologies:
Hadoop Data Storage Framework MapReduce engine HIVE and PIG query tools

Almost SQL compliant

Leading Vendors - Hadoop


Cloudera Open Source HADOOP
Production Releases Very good support Conferences and education

Amazon's Elastic Computing Cloud


Map/Reduce environment MPP for everyone Cost effective And you can buy a book!

IBM -Netezza
Simplifies Data Warehousing
Speed :10-100x better performance Simplicity: Admin cost reduced by 75-90% Scalability Smart System- In-database analytics

IBM -Netezza

Significant response improvement: Faster platform means better reports response


Direct Data Availability Higher trust in data , one version of truth Aggregation reduction Any attribute available

Operational Benefits Storage savings (no data replicas) Administration costs reduction(DBA) Infrastructure Simplification Lower environment complexity

BigDataArchitecture.com

Tavo@BigDataArchitecture.com

Você também pode gostar