Você está na página 1de 44

Accelerate Big Data Analytics with IBM PureData System

2012 IBM Corporation

April 2012 - IBM announced a new family of expert integrated systems:


TM

Systems with integrated expertise and built for cloud

Built-in Expertise
Capturing and automating what experts do from the infrastructure patterns to the application patterns

Integration by Design
Deeply integrating and tuning hardware and software in a ready-togo workload optimized system

Simplified Experience
Making every part of the IT lifecycle easier - with integrated management of the entire system and a broad open ecosystem of optimized solutions
2
2012 IBM Corporation

New announcements for the PureSystems family that change the economics and experience of IT and accelerate time to value

Infrastructure
Delivering Infrastructure Services
3

Application Platform
Delivering Platform Services

Data Platform
Delivering Data Services
2012 IBM Corporation

Meeting Big Data Challenges Fast and Easy! IBM PureData System
For apps like E-commerce
System for Transactions

Database cluster services optimized for transactional throughput and scalability

For apps like Customer Analysis


System for Analytics

Data warehouse services optimized for high-speed, peta-scale analytics and simplicity

Powered by Netezza technology

For apps like Real-time Fraud Detection


System for Operational Analytics

Operational data warehouse services optimized to balance high performance analytics and real-time operational throughput
2012 IBM Corporation

IBM PureData System: Optimized exclusively for data services

Optimized for data services:


Transactional Analytics

Workload optimized performance


Integrated management Integrated maintenance Single point of support

Expert integrated:
Data platform Infrastructure Unified platform management Built-in expertise

Data Platform
Delivering Data Services

2012 IBM Corporation

The NEW PureData System for Analytics provides:

The FASTEST time to value


on the market today

System for Analytics

Optimized analytics performance


for Big Data NEW

Simple administration
for fast and agile deployment

Large library of analytic functions


to accelerate analytic performance

2012 IBM Corporation

IBM PureData System for Analytics


Optimized exclusively for analytic data workloads
Speed System for Analytics
Delivering data services for analytics
10-100x faster than traditional custom systems1 Patented, hardware accelerated MPP (Massively Parallel Processing)

Simplicity
Data load ready in hours No database indexes No tuning No storage administration

Scalability
Peta-scale data capacity2

Smart
Designed to runs complex analytics in minutes, not hours Richest set of in-database analytics
1

Based on IBM customers' reported results. "Traditional custom systems" refers to systems that are not professionally pre-built, pre-tested and optimized. Individual results may vary. 2 Peta Scale capacity offered in the N1001 model

2012 IBM Corporation

Benefits of the IBM PureData System for Analytics


The fastest performance of Netezza technology to date!
3X faster performance1 Accelerate Performance
of Analytic Queries
for Big Data analytics

128 GB/sec effective scan rate per rack2


to tackle Big Data faster

50% greater data capacity per rack3

Increase Efficiency of your Data Center

helps optimize data center efficiency

More capacity and less power per rack


than both Oracle and Teradata

Simplicity and Ease of Administration

Improved system management and resilience


to spend less time managing and more time delivering value

1 Based

on a comparison of the IBM PureData System for Analytics N2001 to the IBM PureData System for Analytics N1001. The performance speed refers to the query times on both macro-analytic and mixed workload tests as conducted in IBM engineering lab benchmarks. The N2001 query times were an average of 3x faster than those of the N1001. Individual results may vary. 2 128 GB/sec scan rate assuming an average of 4x compression across the system. Individual results may vary. 2012 IBM Corporation 3 Capacity of IBM PureData System for Analytics N2001 compared to previous generation IBM PureData System for Analytics N1001.

Accelerate analytic performance


With the fastest out of the box scan rates

Accelerate Performance of Analytic Queries

Scan Rate limits how quickly data can be read and processed Big Data complex analytics requires fast access to big volumes of data Storage bottlenecks limit efficiency and mean long wait times Faster performance drives revenue for data driven businesses PureData System for Analytics Increased parallelism Offloads decompression, filtering, and processing to FPGA for blazing fast performance! Industry leading out of the box effective scan rate of 128 GB/sec3 for ALL the data
Limited to what can fit in flash cache. Disk bandwidth only 25 GB/sec!

on a scan rate of 38 GB/sec per rack for Teradata 2690 from http://www.teradata.com/News-Releases/2011/Fifth-Generation-Teradata-Data-Warehouse-ApplianceDelivers-Double-the-Performance-and-Triple-the-Data-Capacity/ versus IBM PureData System for Analytics N2001 scan rate of 32 GB/sec raw scan rate per rack, 128 GB/sec effective scan rate per rack. 1128 GB/sec scan rate per rack assuming an average of 4X compression across the system. Individual results may vary. Scan rates per rack are based on out of the box configurations. 2 Based on a scan rate of 100GB/sec for flash cache operations from Oracle X3-2 datasheet versus IBM PureData System for Analytics N2001 scan rate of 32 GB/sec raw scan rate, 128 GB/sec effective scan rate. 1128 GB/sec scan rate assuming an average of 4X compression across the system. Individual results may vary. Based on per rack and out of the box configurations. 2012 IBM Corporation 3 128 GB/sec scan rate assuming an average of 4X compression across the system. Individual results may vary.

1 Based

High scan rates drive analytic results- faster!

Accelerate Performance of Analytic Queries

NYSE has replaced an Oracle relational database with a data warehousing appliance from Netezza allowing it to conduct rapid searches of 650 terabytes of data.
- ComputerWeekly.com

10

2012 IBM Corporation

Spend less on PureData than Oracle Exadata for faster performance!

Accelerate Performance of Analytic Queries

Comparing dollars per GB per second Oracle will cost you more than 500% more to scan the same gigabyte of data!1
1

553% more

11

IBM PureData System for Analytics N2001 = $2,637,300 list price/128 GB per sec. = $20,603.91/GB/sec. 128 GB/sec scan rate assumes an average of 4x out of the box compression across the system. Oracle X3-2 = $11,384,280 list price /100 GB/ per sec. = $113,842.80/GB/sec. Oracle scan rate from Oracle X3 datasheet at http://www.oracle.com/us/products/database/exadata-db-machine-x3-2-1851253.pdf and pricing documents at http://www.oracle.com/us/corporate/pricing/exadata-pricelist-070598.pdf and http://www.oracle.com/us/corporate/pricing/technology-price-list-070617.pdf. Individual results may vary. This comparison is based on out of box configurations for both systems

2012 IBM Corporation

Increase data center efficiency


With faster, more efficient systems
PureData uses Less Power than other systems1 PureData has More Capacity than other systems 2,3

Increase Efficiency of your Data Center PureData has Out of the box Faster Scan Rates than other systems

PureData System for Analytics Offers more than 160% greater capacity than Teradata 2690 Has more than a 200% faster scan rate than Teradata! Nearly 30% faster scan rate than Oracle X3-2 Uses less power than Oracle X3-2 using nearly 40% less power than Teradata 2690
1 Teradata 2

Less floor space Less power per data center tile Less cooling per data center tile

12

2690 data sheet claims 8.8kW power per rack. IBM PureData System for Analytics N2001 is 7.5kW power per rack. Oracle high performance drives provide 45TB per rack before compression per Oracle X3-2 datasheet. IBM PureData System for Analytics N2001 pre-compression capacity is 48 TB per rack.Claims are based on out of the box configurations. 3 300 GB drive capacity for Teradata 2690 is 18.2 TB before compression per http://www.teradata.com/News-Releases/2011/Fifth-Generation-Teradata-Data-Warehouse-Appliance-Delivers-Double-the Performance-and-Triple-the-Data-Capacity/. IBM PureData System for Analytics N2001 pre-compression capacity is 48 TB per rack. Claims are based per rack and on out of the box configurations. See Slide 9 for scan rate footer

2012 IBM Corporation

Spend less time managing and more time innovating

Simplicity and Ease of Administration

Easy Administration Portal


No software installation No indexes and tuning No storage administration
No dbspace/tablespace sizing and configuration

No redo/physical/Logical log sizing and configuration


No page/block sizing and configuration for tables No extent sizing and configuration for tables No Temp space allocation and monitoring No RAID level decisions for dbspaces

No logical volume creations of files


No integration of OS kernel recommendations No maintenance of OS recommended patch levels No JAD sessions to configure host/network/storage
13

Data Experts, not Database Experts


2012 IBM Corporation

How we did it, conceptually


More Drives with Faster Scan Rates Faster FPGA Cores, Driving Higher Performance Leading to Faster Performance

2.5 drives @ 130 MB/sec each 1 drive @ 120 MB/sec

1000 MB/sec

1000 MB/sec +

500 MB/sec

800 MB/sec +

FPGA Core Core


Decompress Project Filter

CPU Core
Analyze

Balanced Performance
14
2012 IBM Corporation

PureData System for Analytics hardware overview


Model N2001
12 Disk Enclosures 288 600 GB SAS2 Drives 240 User Data, 14 S-Blade 34 Spare RAID 1 Mirroring 2 Hosts (Active-Passive) 2 6-Core Intel 3.46 GHz CPUs 7x300 GB SAS Drives Red Hat Linux 6 64-bit

Scales from Rack to 4 Racks

7 PureData for Analytics S-Blades 2 Intel 8 Core 2+ GHz CPUs 2 8-Engine Xilinx Virtex-6 FPGAs 128 GB RAM + 8 GB slice buffer Linux 64-bit Kernel

User Data Capacity: Data Scan Speed: Load Speed (per system):
15

192 TB* 478 TB/hr* 5+ TB/hr

Power Requirements: Cooling Requirements:

7.5 kW 27,000 BTU/hr

* 4X compression assumed

2012 IBM Corporation

Leveraging the

2012 IBM Corporation

Smarter Analytics should be your goal


CIOs rank Analytics as the

#1 factor
Contributing to an organizations competitiveness.1 Financial outperformers are

Organizations that embrace analytics are more than

64%

Enterprises that apply advanced analytics have

2X

as likely to outperform their Peers.2

more likely to use analytics to evaluate talent supply and demand on an ongoing basis.3

33% 12X

More revenue Growth and

more profit growth.4

17

IBM CIO Study 2009 IBM IBV/MIT Sloan Management Review Study 2011

3 4

IBM CHRO Study 2010 IBM CFO Study 2010

2012 IBM Corporation

Achieve Smarter Analytics by using all types of analytics against all types of data
Operational reports Analytic reports Documents Transactional & Application Data

Smarter Analytics

Alerts
Statistical Analysis

Enterprise Content

Spatial Analysis
Forecasting Predictive Modeling Optimization Social Analytics Web Analytics
18

Machine & Sensor Data

Social Data

Web Data
2012 IBM Corporation

But how do you achieve that?


There are many challenges to overcome
Do you face these challenges: Then you need a platform that provides:

Difficulty adding new data or analytic capability Lack of analytical insight Broad spectrum of workload and SLA requirements Growing data volume, variety and velocity Complicated system lifecycles

Increased Agility Accelerated Time to Value Fit for Purpose Solutions

Tools for gaining insight from Big Data


Reduced Complexity Simplicity Increased Efficiency
2012 IBM Corporation

Administration complexity
Growing costs of IT
19

Todays big data challenges for both transactions and analytics are increasing demands on data systems
Mobile Commerce Social Analytics Big Data Cloud

Increasing

Increasing

Increasing

Volume of data
requires growing capacity

Velocity of data
requires higher performance

Variety of data
requires new techniques

50x

35 ZB
by 2020

Millions of transactions per second

Billions of devices & sensors

2010

2020

20

2012 IBM Corporation

Built-in expertise makes this as simple as an appliance

Dedicated device
Optimized for purpose Complete solution Fast installation Very easy operation Standard interfaces

Low cost

21

2012 IBM Corporation

Simplify
Move analytics into the Data Warehouse Integrate the server, storage and database into one optimized package Move complex analytics into the database Integrated, high performance analytics within the data warehouse

Analytics

Database

Storage

Server

22

2012 IBM Corporation

IBM PureData System for Analytics The Simple Appliance for Serious Analytics
Built-in Expertise
No indexes or tuning Data model agnostic Hardware accelerated, fully parallel, optimized, In Database Analytics

Integration by Design
Server, Storage, Database in one easy to use package Automatic parallelization and resource optimization to scale efficiently and economically Enterprise-class security and platform management

Simplified Experience
Up and running in hours Minimal up front design and tuning Minimal ongoing administration Standard interfaces to best of breed Analytics, Business Intelligence, and data integration tools Built-in, complex analytical capabilities allow users to derive insight from their data quickly Easy connectivity to other Big Data Platform components
23
2012 IBM Corporation

IBM PureData System for Analytics

Transforms the User Experience


Purpose-built analytics engine
Integrated database, server and storage Standard interfaces Low total cost of ownership Speed: 10-100x faster than traditional systems Simplicity: Minimal administration and tuning Scalability: Peta-scale user data capacity Smart: High-performance advanced analytics
24
2012 IBM Corporation

NYSE Euronext improves data management with IBM PureData System for Analytics
Need
Greater flexibility to meet market demands Reduce the time taken to access businesscritical data on its network, which was taking 26 hours The previous Oracle system trawled through large amounts of irrelevant information to complete searches

Benefits
Ability to conduct rapid searches of 650 TB of data; storing over 1 PB on PureData Time to access business-critical data reduced from 26 hours to 2 minutes; short time to value up & running within weeks

25
25

Video Testimonial

2012 IBM Corporation

Premier Healthcare Alliance improves patient outcomes while reducing spending by USD 2.85B
Need
Improve patient outcomes through enhanced data sharing and analytics Make data in different formats shared across thousands of locations easier to access

Benefits
157 participating hospitals saved an estimated 23,000 lives and reduced healthcare spending by USD 2.85B Reduced health spending by decreasing unnecessary readmissions, ER visits and procedures Improved access to data helps deliver the promise of Accountable Care Organizations

26
26

Video Testimonial

2012 IBM Corporation

What makes PureData System for Analytics different?


Speed
Up to 2000X faster than before Growing by 30% every month
Netezza has allowed us to reduce the complexity of regulatory reporting and processing of exchange data from days down to minutes.

Up and running 6 months before having any training

Simplicity

Allowing the business users access to the Netezza box was what sold it.
- Steve Taff, Executive Dir. of IT Services

200X faster than Oracle system ROI in less than 3 months 1 PB on Netezza 7 years of historical data 100-200% annual data growth

NYSE has replaced an Oracle IO


relational database with a data warehousing appliance from Netezza, allowing it to conduct rapid searches of 650 terabytes of data. - ComputerWeekly.com

Scalability

Smart

SUNY Buffalo researchers reduced the time to perform quintillions of computations from 27 hours to 12 minutes

Once we had the data on Netezza we were


able to do the same analysis and much more complex analysis in minutes. The research draws on medical records, lab results, MRI scans, and patient surveys. - Dr. Murali Ramanathan, SUNY Buffalo
2012 IBM Corporation

27

Whats new?
Improved performance, energy efficiency and resiliency
Increased Performance & Capacity Improved Energy Efficiency Improved Resiliency and Fault Tolerance

Improved 3x faster performance1 Faster disk scan rate-128 GB/sec effective scan rate per rack

50% greater capacity and faster performance, with no increase in floor space, power or cooling requirements2 Uses less power than the competition Better capacity and power requirements than competition

More spare drives per cabinet Faster disk regeneration due to smaller drives 70% Fewer service calls3

13x

faster performance refers to the query times on both macro-analytic and mixed workload tests as conducted in IBM engineering lab benchmarks where the IBM PureData System for Analytics N2001 was shown to be an average of 3x faster than N1001. Individual results may vary. 2 50% greater capacity when compared to previous model PureData System for Analytics N1001. Power and cooling specifications within 97% of previous model PureData System for Analytics N1001. 3 Each N2001 rack contains 34 hot spare drives and 240 active drives for a ratio of 1 spare per 7 drives. Each N1001 rack contains 4 hot spare drives and 92 active drives for a ratio of 1 spare per 23 drives. The N2001 has 3.3x more spares per active drive. Frequency of disk related service calls expected to decrease by 70% assuming the same drive failure rates.

28

2012 IBM Corporation

IBM DB2 Analytics Accelerator


Synergy and integration with IBM and partner products

The PureData System for Analytics N2001 is also the next generation DB2 Analytics Accelerator
Providing the same improvements to our DB2 for zOS customers

29

Internal Use Only | Do Not Distribute

2012 IBM Corporation

PureData System for Analytics takes analytics beyond reporting


Optimization
Predictive Analytics BI Reporting and Ad-Hoc Analysis

What is the best choice? What will happen? What will the impact be? What happened? When and where? How much?

30

2012 IBM Corporation

Integrated by design IBM Netezza In-Database Analytics Version 2.0


Netezza In-Database Analytics 2.0 Transformations Mathematical Geospatial Predictive Statistics Time Series Data Mining

No data movement Analyze deep and wide data High performance, parallel computation
31
Internal Use Only | Do Not Distribute

2012 IBM Corporation

Pre-built in-database analytics


Statistics
Descriptive Statistics+ Distance Measures* Hypothesis Testing* Chi-Square & Contingency Tables* Univariate & Multivariate Distributions+ Monte Carlo Simulation*

Transformations
Data Profiling / Descriptive Statistics+ General Diagnostics Statistics+ Sampling Data prep

Time Series
Autoregressive+

Mathematical
Basic Math* Permutation and Combination* Greatest Common Divisor and Least Common Multiple* Conversion of Values* Exponential and Logarithm* Gamma and Beta Functions Matrix Algebra+ Area Under Curve* Interpolation Methods*

Forecasting*

Data Mining
Association Rules+ Clustering+ Feature Extraction+ Discriminant Analysis*

Predictive
Linear Regression+ Logistic Regression+ Classification Bayesian Sampling Model Testing

Geospatial
Geospatial Data Type Geometric Functions Geometric Analysis
* Fuzzy Logix DB Lytix capabilities + Netezza Analytics and Fuzzy Logix DB Lytix capabilities 2012 IBM Corporation

32

Combining spatial and corporate data


Delivering more insight Combine location with 100 Million call data records Customer usage and location can now be used together to:
Report on usage by area Develop new marketing campaigns to gain customers in low usage areas Planning for additional towers and network capacity

33

2012 IBM Corporation

PureData System for Analytics optimization with other IBM offerings


Big Data Platform

InfoSphere Streams InfoSphere BigInsights System ML (Machine Learning) Information Server v9.1 InfoSphere Discovery v4.5 InfoSphere Data Architect v8.1 InfoSphere CDC Heterogeneous Replication InfoSphere Optim Data Archive 9.1 Industry Models v8.4 Banking, Insurance, Healthcare Industry Model Packs Supply Chain, Customer, Market & Campaign Tivoli Storage Manager Vivismo Data Explorer v8.2

Data Integration

Business Intelligence / Performance Management

Cognos v10.2 Cognos TM1 v9.5 Guardium DB Monitoring v9 SPSS Modeler v15 Unica EMM Marketing Analytics 8.6 Unica NetInsights 8.6

Coming Soon: PureData System for Operational Analytics Guardium Informix Data Warehouse Edition SPSS v16
2012 IBM Corporation

System Z
34

IBM DB2 Analytics Accelerator (IDAA) zLinux ODBC driver

Loading the PureData System for Analytics

Ab Initio Cloudera Composite Software IBM Big Insights IBM Information Server IBM InfoSphere Streams Informatica Oracle Data Integrator Oracle GoldenGate SAP Business Objects

Data In
ODBC SQL

35

JDBC

OLE-DB

Data Integration

2012 IBM Corporation

Querying the PureData System for Analytics

Reporting & Analysis



36

IBM Cognos IBM SPSS IBM Unica Information Builders Kalido KXEN Microsoft Excel MicroStrategy Oracle OBIEE SAP Business Objects SAS Actuate

Data Out
ODBC SQL

JDBC

OLE-DB

2012 IBM Corporation

Big Data meets deep analytics

Analytics without constraint


37
2012 IBM Corporation

IBM Netezza Analytics: Built-in features and capabilities


IBM InfoSphere Streams Tanay GPU Appliance by Fuzzy Logix IBM InfoSphere BigInsights Cloudera Apache Hadoop
Software Development Kit 3rd Party In-Database Analytics IBM In-Database Analytics

IBM SPSS
SAS Revolution Analytics Esri Eclipse BI Tools Visualization Tools

PureData for Analytics AMPP Platform

38

2012 IBM Corporation

Using advanced analytics with BI tools and SQL


IO Stream Processor SQL Snippet Result Processor Snippet

Node
IO Stream Processor SQL Snippet Result Processor Snippet
Snippet

Advanced Analytics

Snippet

Node

IO Stream Processor SQL Snippet Result Processor Snippet

Host Host

Accelerated SQL Result Processor

Extended SQL

BI

Snippet

ETL

Loader

Node
39

Applications

2012 IBM Corporation

Part of the IBM Big Data Platform


Workload Optimized Solutions for all your analytic needs PureData Solutions System for Analytics
Analytics & Decision Management IBM Big Data Platform
Visualization & Discovery Application Development Accelerators Hadoop System Stream Computing Data Warehouse Systems Management

Information Integration & Governance

Big Data Infrastructure


40
2012 IBM Corporation

Listen to what some other clients have to say

http://www.youtube.com/watch?v=uwn8HX0IM8o http://www.youtube.com/watch?v=yKIGQuSYUd4

http://www.youtube.com/watch?v=Y0j1XSdMSDE

http://www.youtube.com/watch?v=-6UBeGIIc98

http://www.youtube.com/watch?v=ySLiDYgObFc

41

2012 IBM Corporation

PureExperience Program Let us prove it at no charge:


1. Guided analysis of business value 2. PureSystems Technology Demonstration 3. On-Site Trial & Support Free execution of on-site service engagement Continued use of the PureSystems offering for 30 days Access to a technical advocate for usage questions and advice Single point of IBM support and maintenance

www.ibm.com/PureExperience
42
2012 IBM Corporation

Take the next step.


Discover the value and begin your journey with IBM PureSystems:
Visit ibm.com/puresystems to learn more Join the conversation about this new category of computing: Twitter: @IBM PureSystems Hashtag: #IBMPureSystems or #expertintsys YouTube Channel: expertintegratedsys

Blog: expertintegratedsystemsblog.com
Developers Get started today with our no charge trial offerings! ibm.com/developerworks/puresystems/try Explore PureSystem partner solutions ibm.com/puresystems/centre Take a Test Drive ibm.com/PureExperience
43
2012 IBM Corporation

International Business Machines Corporation 2012 International Business Machines Corporation New Orchard Road Armonk, NY 10504 IBM, the IBM logo, PureSystems, PureFlex, PureApplication, PureData and ibm.com are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. A current list of IBM trademarks is available on the Web at www.ibm.com/legal/copytrade.shtml 44 2012 IBM Corporation All rights reserved.

Você também pode gostar