Você está na página 1de 22

SAP Big Data Overview

Santosh Gangone

2014 Cognizant

2014 Cognizant

Agenda

Big Data Definition

Big Data Evolution

Big Data Architecture

Use cases/solutions

2014 Cognizant

Big Data Changing the way we do business.

2014 Cognizant

Big Data Definition


Big data is an all-encompassing term for any collection of data sets so large or complex that it becomes
difficult to process them using traditional data processing applications.
Turn raw data into insights that drive massive business value. Enable Customers to Achieve Real-Time
Business Results on BIG DATA
Massive data results into problems of :
capturing
storing
managing
analyzing
Source of challenges 3 Vs
Volume - Ranging up to petabytes needs special storage
Variety - Multiple data sources producing different data types i.e. social media, unstructured, machine
sensors etc
Velocity High speed and volume of incoming data for real-time decisions

2014 Cognizant

Volume Variety Velocity

2014 Cognizant

Big Data Evolution

2014 Cognizant

Source: SAP

Code Halos
Data, devices and interactions surrounding all of us

Code Halo
Code Halo /kohd hey-loh/.noun
The information that surrounds
people, organizations, processes and
products

2014 Cognizant

SAP Big Data Patterns aligned to Cognizant Code Halos

Machine
Data Insight
M2M

Mobile

Customer
Behavior Insight

Sensors

Unstructured

Multi-media

Enterpris
e Data

IoT

Realtime
Social

Processes

Assets
Voluminous Data

Providing insight from


machines, assets, and devices
for
better real-time decisions,
predictions,
operational
performance
2014
Cognizant
8 and

Geospatial

Providing insight from high


volume
and high variety data
for real-time analytics
& actionable intelligence

Cognizant SAP Big Data Reference Architecture


SAP Big Data
Application Types, Data
Science and Statistical
Modeling
M2M and
Customer Behavior

Analytics

Analytics & Big Data Visualization


SAP Big Data
Visualization

insight
SAP KXEN/Infinite Insight

Reporting
Analysis
Dashboards

Exploratio
n

Visualizati
on

Multiple Regression Models


Linear Models
Univariate/Multivariate models

Predictive

Smart Data Access

Data
Stores

DATA LAKE
BW 7.4
Extended
Storage
(Sybase IQ)

Hadoop

HANA DataMart

Large Scale Data Capture,


Generate Analytical Datasets,
Train/Validate Predictive Models

Enterprise HANA 1.0 SPS8

Data
Inge
st

SAP HANA Data Platform

2014 Cognizant

SQL
Anywh
ere

SLT

SAP
SAP
ERP
ERP

Legend

SAP
SAP
CRM
CRM

SAP

SAP
SAP
HCM
HCM

Non-SAP

SAP
SAP
SRM
SRM

Hadoop

Data
Service
s
Non
Non
SAP
SAP

Others
Others

Sybase
ESP

DocumentWeb Logs, Social Machine Sensor Geos & Emails


Click Streams
Networks
Generated Data location
Data

Cognizant and SAP Big Data Solutions Architecture

Cogniza
nt
Verticals
Apps

Apps

SAP KXEN/Infinite Insight


Exploration
Dashboards, Reports SAP
SAP Predictive
Charting,
Lumira
Analytics
Visualization

SAP Data Scientists


SAP
BusinessObjects
BI

BFS

Comms

LS

Tech

E&U

Media

T&H

Insuran
ce

CG

Healthc
are

Educati
on

Manuf.

Retail

Logistic
s

Retail
Omnichanne
l Analytics

Vaccine
Yield
Analysis

Customer
Behavior Apps

Marketing
Insights for
Downstream
Oil & Gas

Machine to
Machine Apps

Business and
Suite Apps

Extended Application Services

Application Development

Unified Administration

Processing Engine

Database Services(OLTP + OLAP)

10

10

Custom HANA
Solutions

SAP HANA PLATFORM

Data
Platform

Plant
Equipment
Analysis

Advanced
Genome
Analysis

Application Function Libraries & Data Models


Integration Services

Smart Data
Access

Smart Data Access

Tools

Analytics

Extended
Storage
(Sybase IQ)

Transfer
Datasets

Hadoop

Large Scale Data Capture, Generate Analytical Datasets, Train/Validate


Predictive Models
2014 Cognizant

What is Hadoop?

Apache Hadoop, an open - source software library, is a framework that


allows for the distributed processing of large data sets across clusters
of commodity hardware using simple programming models. It is designed
to scale up from single servers to thousands of machines, each offering
local computation and storage.

11

2014 Cognizant

SAP Certified Hadoop distribution partners

12

2014 Cognizant

H2 Power of HANA & Hadoop


Combine strengths of different data processing domains

Modern in-memory platform


Transact/analyze in real- time
Native predictive, text, and spatial algorithms

Petascale, columnar database


Tight integration with HANA via Smart Data Access Extend
HANA tables with hot data in HANA, and warm data in IQ

Distributed data storage and processing on commodity


hardware
Store infinite amounts of unstructured data
No-SQL access

13

2014 Cognizant

SAP HANA Smart Data Access


Data virtualization for on-premise and hybrid cloud
environments
Transactions + Analytics
Benefits
Enables access to remote data
access just like local table
Smart query processing
including query decomposition
with predicate push-down,
functional compensation
Supports data location agnostic
development
No special syntax to access
heterogeneous data sources

SAP HANA
HANA Tables

Virtual Tables

Heterogenous Data Sources


IQ

ASE

14

2014 Cognizant

SAP HANA to Hadoop (Hive)


Teradata
SAP Sybase ASE
SAP Sybase IQ
HANA to HANA

SAP HANA - Hadoop Integration

Integration

at ETL layer (HIVE, HDFS, Map Reduce, Pig, Apache HBASE, Floom, Ambari, Oozie, Avro etc. )
Federation at BI layer (BOBJ multi-source Universe accessing Hadoop HIVE )
Smart Data Access - direct HANA-Hadoop connectivity

15

2014 Cognizant

Two speed analytics

16

Long running batch analytic jobs in Hadoop


Push results to SAP HANA
Combine with other data, e.g. from SAP Business Suite
User accesses result through BI Tools on SAP HANA

2014 Cognizant

Big Data Solutions in Innovation Lab


Omni channel Insight for Retail Powered
by Code Halos
Maintain appropriate inventory and
strategize buying decisions
Improve Retail distribution , Store
operations & Product quality
Combine enterprise and digital
information to create Code Halos for
consumers and products
Analyze sentiment from Twitter,
Facebook, LinkedIn or industry-specific
social media streams.

Advanced Genome Analysis

Vaccine Yield Analysis


Improve vaccine yield and lower cost of production
Reduce the variability in the production process to
reduce costs
Analyze large volume of data (spread across disparate
platforms) of manufacturing information including: time
series (pressures, temperatures, etc.), event,
transaction, change control, quality, raw material and
environmental data

Marketing Insights for Downstream Oil & Gas


How can transactions at Gas pumps & PoS transactions at
the Retail Gas outlets be related?
How can we engage with the gas consumer in terms of cross
sell & up sell marketing offers?

Reduce delays and minimize the costs


associated with new drug discovery by
optimizing the process for genome analysis
Speed the decision making for hospitals which
conduct cancer detection based on DNA
sequence matching

17

2014 Cognizant

How can we retain consumer loyalty & brand?


How can we make the consumer return to the same gas
station consistently?
How can we analyze the buying behaviors of gas vs. retail
SKUs bought inside the store?
How can we understand the gas consumer wallet & spend?

18

2014 Cognizant

19

2014 Cognizant

Appendix

2014 Cognizant

20

2014 Cognizant

SAP HANA - Hadoop Integration


SP9 includes the following new features:
Direct access to HDFS file system
Develop and invoke custom Map Reduce jobs
SAP HANA studio as the single IDE to invoke the M/R jobs
Leverage HANA Repository design time for Map Reduce job, remote Hadoop source and virtual functions
Support of remote result caching of virtual function execution
Data Provisioning support for remote source connectivity for IM in HANA via SAP ANA Service/Adapter
Framework

Smart Data Access features:


SAP MAX DB Support
Statistics Enhancements
Changed Default System Parameter Behavior
Improved Join Relocation
Function Translate Improvement
Read-only Remote Sources and Smart Data Access Connections

21

2014 Cognizant

SAP Lumira 1.21 More access to Big Data

Big Data, big Insights

Amazon EMR Hive 0.11


Apache Hive 0.12 & 0.13
Cloudera Impala 1.0

Performance Optimization
Smoother scrolling through large number of
columns in prepare room

22

2014 Cognizant

Você também pode gostar