Você está na página 1de 47

Geo Intelligence India 13-14 Jun 2013 New Delhi

Do lafzon ki hai

DATA ki kahani...............

Ek hai

ZERO....duja hai ONE.....

Big Spatial Data

Security

WELCOME

BIG SPATIAL DATA

has been with us for ages in various formsbut pretty invisible!!

Ancient Egypt River nile Engineers used to try data analysis to predict crop yields 6695Kmlong

Challenges Perceptions Concepts Basic Intro

the 15 min route to THANK YOU slide

An English professor wrote the words :

On the chalk board and asked his students to punctuate it correctly.

A Woman without her man is nothing

A Woman,without her man,is nothing A Woman: Without her, man is nothing

8 How we understand it ?

Series 1
Social media data The latest buzzword Large volumes of Geo data Non traditional forms of Geo data Data influx from new technologies Real time Geo information New kinds of Geo data and analysis A greater scope of Geo Int info 0 2 4 6 8 10 12 14 16 18 20

DEFINING BIG SPATIAL DATA

BIG SPATIAL DATA


Spatial data sets exceeding capacity of current computing systems

.to manage, process or analyze the data with reasonable effort

due to Volume, Velocity, Variety and Veracity

DEFINING BIG SPATIAL DATA

10

DATA is Exploding in
Volume Velocity VARIETY

While decreasing in
Veracity

11

BIG SPATIAL DATA


Finding actionable info in Massive volumes of both structured and unstructured geo data that is so large and complex that its difficult to process with traditional database and software techniques Volume

Data at rest

Data in Motion

Velocity

VARIETY

Data in Many forms

VERACITY

Data in Doubt

DEFINING BIG SPATIAL DATA

Gigabyte (GB) - 1,024MB Terabyte (TB) - 1,024GB Petabyte (PB) - 1,024TB Exabyte (EB) - 1,024PB

U.S. drone aircraft sent back 24 years worth of video footage in 2009

90% of data in the world was created in the last 2 years

2.5 EB of data is created every day

13 growth of geospatial data is outpacing both software and services and is set to become a major contributor to the overall growth of the industry
* Estimated revenue FY 2013

14

100% security is a myth


Increasing attack surface

No one has said this!!! But it remains a fact

15 The

technology is

ready.

But are

we ready

16

DISASTER RELIEF RETAIL UTILITIES FINANCIAL

FRAUD DETECTION ECO-ROUTING DISEASE SURVEILLANCE TELECOMMUNICATIONS

INSURANCE CALL CENTER REQUESTS

16

17

The other side of the story

18

Security challenges before we adopt Big spatial data

19

Ek
Distributed programming frameworks

Utilise parallilism in computation & storage to process massive amounts of 20


Map Input file

data Local Reduce

Intermediate Combining

Reduce
Output File

Shuffle

Mapper performs computation & outputs a key/value pairs

Reducer combines the values belonging to each distict key and outputs the result

Distributed programming frameworks

21

MAP
Splits the input data-set into independent chunks which are processed in a completely parallel manner

REDUCE
Aggregate results from map phase performs a summary operation

FRAMEWORK
Schedules and re-runs tasks Splits the input Moves map outputs to reduce inputs Receive the results

Distributed programming frameworks

Read 1 TB

One Machine 4 i/o Channels Each channel : 100 MB/s

10 Machines 4 i/o Channels Each channel : 100 MB/s

45 Min

4.5 Min

So challenge is not storage but it is I/O speed

23

Untrusted Mappers

Securing the data in the presence of an untrusted mapper

Distributed programming frameworks

24

TWO

NO SQL ISSUES

25

First off : the name NoSQL is not NEVER SQL NoSQL is not No To SQL

26

NoSQL
Is simply

Not Only SQL!!!!!

27

MongoDB

NoSQL DB are still evolving with respect to security infrastructure

Redis

28

Data storage & transaction logs

29 STORAGE TIERS

- Multi-tiered storage media - Necessitated by scalable size - Different categories of data - Different types of storage

Data storage & transaction logs

30 Keeping track of data location

Lower tier means reduced security, loose access controls

Data storage & transaction logs

31

INPUT VALIDATION/FILTERING

32 How can we trust data ?

Validating data when source of input data is not reliable?

Filtering malicious data @ BYOD

Input validation/filtering

33

REAL TIME MONITORING

34 Humongous number of alerts!!!!

False positives

Filtering malicious data @ BYOD

REAL TIME MONITORING

35

Secure communication

36 End to end security ?

Data encryption : attribute based encryption!!!to be made richer

Secure communication

37

Granular audits

38

New attacks will keep happeningand to find out we need detailed audit logs

Missed true positives

Granular audits

39

PRIVACY ISSUES

40

EG : How a retailer was able to identify that a teenager was pregnant before her father knew

In the world of big data,privacy invasion is a business model

PRIVACY ISSUES

41

And...

We

Also Have cloud with us?

42

At 1.4% in 2011-12
Cloud was a very small percentage of the total IT spend

43

Pace of Big Spatial Data adoption has been

Sluggish

44

There is unlikely to be a day soon in near future when we have a

FIND TERRORIST BUTTON

45

We have mostly been reactive till date..

46 USE KERBEROS FOR NODE AUTHENTICATION (BUT WE KNOW ITS A PAIN TO SET UP)

STRINGENT POLICIES STANDARD TO INTRA COUNTRY LAWS

SECURE COMMUNICATION EXHAUSTIVE LOGS STRINGENT POLICIES

47

Você também pode gostar