Bem-vindo(a) ao Scribd!

Pular no carrossel

Introduction To MapReduce - Geoinsyssoft

Enviado por

Anandh Kumar

0% acharam este documento útil (0 voto)

26 visualizações17 páginas

Bgdata training in chennai

Título original

Introduction to MapReduce -Geoinsyssoft

Direitos autorais

Formatos disponíveis

PPT, PDF, TXT ou leia online no Scribd

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Denunciar este documento

Bgdata training in chennai

Direitos autorais:

Attribution Non-Commercial (BY-NC)

Formatos disponíveis

Baixe no formato PPT, PDF, TXT ou leia online no Scribd

Sinalizar o conteúdo como inadequado

0% acharam este documento útil (0 voto)

26 visualizações17 páginas

Introduction To MapReduce - Geoinsyssoft

Enviado por

Anandh Kumar

Bgdata training in chennai

Direitos autorais:

Attribution Non-Commercial (BY-NC)

Formatos disponíveis

Baixe no formato PPT, PDF, TXT ou leia online no Scribd

Sinalizar o conteúdo como inadequado

Pular para a página

Você está na página 1de 17

Pesquisar no documento

Introduction to Google MapReduce

WING Group Meeting 13 Oct 2006 Hendra Setiawan

What is MapReduce?
A programming model (& its associated
implementation) For processing large data set Exploits large set of commodity computers Executes process in distributed manner Offers high degree of transparencies In other words:
simple and maybe suitable for your tasks !!!

Distributed Grep
Split data

Very big data

Split data Split data

grep grep grep grep

matches matches

matches

cat

All matches

Split data

matches

Distributed Word Count

Split data

Very big data

Split data Split data

count count count count

count count

count

merge

merged count

Split data

count

Map Reduce
Very big data M A P Partitioning Function R E D U C E Result

Map:
Accepts input key/value pair Emits intermediate key/value pair

Reduce :
Accepts intermediate key/value* pair Emits output key/value pair

Partitioning Function

Partitioning Function (2)

Default : hash(key) Guarantee:
mod R

Relatively well-balanced partitions Ordering guarantee within partition

Distributed Sort
Map:
emit(key,value)

Reduce (with R=1):

emit(key,value)

MapReduce
Distributed Grep
Map:
if match(value,pattern) emit(value,1)

Reduce:
emit(key,sum(value*))

Distributed Word Count

Map:
for all w in value do emit(w,1)

Reduce:
emit(key,sum(value*))

MapReduce Transparencies
Plus Google Distributed File System : Parallelization Fault-tolerance Locality optimization Load balancing

Suitable for your task if

Have a cluster Working with large dataset Working with independent data (or
assumed) Can be cast into map and reduce

MapReduce outside Google

Hadoop (Java)
Emulates MapReduce and GFS

The architecture of Hadoop MapReduce

and DFS is master/slave
Master Slave MapReduce jobtracker tasktracker DFS namenode datanode

Example Word Count (1)

Map
public static class MapClass extends MapReduceBase implements Mapper { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(WritableComparable key, Writable value, OutputCollector output, Reporter reporter) throws IOException { String line = ((Text)value).toString(); StringTokenizer itr = new StringTokenizer(line); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); output.collect(word, one); } } }

Example Word Count (2)

Reduce
public static class Reduce extends MapReduceBase implements Reducer { public void reduce(WritableComparable key, Iterator values, OutputCollector output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += ((IntWritable) values.next()).get(); } output.collect(key, new IntWritable(sum)); } }

Example Word Count (3)

Main
public static void main(String[] args) throws IOException { //checking goes here JobConf conf = new JobConf();
conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(MapClass.class); conf.setCombinerClass(Reduce.class); conf.setReducerClass(Reduce.class); conf.setInputPath(new Path(args[0])); conf.setOutputPath(new Path(args[1])); JobClient.runJob(conf); }

One time setup

set hadoop-site.xml and slaves Initiate namenode Run Hadoop MapReduce and DFS Upload your data to DFS Run your process Download your data from DFS

Summary
A simple programming model for
processing large dataset on large set of computer cluster Fun to use, focus on problem, and let the library deal with the messy detail

References
Original paper
(http://labs.google.com/papers/mapreduce .html) On wikipedia (http://en.wikipedia.org/wiki/MapReduce) Hadoop MapReduce in Java (http://lucene.apache.org/hadoop/) Starfish - MapReduce in Ruby (http://rufy.com/starfish/)

Você também pode gostar

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
No Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Nota: 4 de 5 estrelas
4/5 (5794)
Map Reduce Architecture: Adapted From Lectures by
Documento37 páginas
Map Reduce Architecture: Adapted From Lectures by
Anandh Kumar
Ainda não há avaliações
The Yellow House: A Memoir (2019 National Book Award Winner)
No Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Nota: 4 de 5 estrelas
4/5 (98)
Linux Shell Scripting Awk Sed
Documento45 páginas
Linux Shell Scripting Awk Sed
Anandh Kumar
Ainda não há avaliações
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
No Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Nota: 3.5 de 5 estrelas
3.5/5 (231)
Python Exercises
Documento1 página
Python Exercises
Anandh Kumar
Ainda não há avaliações
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
No Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Nota: 4 de 5 estrelas
4/5 (895)
Understanding Hadoop Clusters and The Network-Slides and Text Bradhedlund Com
Documento26 páginas
Understanding Hadoop Clusters and The Network-Slides and Text Bradhedlund Com
Shiva Kumar
Ainda não há avaliações
The Little Book of Hygge: Danish Secrets to Happy Living
No Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Nota: 3.5 de 5 estrelas
3.5/5 (400)
Practice Project - Tutorial Microstrategy
Documento38 páginas
Practice Project - Tutorial Microstrategy
Anandh Kumar
50% (2)
Shoe Dog: A Memoir by the Creator of Nike
No Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Nota: 4.5 de 5 estrelas
4.5/5 (537)
Data Visualization - KPN - Tools v1.3
Documento50 páginas
Data Visualization - KPN - Tools v1.3
Anandh Kumar
Ainda não há avaliações
Never Split the Difference: Negotiating As If Your Life Depended On It
No Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Nota: 4.5 de 5 estrelas
4.5/5 (838)
Geoinsyssoft - Business Intelligence Intro 2
Documento3 páginas
Geoinsyssoft - Business Intelligence Intro 2
Anandh Kumar
Ainda não há avaliações
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
No Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Nota: 4.5 de 5 estrelas
4.5/5 (474)
Map Reduce Basics
Documento20 páginas
Map Reduce Basics
Anandh Kumar
Ainda não há avaliações
Grit: The Power of Passion and Perseverance
No Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Nota: 4 de 5 estrelas
4/5 (588)
Rstats
Documento42 páginas
Rstats
Anandh Kumar
Ainda não há avaliações
Yes Please
No Everand
Yes Please
Amy Poehler
Nota: 4 de 5 estrelas
4/5 (1891)
Geoinsyssoft BI Introduction Document - Datawarehosuing Introduction
Documento3 páginas
Geoinsyssoft BI Introduction Document - Datawarehosuing Introduction
Anandh Kumar
Ainda não há avaliações
The Emperor of All Maladies: A Biography of Cancer
No Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Nota: 4.5 de 5 estrelas
4.5/5 (271)
Five Number Theory
Documento3 páginas
Five Number Theory
Anandh Kumar
Ainda não há avaliações
On Fire: The (Burning) Case for a Green New Deal
No Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Nota: 4 de 5 estrelas
4/5 (74)
Bigdata Interview Questions-Geoinsyssoft
Documento1 página
Bigdata Interview Questions-Geoinsyssoft
Anandh Kumar
Ainda não há avaliações
Team of Rivals: The Political Genius of Abraham Lincoln
No Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Nota: 4.5 de 5 estrelas
4.5/5 (234)
Data Science Links-Geoinsyssoft
Documento1 página
Data Science Links-Geoinsyssoft
Anandh Kumar
Ainda não há avaliações
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
No Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Nota: 4.5 de 5 estrelas
4.5/5 (266)
Obiee Resumes 2011dec
Documento30 páginas
Obiee Resumes 2011dec
Anandh Kumar
Ainda não há avaliações
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
No Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Nota: 4.5 de 5 estrelas
4.5/5 (344)
BI Intro Geoinsys
Documento4 páginas
BI Intro Geoinsys
Anandh Kumar
Ainda não há avaliações
Rise of ISIS: A Threat We Can't Ignore
No Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Nota: 3.5 de 5 estrelas
3.5/5 (137)
02 Oracle BI Architect
Documento85 páginas
02 Oracle BI Architect
Anandh Kumar
Ainda não há avaliações
The World Is Flat 3.0: A Brief History of the Twenty-first Century
No Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Nota: 3.5 de 5 estrelas
3.5/5 (2259)
Definition
Documento1 página
Definition
Anandh Kumar
Ainda não há avaliações
Fear: Trump in the White House
No Everand
Fear: Trump in the White House
Bob Woodward
Nota: 3.5 de 5 estrelas
3.5/5 (738)
Informatica Resume Dec2011
Documento46 páginas
Informatica Resume Dec2011
Anandh Kumar
Ainda não há avaliações
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
No Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Nota: 4 de 5 estrelas
4/5 (1090)
Big Data Proof of Concept
Documento2 páginas
Big Data Proof of Concept
Anandh Kumar
0% (1)
Principles: Life and Work
No Everand
Principles: Life and Work
Ray Dalio
Nota: 4 de 5 estrelas
4/5 (599)
Bo 3.1
Documento3 páginas
Bo 3.1
Anandh Kumar
Ainda não há avaliações
John Adams
No Everand
John Adams
David McCullough
Nota: 4.5 de 5 estrelas
4.5/5 (2409)
MicroStrategy 9 Vs QlikTech 11
Documento40 páginas
MicroStrategy 9 Vs QlikTech 11
Anandh Kumar
Ainda não há avaliações
The Unwinding: An Inner History of the New America
No Everand
The Unwinding: An Inner History of the New America
George Packer
Nota: 4 de 5 estrelas
4/5 (45)
R Intro
Documento103 páginas
R Intro
tajjj9
Ainda não há avaliações
The Glass Castle: A Memoir
No Everand
The Glass Castle: A Memoir
Jeannette Walls
Nota: 4.5 de 5 estrelas
4.5/5 (1712)
What Application Content Is Available For The BI Apps? ERP Analytics
Documento4 páginas
What Application Content Is Available For The BI Apps? ERP Analytics
Anandh Kumar
Ainda não há avaliações
Angela's Ashes: A Memoir
No Everand
Angela's Ashes: A Memoir
Frank McCourt
Nota: 4.5 de 5 estrelas
4.5/5 (440)
DWH Questions With Subtopics
Documento16 páginas
DWH Questions With Subtopics
Anandh Kumar
Ainda não há avaliações
Steve Jobs
No Everand
Steve Jobs
Walter Isaacson
Nota: 4.5 de 5 estrelas
4.5/5 (806)
Basic Unix Handout
Documento1 página
Basic Unix Handout
Anandh Kumar
Ainda não há avaliações
Bad Feminist: Essays
No Everand
Bad Feminist: Essays
Roxane Gay
Nota: 4 de 5 estrelas
4/5 (1015)
Texmo Pumps
Documento41 páginas
Texmo Pumps
sivanarayana nallagatla
Ainda não há avaliações
The Outsider: A Novel
No Everand
The Outsider: A Novel
Stephen King
Nota: 4 de 5 estrelas
4/5 (1839)
Ccproxy Manual
Documento85 páginas
Ccproxy Manual
vctior1
Ainda não há avaliações
The Light Between Oceans: A Novel
No Everand
The Light Between Oceans: A Novel
M.L. Stedman
Nota: 4.5 de 5 estrelas
4.5/5 (789)
FSM Application
Documento21 páginas
FSM Application
Mani Bharath Nuti
Ainda não há avaliações
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
No Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Nota: 4.5 de 5 estrelas
4.5/5 (121)
CD Practical File (6th Sem)
Documento24 páginas
CD Practical File (6th Sem)
Nitish Bansal
Ainda não há avaliações
Brooklyn: A Novel
No Everand
Brooklyn: A Novel
Colm Toibin
Nota: 3.5 de 5 estrelas
3.5/5 (1937)
Installation Guide
Documento1 página
Installation Guide
CamiloRamirezSanabria
Ainda não há avaliações
The Woman in Cabin 10
No Everand
The Woman in Cabin 10
Ruth Ware
Nota: 3.5 de 5 estrelas
3.5/5 (2322)
Hyperion Planning System Tables
Documento2 páginas
Hyperion Planning System Tables
Subhakar Madasu
Ainda não há avaliações
A Man Called Ove: A Novel
No Everand
A Man Called Ove: A Novel
Fredrik Backman
Nota: 4.5 de 5 estrelas
4.5/5 (4609)
CL Commands Rtvdtaara AS400 OS400 RUNQRY
Documento836 páginas
CL Commands Rtvdtaara AS400 OS400 RUNQRY
beach20508217
100% (3)
The Perks of Being a Wallflower
No Everand
The Perks of Being a Wallflower
Stephen Chbosky
Nota: 4.5 de 5 estrelas
4.5/5 (2103)
Class 24: Case Study: Building A Simple Postfix Calculator: Introduction To Computation and Problem Solving
Documento16 páginas
Class 24: Case Study: Building A Simple Postfix Calculator: Introduction To Computation and Problem Solving
yekych
Ainda não há avaliações
Wolf Hall: A Novel
No Everand
Wolf Hall: A Novel
Hilary Mantel
Nota: 4 de 5 estrelas
4/5 (3811)
Advance Java Practical File
Documento24 páginas
Advance Java Practical File
aman singh
Ainda não há avaliações
Little Women
No Everand
Little Women
Louisa May Alcott
Nota: 4 de 5 estrelas
4/5 (104)
Spy Glass
Documento31 páginas
Spy Glass
Agnathavasi
Ainda não há avaliações
Manhattan Beach: A Novel
No Everand
Manhattan Beach: A Novel
Jennifer Egan
Nota: 3.5 de 5 estrelas
3.5/5 (792)
Tes PDF
Documento66 páginas
Tes PDF
MRR Nebula
Ainda não há avaliações
The Art of Racing in the Rain: A Novel
No Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Nota: 4 de 5 estrelas
4/5 (4200)
CAP 531 1.6 Configuration and Programming Tool: User's Manual
Documento184 páginas
CAP 531 1.6 Configuration and Programming Tool: User's Manual
zerara12
Ainda não há avaliações
The Constant Gardener: A Novel
No Everand
The Constant Gardener: A Novel
John le Carré
Nota: 3.5 de 5 estrelas
3.5/5 (104)
Commonly Asked MongoDB Interview Questions (2023) - Interviewbit
Documento21 páginas
Commonly Asked MongoDB Interview Questions (2023) - Interviewbit
Sagar Chaudhari
Ainda não há avaliações
A Tree Grows in Brooklyn
No Everand
A Tree Grows in Brooklyn
Betty Smith
Nota: 4.5 de 5 estrelas
4.5/5 (1929)
Session 2
Documento42 páginas
Session 2
Isaac Kariuki
Ainda não há avaliações
Her Body and Other Parties: Stories
No Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Nota: 4 de 5 estrelas
4/5 (821)
Https Es Scribd Com Upload-Document Archive Doc 350977240&escape False&metadata (Context: Archive View Restricted, Page: Read, Action :false, Logged in :true, Platform: Web)
Documento3 páginas
Https Es Scribd Com Upload-Document Archive Doc 350977240&escape False&metadata (Context: Archive View Restricted, Page: Read, Action :false, Logged in :true, Platform: Web)
Sergio Fernandez
Ainda não há avaliações
Sing, Unburied, Sing: A Novel
No Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Nota: 4 de 5 estrelas
4/5 (1103)
GOLDMINE Automated Processes
Documento30 páginas
GOLDMINE Automated Processes
minutemen_us
Ainda não há avaliações
Netbackup For Hyper-V Guide
Documento60 páginas
Netbackup For Hyper-V Guide
Zakaria Doumbia
Ainda não há avaliações
A Primer On Common Gis Data Formats: Contributed by Glenn Letham (Gisuser Editor) 13 September 2004
Documento7 páginas
A Primer On Common Gis Data Formats: Contributed by Glenn Letham (Gisuser Editor) 13 September 2004
cristofer osvalo peñaloza arevalo
Ainda não há avaliações
LAMW Getting Started
Documento5 páginas
LAMW Getting Started
jmpessoa
Ainda não há avaliações
Pandoc Mode Manual
Documento11 páginas
Pandoc Mode Manual
norbulinuks
Ainda não há avaliações
CV
Documento3 páginas
CV
Nina Miličević
Ainda não há avaliações
Workshop 19 - Bucket Excavator (Nested Motions and DEM-FEA Coupling in Workbench) Part B: Mechanical Coupling R4.3
Documento32 páginas
Workshop 19 - Bucket Excavator (Nested Motions and DEM-FEA Coupling in Workbench) Part B: Mechanical Coupling R4.3
Houssam BEN SALAH
Ainda não há avaliações
Installation Instructions (Corrected)
Documento1 página
Installation Instructions (Corrected)
Sicein Sas
Ainda não há avaliações
Implementation Documents
Documento4 páginas
Implementation Documents
SANJAY
Ainda não há avaliações
Correspondence Archiving & Tracking System
Documento45 páginas
Correspondence Archiving & Tracking System
ahmednaxim
Ainda não há avaliações
Users Manual Acobri UK
Documento212 páginas
Users Manual Acobri UK
Ioana Codrea Ortelecan
Ainda não há avaliações
EFI MP命令表
Documento2 páginas
EFI MP命令表
skizoufri
Ainda não há avaliações
Serial Number AutoCAD 2014
Documento5 páginas
Serial Number AutoCAD 2014
Punith Ky
67% (9)
ARW Graphics Bruyere
Documento78 páginas
ARW Graphics Bruyere
Mutahar_Hezam
Ainda não há avaliações
AR Purge Notes 160212
Documento2 páginas
AR Purge Notes 160212
Harry Taieb
Ainda não há avaliações
You Can't Joke About That: Why Everything Is Funny, Nothing Is Sacred, and We're All in This Together
No Everand
You Can't Joke About That: Why Everything Is Funny, Nothing Is Sacred, and We're All in This Together
Kat Timpf
Ainda não há avaliações
Burn Book: A Tech Love Story
No Everand
Burn Book: A Tech Love Story
Kara Swisher
Nota: 4.5 de 5 estrelas
4.5/5 (41)
Defensive Cyber Mastery: Expert Strategies for Unbeatable Personal and Business Security
No Everand
Defensive Cyber Mastery: Expert Strategies for Unbeatable Personal and Business Security
Denys Samsonov
Nota: 5 de 5 estrelas
5/5 (1)
Mooses with Bazookas
No Everand
Mooses with Bazookas
S. D. Smith
Nota: 5 de 5 estrelas
5/5 (8)
Cyber War: The Next Threat to National Security and What to Do About It
No Everand
Cyber War: The Next Threat to National Security and What to Do About It
Richard A. Clarke
Nota: 3.5 de 5 estrelas
3.5/5 (66)
Algorithms to Live By: The Computer Science of Human Decisions
No Everand
Algorithms to Live By: The Computer Science of Human Decisions
Brian Christian
Nota: 4.5 de 5 estrelas
4.5/5 (722)
Building large scale web apps
No Everand
Building large scale web apps
Addy Osmani
Ainda não há avaliações