Bem-vindo(a) ao Scribd!

Pular no carrossel

Pipelined MapReduce

Enviado por

Sachin

0% acharam este documento útil (0 voto)

26 visualizações22 páginas

Power Point Presentation

Título original

Pipelined-MapReduce

Direitos autorais

Formatos disponíveis

PDF, TXT ou leia online no Scribd

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Denunciar este documento

Power Point Presentation

Direitos autorais:

Attribution Non-Commercial (BY-NC)

Formatos disponíveis

Baixe no formato PDF, TXT ou leia online no Scribd

Sinalizar o conteúdo como inadequado

0% acharam este documento útil (0 voto)

26 visualizações22 páginas

Pipelined MapReduce

Enviado por

Sachin

Power Point Presentation

Direitos autorais:

Attribution Non-Commercial (BY-NC)

Formatos disponíveis

Baixe no formato PDF, TXT ou leia online no Scribd

Sinalizar o conteúdo como inadequado

Pular para a página

Você está na página 1de 22

Pesquisar no documento

Presented By :Deshmukh Sachin B.

ME(computer) 9970406068

Guided By :-

Open source implementation. Uses HDFS.

Hadoop Architecture
Client
Job assignment to cluster

Name Node Master Node Job Tracker

Slave Node Task Tracker Data Node Map Reduce

Data replication on multiple node

Fig 1:Architcture Of Hadoop

It is the software framework for distributed processing of large data sets on computer cluster. Map and reduce have general interface Each receives sequence of records and produces records in response A record consists of key and value

Map Reduce Job

Unordered Bined Data

Map

Records Bined with Key

Reduc

Redu ce

Reduc

Result By Reducer

Fig 2: Map Reduce Process

Map operation:
Map seeks to key its output

Reduce operation:
so that the system places in the same bin the records that should come together in the reduce phase.

Fig 3: Map Operation

Fig 4: Reduce Operation

Files in HDFS Splitter Record Reader

Mappers

Combiner

Partitioner

Reducers

Sorter

Fig 5: Detailed Flow Of Map Reduce

Problem statement for word count: There is huge file. Determine the count of each word in the file. Approach: Map reduce take advantage of the huge number of nodes presents in the cluster. Map- reduce runs in parallel at each node in the cluster.

map(key, value): // key: document name; value: text of document for each word w in value: emit(w, 1)

reduce(key, values): // key: a word; values: an iterator over counts result = 0 for each count v in values: result += v emit(key, result)

map(key=url, Val=contents): For each word w in contents, emit (w, 1) reduce(key=word, values=uniq_counts): Sum all 1s in values list Emit result (word, sum)

see bob run see spot throw

1 1 1 1 1 1

bob run see spot throw

1 1 2 1 1

EXAMPLE WORD COUNT USING MAP REDUCE

Input Splitting Mapping Shuffling Reducing Final Result

Deer 1 Bear 1 Bear 1 Bear 1 River 1 Deer Beer River Deer Beer River Car Car River Deer Car Bear Deer Car Bear Car Car Car River Car 1 Car 1 1 Car 1 River 1 Deer 1 Deer 1 Car 1 Deer 1 Bear 1 River 1 River 1 River 2 Deer 2 Deer 2 River 2 Car 1 Car 3 Bear 2 Car 3 Bear 2

Fig 6: Map Reduce Word Count Process

ANOTHER WORD COUNT

Fig 7: Word Count Example

In Pipelined Map Reduce Mapper directly send data to Reducer.

Comparision Of Hadoop & Pipelined Map Reduce Data Flow

Push Pull
Local HDFS

Pull
Map Store Red uce

HDFS

Fig Haboop Data Flow for Batch

Push Push
HDFS Local HDFS Map Store Red uce

Pull

Mapper directly push data to reducer as it is produce

Fig Pipelined Map Reduce Data Flow

Algo For Word Count Using Pipelined MR

TCP Socket See Bob Run
Client Submits Job

See 1
M1

Bob 1
M2

Run 1
M3

See 1
M4

See

Job Tracker

See 1
R1

Bob 1
R2

Run 1
R3

Run 1
R4

Reduce Task Accept The Pipeline Data & store it in In Memory Buffer In Memory Buffer

See 1 Bob 1 Run 1 See 1

MERGE

Applies User defined Reduce Function

See 2 Bob 1 Run 1

HDFS

Open TCP Socket

Allows to send and Receive data between task and between jobs with disk i/o. Reduce Time. Enabling the user to take snapshots of approximate output.

In this seminar , we studied the Pipelined-MapReduce in the Hadoop environment , extends the MapReduce programming model which is superior to the batch, reduce the completion time of tasks. Pipedline-MapReduce can processes large datasets effctively. In our future works , we will study the applicability of the MapReduce technique in cloud computing environments.

J.Dean,S.Ghemawat,MapReduce:simplified Data Processing on Large Clusters. Proc. of Operating Systems Design and Implementation, San Francisco,CA, pp. 137150 (2004) T.Hey, S.Tansley, K.Tolle. The Fourth Paradigm: DataIntensive Scientic Discovery. Microsoft Research, Redmond, Washington, 2009 C.Ranger, R.Raghuraman, A.Penmetsa, and G.Bradski, C.Kozyrakis, Evaluating MapReduce for Multi-core and Multiprocessor Systems. Proc. of 13th Symposium on High-PerformanceComputer Architecture (HPCA), Phoenix, AZ(2007) Hadoop, http://hadoop.apache.org/core/

Você também pode gostar

User Authentication Using Colors and Data Security Using Armstrong Numbers For Wireless Sensor Networks
Documento23 páginas
User Authentication Using Colors and Data Security Using Armstrong Numbers For Wireless Sensor Networks
Sachin
Ainda não há avaliações
Pipelined-MapReduce An Improved MapReduce
Documento40 páginas
Pipelined-MapReduce An Improved MapReduce
Sachin
Ainda não há avaliações
Dual Image Watermarking
Documento20 páginas
Dual Image Watermarking
Sachin
Ainda não há avaliações
Android Speech To Text Converter For SMS Application
Documento36 páginas
Android Speech To Text Converter For SMS Application
Sachin
Ainda não há avaliações
Smart Texting System
Documento3 páginas
Smart Texting System
Sachin
Ainda não há avaliações
Development of Certificate Authority For Web Application
Documento22 páginas
Development of Certificate Authority For Web Application
Sachin
Ainda não há avaliações
Content Based Image Retrieval Using Color Texture
Documento19 páginas
Content Based Image Retrieval Using Color Texture
Sachin
Ainda não há avaliações
Mobile Based Lan Monitoring and Control
Documento57 páginas
Mobile Based Lan Monitoring and Control
Sachin
83% (6)
Speech To Text
Documento29 páginas
Speech To Text
Sachin
100% (1)
Mobile Based Lan Monitoring and Control
Documento18 páginas
Mobile Based Lan Monitoring and Control
Sachin
Ainda não há avaliações
Wireless Mobile Charging
Documento32 páginas
Wireless Mobile Charging
Daval Thummala
Ainda não há avaliações
4g Wireless Technology
Documento31 páginas
4g Wireless Technology
divyareddy31
Ainda não há avaliações
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
No Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Nota: 4 de 5 estrelas
4/5 (5784)
The Yellow House: A Memoir (2019 National Book Award Winner)
No Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Nota: 4 de 5 estrelas
4/5 (98)
Never Split the Difference: Negotiating As If Your Life Depended On It
No Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Nota: 4.5 de 5 estrelas
4.5/5 (838)
Shoe Dog: A Memoir by the Creator of Nike
No Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Nota: 4.5 de 5 estrelas
4.5/5 (537)
The Emperor of All Maladies: A Biography of Cancer
No Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Nota: 4.5 de 5 estrelas
4.5/5 (271)
Fear: Trump in the White House
No Everand
Fear: Trump in the White House
Bob Woodward
Nota: 3.5 de 5 estrelas
3.5/5 (738)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
No Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Nota: 4 de 5 estrelas
4/5 (890)
The Little Book of Hygge: Danish Secrets to Happy Living
No Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Nota: 3.5 de 5 estrelas
3.5/5 (399)
Team of Rivals: The Political Genius of Abraham Lincoln
No Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Nota: 4.5 de 5 estrelas
4.5/5 (234)
Yes Please
No Everand
Yes Please
Amy Poehler
Nota: 4 de 5 estrelas
4/5 (1888)
Grit: The Power of Passion and Perseverance
No Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Nota: 4 de 5 estrelas
4/5 (587)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
No Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Nota: 4.5 de 5 estrelas
4.5/5 (265)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
No Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Nota: 3.5 de 5 estrelas
3.5/5 (231)
On Fire: The (Burning) Case for a Green New Deal
No Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Nota: 4 de 5 estrelas
4/5 (72)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
No Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Nota: 4.5 de 5 estrelas
4.5/5 (474)
Principles: Life and Work
No Everand
Principles: Life and Work
Ray Dalio
Nota: 4 de 5 estrelas
4/5 (599)
Rise of ISIS: A Threat We Can't Ignore
No Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Nota: 3.5 de 5 estrelas
3.5/5 (137)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
No Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Nota: 4.5 de 5 estrelas
4.5/5 (344)
The Unwinding: An Inner History of the New America
No Everand
The Unwinding: An Inner History of the New America
George Packer
Nota: 4 de 5 estrelas
4/5 (45)
Steve Jobs
No Everand
Steve Jobs
Walter Isaacson
Nota: 4.5 de 5 estrelas
4.5/5 (806)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
No Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Nota: 3.5 de 5 estrelas
3.5/5 (2219)
Angela's Ashes: A Memoir
No Everand
Angela's Ashes: A Memoir
Frank McCourt
Nota: 4.5 de 5 estrelas
4.5/5 (440)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
No Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Nota: 4 de 5 estrelas
4/5 (1090)
John Adams
No Everand
John Adams
David McCullough
Nota: 4.5 de 5 estrelas
4.5/5 (2409)
Bad Feminist: Essays
No Everand
Bad Feminist: Essays
Roxane Gay
Nota: 4 de 5 estrelas
4/5 (1015)
The Glass Castle: A Memoir
No Everand
The Glass Castle: A Memoir
Jeannette Walls
Nota: 4.5 de 5 estrelas
4.5/5 (1711)
The Outsider: A Novel
No Everand
The Outsider: A Novel
Stephen King
Nota: 4 de 5 estrelas
4/5 (1800)
The Woman in Cabin 10
No Everand
The Woman in Cabin 10
Ruth Ware
Nota: 3.5 de 5 estrelas
3.5/5 (2322)
A Man Called Ove: A Novel
No Everand
A Man Called Ove: A Novel
Fredrik Backman
Nota: 4.5 de 5 estrelas
4.5/5 (4609)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
No Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Nota: 4.5 de 5 estrelas
4.5/5 (119)
The Light Between Oceans: A Novel
No Everand
The Light Between Oceans: A Novel
M.L. Stedman
Nota: 4.5 de 5 estrelas
4.5/5 (789)
Brooklyn: A Novel
No Everand
Brooklyn: A Novel
Colm Tóibín
Nota: 3.5 de 5 estrelas
3.5/5 (1937)
Wolf Hall: A Novel
No Everand
Wolf Hall: A Novel
Hilary Mantel
Nota: 4 de 5 estrelas
4/5 (3811)
Manhattan Beach: A Novel
No Everand
Manhattan Beach: A Novel
Jennifer Egan
Nota: 3.5 de 5 estrelas
3.5/5 (791)
Little Women
No Everand
Little Women
Louisa May Alcott
Nota: 4 de 5 estrelas
4/5 (104)
The Perks of Being a Wallflower
No Everand
The Perks of Being a Wallflower
Stephen Chbosky
Nota: 4.5 de 5 estrelas
4.5/5 (2099)
The Art of Racing in the Rain: A Novel
No Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Nota: 4 de 5 estrelas
4/5 (4193)
A Tree Grows in Brooklyn
No Everand
A Tree Grows in Brooklyn
Betty Smith
Nota: 4.5 de 5 estrelas
4.5/5 (1929)
Her Body and Other Parties: Stories
No Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Nota: 4 de 5 estrelas
4/5 (821)
Sing, Unburied, Sing: A Novel
No Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Nota: 4 de 5 estrelas
4/5 (1103)
The Constant Gardener: A Novel
No Everand
The Constant Gardener: A Novel
John le Carré
Nota: 3.5 de 5 estrelas
3.5/5 (104)
Data Profiling Overview: What Is Data Profiling, and How Can It Help With Data Quality?
Documento3 páginas
Data Profiling Overview: What Is Data Profiling, and How Can It Help With Data Quality?
Nilesh Patil
Ainda não há avaliações
SH 080008 W
Documento498 páginas
SH 080008 W
Lê Trọng Thắng
Ainda não há avaliações
Nihar Meher E-Mail: Mob: +91-: Good Knowledge in Such As Parsing, Commit Phase
Documento3 páginas
Nihar Meher E-Mail: Mob: +91-: Good Knowledge in Such As Parsing, Commit Phase
Nihar Meher
Ainda não há avaliações
KB - DOC - Oracle Bug
Documento6 páginas
KB - DOC - Oracle Bug
SrinathD
Ainda não há avaliações
Ais Final Reviewer, Perfect Cutie
Documento10 páginas
Ais Final Reviewer, Perfect Cutie
ace
Ainda não há avaliações
Financial Peroformance of Transcom Electronics LTD
Documento38 páginas
Financial Peroformance of Transcom Electronics LTD
SharifMahmud
Ainda não há avaliações
Primary Input File: USERID - NAME.COBCASE - PS.INPUT Indexed File: Userid - Yourname.Vsam - Ksds Transaction File: USERID - NAME.COBCASE - PS.TRANS
Documento3 páginas
Primary Input File: USERID - NAME.COBCASE - PS.INPUT Indexed File: Userid - Yourname.Vsam - Ksds Transaction File: USERID - NAME.COBCASE - PS.TRANS
mohan
Ainda não há avaliações
System Configuration and Management
Documento138 páginas
System Configuration and Management
irsmar
Ainda não há avaliações
Document 1582837.1 PDF
Documento7 páginas
Document 1582837.1 PDF
hariprasathdba
Ainda não há avaliações
Part A: Unit I - Statistics For Economics: Syllabus
Documento7 páginas
Part A: Unit I - Statistics For Economics: Syllabus
Radhika Jaiswal
Ainda não há avaliações
Oracle Hyperion Users Group Presentation - Bank of America
Documento34 páginas
Oracle Hyperion Users Group Presentation - Bank of America
Rose Hoang
Ainda não há avaliações
Data Governance Activities An Analysis of The Literature
Documento13 páginas
Data Governance Activities An Analysis of The Literature
Simon Rojas
Ainda não há avaliações
IMPORTANT: These Routine Tasks Include SQL Server Maintenance Jobs That Keep The Data and The
Documento5 páginas
IMPORTANT: These Routine Tasks Include SQL Server Maintenance Jobs That Keep The Data and The
temptiger
Ainda não há avaliações
Digital Forensic Tool Kit
Documento351 páginas
Digital Forensic Tool Kit
donkalonk
100% (1)
A Detailed Lesson Plan For Grade 7
Documento7 páginas
A Detailed Lesson Plan For Grade 7
Norbelle Lou
Ainda não há avaliações
BSIT AbadianoEM 2013 Ab
Documento2 páginas
BSIT AbadianoEM 2013 Ab
Kurama
Ainda não há avaliações
BD-Practice Questions-Aut
Documento3 páginas
BD-Practice Questions-Aut
Suman Pandit
Ainda não há avaliações
A Research On Error Analysis Made by Males and Females Students
Documento10 páginas
A Research On Error Analysis Made by Males and Females Students
Risyda Sari Itsna
Ainda não há avaliações
BPUT MCA 2007 - 2009 Solved Questions
Documento11 páginas
BPUT MCA 2007 - 2009 Solved Questions
dnayak3821396
Ainda não há avaliações
Socket Programming and Threading
Documento9 páginas
Socket Programming and Threading
Namra Farman Rao
Ainda não há avaliações
CRUD Operations
Documento17 páginas
CRUD Operations
zanibab
Ainda não há avaliações
Reading Writing Listening Speaking: Lesson Preparation
Documento3 páginas
Reading Writing Listening Speaking: Lesson Preparation
Cecília Almeida
Ainda não há avaliações
Guide About Tcpip Connections Between Pcs and Siemens Plcs
Documento11 páginas
Guide About Tcpip Connections Between Pcs and Siemens Plcs
nathan_allgren
Ainda não há avaliações
Impact of Accounting Software For Business Performance: January 2017
Documento7 páginas
Impact of Accounting Software For Business Performance: January 2017
Christine Abcede
Ainda não há avaliações
DBempresa database creation and queries
Documento2 páginas
DBempresa database creation and queries
jose gutierrez
Ainda não há avaliações
Mahmud RAHMAN - 1.2 Workbook (Part 2)
Documento30 páginas
Mahmud RAHMAN - 1.2 Workbook (Part 2)
Mahmud Rahman
Ainda não há avaliações
LS1.0 - 0 DSECL ZC556 SPA Course Introduction
Documento9 páginas
LS1.0 - 0 DSECL ZC556 SPA Course Introduction
R Krish
Ainda não há avaliações
FW Monitor Cheat Sheet
Documento2 páginas
FW Monitor Cheat Sheet
netzsheriff
Ainda não há avaliações
Advanced Database Systemsssssssssss
Documento578 páginas
Advanced Database Systemsssssssssss
asmitachauhan
94% (18)
SplunkCheatsheet PDF
Documento2 páginas
SplunkCheatsheet PDF
Gyan Sharma
100% (1)