Ebook552 pages3 hours

Learning YARN

Name: Learning YARN
Author: Akhil Arora
ISBN: 9781784394585

By Akhil Arora and Shrey Mehrotra

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Moving beyond MapReduce - learn resource management and big data processing using YARN

About This Book

Deep dive into YARN components, schedulers, life cycle management and security architecture
Create your own Hadoop-YARN applications and integrate big data technologies with YARN
Step-by-step guide to provision, manage, and monitor Hadoop-YARN clusters with ease

Who This Book Is For

This book is intended for those who want to understand what YARN is and how to efficiently use it for the resource management of large clusters. For cluster administrators, this book gives a detailed explanation of provisioning and managing YARN clusters. If you are a Java developer or an open source contributor, this book will help you to drill down the YARN architecture, write your own YARN applications and understand the application execution phases. This book will also help big data engineers explore YARN integration with real-time analytics technologies such as Spark and Storm.

What You Will Learn

Explore YARN features and offerings
Manage big data clusters efficiently using the YARN framework
Create single as well as multi-node Hadoop-YARN clusters on Linux machines
Understand YARN components and their administration
Gain insights into application execution flow over a YARN cluster
Write your own distributed application and execute it over YARN cluster
Work with schedulers and queues for efficient scheduling of applications
Integrate big data projects like Spark and Storm with YARN

In Detail

Today enterprises generate huge volumes of data. In order to provide effective services and to make smarter and more intelligent decisions from these huge volumes of data, enterprises use big-data analytics. In recent years, Hadoop has been used for massive data storage and efficient distributed processing of data. The Yet Another Resource Negotiator (YARN) framework solves the design problems related to resource management faced by the Hadoop 1.x framework by providing a more scalable, efficient, flexible, and highly available resource management framework for distributed data processing.

This book starts with an overview of the YARN features and explains how YARN provides a business solution for growing big data needs. You will learn to provision and manage single, as well as multi-node, Hadoop-YARN clusters in the easiest way. You will walk through the YARN administration, life cycle management, application execution, REST APIs, schedulers, security framework and so on. You will gain insights about the YARN components and features such as ResourceManager, NodeManager, ApplicationMaster, Container, Timeline Server, High Availability, Resource Localisation and so on.

The book explains Hadoop-YARN commands and the configurations of components and explores topics such as High Availability, Resource Localization and Log aggregation. You will then be ready to develop your own ApplicationMaster and execute it over a Hadoop-YARN cluster.

Towards the end of the book, you will learn about the security architecture and integration of YARN with big data technologies like Spark and Storm. This book promises conceptual as well as practical knowledge of resource management using YARN.

Style and approach

Starting with the basics and covering the core concepts with the practical usage, this tutorial is a complete guide to learn and explore YARN offerings.

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateAug 28, 2015

ISBN9781784394585

Author

Akhil Arora

Akhil Arora works as a senior software engineer with Impetus Infotech and has around 5 years of extensive research and development experience. He joined Impetus Infotech in October 2012 and is working with the innovation labs team. He is a technology expert, good learner, and creative thinker. He is also passionate and enthusiastic about application development in Hadoop and other big data technologies. He loves to explore new technologies and is always ready to work on new challenges. Akhil attained a BE degree in computer science from the Apeejay College of Engineering in Sohna, Haryana, India. A beginning for a new voyage, A first step towards my passion and to gain recognition, My first book Learning YARN..!! -- Akhil Arora

Related authors

Skip carousel

Related to Learning YARN

Related ebooks

Skip carousel

HDInsight Essentials - Second Edition
Ebook
HDInsight Essentials - Second Edition
byRajesh Nadipalli
Rating: 0 out of 5 stars
0 ratings
Learning Apache Spark 2
Ebook
Learning Apache Spark 2
byMuhammad Asif Abbasi
Rating: 0 out of 5 stars
0 ratings
Big Data Analytics
Ebook
Big Data Analytics
byVenkat Ankam
Rating: 0 out of 5 stars
0 ratings
Apache Solr Search Patterns
Ebook
Apache Solr Search Patterns
byJayant Kumar
Rating: 0 out of 5 stars
0 ratings
Scala for Data Science
Ebook
Scala for Data Science
byBugnion Pascal
Rating: 0 out of 5 stars
0 ratings
Learning Hadoop 2
Ebook
Learning Hadoop 2
byGarry Turkington
Rating: 4 out of 5 stars
4/5
Fast Data Processing with Spark 2 - Third Edition
Ebook
Fast Data Processing with Spark 2 - Third Edition
byKrishna Sankar
Rating: 0 out of 5 stars
0 ratings
Scalatra in Action
Ebook
Scalatra in Action
byRoss Baker
Rating: 0 out of 5 stars
0 ratings
YARN Essentials
Ebook
YARN Essentials
byAmol Fasale
Rating: 0 out of 5 stars
0 ratings
Learning ELK Stack
Ebook
Learning ELK Stack
byChhajed Saurabh
Rating: 0 out of 5 stars
0 ratings
Fast Data Processing Systems with SMACK Stack
Ebook
Fast Data Processing Systems with SMACK Stack
byRaúl Estrada
Rating: 0 out of 5 stars
0 ratings
Spark in Action: Covers Apache Spark 3 with Examples in Java, Python, and Scala
Ebook
Spark in Action: Covers Apache Spark 3 with Examples in Java, Python, and Scala
byJean-Georges Perrin
Rating: 0 out of 5 stars
0 ratings
Cloudera Administration Handbook
Ebook
Cloudera Administration Handbook
byRohit Menon
Rating: 0 out of 5 stars
0 ratings
Big Data Forensics – Learning Hadoop Investigations
Ebook
Big Data Forensics – Learning Hadoop Investigations
byJoe Sremack
Rating: 0 out of 5 stars
0 ratings
Mastering Spark for Data Science
Ebook
Mastering Spark for Data Science
byDavid George
Rating: 0 out of 5 stars
0 ratings
Scaling Big Data with Hadoop and Solr - Second Edition
Ebook
Scaling Big Data with Hadoop and Solr - Second Edition
byHrishikesh Vijay Karambelkar
Rating: 0 out of 5 stars
0 ratings
Scala Data Analysis Cookbook
Ebook
Scala Data Analysis Cookbook
byManivannan Arun
Rating: 0 out of 5 stars
0 ratings
Azure Databricks A Complete Guide - 2019 Edition
Ebook
Azure Databricks A Complete Guide - 2019 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Elasticsearch 8 for Developers - 2nd Edition: A beginner's guide to indexing, analyzing, searching, and aggregating data (English Edition)
Ebook
Elasticsearch 8 for Developers - 2nd Edition: A beginner's guide to indexing, analyzing, searching, and aggregating data (English Edition)
byAnurag Srivastava
Rating: 0 out of 5 stars
0 ratings
Metadata Repositories Complete Self-Assessment Guide
Ebook
Metadata Repositories Complete Self-Assessment Guide
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Managing Multimedia and Unstructured Data in the Oracle Database
Ebook
Managing Multimedia and Unstructured Data in the Oracle Database
byMarcelle Kratochvil
Rating: 0 out of 5 stars
0 ratings
Hadoop: Data Processing and Modelling
Ebook
Hadoop: Data Processing and Modelling
byGarry Turkington
Rating: 0 out of 5 stars
0 ratings
Hadoop Blueprints
Ebook
Hadoop Blueprints
byAnurag Shrivastava
Rating: 0 out of 5 stars
0 ratings
Learning Azure DocumentDB
Ebook
Learning Azure DocumentDB
byBecker Riccardo
Rating: 0 out of 5 stars
0 ratings
Learning HBase
Ebook
Learning HBase
byShashwat Shriparv
Rating: 0 out of 5 stars
0 ratings
Hadoop 2.x Administration Cookbook
Ebook
Hadoop 2.x Administration Cookbook
byGurmukh Singh
Rating: 0 out of 5 stars
0 ratings
Digital Rights Management: A Librarian’s Guide to Technology and Practise
Ebook
Digital Rights Management: A Librarian’s Guide to Technology and Practise
byGrace Agnew
Rating: 3 out of 5 stars
3/5
Apache Oozie Essentials
Ebook
Apache Oozie Essentials
bySingh Jagat Jasjit
Rating: 0 out of 5 stars
0 ratings
Hadoop MapReduce v2 Cookbook - Second Edition
Ebook
Hadoop MapReduce v2 Cookbook - Second Edition
byThilina Gunarathne
Rating: 0 out of 5 stars
0 ratings
IPv6 Complete Self-Assessment Guide
Ebook
IPv6 Complete Self-Assessment Guide
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings

Intelligence (AI) & Semantics For You

Skip carousel

ChatGPT For Dummies
Ebook
ChatGPT For Dummies
byPam Baker
Rating: 0 out of 5 stars
0 ratings
Midjourney Mastery - The Ultimate Handbook of Prompts
Ebook
Midjourney Mastery - The Ultimate Handbook of Prompts
byAndreea Todinca
Rating: 5 out of 5 stars
5/5
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
AI for Educators: AI for Educators
Ebook
AI for Educators: AI for Educators
byMatt Miller
Rating: 5 out of 5 stars
5/5
80 Ways to Use ChatGPT in the Classroom
Ebook
80 Ways to Use ChatGPT in the Classroom
byStan Skrabut
Rating: 5 out of 5 stars
5/5
101 Midjourney Prompt Secrets
Ebook
101 Midjourney Prompt Secrets
byMarcus Byrne
Rating: 3 out of 5 stars
3/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
ChatGPT For Fiction Writing: AI for Authors
Ebook
ChatGPT For Fiction Writing: AI for Authors
byNova Leigh
Rating: 5 out of 5 stars
5/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
Ebook
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
byHadelin de Ponteves
Rating: 0 out of 5 stars
0 ratings
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5
ChatGPT Millionaire 2024 - Bot-Driven Side Hustles, Prompt Engineering Shortcut Secrets, and Automated Income Streams that Print Money While You Sleep. The Ultimate Beginner’s Guide for AI Business
Ebook
ChatGPT Millionaire 2024 - Bot-Driven Side Hustles, Prompt Engineering Shortcut Secrets, and Automated Income Streams that Print Money While You Sleep. The Ultimate Beginner’s Guide for AI Business
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Rise of Generative AI and ChatGPT: Understand how Generative AI and ChatGPT are transforming and reshaping the business world (English Edition)
Ebook
Rise of Generative AI and ChatGPT: Understand how Generative AI and ChatGPT are transforming and reshaping the business world (English Edition)
byUtpal Chakraborty
Rating: 0 out of 5 stars
0 ratings
Artificial Intelligence: A Guide for Thinking Humans
Ebook
Artificial Intelligence: A Guide for Thinking Humans
byMelanie Mitchell
Rating: 4 out of 5 stars
4/5
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
Ebook
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
bySebastian Raschka
Rating: 5 out of 5 stars
5/5
A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)
Ebook
A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)
byS M Howard
Rating: 4 out of 5 stars
4/5
ChatGPT: The Future of Intelligent Conversation
Ebook
ChatGPT: The Future of Intelligent Conversation
byCea West
Rating: 4 out of 5 stars
4/5
The Secrets of ChatGPT Prompt Engineering for Non-Developers
Ebook
The Secrets of ChatGPT Prompt Engineering for Non-Developers
byCea West
Rating: 5 out of 5 stars
5/5
Dancing with Qubits: How quantum computing works and how it can change the world
Ebook
Dancing with Qubits: How quantum computing works and how it can change the world
byRobert S. Sutor
Rating: 5 out of 5 stars
5/5
ChatGPT
Ebook
ChatGPT
byRobert Conway
Rating: 1 out of 5 stars
1/5
Enterprise AI For Dummies
Ebook
Enterprise AI For Dummies
byZachary Jarvinen
Rating: 3 out of 5 stars
3/5
Hacking With Linux 2020:A Complete Beginners Guide to the World of Hacking Using Linux - Explore the Methods and Tools of Ethical Hacking with Linux
Ebook
Hacking With Linux 2020:A Complete Beginners Guide to the World of Hacking Using Linux - Explore the Methods and Tools of Ethical Hacking with Linux
byJoseph Kenna
Rating: 0 out of 5 stars
0 ratings
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
Ebook
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
byThe Passive Income Strategist
Rating: 4 out of 5 stars
4/5
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
Ebook
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
byMatthew Hayes
Rating: 0 out of 5 stars
0 ratings
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
Ebook
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
byTJ Books
Rating: 3 out of 5 stars
3/5
The Dangers of Automation in Airliners: Accidents Waiting to Happen
Ebook
The Dangers of Automation in Airliners: Accidents Waiting to Happen
byJack J. Hersch
Rating: 5 out of 5 stars
5/5
TensorFlow in 1 Day: Make your own Neural Network
Ebook
TensorFlow in 1 Day: Make your own Neural Network
byKrishna Rungta
Rating: 4 out of 5 stars
4/5
The Algorithm of the Universe (A New Perspective to Cognitive AI)
Ebook
The Algorithm of the Universe (A New Perspective to Cognitive AI)
byAncient Philosophy
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

Beam and Spark with Holden Karau: This week our colleague, Holden Karau, joins us to talk about Spark and Beam.
Podcast episode
Beam and Spark with Holden Karau: This week our colleague, Holden Karau, joins us to talk about Spark and Beam.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Speed Up And Simplify Your Streaming Data Workloads With Red Panda - Episode 152: An interview with Vectorized founder Alexander Gallego about the Red Panda streaming engine and building a drop-in replacement for Kafka with better performance and throughput.
Podcast episode
Speed Up And Simplify Your Streaming Data Workloads With Red Panda - Episode 152: An interview with Vectorized founder Alexander Gallego about the Red Panda streaming engine and building a drop-in replacement for Kafka with better performance and throughput.
byData Engineering Podcast
0 ratings
0% found this document useful
Build Your Data Analytics Like An Engineer - Episode 81: An interview about how dbt enables your data teams to build better analytics in your data warehouse
Podcast episode
Build Your Data Analytics Like An Engineer - Episode 81: An interview about how dbt enables your data teams to build better analytics in your data warehouse
byData Engineering Podcast
0 ratings
0% found this document useful
Ali Ghodsi – The Past, Present, and Future of Big Data – [Founder’s Field Guide, EP.18]: My Guest today is Ali Ghodsi, founder and CEO of Databricks, a data analytics platform for data scientists and developers. He's also the founder of Apache Spark, the open-source project that Databricks is built on, and is an accomplished researcher at...
Podcast episode
Ali Ghodsi – The Past, Present, and Future of Big Data – [Founder’s Field Guide, EP.18]: My Guest today is Ali Ghodsi, founder and CEO of Databricks, a data analytics platform for data scientists and developers. He's also the founder of Apache Spark, the open-source project that Databricks is built on, and is an accomplished researcher at...
byInvest Like the Best with Patrick O'Shaughnessy
0 ratings
0% found this document useful
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
Podcast episode
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
byScreaming in the Cloud
0 ratings
0% found this document useful
Level Up Your Data Platform With Active Metadata: A conversation with Atlan co-founder Prukalpa Sankar about the idea of active metadata and how it can reduce the toil involved in managing a data platform
Podcast episode
Level Up Your Data Platform With Active Metadata: A conversation with Atlan co-founder Prukalpa Sankar about the idea of active metadata and how it can reduce the toil involved in managing a data platform
byData Engineering Podcast
0 ratings
0% found this document useful
Reflections On Designing A Data Platform From Scratch: A monologue by Tobias Macey, the host of the show, about the design considerations involved in building a data platform and how the lessons learned from running the Data Engineering Podcast are influencing the choices made.
Podcast episode
Reflections On Designing A Data Platform From Scratch: A monologue by Tobias Macey, the host of the show, about the design considerations involved in building a data platform and how the lessons learned from running the Data Engineering Podcast are influencing the choices made.
byData Engineering Podcast
100%
100% found this document useful
Data Security in Snowflake’s Data Cloud with Dan Myers: Snowflake went public last year and is one of the fastest growing companies in the data cloud space. Businesses from all over the world are utilizing Snowflake for data storage, processing, and analytics. Businesses using Snowflake are storing massive am...
Podcast episode
Data Security in Snowflake’s Data Cloud with Dan Myers: Snowflake went public last year and is one of the fastest growing companies in the data cloud space. Businesses from all over the world are utilizing Snowflake for data storage, processing, and analytics. Businesses using Snowflake are storing massive am...
byPartially Redacted: Data Privacy, Security & Compliance
0 ratings
0% found this document useful
Stackdriver Diagnostics Tools with Sharat Shroff and Morgan McLean: Sharat Shroff and Morgan McLean, product managers at Google Cloud, cover some of the Stackdriver tools at your disposal when you're investigating an issue on your application.
Podcast episode
Stackdriver Diagnostics Tools with Sharat Shroff and Morgan McLean: Sharat Shroff and Morgan McLean, product managers at Google Cloud, cover some of the Stackdriver tools at your disposal when you're investigating an issue on your application.
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Episode 534: Load Testing Rails Apps with JMeter ft. Milap Neupane - RUBY 509
Podcast episode
Episode 534: Load Testing Rails Apps with JMeter ft. Milap Neupane - RUBY 509
byRuby Rogues
0 ratings
0% found this document useful
Package Management in Elixir vs. JavaScript with Wojtek Mach & Amal Hussein: Wojtek Mach of HexPM and Amal Hussein, engineering leader and former NPM team member, join Owen Bickford to compare notes on package management in Elixir vs. JavaScript.
Podcast episode
Package Management in Elixir vs. JavaScript with Wojtek Mach & Amal Hussein: Wojtek Mach of HexPM and Amal Hussein, engineering leader and former NPM team member, join Owen Bickford to compare notes on package management in Elixir vs. JavaScript.
byElixir Wizards
0 ratings
0% found this document useful
Serverless Data APIs
Podcast episode
Serverless Data APIs
byThe Cloudcast
0 ratings
0% found this document useful
RAILS_ENV - Ruby 556
Podcast episode
RAILS_ENV - Ruby 556
byRuby Rogues
0 ratings
0% found this document useful
Spanner Myths Busted with Pritam Shah and Vaibhav Govil: This week, we’re busting myths around Google Cloud Spanner with our guests Pritam Shah and Vaibhav Govil. and host this episode and learn about the fantastic capabilities of Cloud Spanner. Our guests give us a quick run-down of Spanner database...
Podcast episode
Spanner Myths Busted with Pritam Shah and Vaibhav Govil: This week, we’re busting myths around Google Cloud Spanner with our guests Pritam Shah and Vaibhav Govil. and host this episode and learn about the fantastic capabilities of Cloud Spanner. Our guests give us a quick run-down of Spanner database...
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Birthday tin, logging HTTP clients, and Inertia's head: Jake and Michael discuss all the latest Laravel releases, tutorials, and happenings in the community.
Podcast episode
Birthday tin, logging HTTP clients, and Inertia's head: Jake and Michael discuss all the latest Laravel releases, tutorials, and happenings in the community.
byLaravel News Podcast
0 ratings
0% found this document useful
SAP Devtoberfest How to simplify your data fetching life with RTK Query: We are super excited to announce that Devtoberfest returns this year running from October 1st - 31st 2022. It's a celebration for SAP developers and an excellent way to get ready for SAP TechEd.
Podcast episode
SAP Devtoberfest How to simplify your data fetching life with RTK Query: We are super excited to announce that Devtoberfest returns this year running from October 1st - 31st 2022. It's a celebration for SAP developers and an excellent way to get ready for SAP TechEd.
bySAP Developers
0 ratings
0% found this document useful
Databases on Kubernetes, Why Database-as-a-Service matters
Podcast episode
Databases on Kubernetes, Why Database-as-a-Service matters
byKubernetes Bytes
0 ratings
0% found this document useful
"Saga of a Gnarly Report" with Owen and Dan: Elixir Wizards Owen and Dan delve into the complexities of building advanced reporting features within software applications. They share personal insights and challenges encountered while developing reporting solutions for user-generated data, leveraging both Elixir/Phoenix and Ruby on Rails.
Podcast episode
"Saga of a Gnarly Report" with Owen and Dan: Elixir Wizards Owen and Dan delve into the complexities of building advanced reporting features within software applications. They share personal insights and challenges encountered while developing reporting solutions for user-generated data, leveraging both Elixir/Phoenix and Ruby on Rails.
byElixir Wizards
0 ratings
0% found this document useful
S2 E19 - Dynamic Module + Component Loading Using any Observables: Single Page Applications (SPA) have many advantages, including increased interactivity, responsiveness, and user experience. However, a SPA often requires sending large chunks of JavaScript code to the client. This code must be downloaded and parsed...
Podcast episode
S2 E19 - Dynamic Module + Component Loading Using any Observables: Single Page Applications (SPA) have many advantages, including increased interactivity, responsiveness, and user experience. However, a SPA often requires sending large chunks of JavaScript code to the client. This code must be downloaded and parsed...
byThe Angular Plus Show
0 ratings
0% found this document useful
Yaniv Tal: The Graph – A Marketplace for Web3 Data Indexes Based on GraphQL: We're joined by Yaniv Tal, Project Lead at The Graph. The project aims to create a scalable marketplace for high-availability blockchain data indexes.
Podcast episode
Yaniv Tal: The Graph – A Marketplace for Web3 Data Indexes Based on GraphQL: We're joined by Yaniv Tal, Project Lead at The Graph. The project aims to create a scalable marketplace for high-availability blockchain data indexes.
byEpicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies
0 ratings
0% found this document useful
MySQL on Kubernetes
Podcast episode
MySQL on Kubernetes
byKubernetes Bytes
0 ratings
0% found this document useful
Faster asset compilation, faster pagination, and query correlation: Jake and Michael discuss all the latest Laravel releases, tutorials, and happenings in the community.
Podcast episode
Faster asset compilation, faster pagination, and query correlation: Jake and Michael discuss all the latest Laravel releases, tutorials, and happenings in the community.
byLaravel News Podcast
0 ratings
0% found this document useful
Grabbing a Pint, dry requests, and supercharging your pipelines: Jake and Michael discuss all the latest Laravel releases, tutorials, and happenings in the community.
Podcast episode
Grabbing a Pint, dry requests, and supercharging your pipelines: Jake and Michael discuss all the latest Laravel releases, tutorials, and happenings in the community.
byLaravel News Podcast
0 ratings
0% found this document useful
SAP Developer News February 8th, 2024: Join some of our SAP Developer Advocates Team, whilst they summarize their highlights of 2023, in part two of this special edition.
Podcast episode
SAP Developer News February 8th, 2024: Join some of our SAP Developer Advocates Team, whilst they summarize their highlights of 2023, in part two of this special edition.
bySAP Developers
0 ratings
0% found this document useful
Tinkering well, bouncing users, and validating configs: Jake and Michael discuss all the latest Laravel releases, tutorials, and happenings in the community.
Podcast episode
Tinkering well, bouncing users, and validating configs: Jake and Michael discuss all the latest Laravel releases, tutorials, and happenings in the community.
byLaravel News Podcast
0 ratings
0% found this document useful
Episode 465 - Functions on Azure Container Apps: Evan and Sujit talk to Principal PM Ramya Oruganti about the release of Functions as a first class managed service on Azure Container Apps. She tells us how this opens up possibilities for event-driven applications on ACA with full support for KEDA, Bindings, Dapr, GitHub Actions etc.   Media file: https://azpodcast.blob.core.windows.net/episodes/Episode465.mp3 YouTube: https://youtu.be/0XR0PJuB_ko Resources: https://github.com/Azure/azure-functions-on-container-apps https://techcommunity.microsoft.com/t5/apps-on-azure-blog/azure-functions-for-cloud-native-microservices-public-preview/ba-p/3826800 https://learn.microsoft.com/en-us/azure/azure-functions/functions-deploy-container-apps?tabs=acr%2Cbash&pivots=programming-language-csharp   Other updates: Understanding Service Health communications for Azure vulnerabilities | Azure Blog | Microsoft Azure Public preview: Assess impact of service retirements workbook tem
Podcast episode
Episode 465 - Functions on Azure Container Apps: Evan and Sujit talk to Principal PM Ramya Oruganti about the release of Functions as a first class managed service on Azure Container Apps. She tells us how this opens up possibilities for event-driven applications on ACA with full support for KEDA, Bindings, Dapr, GitHub Actions etc.   Media file: https://azpodcast.blob.core.windows.net/episodes/Episode465.mp3 YouTube: https://youtu.be/0XR0PJuB_ko Resources: https://github.com/Azure/azure-functions-on-container-apps https://techcommunity.microsoft.com/t5/apps-on-azure-blog/azure-functions-for-cloud-native-microservices-public-preview/ba-p/3826800 https://learn.microsoft.com/en-us/azure/azure-functions/functions-deploy-container-apps?tabs=acr%2Cbash&pivots=programming-language-csharp   Other updates: Understanding Service Health communications for Azure vulnerabilities | Azure Blog | Microsoft Azure Public preview: Assess impact of service retirements workbook tem
byThe Azure Podcast
0 ratings
0% found this document useful
Sailing, Rays, and Socialstreams: Jake and Michael discuss all the latest Laravel releases, tutorials, and happenings in the community.
Podcast episode
Sailing, Rays, and Socialstreams: Jake and Michael discuss all the latest Laravel releases, tutorials, and happenings in the community.
byLaravel News Podcast
0 ratings
0% found this document useful
59 ngAir - Angular 2 testing using Protractor 2C Karma and more with Julie Ralph: Angular 2 testing using Protractor, Karma and more withJulie Ralph Panelists: PatrickJS, Ed Conolly Guests: Julie RalphOutline Background Testing landscape Firstquestion...Jasmine or Mocha? j/k Seriously, though, give us a layout of the testinglandsc...
Podcast episode
59 ngAir - Angular 2 testing using Protractor 2C Karma and more with Julie Ralph: Angular 2 testing using Protractor, Karma and more withJulie Ralph Panelists: PatrickJS, Ed Conolly Guests: Julie RalphOutline Background Testing landscape Firstquestion...Jasmine or Mocha? j/k Seriously, though, give us a layout of the testinglandsc...
byAngular Air
0 ratings
0% found this document useful
How Decentralized Social Network Farcaster Hopes to Eventually Get to One Billion Users - Ep. 607: Dan Romero, the co-founder of the decentralized social media network, discusses why Frames has become so popular, the central ideas behind Farcaster, and what’s in store for Farcaster’s future.
Podcast episode
How Decentralized Social Network Farcaster Hopes to Eventually Get to One Billion Users - Ep. 607: Dan Romero, the co-founder of the decentralized social media network, discusses why Frames has become so popular, the central ideas behind Farcaster, and what’s in store for Farcaster’s future.
byUnchained
0 ratings
0% found this document useful
Cassandra on Kubernetes using K8ssandra
Podcast episode
Cassandra on Kubernetes using K8ssandra
byKubernetes Bytes
0 ratings
0% found this document useful

Skip carousel

Build Your Own URL Shortening Service
Linux Format
Article
Build Your Own URL Shortening Service
May 4, 2021
7 min read
Build A Static Analysis Development Pipeline
Linux Format
Article
Build A Static Analysis Development Pipeline
Jul 27, 2021
9 min read
The Razor’s Edge
Linux Format
Article
The Razor’s Edge
Mar 10, 2020
10 min read
Perl at 34
Linux Format
Article
Perl at 34
Feb 8, 2022
7 min read
ManageEngine OpManager Professional 12.7
PC Pro Magazine
Article
ManageEngine OpManager Professional 12.7
Feb 8, 2024
2 min read
Poisoning The Well
Linux Format
Article
Poisoning The Well
Jan 11, 2022
4 min read
Beef Up Network Security
MacLife
Article
Beef Up Network Security
May 29, 2018
2 min read
The Network NAS appliances 2024
PC Pro Magazine
Article
The Network NAS appliances 2024
Apr 4, 2024
4 min read
Five Key VPNs for Your Apple Gear
Essential Apple User Magazine
Article
Five Key VPNs for Your Apple Gear
Aug 1, 2021
5 min read
Accurate, Open Source IP-based Localisation
Linux Format
Article
Accurate, Open Source IP-based Localisation
Dec 14, 2021
8 min read
MacOS Monterey 12.5 Is Now Available And Full Of Security Updates
Macworld UK
Article
MacOS Monterey 12.5 Is Now Available And Full Of Security Updates
Aug 19, 2022
4 min read
Observability Of The Kernel And Containers
Linux Format
Article
Observability Of The Kernel And Containers
Apr 4, 2023
Mihalis Tsoukalos is currently working on Time Series. You can reach him at: @mactsouk. For our final delve into eBPF, we’re tackling applications, the kernel and Docker containers. At the end of the day, all Linux machines execute code for applicat
10 min read
LATEST NEWS & GEAR
RotorDrone Pro
Article
LATEST NEWS & GEAR
Mar 17, 2020
Terra Drone Europe, a Japan-based corporation, has joined forces with Unilever, the British-Dutch consumer goods giant, to explore drone delivery service of Unilever’s Ben & Jerry’s ice cream brand in New York. To introduce the delivery service—Ice C
8 min read
Leveling Up Remote Management
Residential Tech Today
Article
Leveling Up Remote Management
Sep 29, 2023
2 min read
Network Issues
MacLife
Article
Network Issues
May 29, 2018
These days, many wireless networks overlap due to their number and extended range, which can cause interference that manifests with poor performance and connection issues at shorter-than-expected ranges. Add in an ever-growing number of devices tryin
4 min read
Other Pros And Cons
Linux Format
Article
Other Pros And Cons
Feb 7, 2023
1 min read
Networking
MacFormat
Article
Networking
Mar 5, 2024
> This is likely to result from that Mac having two network connections. When macOS checks every couple of days to ensure that its name is different from others on that local network, it sees itself through the other interface, assumes it’s another M
1 min read
Lag Is Killing Games
Linux Format
Article
Lag Is Killing Games
Jan 11, 2022
8 min read
Optimising Your Quality Of Service
TechLife
Article
Optimising Your Quality Of Service
May 3, 2021
4 min read
Ring The Plumber!
Linux Format
Article
Ring The Plumber!
Aug 25, 2020
“A key improvement that PipeWire brings over PulseAudio is that the policy is separate from the media handling. The policy is the code that decides which audio device to use, which software has access to them and how it’s all connected. In PipeWire,
1 min read
Quick–fire Questions & Answers
MacLife
Article
Quick–fire Questions & Answers
May 25, 2021
The way to do this is changing, and currently depends on where in the world you are. In the United States, Apple now provides an online self–service system at bit.ly/mac364 activlock1. For other countries, you can enter the system with its guidance t
2 min read
Hammerhead Karoo2
Australian Mountain Bike
Article
Hammerhead Karoo2
May 8, 2022
5 min read
Pull, Configure And Run
Linux Format
Article
Pull, Configure And Run
Apr 7, 2020
Guacamole offers ready-to-run installation packages that are available for Linux distros such as CentOS or Debian. However, the thrust of this article is to illustrate running Guacamole in a Docker container context. Fire up an environment where you
8 min read
Yarp: A Similar Framework
Linux Format
Article
Yarp: A Similar Framework
Jan 12, 2021
YARP is the framework for communications within robotics. It can replace the ROS master as a name server. You can also do this the other way around using the YARP as a name server. The name server will support the nodes and protocols across your syst
1 min read
What’s Killing Your Wi-fi?
PC Pro Magazine
Article
What’s Killing Your Wi-fi?
May 13, 2021
9 min read
Art Beyond The Canvas
Linux Format
Article
Art Beyond The Canvas
May 2, 2023
9 min read
Synology RT2600ac
MacLife
Article
Synology RT2600ac
May 11, 2017
2 min read
Private Internet Access
Linux Format
Article
Private Internet Access
Dec 12, 2023
2 min read
Network-monitoring software 2024
PC Pro Magazine
Article
Network-monitoring software 2024
Feb 8, 2024
4 min read
Set Up A Production- Ready Web Server
APC
Article
Set Up A Production- Ready Web Server
Nov 4, 2019
8 min read

Related categories

Skip carousel

Reviews for Learning YARN

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Learning YARN - Akhil Arora

Learning YARN

Credits

About the Authors

Acknowledgments

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Starting with YARN Basics

Introduction to MapReduce v1

Shortcomings of MapReducev1

An overview of YARN components

ResourceManager

NodeManager

ApplicationMaster

Container

The YARN architecture

How YARN satisfies big data needs

Projects powered by YARN

Summary

2. Setting up a Hadoop-YARN Cluster

Starting with the basics

Supported platforms

Hardware requirements

Software requirements

Basic Linux commands / utilities

Sudo

Nano editor

Source

Jps

Netstat

Man

Preparing a node for a Hadoop-YARN cluster

Install Java

Create a Hadoop dedicated user and group

Disable firewall or open Hadoop ports

Configure the domain name resolution

Install SSH and configure passwordless SSH from the master to all slaves

The Hadoop-YARN single node installation

Prerequisites

Installation steps

Step 1 – Download and extract the Hadoop bundle

Step 2 – Configure the environment variables

Step 3 – Configure the Hadoop configuration files

The core-site.xml file

The hdfs-site.xml file

The mapred-site.xml file

The yarn-site.xml file

The hadoop-env.sh and yarn-env.sh files

The slaves file

Step 4 – Format NameNode

Step 5 – Start Hadoop daemons

An overview of web user interfaces

Run a sample application

The Hadoop-YARN multi-node installation

Prerequisites

Installation steps

Step 1 – Configure the master node as a single-node Hadoop-YARN installation

Step 2 – Copy the Hadoop folder to all the slave nodes

Step 3 – Configure environment variables on slave nodes

Step 4 – Format NameNode

Step 5 – Start Hadoop daemons

An overview of the Hortonworks and Cloudera installations

Summary

3. Administering a Hadoop-YARN Cluster

Using the Hadoop-YARN commands

The user commands

Jar

Application

Command options

Sample output

Node

Command options

Sample output

Logs

Command options

Classpath

Version

Administration commands

ResourceManager / NodeManager / ProxyServer

RMAdmin

Command options

DaemonLog

Command options

Configuring the Hadoop-YARN services

The ResourceManager service

The NodeManager service

The Timeline server

The web application proxy server

Ports summary

Managing the Hadoop-YARN services

Managing service logs

Managing pid files

Monitoring the YARN services

JMX monitoring

The ResourceManager JMX beans

The NodeManager JMX beans

Ganglia monitoring

Ganglia daemons

Integrating Ganglia with Hadoop

Understanding ResourceManager's High Availability

Architecture

Failover mechanisms

Configuring ResourceManager's High Availability

Define nodes

The RM state store mechanism

The failover proxy provider

Automatic failover

High Availability admin commands

Monitoring NodeManager's health

The health checker script

Summary

4. Executing Applications Using YARN

Understanding application execution flow

Phase 1 – Application initialization and submission

Phase 2 – Allocate memory and start ApplicationMaster

Phase 3 – ApplicationMaster registration and resource allocation

Phase 4 – Launch and monitor containers

Phase 5 – Application progress report

Phase 6 – Application completion

Submitting a sample MapReduce application

Submitting an application to the cluster

Updates in the ResourceManager web UI

Understanding the application process

Tracking application details

The ApplicationMaster process

Cluster nodes information

Node's container list

YARN child processes

Application details after completion

Handling failures in YARN

The container failure

The NodeManager failure

The ResourceManager failure

YARN application logging

Services logs

Application logs

Summary

5. Understanding YARN Life Cycle Management

An introduction to state management analogy

The ResourceManager's view

View 1 – Node

View 2 – Application

View 3 – An application attempt

View 4 – Container

The NodeManager's view

View 1 – Application

View 2 – Container

View 3 – A localized resource

Analyzing transitions through logs

NodeManager registration with ResourceManager

Application submission

Container resource allocation

Resource localization

Summary

6. Migrating from MRv1 to MRv2

Introducing MRv1 and MRv2

High-level changes from MRv1 to MRv2

The evolution of the MRApplicationMaster service

Resource capability

Pluggable shuffle

Hierarchical queues and fair scheduler

Task execution as containers

The migration steps from MRv1 to MRv2

Configuration changes

The binary / source compatibility

Running and monitoring MRv1 apps on YARN

Summary

7. Writing Your Own YARN Applications

An introduction to the YARN API

YARNConfiguration

Load resources

Final properties

Variable expansion

ApplicationSubmissionContext

ContainerLaunchContext

Communication protocols

ApplicationClientProtocol

ApplicationMasterProtocol

ContainerManagementProtocol

ApplicationHistoryProtocol

YARN client API

Writing your own application

Step 1 – Create a new project and add Hadoop-YARN JAR files

Step 2 – Define the ApplicationMaster and client classes

Define an ApplicationMaster

Define a YARN client

Step 3 – Export the project and copy resources

Step 4 – Run the application using bin or the YARN command

Summary

8. Dive Deep into YARN Components

Understanding ResourceManager

The client and admin interfaces

The core interfaces

The NodeManager interfaces

The security and token managers

Understanding NodeManager

Status updates

State and health management

Container management

The security and token managers

The YARN Timeline server

The web application proxy server

YARN Scheduler Load Simulator (SLS)

Handling resource localization in YARN

Resource localization terminologies

The resource localization directory structure

Summary

9. Exploring YARN REST Services

Introduction to YARN REST services

HTTP request and response

Successful response

Response with an error

ResourceManager REST APIs

The cluster summary

Scheduler details

Nodes

Applications

NodeManager REST APIs

The node summary

Applications

Containers

MapReduce ApplicationMaster REST APIs

ApplicationMaster summary

Jobs

Tasks

MapReduce HistoryServer REST APIs

How to access REST services

RESTClient plugins

Curl command

Java API

Summary

10. Scheduling YARN Applications

An introduction to scheduling in YARN

An overview of queues

Types of queues

CapacityScheduler Queue (CSQueue)

FairScheduler Queue (FSQueue)

An introduction to schedulers

Fair scheduler

Hierarchical queues

Schedulable

Scheduling policy

Configuring a fair scheduler

CapacityScheduler

Configuring CapacityScheduler

Summary

11. Enabling Security in YARN

Adding security to a YARN cluster

Using a dedicated user group for Hadoop-YARN daemons

Validating permissions to YARN directories

Enabling the HTTPS protocol

Enabling authorization using Access Control Lists

Enabling authentication using Kerberos

Working with ACLs

Defining an ACL value

Type of ACLs

The administration ACL

The service-level ACL

The queue ACL

The application ACL

Other security frameworks

Apache Ranger

Apache Knox

Summary

12. Real-time Data Analytics Using YARN

The integration of Spark with YARN

Running Spark on YARN

The integration of Storm with YARN

Running Storm on YARN

Create a Zookeeper quorum

Download, extract, and prepare the Storm bundle

Copy Storm ZIP to HDFS

Configuring the storm.yaml file

Launching the Storm-YARN cluster

Managing Storm on YARN

The integration of HAMA and Giraph with YARN

Summary

Index

Learning YARN

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: August 2015

Production reference: 1210815

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78439-396-0

www.packtpub.com

Credits

Authors

Akhil Arora

Shrey Mehrotra

Reviewers

P. Tomas Delvechio

Swapnil Salunkhe

Parambir Singh

Jenny (Xiao) Zhang

Commissioning Editor

Sarah Crofton

Acquisition Editor

Kevin Colaco

Content Development Editor

Susmita Sabat

Technical Editor

Deepti Tuscano

Copy Editors

Merilyn Pereira

Alpha Singh

Laxmi Subramanian

Project Coordinator

Milton Dsouza

Proofreader

Safis Editing

Indexer

Tejal Soni

Production Coordinator

Melwyn D'sa

Cover Work

Melwyn D'sa

About the Authors

I dedicate this book to my parents, who are always an inspiration for me; my wife, who is my strength; and my family and friends for their faith. Last but not least, thanks to my MacBook Pro for adding the fun element and making the entire process trouble-free.

Shrey Mehrotra has more than 5 years of IT experience, and in the past 4 years, he has gained experience in designing and architecting solutions for cloud and big data domains.

Working with big data R&D Labs, he has gained insights into Hadoop, focusing on HDFS, MapReduce, and YARN. His technical strengths also include Hive, PIG, ElasticSearch, Kafka, Sqoop, Flume, and Java. During his free time, he listens to music, watches movies, and enjoys going out with friends.

I would like to thank my mom and dad for giving me support to accomplish anything I wanted. Also, I would like to thank my friends, who bear with me while I am busy writing.

Acknowledgments

The three Is—idea, intelligence, and invention correlate to an idea given by Shrey, implemented by Akhil, and which has been used in this book.

We are glad that we approached Packt Publishing for this book and they agreed. We would like to take this opportunity to thank Packt Publishing for providing us the platform and support to write this book.

Words can't express our gratitude to the editors for their professional advice and assistance in polishing the content. A special thanks to Susmita for her support and patience during the entire process. We would also like to thank the reviewers and the technical editor, who not only helped us improve the content, but also enabled us to think better.

About the Reviewers

P. Tomas Delvechio is an IT programmer with nearly 10 years of experience. He completed his graduation at Luján University, Argentina. For his thesis, he started to research big data trends and gained a lot of deep knowledge of the MapReduce approach and the Hadoop environment. He participated in many projects as a web developer and software designing, with PHP and Python as the main languages. In his university, he worked as an assistant for the subject computer networks and taught courses on Hadoop and MapReduce for distributed systems subjects and academic conferences. Also, he is a regular member of the staff of programmers in the same institution. In his free time, he is an enthusiastic user of free software and assists in the organization of conferences of diffusion on it.

Swapnil Salunkhe is a passionate software developer who works on big data. He has a keen interest in learning and implementing new technologies. He also has a passion for functional programming, machine learning, and working with complex datasets. He can be contacted via his Twitter handle at @swapnils10.

I'd like to thank Packt Publishing and their staff for providing me with an opportunity to contribute to this book.

Parambir Singh is a JVM/frontend programmer who has worked on a variety of applications in his 10 years of experience. He's currently employed as a senior developer with Atlassian and is working on building their cloud infrastructure.

I would like to thank my wife, Gurleen, for her support while I was busy reviewing different chapters for this book.

Jenny (Xiao) Zhang is a technology professional in business analytics, KPIs, and big data. She helps businesses better manage, measure, report, and analyze big data to answer critical business questions and give better experiences to customers. She has written a number of blog posts at jennyxiaozhang.com on big data, Hadoop, and YARN. She constantly shares insights on big data on Twitter at @smallnaruto. She previously reviewed another YARN book called YARN Essentials.

I would like to thank my dad, Michael (Tiegang) Zhang, for providing technical insights in the process of reviewing this book. Special thanks to Packt Publishing for this great opportunity.

www.PacktPub.com

Support files, eBooks, discount offers, and more

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Free access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.

Preface

Today enterprises generate huge volumes of data. In order to provide effective services and to make smarter and intelligent decisions from these huge volumes of data, enterprises use big data analytics. In recent years, Hadoop is used for massive data storage and efficient distributed processing of data. YARN framework solves design problems faced by Hadoop 1.x framework by providing a more scalable, efficient, flexible, and highly available resource management framework for distributed data processing. It provides efficient scheduling algorithms and utility components for optimized use of resources of cluster with thousands of nodes, running millions of jobs in parallel.

In this book, you'll explore what YARN provides as a business solution for distributed resource management. You will learn to configure and manage single as well as multi-node Hadoop-YARN clusters. You will also learn about the YARN daemons – ResourceManager, NodeManager, ApplicationMaster, Container, and TimeLine server, and so on.

In subsequent chapters, you will walk through YARN application life cycle management, scheduling and application execution over a Hadoop-YARN cluster. It also covers a detailed explanation of features such as High Availability, Resource Localization, and Log Aggregation. You will learn to write and manage YARN applications with ease.

Toward the end, you will learn about the security architecture and integration of YARN with big data technologies such as Spark and Storm. This book promises conceptual as well as practical knowledge of resource management using YARN.

What this book covers

Chapter 1, Starting with YARN Basics, gives a theoretical overview of YARN, its background, and need. This chapter starts with the limitations in Hadoop 1.x that leads to the evolution of a resource management framework YARN. It also covers features provided by YARN, its architecture, and advantages of using YARN as a cluster ResourceManager for a variety of batch and real-time frameworks.

Chapter 2, Setting up a Hadoop-YARN Cluster, provides a step-by-step process to set up Hadoop-YARN single-node and multi-node clusters, configuration of different YARN components and an overview of YARN's web user interface.

Chapter 3, Administering a Hadoop-YARN Cluster, provides a detailed explanation of the administrative and user commands provided by YARN. It also provides how to guides for configuring YARN, enable log aggregation, auxiliary services, Ganglia integration, JMX monitoring, and health management, and so on.

Chapter 4, Executing Applications Using YARN, explains the process of executing a YARN application over Hadoop-YARN cluster and monitoring it. This chapter describes the application flow and how the components interact during an application execution in a cluster.

Chapter 5, Understanding YARN Life Cycle Management, gives a detailed description of internal classes involved and their core functionalities. It will help readers to understand internals of state transitions of services involved in the YARN application. It will also help in troubleshooting the failures and examining the current application state.

Chapter 6, Migrating from MRv1 to MRv2, involves the steps and configuration

Enjoying the preview?

Page 1 of 1

Learning YARN

About this ebook

Akhil Arora

Related authors

Related to Learning YARN

Related ebooks

Intelligence (AI) & Semantics For You

Related podcast episodes

Related articles

Related categories

Reviews for Learning YARN

What did you think?

Book preview

Learning YARN - Akhil Arora

Table of Contents

Learning YARN

Learning YARN

Credits

About the Authors

Acknowledgments

About the Reviewers

Support files, eBooks, discount offers, and more

Why subscribe?

Preface

What this book covers