Escolar Documentos
Profissional Documentos
Cultura Documentos
Alexey Filanovskiy
Cloudera certified developer
Scalable of JobTracker.
Split it into two components:
- Resource Manager handing cluster resource (CPU, RAM). One per cluster.
- Application Master, coordinate dedicated MR. One per MR job
Move from slots approach of Resource management to physical world approach(Memory, CPU,
Disk).
Determinate amount of resource that can we used by each process for each node (for example,
Impala can use 4 core, 16 Gb RAM, MapReduce 12 cores 32 Gb RAM)
For each map of reduce are dedicated some amount RAM, cores, weight for IO operation
Copyright 2014 Oracle and/or its affiliates. All rights reserved. |
YARN.
YARN: Yet-Another-Resource-Negotiator
Scheduler
10
Lets zoom it
Scheduler
uler
FIFO Scheduler
14
15
DEMO
16
17
18
19
20
Fair Scheduler
21
Fair Scheduler
22
All hardware resources shared by all applications based on config file (some policies)
Piece of HW resource that dedicate for each job determinate by queue
Each application are placed in some queue
f queue is not determinate explicitly application is put on default queue
rameter yarn.scheduler.fair.allow-undeclared-pools should be equal false)
When there is a single job running, that job uses the entire cluster.
When other jobs are submitted, tasks slots that free up are assigned to the new jobs, so that each job gets r
e same amount of CPU time.
The Fair Scheduler arose out of Facebooks need to share its data warehouse between multiple users
ed user
queue you can log on as hdfs Linux user:
23(hadoop),1001(oinstall),1003(hdfs)
ob.queue.name during running MR job
yarn)
mples-2.3.0-cdh5.0.0.jar
red.job.queue.name=root.hdfs 1000000000 /tmp/test2
mple). This HQL will use root.hdfs queue
24
DEMO
25
26
on root.someuser pool
U resource. Because its single Job in cluster
27
28
d) in root.hdfs pool
g file. Hdfs pool takes half recourse, root and someuser quoter
29
30
31
32
33
CapacityScheduler
36
CapacityScheduler is designed to allow sharing a large cluster while giving each organization a minimum
y guarantee.
entral idea is that the available resources in the Hadoop Map-Reduce cluster are partitioned among
e organizations who collectively fund the cluster based on computing needs.
apacity Scheduler from Yahoo offers similar functionality to the Fair Scheduler but takes a somewhat differe
hy
Capacity Scheduler, you define a number of named queues. Each queue has a configurable number
and reduce slots. The scheduler gives each queue its capacity when it contains jobs, and shares any unused
y between the queues.
38