Escolar Documentos
Profissional Documentos
Cultura Documentos
M A P R.CO M
Search
Editor'sNote:Inthisweek'sWhiteboardWalkthrough,AbizerAdenwala,TechnicalSupport
EngineeratMapR,walksyouthroughwhatastoragepoolis,whydisksarestriped,reasonsdisk
wouldbemarkedasfailed,whathappenswhenadiskismarkedfailed,whattowatchoutforbefore
reformatting/readdingdiskback,andwhatisthebestpathtorecoverfromdiskfailure.Here'sthe
video:
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 1/13
6/9/2015 HandlingDiskFailureinMapRFS#WhiteboardWalkthrough|MapR
Welcome, everyone, to a MapR Whiteboard Session. I'm going to talk about disk failures in MapR File
system (https://www.mapr.com/blog/comparing-mapr-fs-and-hdfs-nfs-and-snapshots).
My name is Abizer Adenwala. I work in the MapR support group as a technical support engineer.
To begin with, what are storage pools? In MapR-FS architecture, nodes consist of multiple disks, and
all these disks are divided into storage pools. Why do we need storage pools? Say, if you have, say, 10
disks, you can have all those disks in one storage pool, or you can have multiple storage pools. By
default, we have 3 disks for a storage pool, and all the disks in the storage pool are striped with Raid
0. Why do we need striping? We need disk striping just to get better read and write performance.
That's the reason we have the disk stripe across the whole SP.
Now, what are the reasons a disk would be marked as failed, and how do I know the reason why it
Free Trial
failed? Under MapR op logs, there's a failed.log, where, if it is taken offline, you will see the reason
why that disk was taken offline. There are common scenarios why disks are taken offline. One is a CR
error, which means that the disk, the block of that disk, has been corrupted, so MapR software has
Contact Us
taken that offline. It could be that there was an I/O error, or an I/O timeout, basically. An I/O timeout
would happen if the disks are slower than what MapR software expects it to be. There is a property
mfs.io.disk.timeout under mfs.conf, where you basically set how much the disk should be slow, and
how much we can tolerate. By default, it's 60 seconds, but if an I/O takes more than 60 seconds, the
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 2/13
6/9/2015 HandlingDiskFailureinMapRFS#WhiteboardWalkthrough|MapR
Another reason why you would see a disk has gone offline is if, for any reason, the disk disappears
from the OS. You will see "No disk found" or "Device not found." If for any hardware reason or disk
controller fault, if the disk disappears from the OS, MapR software also doesn't see that disk, and
those disks are taken offline, and the SP itself goes offline.
Let's say, what happens when a disk fails, since all the disks are part of a storage pool, right? Say, if
one of the disks in one storage pool fails, all of the disks in this SP are taken offline. The reason being,
think of all your data as being in one storage pool, and one disk fails, so it's like a hole in the SP. This
data is not consistent. That's why the whole SP is taken offline, all the disks on the SP are taken
offline. Once you take the whole SP offline, you might see a disk failure alarm. Even if you do some
When a disk is taken offline, what do you expect, apart from a failed disk alarm? You should also see
a volume under replicated alarm, and an R volume unavailable alarm. If you see a data unavailable
alarm, that's a red flag, which means at least one container of that volume doesn't have a valid
master copy. What does that mean? Basically, say, if there was one disk which had some data on it,
and this was the only disk which had that data, and that disk has gone offline, that means you don't
have that data. The possibility of that is very low, because all the data across the file system has three
applications. But, say if three disks on different nodes fail at the same time, then you will see a data
unavailable alarm, which is kind of risky. That's a red flag, and we should stop there.
Another alarm you might see is data under replicated, which means if I had three copies of our data,
and one copy goes offline, that means I have two copies. Because my application count is not met
(three) that's the reason we have that alarm. Basically, what I will do is nothing. I just have to wait
until this data gets replicated somewhere else on some other SP on some other node.
OKthe path to recovery: how do I recover if I see a data unavailable alarm? Say if I have one disk
which has failed for some reason. I first have to go and check my faileddisk.log, to see what is the
reason my disk has failed to see if I have an I/O error or CRC error. There are ways to bring back
that SP online. The first thing I want to do is, on that node, an mrconfig sp list and get it from offline.
I want to get the SP which is offline. Now that I have an SP name, because the whole SP is offline, now
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 3/13
6/9/2015 HandlingDiskFailureinMapRFS#WhiteboardWalkthrough|MapR
I want to run fsck utility on it. The reason I want to do that is, when this SP was taken offline due to
one of the disks which was failed, it was marked with an error signal. So MapR-FS doesn't bring this
SP back online, because we know that there is some problem or there was some inconsistency with it.
So I have to run, depending on what error I get in faileddisk.log, different flags with fsck.
In most cases, if I run an fsck with a repair option, it should basically go and check each block for
consistency across the SP, and it will remove that error signal once fsck has completed successfully.
After that, all I have to do is an mrconfig sp name, and bring that SP online. That's it. Once I see that
SP is online, that means my data came back. Now, I want to wait until my alarm goes off, because if I
just had one set of data, we want to wait until it replicates to some other nodes as well, before we do
What is the other scenario? The other scenario is when you had data under replicated. With data
under replicated, at any point, there is no system admin activity, which is very urgent. All you have to
do is wait for the MapR file system to self-heal. What basically it means is, if one of the disks was
failed, and one SP was taken offline because of that, you know that the same data is there on some
other SP on some other node. Once MapR software realizes that one of my data is not available, after,
say, one hour, it will by default go ahead and do a replication of the same data on some other node
on some other SP. After it completes that replication, you won't see any alarms anymore. All you will
see is failed disk alarm. After that, basically, you can take the disk out, do some hardware tests, and if
you really find out that there was any problem with the disk, you can basically replace those disks
and reformat the disk back, and get the MapR-FS online again.
That's it for my whiteboard walkthrough. You can comment or post any questions to the link below.
You can also follow us on Twitter, @MapR, #WhiteboardWalkthrough. Thanks for watching.
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 4/13
6/9/2015 HandlingDiskFailureinMapRFS#WhiteboardWalkthrough|MapR
1Comment mapr.com
1 Login
Jointhediscussion
JimBates 6monthsago
Greatinformation.Thanks.
Reply Share
ALSOONMAPR.COM WHAT'STHIS?
HadoopAdoptionIstheClusterHalf NewAgeFraudAnalytics:Machine
Full? LearningonHadoop
3comments24daysago 1comment3monthsago
HariSekhonI'mgladtohearthat GabeThanksforsuchagreatarticle!
Steve...oneoftheconcernsinthe ApacheMahoutsgoalistobuildscalable
communityisthatchoosingtosupport machinelearninglibraries.Mahoutscore
companiesthataren't
ALookBackatSparkasanOpen InternetofThings:BigDataOutbreak
Standard WhiteboardWalkthrough
2comments6monthsago 1comment2monthsago
chasehooHiNagaraju,atthispoint, CarolMcDonaldreallycool,kindof
MapRdoesnotsupporttheKiteSDK. remindsmeofthefamousJavaRingsat
theSun'sJavaOneconference,inMarch
SUBSCRIBE NOW
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 5/13
6/9/2015 HandlingDiskFailureinMapRFS#WhiteboardWalkthrough|MapR
(/blog/author/abizer-adenwala)
Abizer Adenwala
(/blog/author/abizer-
adenwala)
SR. TECHNICAL SUPPORT ENGINEER, MAPR
FOLLOW MAPR
Follow 10,673
Follow@mapr 28.1Kfollowers
Top Posts
How to Use SQL, Hadoop, Drill, REST, JSON, NoSQL, and HBase in a Simple REST Client
(/blog/how-use-sql-hadoop-drill-rest-json-nosql-and-hbase-simple-rest-client)
WHITEPAPER
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 6/13
6/9/2015 HandlingDiskFailureinMapRFS#WhiteboardWalkthrough|MapR
Download Whitepaper from Radiant Advisors on the next gen data architecture with Hadoop
(
h
tt
p
s:
//
w
w
w
.
m
a
p
r.
c
o
m
/
d
ri
vi
n
g
-
n
e
xt
-
g
e
n
e
r
at
io
n
-
d
at
a-
a
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 7/13
6/9/2015 HandlingDiskFailureinMapRFS#WhiteboardWalkthrough|MapR
rc
hi
te
ct
u
r
e-
h
a
d
o
o
p
?
s
o
u
rc
e
=
S
o
ci
al
&
c
a
m
p
ai
g
n
=
2
0
1
5
_
S
o
ci
al
_
Bl
o
Download Now g)
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 8/13
6/9/2015 HandlingDiskFailureinMapRFS#WhiteboardWalkthrough|MapR
The inaugural Big Data Utah and Boulder/Denver Big Data Users Group (BDBDUG)
Global Data Competition 2015: Collaborate to Change Climate Change officially kicks off
this weekend (June 6). The competition focuses on climate analysis in 22 regions around
the globe, with the mandate to facilitate global collaboration and to promote data-
driven decision-making that allows others to improve research and decisions on
investments, adaptive approaches, policymaking, and more.
READ MORE
You already know Hadoop as one of the best, cost-effective platforms for deploying
large-scale big data applications. But Hadoop is even more powerful when combined
with execution capabilities provided by Apache Spark. Although Spark can be used with
a number of big data platforms, with the right Hadoop distribution, you can build big
data applications quickly using tools you already know.
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 9/13
6/9/2015 HandlingDiskFailureinMapRFS#WhiteboardWalkthrough|MapR
READ MORE
READ MORE
As you probably know, Apache Hadoop was inspired by Googles MapReduce and
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 10/13
6/9/2015 HandlingDiskFailureinMapRFS#WhiteboardWalkthrough|MapR
Google File System papers and cultivated at Yahoo! It started as a large-scale distributed
batch processing infrastructure, and was designed to meet the need for an affordable,
scalable and flexible data structure that could be used for working with very large data
sets.
READ MORE
In this post, you will learn how memory is sliced and diced from the time you start
warden, how memory is allocated to each MapR service installed on the node, and what
memory is made available for MR1 for running jobs.
READ MORE
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 11/13
6/9/2015 HandlingDiskFailureinMapRFS#WhiteboardWalkthrough|MapR
READ MORE
In this week's Whiteboard Walkthrough, Tomer Shiran, PMC member and Apache Drill
committer, walks you through the deployment of Apache Drill with different storage
systems and the connection with BI tools.
READ MORE
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 12/13
6/9/2015 HandlingDiskFailureinMapRFS#WhiteboardWalkthrough|MapR
In this demo we are using Spark and PySpark to process and analyze the data set,
calculate aggregate statistics about the user base in a PySpark script, persist all of that
back into MapR-DB for use in Spark and Tableau, and finally use MLlib to build logistic
regression models.
READ MORE
https://www.mapr.com/blog/handlingdiskfailuremaprfs%E2%80%93whiteboardwalkthrough#.VXY8V8qrCe 13/13