Você está na página 1de 20

3/30/2017 BigdataWikipedia

Bigdata
FromWikipedia,thefreeencyclopedia

Bigdataisatermfordatasetsthataresolarge
orcomplexthattraditionaldataprocessing
applicationsoftwareisinadequatetodealwith
them.Challengesincludecapture,storage,
analysis,datacuration,search,sharing,
transfer,visualization,querying,updatingand
informationprivacy.Theterm"bigdata"often
referssimplytotheuseofpredictiveanalytics,
userbehavioranalytics,orcertainother
advanceddataanalyticsmethodsthatextract
valuefromdata,andseldomtoaparticularsize
ofdataset."Thereislittledoubtthatthe
quantitiesofdatanowavailableareindeed
large,butthatsnotthemostrelevant
characteristicofthisnewdataecosystem."[2]
Analysisofdatasetscanfindnewcorrelations
to"spotbusinesstrends,preventdiseases,
Growthofanddigitizationofglobalinformationstoragecapacity[1]
combatcrimeandsoon."[3]Scientists,business
executives,practitionersofmedicine,
advertisingandgovernmentsalikeregularlymeetdifficultieswithlargedatasetsinareasincludingInternetsearch,
finance,urbaninformatics,andbusinessinformatics.ScientistsencounterlimitationsineSciencework,including
meteorology,genomics,[4]connectomics,complexphysicssimulations,biologyandenvironmentalresearch.[5]

Datasetsgrowrapidlyinpartbecausetheyareincreasinglygatheredbycheapandnumerousinformationsensing
mobiledevices,aerial(remotesensing),softwarelogs,cameras,microphones,radiofrequencyidentification
(RFID)readersandwirelesssensornetworks.[6][7]Theworld'stechnologicalpercapitacapacitytostore
informationhasroughlydoubledevery40monthssincethe1980s[8]asof2012,everyday2.5exabytes(2.51018)
ofdataaregenerated.[9]Onequestionforlargeenterprisesisdeterminingwhoshouldownbigdatainitiativesthat
affecttheentireorganization.[10]

Relationaldatabasemanagementsystemsanddesktopstatisticsandvisualizationpackagesoftenhavedifficulty
handlingbigdata.Theworkmayrequire"massivelyparallelsoftwarerunningontens,hundreds,oreven
thousandsofservers".[11]Whatcountsas"bigdata"variesdependingonthecapabilitiesoftheusersandtheir
tools,andexpandingcapabilitiesmakebigdataamovingtarget."Forsomeorganizations,facinghundredsof
gigabytesofdataforthefirsttimemaytriggeraneedtoreconsiderdatamanagementoptions.Forothers,itmay
taketensorhundredsofterabytesbeforedatasizebecomesasignificantconsideration."[12]

Contents
1 Definition
2 Characteristics
3 Architecture
4 Technologies
5 Applications
https://en.wikipedia.org/wiki/Big_data 1/20
3/30/2017 BigdataWikipedia

5.1 Government
5.1.1 UnitedStatesofAmerica
5.1.2 India
5.1.3 UnitedKingdom
5.2 Internationaldevelopment
5.3 Manufacturing
5.3.1 Cyberphysicalmodels
5.4 Healthcare
5.5 Education
5.6 Media
5.6.1 InternetofThings(IoT)
5.6.2 Technology
5.7 InformationTechnology
5.7.1 Retail
5.7.2 Retailbanking
5.7.3 Realestate
5.8 Science
5.8.1 Scienceandresearch
5.9 Sports
6 Researchactivities
6.1 Samplingbigdata
7 Critique
7.1 Critiquesofthebigdataparadigm
7.2 Critiquesofbigdataexecution
8 Seealso
9 References
10 Furtherreading
11 Externallinks

Definition
Thetermhasbeeninusesincethe1990s,withsomegivingcredittoJohn
Masheyforcoiningoratleastmakingitpopular.[13][14]Bigdatausually
includesdatasetswithsizesbeyondtheabilityofcommonlyusedsoftware
toolstocapture,curate,manage,andprocessdatawithinatolerableelapsed
time.[15]Bigdata"size"isaconstantlymovingtarget,asof2012ranging
fromafewdozenterabytestomanypetabytesofdata.[16]Bigdatarequires
asetoftechniquesandtechnologieswithnewformsofintegrationtoreveal
insightsfromdatasetsthatarediverse,complex,andofamassivescale.[17]

Ina2001researchreport[18]andrelatedlectures,METAGroup(now VisualizationofdailyWikipediaedits
Gartner)defineddatagrowthchallengesandopportunitiesasbeingthree createdbyIBM.Atmultipleterabytes
dimensional,i.e.increasingvolume(amountofdata),velocity(speedof insize,thetextandimagesof
datainandout),andvariety(rangeofdatatypesandsources).Gartner,and Wikipediaareanexampleofbigdata.
nowmuchoftheindustry,continuetousethis"3Vs"modelfordescribing
bigdata.[19]In2012,Gartnerupdateditsdefinitionasfollows:"Bigdataishighvolume,highvelocity,and/orhigh
varietyinformationassetsthatrequirenewformsofprocessingtoenableenhanceddecisionmaking,insight
discoveryandprocessoptimization."Gartner'sdefinitionofthe3Vsisstillwidelyused,andinagreementwitha
consensualdefinitionthatstatesthat"BigDatarepresentstheInformationassetscharacterizedbysuchaHigh
Volume,VelocityandVarietytorequirespecificTechnologyandAnalyticalMethodsforitstransformationinto
https://en.wikipedia.org/wiki/Big_data 2/20
3/30/2017 BigdataWikipedia

Value".[20]Additionally,anewV"Veracity"isaddedbysomeorganizationstodescribeit,[21]revisionism
challengedbysomeindustryauthorities.[22]The3Vshavebeenexpandedtoothercomplementarycharacteristics
ofbigdata:[23][24]

Volume:bigdatadoesn'tsampleitjustobservesandtrackswhathappens
Velocity:bigdataisoftenavailableinrealtime
Variety:bigdatadrawsfromtext,images,audio,videoplusitcompletesmissingpiecesthroughdatafusion
Machinelearning:bigdataoftendoesn'taskwhyandsimplydetectspatterns[25]
Digitalfootprint:bigdataisoftenacostfreebyproductofdigitalinteraction[24][26]

ThegrowingmaturityoftheconceptmorestarklydelineatesthedifferencebetweenbigdataandBusiness
Intelligence:[27]

BusinessIntelligenceusesdescriptivestatisticswithdatawithhighinformationdensitytomeasurethings,
detecttrends,etc..
Bigdatausesinductivestatisticsandconceptsfromnonlinearsystemidentification[28]toinferlaws
(regressions,nonlinearrelationships,andcausaleffects)fromlargesetsofdatawithlowinformation
density[29]torevealrelationshipsanddependencies,ortoperformpredictionsofoutcomesand
behaviors.[28][30]

Characteristics
Bigdatacanbedescribedbythefollowingcharacteristics:[23][24]

Volume
Thequantityofgeneratedandstoreddata.Thesizeofthedatadeterminesthevalueandpotentialinsight
andwhetheritcanactuallybeconsideredbigdataornot.

Variety
Thetypeandnatureofthedata.Thishelpspeoplewhoanalyzeittoeffectivelyusetheresultinginsight.

Velocity
Inthiscontext,thespeedatwhichthedataisgeneratedandprocessedtomeetthedemandsandchallenges
thatlieinthepathofgrowthanddevelopment.

Variability
Inconsistencyofthedatasetcanhamperprocessestohandleandmanageit.

Veracity
Thequalityofcaptureddatacanvarygreatly,affectingaccurateanalysis.

FactoryworkandCyberphysicalsystemsmayhavea6Csystem:

Connection(sensorandnetworks)
Cloud(computinganddataondemand)[31][32]
Cyber(modelandmemory)
Content/context(meaningandcorrelation)
Community(sharingandcollaboration)
Customization(personalizationandvalue)

https://en.wikipedia.org/wiki/Big_data 3/20
3/30/2017 BigdataWikipedia

Datamustbeprocessedwithadvancedtools(analyticsandalgorithms)torevealmeaningfulinformation.For
example,tomanageafactoryonemustconsiderbothvisibleandinvisibleissueswithvariouscomponents.
Informationgenerationalgorithmsmustdetectandaddressinvisibleissuessuchasmachinedegradation,
componentwear,etc.onthefactoryfloor.[33][34]

Architecture
In2000,SeisintInc.(nowLexisNexisGroup)developedaC++baseddistributedfilesharingframeworkfordata
storageandquery.Thesystemstoresanddistributesstructured,semistructured,andunstructureddataacross
multipleservers.UserscanbuildqueriesinaC++dialectcalledECL.ECLusesan"applyschemaonread"
methodtoinferthestructureofstoreddatawhenitisqueried,insteadofwhenitisstored.In2004,LexisNexis
acquiredSeisintInc.[35]andin2008acquiredChoicePoint,Inc.[36]andtheirhighspeedparallelprocessing
platform.ThetwoplatformsweremergedintoHPCC(orHighPerformanceComputingCluster)Systemsandin
2011,HPCCwasopensourcedundertheApachev2.0License.QuantcastFileSystemwasavailableaboutthe
sametime.[37]

In2004,GooglepublishedapaperonaprocesscalledMapReducethatusesasimilararchitecture.The
MapReduceconceptprovidesaparallelprocessingmodel,andanassociatedimplementationwasreleasedto
processhugeamountsofdata.WithMapReduce,queriesaresplitanddistributedacrossparallelnodesand
processedinparallel(theMapstep).Theresultsarethengatheredanddelivered(theReducestep).Theframework
wasverysuccessful,[38]sootherswantedtoreplicatethealgorithm.Therefore,animplementationofthe
MapReduceframeworkwasadoptedbyanApacheopensourceprojectnamedHadoop.[39]

MIKE2.0isanopenapproachtoinformationmanagementthatacknowledgestheneedforrevisionsduetobigdata
implicationsidentifiedinanarticletitled"BigDataSolutionOffering".[40]Themethodologyaddresseshandling
bigdataintermsofusefulpermutationsofdatasources,complexityininterrelationships,anddifficultyindeleting
(ormodifying)individualrecords.[41]

2012studiesshowedthatamultiplelayerarchitectureisoneoptiontoaddresstheissuesthatbigdatapresents.A
distributedparallelarchitecturedistributesdataacrossmultipleserverstheseparallelexecutionenvironmentscan
dramaticallyimprovedataprocessingspeeds.ThistypeofarchitectureinsertsdataintoaparallelDBMS,which
implementstheuseofMapReduceandHadoopframeworks.Thistypeofframeworklookstomaketheprocessing
powertransparenttotheenduserbyusingafrontendapplicationserver.[42]

Bigdataanalyticsformanufacturingapplicationsismarketedasa5Carchitecture(connection,conversion,cyber,
cognition,andconfiguration).[43]

Thedatalakeallowsanorganizationtoshiftitsfocusfromcentralizedcontroltoasharedmodeltorespondtothe
changingdynamicsofinformationmanagement.Thisenablesquicksegregationofdataintothedatalake,thereby
reducingtheoverheadtime.[44][45]

Technologies
A2011McKinseyGlobalInstitutereportcharacterizesthemaincomponentsandecosystemofbigdataas
follows:[46]

Techniquesforanalyzingdata,suchasA/Btesting,machinelearningandnaturallanguageprocessing
Bigdatatechnologies,likebusinessintelligence,cloudcomputinganddatabases
Visualization,suchascharts,graphsandotherdisplaysofthedata
https://en.wikipedia.org/wiki/Big_data 4/20
3/30/2017 BigdataWikipedia

Multidimensionalbigdatacanalsoberepresentedastensors,whichcanbemoreefficientlyhandledbytensor
basedcomputation,[47]suchasmultilinearsubspacelearning.[48]Additionaltechnologiesbeingappliedtobigdata
includemassivelyparallelprocessing(MPP)databases,searchbasedapplications,datamining,[49]distributedfile
systems,distributeddatabases,cloudbasedinfrastructure(applications,storageandcomputingresources)andthe
Internet.

SomebutnotallMPPrelationaldatabaseshavetheabilitytostoreandmanagepetabytesofdata.Implicitisthe
abilitytoload,monitor,backup,andoptimizetheuseofthelargedatatablesintheRDBMS.[50]

DARPA'sTopologicalDataAnalysisprogramseeksthefundamentalstructureofmassivedatasetsandin2008the
technologywentpublicwiththelaunchofacompanycalledAyasdi.[51]

Thepractitionersofbigdataanalyticsprocessesaregenerallyhostiletoslowersharedstorage,[52]preferringdirect
attachedstorage(DAS)initsvariousformsfromsolidstatedrive(Ssd)tohighcapacitySATAdiskburiedinside
parallelprocessingnodes.TheperceptionofsharedstoragearchitecturesStorageareanetwork(SAN)and
Networkattachedstorage(NAS)isthattheyarerelativelyslow,complex,andexpensive.Thesequalitiesarenot
consistentwithbigdataanalyticssystemsthatthriveonsystemperformance,commodityinfrastructure,andlow
cost.

Realornearrealtimeinformationdeliveryisoneofthedefiningcharacteristicsofbigdataanalytics.Latencyis
thereforeavoidedwheneverandwhereverpossible.Datainmemoryisgooddataonspinningdiskattheother
endofaFCSANconnectionisnot.ThecostofaSANatthescaleneededforanalyticsapplicationsisverymuch
higherthanotherstoragetechniques.

Thereareadvantagesaswellasdisadvantagestosharedstorageinbigdataanalytics,butbigdataanalytics
practitionersasof2011didnotfavourit.[53]

Applications
Bigdatahasincreasedthedemandofinformationmanagementspecialists
somuchsothatSoftwareAG,OracleCorporation,IBM,Microsoft,SAP,
EMC,HPandDellhavespentmorethan$15billiononsoftwarefirms
specializingindatamanagementandanalytics.In2010,thisindustrywas
worthmorethan$100billionandwasgrowingatalmost10percentayear:
abouttwiceasfastasthesoftwarebusinessasawhole.[3]

Developedeconomiesincreasinglyusedataintensivetechnologies.There
are4.6billionmobilephonesubscriptionsworldwide,andbetween
1billionand2billionpeopleaccessingtheinternet.[3]Between1990and BuswrappedwithSAPBigdata
2005,morethan1billionpeopleworldwideenteredthemiddleclass, parkedoutsideIDF13.
whichmeansmorepeoplebecamemoreliterate,whichinturnleadto
informationgrowth.Theworld'seffectivecapacitytoexchangeinformation
throughtelecommunicationnetworkswas281petabytesin1986,471petabytesin1993,2.2exabytesin2000,65
exabytesin2007[8]andpredictionsputtheamountofinternettrafficat667exabytesannuallyby2014.[3]
Accordingtooneestimate,onethirdofthegloballystoredinformationisintheformofalphanumerictextandstill
imagedata,[54]whichistheformatmostusefulformostbigdataapplications.Thisalsoshowsthepotentialofyet
unuseddata(i.e.intheformofvideoandaudiocontent).

https://en.wikipedia.org/wiki/Big_data 5/20
3/30/2017 BigdataWikipedia

Whilemanyvendorsofferofftheshelfsolutionsforbigdata,expertsrecommendthedevelopmentofinhouse
solutionscustomtailoredtosolvethecompany'sproblemathandifthecompanyhassufficienttechnical
capabilities.[55]

Government

Theuseandadoptionofbigdatawithingovernmentalprocessesallowsefficienciesintermsofcost,productivity,
andinnovation,[56]butdoesnotcomewithoutitsflaws.Dataanalysisoftenrequiresmultiplepartsofgovernment
(centralandlocal)toworkincollaborationandcreatenewandinnovativeprocessestodeliverthedesired
outcome.Belowaresomeexamplesofinitiativesthegovernmentalbigdataspace.

UnitedStatesofAmerica

In2012,theObamaadministrationannouncedtheBigDataResearchandDevelopmentInitiative,toexplore
howbigdatacouldbeusedtoaddressimportantproblemsfacedbythegovernment.[57]Theinitiativeis
composedof84differentbigdataprogramsspreadacrosssixdepartments.[58]
BigdataanalysisplayedalargeroleinBarackObama'ssuccessful2012reelectioncampaign.[59]
TheUnitedStatesFederalGovernmentownssixofthetenmostpowerfulsupercomputersintheworld.[60]
TheUtahDataCenterhasbeenconstructedbytheUnitedStatesNationalSecurityAgency.Whenfinished,
thefacilitywillbeabletohandlealargeamountofinformationcollectedbytheNSAovertheInternet.The
exactamountofstoragespaceisunknown,butmorerecentsourcesclaimitwillbeontheorderofafew
exabytes.[61][62][63]

India

BigdataanalysiswasinpartresponsiblefortheBJPtowintheIndianGeneralElection2014.[64]
TheIndiangovernmentutilizesnumeroustechniquestoascertainhowtheIndianelectorateisrespondingto
governmentaction,aswellasideasforpolicyaugmentation.

UnitedKingdom

Examplesofusesofbigdatainpublicservices:

Dataonprescriptiondrugs:byconnectingorigin,locationandthetimeofeachprescription,aresearchunit
wasabletoexemplifytheconsiderabledelaybetweenthereleaseofanygivendrug,andaUKwide
adaptationoftheNationalInstituteforHealthandCareExcellenceguidelines.Thissuggeststhatnewor
mostuptodatedrugstakesometimetofilterthroughtothegeneralpatient.[65]
Joiningupdata:alocalauthorityblendeddataaboutservices,suchasroadgrittingrotas,withservicesfor
peopleatrisk,suchas'mealsonwheels'.Theconnectionofdataallowedthelocalauthoritytoavoidany
weatherrelateddelay.[66]

Internationaldevelopment

Researchontheeffectiveusageofinformationandcommunicationtechnologiesfordevelopment(alsoknownas
ICT4D)suggeststhatbigdatatechnologycanmakeimportantcontributionsbutalsopresentuniquechallengesto
Internationaldevelopment.[67][68]Advancementsinbigdataanalysisoffercosteffectiveopportunitiestoimprove
decisionmakingincriticaldevelopmentareassuchashealthcare,employment,economicproductivity,crime,
security,andnaturaldisasterandresourcemanagement.[69][70][71]Additionally,usergenerateddataoffersnew

https://en.wikipedia.org/wiki/Big_data 6/20
3/30/2017 BigdataWikipedia

opportunitiestogivetheunheardavoice.[72]However,longstandingchallengesfordevelopingregionssuchas
inadequatetechnologicalinfrastructureandeconomicandhumanresourcescarcityexacerbateexistingconcerns
withbigdatasuchasprivacy,imperfectmethodology,andinteroperabilityissues.[69]

Manufacturing

BasedonTCS2013GlobalTrendStudy,improvementsinsupplyplanningandproductqualityprovidethe
greatestbenefitofbigdataformanufacturing.Bigdataprovidesaninfrastructurefortransparencyin
manufacturingindustry,whichistheabilitytounraveluncertaintiessuchasinconsistentcomponentperformance
andavailability.Predictivemanufacturingasanapplicableapproachtowardnearzerodowntimeandtransparency
requiresvastamountofdataandadvancedpredictiontoolsforasystematicprocessofdataintouseful
information.[73]Aconceptualframeworkofpredictivemanufacturingbeginswithdataacquisitionwheredifferent
typeofsensorydataisavailabletoacquiresuchasacoustics,vibration,pressure,current,voltageandcontroller
data.Vastamountofsensorydatainadditiontohistoricaldataconstructthebigdatainmanufacturing.The
generatedbigdataactsastheinputintopredictivetoolsandpreventivestrategiessuchasPrognosticsandHealth
Management(PHM).[74][75]

Cyberphysicalmodels

CurrentPHMimplementationsmostlyusedataduringtheactualusagewhileanalyticalalgorithmscanperform
moreaccuratelywhenmoreinformationthroughoutthemachine'slifecycle,suchassystemconfiguration,physical
knowledgeandworkingprinciples,areincluded.Thereisaneedtosystematicallyintegrate,manageandanalyze
machineryorprocessdataduringdifferentstagesofmachinelifecycletohandledata/informationmoreefficiently
andfurtherachievebettertransparencyofmachinehealthconditionformanufacturingindustry.

Withsuchmotivationacyberphysical(coupled)modelschemehasbeendeveloped.Thecoupledmodelisa
digitaltwinoftherealmachinethatoperatesinthecloudplatformandsimulatesthehealthconditionwithan
integratedknowledgefrombothdatadrivenanalyticalalgorithmsaswellasotheravailablephysicalknowledge.It
canalsobedescribedasa5Ssystematicapproachconsistingofsensing,storage,synchronization,synthesisand
service.Thecoupledmodelfirstconstructsadigitalimagefromtheearlydesignstage.Systeminformationand
physicalknowledgeareloggedduringproductdesign,basedonwhichasimulationmodelisbuiltasareference
forfutureanalysis.Initialparametersmaybestatisticallygeneralizedandtheycanbetunedusingdatafromtesting
orthemanufacturingprocessusingparameterestimation.Afterthatstep,thesimulationmodelcanbeconsidereda
mirroredimageoftherealmachineabletocontinuouslyrecordandtrackmachineconditionduringthelater
utilizationstage.Finally,withtheincreasedconnectivityofferedbycloudcomputingtechnology,thecoupled
modelalsoprovidesbetteraccessibilityofmachineconditionforfactorymanagersincaseswherephysicalaccess
toactualequipmentormachinedataislimited.[34]

Healthcare

Bigdataanalyticshashelpedhealthcareimprovebyprovidingpersonalizedmedicineandprescriptiveanalytics,
clinicalriskinterventionandpredictiveanalytics,wasteandcarevariabilityreduction,automatedexternaland
internalreportingofpatientdata,standardizedmedicaltermsandpatientregistriesandfragmentedpoint
solutions.[76]Someareasofimprovementaremoreaspirationalthanactuallyimplemented.Thelevelofdata
generatedwithinhealthcaresystemsisnottrivial.WiththeaddedadoptionofmHealth,eHealthandwearable
technologiesthevolumeofdatawillcontinuetoincrease.Thisincludeselectronichealthrecorddata,imaging
data,patientgenerateddata,sensordata,andotherformsofdifficulttoprocessdata.Thereisnowanevengreater
needforsuchenvironmentstopaygreaterattentiontodataandinformationquality.[77]"Bigdataveryoftenmeans
`dirtydata'andthefractionofdatainaccuraciesincreaseswithdatavolumegrowth."Humaninspectionatthebig

https://en.wikipedia.org/wiki/Big_data 7/20
3/30/2017 BigdataWikipedia

datascaleisimpossibleandthereisadesperateneedinhealthserviceforintelligenttoolsforaccuracyand
believabilitycontrolandhandlingofinformationmissed.[78]Whileextensiveinformationinhealthcareisnow
electronic,itfitsunderthebigdataumbrellaasmostisunstructuredanddifficulttouse.[79]

Education

AMcKinseyGlobalInstitutestudyfoundashortageof1.5millionhighlytraineddataprofessionalsand
managers[46]andanumberofuniversities[80]includingUniversityofTennesseeandUCBerkeley,havecreated
mastersprogramstomeetthisdemand.Privatebootcampshavealsodevelopedprogramstomeetthatdemand,
includingfreeprogramslikeTheDataIncubatororpaidprogramslikeGeneralAssembly.[81]

Media

Tounderstandhowthemediautilisesbigdata,itisfirstnecessarytoprovidesomecontextintothemechanism
usedformediaprocess.IthasbeensuggestedbyNickCouldryandJosephTurowthatpractitionersinMediaand
Advertisingapproachbigdataasmanyactionablepointsofinformationaboutmillionsofindividuals.Theindustry
appearstobemovingawayfromthetraditionalapproachofusingspecificmediaenvironmentssuchas
newspapers,magazines,ortelevisionshowsandinsteadtapsintoconsumerswithtechnologiesthatreachtargeted
peopleatoptimaltimesinoptimallocations.Theultimateaimistoserve,orconvey,amessageorcontentthatis
(statisticallyspeaking)inlinewiththeconsumer'smindset.Forexample,publishingenvironmentsareincreasingly
tailoringmessages(advertisements)andcontent(articles)toappealtoconsumersthathavebeenexclusively
gleanedthroughvariousdataminingactivities.[82]

Targetingofconsumers(foradvertisingbymarketers)
Datacapture
Datajournalism:publishersandjournalistsusebigdatatoolstoprovideuniqueandinnovativeinsightsand
infographics.

InternetofThings(IoT)

BigdataandtheIoTworkinconjunction.DataextractedfromIoTdevicesprovidesamappingofdeviceinter
connectivity.Suchmappingshavebeenusedbythemediaindustry,companiesandgovernmentstomore
accuratelytargettheiraudienceandincreasemediaefficiency.IoTisalsoincreasinglyadoptedasameansof
gatheringsensorydata,andthissensorydatahasbeenusedinmedical[83]andmanufacturing[84]contexts.

Technology

eBay.comusestwodatawarehousesat7.5petabytesand40PBaswellasa40PBHadoopclusterforsearch,
consumerrecommendations,andmerchandising.[85]
Amazon.comhandlesmillionsofbackendoperationseveryday,aswellasqueriesfrommorethanhalfa
millionthirdpartysellers.ThecoretechnologythatkeepsAmazonrunningisLinuxbasedandasof2005
theyhadtheworld'sthreelargestLinuxdatabases,withcapacitiesof7.8TB,18.5TB,and24.7TB.[86]
Facebookhandles50billionphotosfromitsuserbase.[87]
Googlewashandlingroughly100billionsearchespermonthasofAugust2012.[88]
OracleNoSQLDatabasehasbeentestedtopastthe1Mops/secmarkwith8shardsandproceededtohit
1.2Mops/secwith10shards.[89]

InformationTechnology

https://en.wikipedia.org/wiki/Big_data 8/20
3/30/2017 BigdataWikipedia

Especiallysince2015,bigdatahascometoprominencewithinBusinessOperationsasatooltohelpemployees
workmoreefficientlyandstreamlinethecollectionanddistributionofInformationTechnology(IT).Theuseof
bigdatatoresolveITanddatacollectionissueswithinanenterpriseiscalledITOperationsAnalytics(ITOA).[90]
Byapplyingbigdataprinciplesintotheconceptsofmachineintelligenceanddeepcomputing,ITdepartmentscan
predictpotentialissuesandmovetoprovidesolutionsbeforetheproblemsevenhappen.[90]Inthistime,ITOA
businesseswerealsobeginningtoplayamajorroleinsystemsmanagementbyofferingplatformsthatbrought
individualdatasilostogetherandgeneratedinsightsfromthewholeofthesystemratherthanfromisolatedpockets
ofdata.

Retail

Walmarthandlesmorethan1millioncustomertransactionseveryhour,whichareimportedintodatabases
estimatedtocontainmorethan2.5petabytes(2560terabytes)ofdatatheequivalentof167timesthe
informationcontainedinallthebooksintheUSLibraryofCongress.[3]

Retailbanking

FICOCardDetectionSystemprotectsaccountsworldwide.[91]
Thevolumeofbusinessdataworldwide,acrossallcompanies,doublesevery1.2years,accordingto
estimates.[92][93]

Realestate

WindermereRealEstateusesanonymousGPSsignalsfromnearly100milliondriverstohelpnewhome
buyersdeterminetheirtypicaldrivetimestoandfromworkthroughoutvarioustimesoftheday.[94]

Science

TheLargeHadronColliderexperimentsrepresentabout150millionsensorsdeliveringdata40milliontimesper
second.Therearenearly600millioncollisionspersecond.Afterfilteringandrefrainingfromrecordingmorethan
99.99995%[95]ofthesestreams,thereare100collisionsofinterestpersecond.[96][97][98]

Asaresult,onlyworkingwithlessthan0.001%ofthesensorstreamdata,thedataflowfromallfourLHC
experimentsrepresents25petabytesannualratebeforereplication(asof2012).Thisbecomesnearly200
petabytesafterreplication.
IfallsensordatawererecordedinLHC,thedataflowwouldbeextremelyhardtoworkwith.Thedataflow
wouldexceed150millionpetabytesannualrate,ornearly500exabytesperday,beforereplication.Toput
thenumberinperspective,thisisequivalentto500quintillion(51020)bytesperday,almost200times
morethanalltheothersourcescombinedintheworld.

TheSquareKilometreArrayisaradiotelescopebuiltofthousandsofantennas.Itisexpectedtobeoperationalby
2024.Collectively,theseantennasareexpectedtogather14exabytesandstoreonepetabyteperday.[99][100]Itis
consideredoneofthemostambitiousscientificprojectseverundertaken.[101]

Scienceandresearch

WhentheSloanDigitalSkySurvey(SDSS)begantocollectastronomicaldatain2000,itamassedmorein
itsfirstfewweeksthanalldatacollectedinthehistoryofastronomypreviously.Continuingatarateof
about200GBpernight,SDSShasamassedmorethan140terabytesofinformation.[3]WhentheLarge

https://en.wikipedia.org/wiki/Big_data 9/20
3/30/2017 BigdataWikipedia

SynopticSurveyTelescope,successortoSDSS,comesonlinein2020,itsdesignersexpectittoacquirethat
amountofdataeveryfivedays.[3]
Decodingthehumangenomeoriginallytook10yearstoprocess,nowitcanbeachievedinlessthanaday.
TheDNAsequencershavedividedthesequencingcostby10,000inthelasttenyears,whichis100times
cheaperthanthereductionincostpredictedbyMoore'sLaw.[102]
TheNASACenterforClimateSimulation(NCCS)stores32petabytesofclimateobservationsand
simulationsontheDiscoversupercomputingcluster.[103][104]
Google'sDNAStackcompilesandorganizesDNAsamplesofgeneticdatafromaroundtheworldtoidentify
diseasesandothermedicaldefects.Thesefastandexactcalculationseliminateany'frictionpoints,'orhuman
errorsthatcouldbemadebyoneofthenumerousscienceandbiologyexpertsworkingwiththeDNA.
DNAStack,apartofGoogleGenomics,allowsscientiststousethevastsampleofresourcesfromGoogle's
searchservertoscalesocialexperimentsthatwouldusuallytakeyears,instantly.[105][106]
23andme'sDNAdatabasecontainsgeneticinformationofover1,000,000peopleworldwide.[107]The
companyexploressellingthe"anonymousaggregatedgeneticdata"tootherresearchersandpharmaceutical
companiesforresearchpurposesifpatientsgivetheirconsent.[108][109][110][111][112]AhmadHariri,professor
ofpsychologyandneuroscienceatDukeUniversitywhohasbeenusing23andMeinhisresearchsince2009
statesthatthemostimportantaspectofthecompany'snewserviceisthatitmakesgeneticresearch
accessibleandrelativelycheapforscientists.[108]Astudythatidentified15genomesiteslinkedto
depressionin23andMe'sdatabaseleadtoasurgeindemandstoaccesstherepositorywith23andMefielding
nearly20requeststoaccessthedepressiondatainthetwoweeksafterpublicationofthepaper.[113]

Sports

Bigdatacanbeusedtoimprovetrainingandunderstandingcompetitors,usingsportsensors.Itisalsopossibleto
predictwinnersinamatchusingbigdataanalytics.[114]Futureperformanceofplayerscouldbepredictedaswell.
Thus,players'valueandsalaryisdeterminedbydatacollectedthroughouttheseason.[115]

ThemovieMoneyBalldemonstrateshowbigdatacouldbeusedtoscoutplayersandalsoidentifyundervalued
players.[116]

InFormulaOneraces,racecarswithhundredsofsensorsgenerateterabytesofdata.Thesesensorscollectdata
pointsfromtirepressuretofuelburnefficiency.[117]Basedonthedata,engineersanddataanalystsdecidewhether
adjustmentsshouldbemadeinordertowinarace.Besides,usingbigdata,raceteamstrytopredictthetimethey
willfinishtheracebeforehand,basedonsimulationsusingdatacollectedovertheseason.[118]

Researchactivities
EncryptedsearchandclusterformationinbigdatawasdemonstratedinMarch2014attheAmericanSocietyof
EngineeringEducation.GautamSiwachengagedatTacklingthechallengesofBigDatabyMITComputerScience
andArtificialIntelligenceLaboratoryandDr.AmirEsmailpouratUNHResearchGroupinvestigatedthekey
featuresofbigdataasformationofclustersandtheirinterconnections.Theyfocusedonthesecurityofbigdata
andtheactualorientationofthetermtowardsthepresenceofdifferenttypeofdatainanencryptedformatcloud
interfacebyprovidingtherawdefinitionsandrealtimeexampleswithinthetechnology.Moreover,theyproposed
anapproachforidentifyingtheencodingtechniquetoadvancetowardsanexpeditedsearchoverencryptedtext
leadingtothesecurityenhancementsinbigdata.[119]

InMarch2012,TheWhiteHouseannouncedanational"BigDataInitiative"thatconsistedofsixFederal
departmentsandagenciescommittingmorethan$200milliontobigdataresearchprojects.[120]

https://en.wikipedia.org/wiki/Big_data 10/20
3/30/2017 BigdataWikipedia

TheinitiativeincludedaNationalScienceFoundation"ExpeditionsinComputing"grantof$10millionover5
yearstotheAMPLab[121]attheUniversityofCalifornia,Berkeley.[122]TheAMPLabalsoreceivedfundsfrom
DARPA,andoveradozenindustrialsponsorsandusesbigdatatoattackawiderangeofproblemsfrompredicting
trafficcongestion[123]tofightingcancer.[124]

TheWhiteHouseBigDataInitiativealsoincludedacommitmentbytheDepartmentofEnergytoprovide$25
millioninfundingover5yearstoestablishtheScalableDataManagement,AnalysisandVisualization(SDAV)
Institute,[125]ledbytheEnergyDepartmentsLawrenceBerkeleyNationalLaboratory.TheSDAVInstituteaimsto
bringtogethertheexpertiseofsixnationallaboratoriesandsevenuniversitiestodevelopnewtoolstohelp
scientistsmanageandvisualizedataontheDepartment'ssupercomputers.

TheU.S.stateofMassachusettsannouncedtheMassachusettsBigDataInitiativeinMay2012,whichprovides
fundingfromthestategovernmentandprivatecompaniestoavarietyofresearchinstitutions.[126]The
MassachusettsInstituteofTechnologyhoststheIntelScienceandTechnologyCenterforBigDataintheMIT
ComputerScienceandArtificialIntelligenceLaboratory,combininggovernment,corporate,andinstitutional
fundingandresearchefforts.[127]

TheEuropeanCommissionisfundingthe2yearlongBigDataPublicPrivateForumthroughtheirSeventh
FrameworkProgramtoengagecompanies,academicsandotherstakeholdersindiscussingbigdataissues.The
projectaimstodefineastrategyintermsofresearchandinnovationtoguidesupportingactionsfromtheEuropean
Commissioninthesuccessfulimplementationofthebigdataeconomy.Outcomesofthisprojectwillbeusedas
inputforHorizon2020,theirnextframeworkprogram.[128]

TheBritishgovernmentannouncedinMarch2014thefoundingoftheAlanTuringInstitute,namedafterthe
computerpioneerandcodebreaker,whichwillfocusonnewwaystocollectandanalyselargedatasets.[129]

AttheUniversityofWaterlooStratfordCampusCanadianOpenDataExperience(CODE)InspirationDay,
participantsdemonstratedhowusingdatavisualizationcanincreasetheunderstandingandappealofbigdatasets
andcommunicatetheirstorytotheworld.[130]

TomakemanufacturingmorecompetitiveintheUnitedStates(andglobe),thereisaneedtointegratemore
AmericaningenuityandinnovationintomanufacturingTherefore,NationalScienceFoundationhasgrantedthe
IndustryUniversitycooperativeresearchcenterforIntelligentMaintenanceSystems(IMS)atuniversityof
Cincinnatitofocusondevelopingadvancedpredictivetoolsandtechniquestobeapplicableinabigdata
environment.[131]InMay2013,IMSCenterheldanindustryadvisoryboardmeetingfocusingonbigdatawhere
presentersfromvariousindustrialcompaniesdiscussedtheirconcerns,issuesandfuturegoalsinbigdata
environment.

ComputationalsocialsciencesAnyonecanuseApplicationProgrammingInterfaces(APIs)providedbybigdata
holders,suchasGoogleandTwitter,todoresearchinthesocialandbehavioralsciences.[132]OftentheseAPIsare
providedforfree.[132]TobiasPreisetal.usedGoogleTrendsdatatodemonstratethatInternetusersfromcountries
withahigherpercapitagrossdomesticproduct(GDP)aremorelikelytosearchforinformationaboutthefuture
thaninformationaboutthepast.Thefindingssuggesttheremaybealinkbetweenonlinebehaviourandrealworld
economicindicators.[133][134][135]TheauthorsofthestudyexaminedGooglequerieslogsmadebyratioofthe
volumeofsearchesforthecomingyear('2011')tothevolumeofsearchesforthepreviousyear('2009'),which
theycallthe'futureorientationindex'.[136]TheycomparedthefutureorientationindextothepercapitaGDPof
eachcountry,andfoundastrongtendencyforcountrieswhereGoogleusersinquiremoreaboutthefuturetohave
ahigherGDP.Theresultshintthattheremaypotentiallybearelationshipbetweentheeconomicsuccessofa
countryandtheinformationseekingbehaviorofitscitizenscapturedinbigdata.

https://en.wikipedia.org/wiki/Big_data 11/20
3/30/2017 BigdataWikipedia

TobiasPreisandhiscolleaguesHelenSusannahMoatandH.EugeneStanleyintroducedamethodtoidentify
onlineprecursorsforstockmarketmoves,usingtradingstrategiesbasedonsearchvolumedataprovidedby
GoogleTrends.[137]TheiranalysisofGooglesearchvolumefor98termsofvaryingfinancialrelevance,published
inScientificReports,[138]suggeststhatincreasesinsearchvolumeforfinanciallyrelevantsearchtermstendto
precedelargelossesinfinancialmarkets.[139][140][141][142][143][144][145][146]

Bigdatasetscomewithalgorithmicchallengesthatpreviouslydidnotexist.Hence,thereisaneedto
fundamentallychangetheprocessingways.[147]

TheWorkshopsonAlgorithmsforModernMassiveDataSets(MMDS)bringtogethercomputerscientists,
statisticians,mathematicians,anddataanalysispractitionerstodiscussalgorithmicchallengesofbigdata.[148]

Samplingbigdata

Animportantresearchquestionthatcanbeaskedaboutbigdatasetsiswhetheryouneedtolookatthefulldatato
drawcertainconclusionsaboutthepropertiesofthedataorisasamplegoodenough.Thenamebigdataitself
containsatermrelatedtosizeandthisisanimportantcharacteristicofbigdata.ButSampling(statistics)enables
theselectionofrightdatapointsfromwithinthelargerdatasettoestimatethecharacteristicsofthewhole
population.Forexample,thereareabout600milliontweetsproducedeveryday.Isitnecessarytolookatallof
themtodeterminethetopicsthatarediscussedduringtheday?Isitnecessarytolookatallthetweetstodetermine
thesentimentoneachofthetopics?Inmanufacturingdifferenttypesofsensorydatasuchasacoustics,vibration,
pressure,current,voltageandcontrollerdataareavailableatshorttimeintervals.Topredictdowntimeitmaynot
benecessarytolookatallthedatabutasamplemaybesufficient.BigDatacanbebrokendownbyvariousdata
pointcategoriessuchasdemographic,psychographic,behavioral,andtransactionaldata.Withlargesetsofdata
points,marketersareabletocreateandutilizemorecustomizedsegmentsofconsumersformorestrategic
targeting.

TherehasbeensomeworkdoneinSamplingalgorithmsforbigdata.Atheoreticalformulationforsampling
Twitterdatahasbeendeveloped.[149]

Critique
Critiquesofthebigdataparadigmcomeintwoflavors,thosethatquestiontheimplicationsoftheapproachitself,
andthosethatquestionthewayitiscurrentlydone.[150]OneapproachtothiscriticismisthefieldofCriticaldata
studies.

Critiquesofthebigdataparadigm

"Acrucialproblemisthatwedonotknowmuchabouttheunderlyingempiricalmicroprocessesthatleadtothe
emergenceofthe[se]typicalnetworkcharacteristicsofBigData".[15]Intheircritique,Snijders,Matzat,andReips
pointoutthatoftenverystrongassumptionsaremadeaboutmathematicalpropertiesthatmaynotatallreflect
whatisreallygoingonatthelevelofmicroprocesses.MarkGrahamhasleveledbroadcritiquesatChris
Anderson'sassertionthatbigdatawillspelltheendoftheory:[151]focusinginparticularonthenotionthatbigdata
mustalwaysbecontextualizedintheirsocial,economic,andpoliticalcontexts.[152]Evenascompaniesinvest
eightandninefiguresumstoderiveinsightfrominformationstreaminginfromsuppliersandcustomers,lessthan
40%ofemployeeshavesufficientlymatureprocessesandskillstodoso.Toovercomethisinsightdeficit,bigdata,
nomatterhowcomprehensiveorwellanalysed,mustbecomplementedby"bigjudgment,"accordingtoanarticle
intheHarvardBusinessReview.[153]

https://en.wikipedia.org/wiki/Big_data 12/20
3/30/2017 BigdataWikipedia

Muchinthesameline,ithasbeenpointedoutthatthedecisionsbasedontheanalysisofbigdataareinevitably
"informedbytheworldasitwasinthepast,or,atbest,asitcurrentlyis".[69]Fedbyalargenumberofdataonpast
experiences,algorithmscanpredictfuturedevelopmentifthefutureissimilartothepast.[154]Ifthesystems
dynamicsofthefuturechange(ifitisnotastationaryprocess),thepastcansaylittleaboutthefuture.Inorderto
makepredictionsinchangingenvironments,itwouldbenecessarytohaveathoroughunderstandingofthe
systemsdynamic,whichrequirestheory.[154]Asaresponsetothiscritiqueithasbeensuggestedtocombinebig
dataapproacheswithcomputersimulations,suchasagentbasedmodels[69]andComplexSystems.Agentbased
modelsareincreasinglygettingbetterinpredictingtheoutcomeofsocialcomplexitiesofevenunknownfuture
scenariosthroughcomputersimulationsthatarebasedonacollectionofmutuallyinterdependent
algorithms.[155][156]Inaddition,useofmultivariatemethodsthatprobeforthelatentstructureofthedata,suchas
factoranalysisandclusteranalysis,haveprovenusefulasanalyticapproachesthatgowellbeyondthebivariate
approaches(crosstabs)typicallyemployedwithsmallerdatasets.

Inhealthandbiology,conventionalscientificapproachesarebasedonexperimentation.Fortheseapproaches,the
limitingfactoristherelevantdatathatcanconfirmorrefutetheinitialhypothesis.[157]Anewpostulateisaccepted
nowinbiosciences:theinformationprovidedbythedatainhugevolumes(omics)withoutpriorhypothesisis
complementaryandsometimesnecessarytoconventionalapproachesbasedonexperimentation.[158][159]Inthe
massiveapproachesitistheformulationofarelevanthypothesistoexplainthedatathatisthelimitingfactor.[160]
Thesearchlogicisreversedandthelimitsofinduction("GloryofScienceandPhilosophyscandal",C.D.Broad,
1926)aretobeconsidered.

Privacyadvocatesareconcernedaboutthethreattoprivacyrepresentedbyincreasingstorageandintegrationof
personallyidentifiableinformationexpertpanelshavereleasedvariouspolicyrecommendationstoconform
practicetoexpectationsofprivacy.[161][162][163]

Critiquesofbigdataexecution

UlfDietrichReipsandUweMatzatwrotein2014thatbigdatahadbecomea"fad"
inscientificresearch.[132]ResearcherDanahBoydhasraisedconcernsaboutthe
useofbigdatainscienceneglectingprinciplessuchaschoosingarepresentative
samplebybeingtooconcernedaboutactuallyhandlingthehugeamountsof
data.[164]Thisapproachmayleadtoresultsbiasinonewayoranother.Integration
acrossheterogeneousdataresourcessomethatmightbeconsideredbigdataand
othersnotpresentsformidablelogisticalaswellasanalyticalchallenges,but
manyresearchersarguethatsuchintegrationsarelikelytorepresentthemost
promisingnewfrontiersinscience.[165]Intheprovocativearticle"Critical
QuestionsforBigData",[166]theauthorstitlebigdataapartofmythology:"large
datasetsofferahigherformofintelligenceandknowledge[...],withtheauraof
truth,objectivity,andaccuracy".Usersofbigdataareoften"lostinthesheer
volumeofnumbers",and"workingwithBigDataisstillsubjective,andwhatit
DanahBoyd
quantifiesdoesnotnecessarilyhaveacloserclaimonobjectivetruth".[166]Recent
developmentsinBIdomain,suchasproactivereportingespeciallytarget
improvementsinusabilityofbigdata,throughautomatedfilteringofnonusefuldataandcorrelations.[167]

Bigdataanalysisisoftenshallowcomparedtoanalysisofsmallerdatasets.[168]Inmanybigdataprojects,thereis
nolargedataanalysishappening,butthechallengeistheextract,transform,loadpartofdatapreprocessing.[168]

https://en.wikipedia.org/wiki/Big_data 13/20
3/30/2017 BigdataWikipedia

Bigdataisabuzzwordanda"vagueterm",[169][170]butatthesametimean"obsession"[170]withentrepreneurs,
consultants,scientistsandthemedia.BigdatashowcasessuchasGoogleFluTrendsfailedtodelivergood
predictionsinrecentyears,overstatingthefluoutbreaksbyafactoroftwo.Similarly,Academyawardsand
electionpredictionssolelybasedonTwitterweremoreoftenoffthanontarget.Bigdataoftenposesthesame
challengesassmalldataandaddingmoredatadoesnotsolveproblemsofbias,butmayemphasizeother
problems.InparticulardatasourcessuchasTwitterarenotrepresentativeoftheoverallpopulation,andresults
drawnfromsuchsourcesmaythenleadtowrongconclusions.GoogleTranslatewhichisbasedonbigdata
statisticalanalysisoftextdoesagoodjobattranslatingwebpages.However,resultsfromspecializeddomains
maybedramaticallyskewed.Ontheotherhand,bigdatamayalsointroducenewproblems,suchasthemultiple
comparisonsproblem:simultaneouslytestingalargesetofhypothesesislikelytoproducemanyfalseresultsthat
mistakenlyappearsignificant.Ioannidisarguedthat"mostpublishedresearchfindingsarefalse"[171]dueto
essentiallythesameeffect:whenmanyscientificteamsandresearcherseachperformmanyexperiments(i.e.
processabigamountofscientificdataalthoughnotwithbigdatatechnology),thelikelihoodofa"significant"
resultbeingactuallyfalsegrowsfastevenmoreso,whenonlypositiveresultsarepublished.Furthermore,big
dataanalyticsresultsareonlyasgoodasthemodelonwhichtheyarepredicated.Inanexample,bigdatatookpart
inattemptingtopredicttheresultsofthe2016U.S.PresidentialElection[172]withvaryingdegreesofsuccess.
Forbespredicted"IfyoubelieveinBigDataanalytics,itstimetobeginplanningforaHillaryClintonpresidency
andallthatentails.".[173]

Seealso
Bigmemory
Datafication
Datadefinedstorage
Datajournalism
Datalineage
Dataphilanthropy
Datascience
Statistics
Surveillancecapitalism
Smalldata
Urbaninformatics

References
1."TheWorld'sTechnologicalCapacitytoStore,Communicate,andComputeInformation".MartinHilbert.net.Retrieved
13April2016.
2.boyd,danaCrawford,Kate(September21,2011)."SixProvocationsforBigData".SocialScienceResearchNetwork:A
DecadeinInternetTime:SymposiumontheDynamicsoftheInternetandSociety.doi:10.2139/ssrn.1926431.
3."Data,dataeverywhere".TheEconomist.25February2010.Retrieved9December2012.
4."Communityclevernessrequired".Nature.455(7209):1.4September2008.doi:10.1038/455001a.
5.Reichman,O.J.Jones,M.B.Schildhauer,M.P.(2011)."ChallengesandOpportunitiesofOpenDatainEcology".
Science.331(6018):7035.doi:10.1126/science.1197962.PMID21311007.
6.Hellerstein,Joe(9November2008)."ParallelProgrammingintheAgeofBigData".GigaomBlog.
7.Segaran,TobyHammerbacher,Jeff(2009).BeautifulData:TheStoriesBehindElegantDataSolutions.O'ReillyMedia.
p.257.ISBN9780596157111.
8.Hilbert,MartinLpez,Priscila(2011)."TheWorld'sTechnologicalCapacitytoStore,Communicate,andCompute
Information".Science.332(6025):6065.doi:10.1126/science.1200970.PMID21310967.
9."IBMWhatisbigdata?Bringingbigdatatotheenterprise".www.ibm.com.Retrieved20130826.

https://en.wikipedia.org/wiki/Big_data 14/20
3/30/2017 BigdataWikipedia

10.OracleandFSN,"MasteringBigData:CFOStrategiestoTransformInsightintoOpportunity"(http://www.fsn.co.uk/cha
nnel_bi_bpm_cpm/mastering_big_data_cfo_strategies_to_transform_insight_into_opportunity#.UO2AcTTuys),
December2012
11.Jacobs,A.(6July2009)."ThePathologiesofBigData".ACMQueue.
12.Magoulas,RogerLorica,Ben(February2009)."IntroductiontoBigData".Release2.0.SebastopolCA:O'ReillyMedia
(11).
13.JohnR.Mashey(25April1998)."BigData...andtheNextWaveofInfraStress"(PDF).Slidesfrominvitedtalk.Usenix.
Retrieved28September2016.
14.SteveLohr(1February2013)."TheOriginsof'BigData':AnEtymologicalDetectiveStory".NewYorkTimes.
Retrieved28September2016.
15.Snijders,C.Matzat,U.Reips,U.D.(2012)." 'BigData':BiggapsofknowledgeinthefieldofInternet".International
JournalofInternetScience.7:15.
16.Everts,Sarah(2016)."InformationOverload".Distillations.2(2):2633.Retrieved17February2017.
17.IbrahimTargioHashem,AbakerYaqoob,IbrarBadrulAnuar,NorMokhtar,SalimahGani,AbdullahUllahKhan,
Samee(2015)."bigdata"oncloudcomputing:Reviewandopenresearchissues".InformationSystems.47:98115.
doi:10.1016/j.is.2014.07.006.
18.Laney,Douglas."3DDataManagement:ControllingDataVolume,VelocityandVariety"(PDF).Gartner.Retrieved
6February2001.
19.Beyer,Mark."GartnerSaysSolving'BigData'ChallengeInvolvesMoreThanJustManagingVolumesofData".Gartner.
Archivedfromtheoriginalon10July2011.Retrieved13July2011.
20.DeMauro,AndreaGreco,MarcoGrimaldi,Michele(2016)."AFormaldefinitionofBigDatabasedonitsessential
Features".LibraryReview.65:122135.doi:10.1108/LR0620150061.
21."WhatisBigData?".VillanovaUniversity.
22.Grimes,Seth."BigData:Avoid'WannaV'Confusion".InformationWeek.Retrieved5January2016.
23.Hilbert,Martin."BigDataforDevelopment:AReviewofPromisesandChallenges.DevelopmentPolicyReview.".
martinhilbert.net.Retrieved20151007.
24.DT&SC73:WhatisBigData?.12August2015viaYouTube.
25.MayerSchnberger,V.,&Cukier,K.(2013).Bigdata:arevolutionthatwilltransformhowwelive,workandthink.
London:JohnMurray.
26."DigitalTechnology&SocialChange".
27.http://www.bigdataparis.com/presentation/mercredi/PDelort.pdf?PHPSESSID=tv7k70pcr3egpi2r6fi3qbjtj6#page=4
28.BillingsS.A."NonlinearSystemIdentification:NARMAXMethodsintheTime,Frequency,andSpatioTemporal
Domains".Wiley,2013
29."leBlogANDSIDSIBigData".
30.LesEchos(3April2013)."LesEchosBigDatacarLowDensityData?Lafaibledensiteninformationcomme
facteurdiscriminantArchives".lesechos.fr.
31.Wu,D.,Liu.X.,Hebert,S.,Gentzsch,W.,Terpenny,J.(2015).PerformanceEvaluationofCloudBasedHigh
PerformanceComputingforFiniteElementAnalysis.ProceedingsoftheASME2015InternationalDesignEngineering
TechnicalConference&ComputersandInformationinEngineeringConference(IDETC/CIE2015),Boston,
Massachusetts,U.S.
32.Wu,D.Rosen,D.W.Wang,L.Schaefer,D.(2015)."CloudBasedDesignandManufacturing:ANewParadigmin
DigitalManufacturingandDesignInnovation".ComputerAidedDesign.59(1):114.doi:10.1016/j.cad.2014.07.006.
33.Lee,JayBagheri,BehradKao,HungAn(2014)."RecentAdvancesandTrendsofCyberPhysicalSystemsandBig
DataAnalyticsinIndustrialInformatics".IEEEInt.ConferenceonIndustrialInformatics(INDIN)2014.
34.Lee,JayLapira,EdzelBagheri,BehradKao,Hungan."Recentadvancesandtrendsinpredictivemanufacturing
systemsinbigdataenvironment".ManufacturingLetters.1(1):3841.doi:10.1016/j.mfglet.2013.09.005.
35."LexisNexisToBuySeisintFor$775Million".WashingtonPost.Retrieved15July2004.
36."LexisNexisParentSettoBuyChoicePoint".WashingtonPost.Retrieved22February2008.
37."QuantcastOpensExabyteReadyFileSystem".www.datanami.com.Retrieved1October2012.
38.Bertolucci,Jeff"Hadoop:FromExperimentToLeadingBigDataPlatform"(http://www.informationweek.com/bigdata/n
ews/softwareplatforms/hadoopfromexperimenttoleadingbigd/240157176),"InformationWeek",2013.Retrievedon
14November2013.
39.Webster,John."MapReduce:SimplifiedDataProcessingonLargeClusters"(http://research.google.com/archive/mapredu
ceosdi04.pdf),"SearchStorage",2004.Retrievedon25March2013.
40."BigDataSolutionOffering".MIKE2.0.Retrieved8December2013.
41."BigDataDefinition".MIKE2.0.Retrieved9March2013.

https://en.wikipedia.org/wiki/Big_data 15/20
3/30/2017 BigdataWikipedia

42.Boja,CPocovnicu,ABtgan,L.(2012)."DistributedParallelArchitectureforBigData".InformaticaEconomica.16
(2):116127.
43."IMS_CPSIMSCenter".Retrieved16June2016.
44.http://www.hcltech.com/sites/default/files/solving_key_businesschallenges_with_big_data_lake_0.pdf
45."MethodfortestingthefaulttoleranceofMapReduceframeworks"(PDF).ComputerNetworks.2015.
46.Manyika,JamesChui,MichaelBughin,JaquesBrown,BradDobbs,RichardRoxburgh,CharlesByers,Angela
Hung(May2011)."BigData:Thenextfrontierforinnovation,competition,andproductivity".McKinseyGlobal
Institute.RetrievedJanuary16,2016.
47."FutureDirectionsinTensorBasedComputationandModeling"(PDF).May2009.
48.Lu,HaipingPlataniotis,K.N.Venetsanopoulos,A.N.(2011)."ASurveyofMultilinearSubspaceLearningforTensor
Data"(PDF).PatternRecognition.44(7):15401551.doi:10.1016/j.patcog.2011.01.004.
49.Pllana,SabriJanciak,IvanBrezany,PeterWhrer,Alexander."ASurveyoftheStateoftheArtinDataMiningand
IntegrationQueryLanguages".2011InternationalConferenceonNetworkBasedInformationSystems(NBIS2011).
IEEEComputerSociety.Retrieved2April2016.
50.Monash,Curt(30April2009)."eBay'stwoenormousdatawarehouses".
Monash,Curt(6October2010)."eBayfollowupGreenplumout,Teradata>10petabytes,Hadoophassomevalue,and
more".
51."ResourcesonhowTopologicalDataAnalysisisusedtoanalyzebigdata".Ayasdi.
52.CNETNews(1April2011)."Storageareanetworksneednotapply".
53."HowNewAnalyticSystemswillImpactStorage".September2011.
54."AnErrorOccurredSettingYourUserCookie".TheInformationSociety.30:127143.
doi:10.1080/01972243.2013.873748.
55.Rajpurohit,Anmol(11July2014)."Interview:AmyGershkoff,DirectorofCustomerAnalytics&Insights,eBayon
HowtoDesignCustomInHouseBITools".KDnuggets.Retrieved20140714."Dr.AmyGershkoff:"Generally,Ifind
thatofftheshelfbusinessintelligencetoolsdonotmeettheneedsofclientswhowanttoderivecustominsightsfrom
theirdata.Therefore,formediumtolargeorganizationswithaccesstostrongtechnicaltalent,Iusuallyrecommend
buildingcustom,inhousesolutions.""
56."TheGovernmentandbigdata:Use,problemsandpotential".Computerworld.21March2012.Retrieved12September
2016.
57.Kalil,Tom."BigDataisaBigDeal".WhiteHouse.Retrieved26September2012.
58.ExecutiveOfficeofthePresident(March2012)."BigDataAcrosstheFederalGovernment"(PDF).WhiteHouse.
Retrieved26September2012.
59.Lampitt,Andrew."TherealstoryofhowbigdataanalyticshelpedObamawin".Infoworld.Retrieved31May2014.
60.Hoover,J.Nicholas."Government's10MostPowerfulSupercomputers".InformationWeek.UBM.Retrieved
26September2012.
61.Bamford,James(15March2012)."TheNSAIsBuildingtheCountry'sBiggestSpyCenter(WatchWhatYouSay)".
WiredMagazine.Retrieved20130318.
62."GroundbreakingCeremonyHeldfor$1.2BillionUtahDataCenter".NationalSecurityAgencyCentralSecurityService.
Retrieved20130318.
63.Hill,Kashmir."TBlueprintsofNSA'sRidiculouslyExpensiveDataCenterinUtahSuggestItHoldsLessInfoThan
Thought".Forbes.Retrieved20131031.
64."News:LiveMint".AreIndiancompaniesmakingenoughsenseofBigData?.LiveMint.23June2014.Retrieved
20141122.
65."SurveyonBigDataUsingDataMining"(PDF).InternationalJournalofEngineeringDevelopmentandResearch.2015.
Retrieved14September2016.
66."RecentadvancesdeliveredbyMobileCloudComputingandInternetofThingsforBigDataapplications:asurvey".
InternationalJournalofNetworkManagement.11March2016.Retrieved14September2016.
67."WhitePaper:BigDataforDevelopment:Opportunities&Challenges(2012)UnitedNationsGlobalPulse".Retrieved
13April2016.
68."WEF(WorldEconomicForum),&VitalWaveConsulting.(2012).BigData,BigImpact:NewPossibilitiesfor
InternationalDevelopment".WorldEconomicForum.Retrieved24August2012.
69."BigDataforDevelopment:FromInformationtoKnowledgeSocieties".SSRN2205145 .
70."ElenaKvochko,FourWaysTotalkAboutBigData(InformationCommunicationTechnologiesforDevelopment
Series)".worldbank.org.Retrieved20120530.
71."DanieleMedri:BigData&Business:Anongoingrevolution".StatisticsViews.21October2013.
72.TobiasKnoblochandJuliaManske(11January2016)."Responsibleuseofdata".D+C,DevelopmentandCooperation.

https://en.wikipedia.org/wiki/Big_data 16/20
3/30/2017 BigdataWikipedia

73.Lee,JayWu,F.Zhao,W.Ghaffari,M.Liao,L(January2013)."Prognosticsandhealthmanagementdesignforrotary
machinerysystemsReviews,methodologyandapplications".MechanicalSystemsandSignalProcessing.42(1).
74."Tutorials".PHMSociety.Retrieved27September2016.
75."PrognosticandHealthManagementTechnologyforMOCVDEquipment".IndustrialTechnologyResearchInstitute.
Retrieved27September2016.
76."ImpendingChallengesfortheUseofBigData".InternationalJournalofRadiationOncology*Biology*Physics.
doi:10.1016/j.ijrobp.2015.10.060.
77.O'Donoghue,JohnHerbert,John(1October2012)."DataManagementWithinmHealthEnvironments:PatientSensors,
MobileDevices,andDatabases".JournalofDataandInformationQuality.4(1):5:15:20.
doi:10.1145/2378016.2378021.Retrieved16June2016viaACMDigitalLibrary.
78.Mirkes,E.M.Coats,T.J.Levesley,J.Gorban,A.N.(2016)."Handlingmissingdatainlargehealthcaredataset:Acase
studyofunknowntraumaoutcomes".ComputersinBiologyandMedicine.75:203216.
doi:10.1016/j.compbiomed.2016.06.004.
79.Murdoch,TravisB.Detsky,AllanS.(20130403)."TheInevitableApplicationofBigDatatoHealthCare".JAMA.
309(13):1351.doi:10.1001/jama.2013.393.ISSN00987484.
80."DegreesinBigData:FadorFastTracktoCareerSuccess".Forbes.Retrieved20160221.
81."NYgetsnewbootcampfordatascientists:It'sfree,buthardertogetintothanHarvard".VentureBeat.Retrieved
20160221.
82.Couldry,NickTurow,Joseph(2014)."Advertising,BigData,andtheClearanceofthePublicRealm:Marketers'New
ApproachestotheContentSubsidy".InternationalJournalofCommunication.8:17101726.
83.http://www.businesswire.com/news/home/20170109006500/en/QuiONamedInnovationChampionAccenture
HealthTechInnovation
84.https://www.predix.com/sites/default/files/IDC_OT_Final_whitepaper_249120.pdf
85.Tay,Liz."InsideeBay's90PBdatawarehouse".ITNews.Retrieved20160212.
86.Layton,Julia."AmazonTechnology".Money.howstuffworks.com.Retrieved20130305.
87."ScalingFacebookto500MillionUsersandBeyond".Facebook.com.Retrieved20130721.
88."GoogleStillDoingatLeast1TrillionSearchesPerYear".SearchEngineLand.16January2015.Retrieved15April
2015.
89.Lamb,Charles."OracleNoSQLDatabaseExceeds1MillionMixedYCSBOps/Sec".
90.Solnik,Ray."TheTimeHasCome:AnalyticsDeliversforITOperations".DataCenterJournal.RetrievedJune21,
2016.
91."FICOFalconFraudManager".Fico.com.Retrieved20130721.
92."eBayStudy:HowtoBuildTrustandImprovetheShoppingExperience".Knowwpcarey.com.8May2012.Retrieved
20151220.
93.LeadingPrioritiesforBigDataforBusinessandIT(http://www.statista.com/statistics/280444/globalleadingprioritiesf
orbigdataaccordingtobusinessanditexecutives/).eMarketer.October2013.RetrievedJanuary2014.
94.Wingfield,Nick(12March2013)."PredictingCommutesMoreAccuratelyforWouldBeHomeBuyers
NYTimes.com".Bits.blogs.nytimes.com.Retrieved20130721.
95.Alexandru,Dan."Prof"(PDF).cds.cern.ch.CERN.Retrieved24March2015.
96."LHCBrochure,Englishversion.Apresentationofthelargestandthemostpowerfulparticleacceleratorintheworld,
theLargeHadronCollider(LHC),whichstartedupin2008.Itsrole,characteristics,technologies,etc.areexplainedfor
thegeneralpublic.".CERNBrochure2010006Eng.LHCBrochure,Englishversion.CERN.Retrieved20January
2013.
97."LHCGuide,Englishversion.AcollectionoffactsandfiguresabouttheLargeHadronCollider(LHC)intheformof
questionsandanswers.".CERNBrochure2008001Eng.LHCGuide,Englishversion.CERN.Retrieved20January
2013.
98.Brumfiel,Geoff(19January2011)."Highenergyphysics:Downthepetabytehighway".Nature.469.pp.28283.
doi:10.1038/469282a.
99.http://www.zurich.ibm.com/pdf/astron/CeBIT%202013%20Background%20DOME.pdf
100."Futuretelescopearraydrivesdevelopmentofexabyteprocessing".ArsTechnica.Retrieved15April2015.
101."Australia'sbidfortheSquareKilometreArrayaninsider'sperspective".TheConversation.1February2012.
Retrieved27September2016.
102.DelortP.,OECDICCPTechnologyForesightForum,2012.(http://www.oecd.org/sti/ieconomy/Session_3_Delort.pdf#pa
ge=6)
103."NASANASAGoddardIntroducestheNASACenterforClimateSimulation".Retrieved13April2016.
104.Webster,Phil."SupercomputingtheClimate:NASA'sBigDataMission".CSCWorld.ComputerSciencesCorporation.
Retrieved20130118.
https://en.wikipedia.org/wiki/Big_data 17/20
3/30/2017 BigdataWikipedia

105."Thesesixgreatneuroscienceideascouldmaketheleapfromlabtomarket".TheGlobeandMail.20November2014.
Retrieved1October2016.
106."DNAstacktacklesmassive,complexDNAdatasetswithGoogleGenomics".GoogleCloudPlatform.Retrieved
1October2016.
107."23andMeAncestry".23andme.com.Retrieved29December2016.
108.Potenza,Alessandra(13July2016)."23andMewantsresearcherstouseitskits,inabidtoexpanditscollectionof
geneticdata".TheVerge.Retrieved29December2016.
109."ThisStartupWillSequenceYourDNA,SoYouCanContributeToMedicalResearch".FastCompany.23December
2016.Retrieved29December2016.
110.Seife,Charles."23andMeIsTerrifying,butNotfortheReasonstheFDAThinks".ScientificAmerican.Retrieved
29December2016.
111.Zaleski,Andrew(22June2016)."Thisbiotechstartupisbettingyourgeneswillyieldthenextwonderdrug".CNBC.
Retrieved29December2016.
112.Regalado,Antonio."How23andMeturnedyourDNAintoa$1billiondrugdiscoverymachine".MITTechnology
Review.Retrieved29December2016.
113."23andMereportsjumpinrequestsfordatainwakeofPfizerdepressionstudy|FierceBiotech".fiercebiotech.com.
Retrieved29December2016.
114.AdmireMoyo."DatascientistspredictSpringbokdefeat".www.itweb.co.za.Retrieved12December2015.
115.ReginaPazvakavambwa."Predictiveanalytics,bigdatatransformsports".www.itweb.co.za.Retrieved12December
2015.
116.RichMiller."TheLessonsofMoneyballforBigDataAnalysis".www.datecenterknowledge.com.Retrieved
12December2015.
117.DaveRyan."Sports:WhereBigDataFinallyMakesSense".www.huffingtonpost.com.Retrieved12December2015.
118.FrankBi."HowFormulaOneTeamsAreUsingBigDataToGetTheInsideEdge".www.forbes.com.Retrieved
12December2015.
119.Siwach,GautamEsmailpour,Amir(March2014).EncryptedSearch&ClusterFormationinBigData(PDF).ASEE
2014ZoneIConference.UniversityofBridgeport,Bridgeport,Connecticut,US.
120."ObamaAdministrationUnveils"BigData"Initiative:Announces$200MillionInNewR&DInvestments"(PDF).The
WhiteHouse.
121."AMPLabattheUniversityofCalifornia,Berkeley".Amplab.cs.berkeley.edu.Retrieved20130305.
122."NSFLeadsFederalEffortsinBigData".NationalScienceFoundation(NSF).29March2012.
123.TimothyHunterTeodorMoldovanMateiZahariaJustinMaMichaelFranklinPieterAbbeelAlexandreBayen
(October2011).ScalingtheMobileMillenniumSystemintheCloud.
124.DavidPatterson(5December2011)."ComputerScientistsMayHaveWhatItTakestoHelpCureCancer".TheNew
YorkTimes.
125."SecretaryChuAnnouncesNewInstitutetoHelpScientistsImproveMassiveDataSetResearchonDOE
Supercomputers"."energy.gov".
126."GovernorPatrickannouncesnewinitiativetostrengthenMassachusetts'positionasaWorldleaderinBigData".
CommonwealthofMassachusetts.
127."BigData@CSAIL".Bigdata.csail.mit.edu.22February2013.Retrieved20130305.
128."BigDataPublicPrivateForum".Cordis.europa.eu.1September2012.Retrieved20130305.
129."AlanTuringInstitutetobesetuptoresearchbigdata".BBCNews.19March2014.Retrieved20140319.
130."InspirationdayatUniversityofWaterloo,StratfordCampus".betakit.com/.Retrieved20140228.
131.Lee,JayLapira,EdzelBagheri,BehradKao,HungAn(2013)."RecentAdvancesandTrendsinPredictive
ManufacturingSystemsinBigDataEnvironment".ManufacturingLetters.1(1):3841.
doi:10.1016/j.mfglet.2013.09.005.
132.Reips,UlfDietrichMatzat,Uwe(2014)."Mining"BigData"usingBigDataServices".InternationalJournalof
InternetScience.1(1):18.
133.Preis,TobiasMoat,,HelenSusannahStanley,H.EugeneBishop,StevenR.(2012)."QuantifyingtheAdvantageof
LookingForward".ScientificReports.2:350.doi:10.1038/srep00350.PMC3320057 .PMID22482034.
134.Marks,Paul(5April2012)."Onlinesearchesforfuturelinkedtoeconomicsuccess".NewScientist.Retrieved9April
2012.
135.Johnston,Casey(6April2012)."GoogleTrendsrevealscluesaboutthementalityofrichernations".ArsTechnica.
Retrieved9April2012.
136.TobiasPreis(24May2012)."SupplementaryInformation:TheFutureOrientationIndexisavailablefordownload"
(PDF).Retrieved20120524.
137.PhilipBall(26April2013)."CountingGooglesearchespredictsmarketmovements".Nature.Retrieved9August2013.
https://en.wikipedia.org/wiki/Big_data 18/20
3/30/2017 BigdataWikipedia

138.TobiasPreis,HelenSusannahMoatandH.EugeneStanley(2013)."QuantifyingTradingBehaviorinFinancialMarkets
UsingGoogleTrends".ScientificReports.3:1684.doi:10.1038/srep01684.PMC3635219 .PMID23619126.
139.NickBilton(26April2013)."GoogleSearchTermsCanPredictStockMarket,StudyFinds".NewYorkTimes.Retrieved
9August2013.
140.ChristopherMatthews(26April2013)."TroubleWithYourInvestmentPortfolio?GoogleIt!".TIMEMagazine.
Retrieved9August2013.
141.PhilipBall(26April2013)."CountingGooglesearchespredictsmarketmovements".Nature.Retrieved9August2013.
142.BernhardWarner(25April2013)." 'BigData'ResearchersTurntoGoogletoBeattheMarkets".Bloomberg
Businessweek.Retrieved9August2013.
143.HamishMcRae(28April2013)."HamishMcRae:Needavaluablehandleoninvestorsentiment?Googleit".The
Independent.London.Retrieved9August2013.
144.RichardWaters(25April2013)."Googlesearchprovestobenewwordinstockmarketprediction".FinancialTimes.
Retrieved9August2013.
145.DavidLeinweber(26April2013)."BigDataGetsBigger:NowGoogleTrendsCanPredictTheMarket".Forbes.
Retrieved9August2013.
146.JasonPalmer(25April2013)."Googlesearchespredictmarketmoves".BBC.Retrieved9August2013.
147.E.Sejdi,"Adaptcurrenttoolsforusewithbigdata,"Nature,vol.vol.507,no.7492,pp.306,Mar.2014.
148.Stanford."MMDS.WorkshoponAlgorithmsforModernMassiveDataSets"(http://web.stanford.edu/group/mmds/).
149.DeepanPalgunaVikasJoshiVenkatesanChakaravarthyRaviKothari&L.V.Subramaniam(2015).Analysisof
SamplingAlgorithmsforTwitter.InternationalJointConferenceonArtificialIntelligence.
150.Kimble,C.Milolidakis,G.(2015)."BigDataandBusinessIntelligence:DebunkingtheMyths".GlobalBusinessand
OrganizationalExcellence.35(1):2334.doi:10.1002/joe.21642.
151.ChrisAnderson(23June2008)."TheEndofTheory:TheDataDelugeMakestheScientificMethodObsolete".WIRED.
152.GrahamM.(9March2012)."Bigdataandtheendoftheory?".TheGuardian.London.
153."GoodDataWon'tGuaranteeGoodDecisions.HarvardBusinessReview".Shah,ShvetankHorne,AndrewCapell,
Jaime.HBR.org.Retrieved8September2012.
154.BigDatarequiresBigVisionsforBigChange.(https://www.youtube.com/watch?v=UXef6yfJZAI),Hilbert,M.(2014).
London:TEDxUCL,x=independentlyorganizedTEDtalks
155.JonathanRauch(1April2002)."SeeingAroundCorners".TheAtlantic.
156.Epstein,J.M.,&Axtell,R.L.(1996).GrowingArtificialSocieties:SocialSciencefromtheBottomUp.ABradford
Book.
157.DelortP.,BigdatainBiosciences,BigDataParis,2012(http://www.bigdataparis.com/documents/PierreDelortINSER
M.pdf#page=5)
158."Nextgenerationgenomics:anintegrativeapproach"(PDF).nature.July2010.Retrieved18October2016.
159."BIGDATAINBIOSCIENCES".ResearchGate.October2015.Retrieved18October2016.
160."Bigdata:arewemakingabigmistake?".FinancialTimes.28March2014.Retrieved20October2016.
161.Ohm,Paul."Don'tBuildaDatabaseofRuin".HarvardBusinessReview.
162.DarwinBondGraham,IronCagebookTheLogicalEndofFacebook'sPatents(http://www.counterpunch.org/2013/12/0
3/ironcagebook/),Counterpunch.org,2013.12.03
163.DarwinBondGraham,InsidetheTechindustrysStartupConference(http://www.counterpunch.org/2013/09/11/insideth
etechindustrysstartupconference/),Counterpunch.org,2013.09.11
164.danahboyd(29April2010)."PrivacyandPublicityintheContextofBigData".WWW2010conference.Retrieved
20110418.
165.Jones,MBSchildhauer,MPReichman,OJBowers,S(2006)."TheNewBioinformatics:IntegratingEcologicalData
fromtheGenetotheBiosphere"(PDF).AnnualReviewofEcology,Evolution,andSystematics.37(1):519544.
doi:10.1146/annurev.ecolsys.37.091305.110031.
166.Boyd,D.Crawford,K.(2012)."CriticalQuestionsforBigData".Information,Communication&Society.15(5):662
679.doi:10.1080/1369118X.2012.678878.
167.FailuretoLaunch:FromBigDatatoBigDecisions(http://www.fortewares.com/Administrator/userfiles/Banner/fortewar
esproactivereporting_EN.pdf),ForteWares.
168.GregoryPiatetsky(12August2014)."Interview:MichaelBerthold,KNIMEFounder,onResearch,Creativity,BigData,
andPrivacy,Part2".KDnuggets.Retrieved20140813.
169.Pelt,Mason." "BigData"isanoverusedbuzzwordandthisTwitterbotprovesit".siliconangle.com.SiliconANGLE.
Retrieved4November2015.
170.Harford,Tim(28March2014)."Bigdata:arewemakingabigmistake?".FinancialTimes.FinancialTimes.Retrieved
20140407.

https://en.wikipedia.org/wiki/Big_data 19/20
3/30/2017 BigdataWikipedia

171.Ioannidis,J.P.A.(2005)."WhyMostPublishedResearchFindingsAreFalse".PLoSMedicine.2(8):e124.
doi:10.1371/journal.pmed.0020124.PMC1182327 .PMID16060722.
172.Lohr,SteveSinger,Natasha(20161110)."HowDataFailedUsinCallinganElection".TheNewYorkTimes.
ISSN03624331.Retrieved20161127.
173.Markman,Jon."BigDataAndThe2016Election".Forbes.Retrieved20161127.

Furtherreading
PeterKinnaird,InbalTalgamCohen,eds.(2012)."BigData".XRDS:Crossroads,TheACMMagazinefor
Students.No.19(1).AssociationforComputingMachinery.ISSN15284980.OCLC779657714.
JureLeskovecAnandRajaramanJeffreyD.Ullman(2014).Miningofmassivedatasets.Cambridge
UniversityPress.ISBN9781107077232.OCLC888463433.
ViktorMayerSchnbergerKennethCukier(2013).BigData:ARevolutionthatWillTransformhowWe
Live,Work,andThink.HoughtonMifflinHarcourt.ISBN9781299903029.OCLC828620988.
Press,Gil(20130509)."AVeryShortHistoryOfBigData".forbes.com.JerseyCity,NJ:ForbesMagazine.
Retrieved20160917.

Externallinks
MediarelatedtoBigdataatWikimediaCommons
ThedictionarydefinitionofbigdataatWiktionary

Retrievedfrom"https://en.wikipedia.org/w/index.php?title=Big_data&oldid=772100883"

Categories: Bigdata Datamanagement Distributedcomputingproblems Technologyforecasting


Transactionprocessing

Thispagewaslastmodifiedon25March2017,at09:13.
TextisavailableundertheCreativeCommonsAttributionShareAlikeLicenseadditionaltermsmayapply.
Byusingthissite,youagreetotheTermsofUseandPrivacyPolicy.Wikipediaisaregisteredtrademark
oftheWikimediaFoundation,Inc.,anonprofitorganization.

https://en.wikipedia.org/wiki/Big_data 20/20