Escolar Documentos
Profissional Documentos
Cultura Documentos
DIKUMachineLearningLab
MachineLearningandDataMiningResearchatDIKU
Theamountandcomplexityofavailabledataissteadilyincreasing.Tomakeuseofthiswealthof
information,computingsystemsareneededthatturnthedataintoknowledge.Machinelearningis
aboutdevelopingtherequiredsoftwarethatautomaticallyanalysesdataformakingpredictions,
categorizations,andrecommendations.Machinelearningalgorithmsarealreadyanintegralpartof
today'scomputingsystemsforexampleinsearchengines,recommendersystems,orbiometrical
applicationsandhavereachedsuperhumanperformanceinsomedomains.DIKU'sresearch
pushestheboundariesandaimsatmorerobust,moreefficient,andmorewidelyapplicable
machinelearningtechniques.
Stateoftheartmachinelearning
Machinelearningisabranchofcomputerscienceandappliedstatisticscoveringsoftwarethat
improvesitsperformanceatagiventaskbasedonsampledataorexperience.Themachine
learningresearchatDIKU,theDepartmentofComputerScienceattheUniversityofCopenhagen,
isconcernedwiththedesignandanalysisofadaptivesystemsforpatternrecognitionand
behaviourgeneration.
Wedevelopmachinelearningalgorithmsformakingnewdiscoveriesinscience
[imagefromSkyMLproject].
Ourfieldsofexpertiseare
classification,regression,anddensityestimationtechniquesfordataminingandmodelling,
patternrecognition,andtimeseriespredictionand
computationalintelligencemethodsfornonlinearoptimisationincludingvectoroptimisationand
multicriteriadecisionmaking.
Successfulrealworldapplicationsincludethedesignofbiometricandmedicalimageprocessing
http://image.diku.dk/MLLab/
1/6
24/12/2014
DIKUMachineLearningLab
systems,chemicalprocessesandplants,advanced
driverassistancesystems,robotcontrollers,time
seriespredictorsforphysicalprocesses,systemsfor
sportsanalytics,acousticsignalclassificationsystems,
automaticqualitycontrolforproductionlines,and
sequenceanalysisinbioinformatics.
Tobuildefficientandautonomousmachinelearning
systemswedrawinspirationfromoptimisationand
computingtheoryaswellasbiologicalinformation
processing.Weanalyseouralgorithmstheoreticallyand
criticallyevaluatethemonrealworldproblems.
Increasingtherobustnessandimprovingscalabilityof
selfadaptive,learningcomputersystemsarecross
cuttingissuesinourwork.Thefollowingsections
highlightsomeofourresearchactivities.
Efficientautonomousmachine
learning
Medicalimageanalysisisamajorapplicationarea
[takenfromPrasoonetal.,2012,(top)and
Winteretal.,2008(bottom)].
Westriveforcomputersystemsthatcandeal
autonomouslyandflexiblywithourneeds.Theymust
workinscenariosthathavenotbeenfullyspecifiedandmustbeabletocopewithunpredicted
situations.Incompletedescriptionsofapplicationscenariosareinevitablebecauseweneed
algorithmsfordomainswherethedesigner'sknowledgeisnotperfect,thesolutionstoparticular
problemsaresimplyunknown,and/orthesheercomplexityandvariabilityofthetaskandthe
environmentprecludesasufficientlyaccuratedomaindescription.Althoughsuchsystemsarein
generaltoocomplextobedesignedmanually,largeamountsofdatadescribingthetaskandthe
environmentareoftenavailableorcanbeautomaticallyobtained.Totakeproperadvantageofthis
availableinformation,weneedtodevelopsystemsthatselfadaptandautomaticallyimprove
basedonsampledatasystemsthatlearn.
Machinelearningalgorithmsarealreadyanintegralpartof
today'scomputingsystems,forexampleininternetsearch
engines,recommendersystems,orbiometricalapplications.
Highlyspecialisedtechnicalsolutionsforrestrictedtask
domainsexistthathavereachedsuperhumanperformance.
Despitethesesuccesses,therearefundamentalchallenges
thatmustbemetifwearetodevelopmoregenerallearning
systems.
Weapplymachinelearningalgorithmsfor
First,presentadaptivesystemsoftenlackautonomyand
hydroacousticsignalclassificationto
robustness.Forexample,theyusuallyrequireahumanexpert
supporttheverificationofthe
toselectthetrainingexamples,thelearningmethodandits
parameters,andanappropriaterepresentationorstructurefor ComprehensiveNuclearTestBanTreaty
[Tumaetal.,2012].
thelearningsystem.Thisdependenceonexpertsupervisionis
retardingtheubiquitousdeploymentofadaptivesoftware
systems.Wethereforeworkonalgorithmsthatcanhandlelargemultimodaldatasets,that
activelyselecttrainingpatterns,andthatautonomouslybuildappropriateinternalrepresentations
basedondatafromdifferentsources.Theserepresentationsshouldfosterlearning,generalisation,
andcommunication.Second,currentadaptivesystemssuccumbtoscalabilityproblems.
http://image.diku.dk/MLLab/
2/6
24/12/2014
DIKUMachineLearningLab
Ontheonehand,theevergrowingamountsofdatarequirehighlyefficientlargescalelearning
algorithms.Ontheotherhand,learningandgeneralisationfromveryfewexamplesisalsoa
challengingproblem.Thisscenariooftenoccursinmanmachineinteraction,forexamplein
softwarepersonalisationorwhengeneralisationfromfewdatabasequeriesisrequired.Weaddress
thescalingproblemsbyusingtaskspecificarchitecturesincorporatingbothnewconceptsinspired
bynaturaladaptivesystemsaswellasrecentmethodsfromalgorithmicengineeringand
mathematicalprogramming.
Selectedmethods
Weaddressallmajorlearningparadigms,unsupervised,supervised,andreinforcementlearning.
Thesearecloselyconnected.Forinstance,unsupervisedlearningcanbeusedtofindappropriate
representationsforsupervisedlearningandreliablesupervisedlearningtechniquesarethe
prerequisiteforsuccessfulreinforcementlearning.Overtheyears,weused,analysed,andrefined
abroadspectrumofmachinelearningtechniques.Currentlyourmethodologicalresearchfocuses
onthefollowingmethods.
Supervisedlearning
Schemaofmulticlasssupportvectormachineclassification
[takenfromDoganetal.,2011].
Supportvectormachines(SVMs)andotherkernelbasedalgorithmsarestateoftheartinpattern
recognition.Theyperformwellinmanyapplications,especiallyinclassificationtasks.Thekernel
trickallowsforaneasyhandlingofnonstandarddata(e.g.,biologicalsequences,multimodaldata)
andpermitsabettermathematicalanalysisoftheadaptivesystembecauseoftheconvenient
structureofthehypothesisspace.Developingandanalysingkernelbasedmethods,inparticular
increasingautonomyandimprovingscalabilityofSVMs,iscurrentlyoneofthemostactive
branchesofourresearch.
Reinforcementlearning
Thefeedbackintoday'smostchallengingapplicationsforadaptive
systemsissparse,unspecific,and/ordelayed,forinstancein
autonomousroboticsorinmanmachineinteraction.Supervisedlearning
cannotbeuseddirectlyinsuchacase,butthetaskcanbecastintoa
reinforcementlearning(RL)problem.Reinforcementlearningislearning
fromtheconsequencesofinteractionswithanenvironmentwithoutbeing
explicitlytaught.BecausetheperformanceofstandardRLtechniquesis
fallingshortofexpectations,wearedevelopingnewRLalgorithms
Covariancematrixadaptation
relyingongradientbasedandevolutionarydirectpolicysearch.
evolutionstrategy(CMAES).
http://image.diku.dk/MLLab/
3/6
24/12/2014
DIKUMachineLearningLab
Directpolicysearchforadaptationinintelligentdriverassistancesystems
[takenfromPellecchiaetal.,2005].
Unsupervisedanddeeplearning
Weemployprobabilisticgenerativemodelstolearnandtodescribeprobability
distributions.OurresearchfocusesonMarkovrandomfields,inwhichthe
conditionalindependencestructurebetweenrandomvariablesisdescribedby
anundirectedgraph.Weareparticularlyinterestedinmodelsthatallowfor
Markovrandomfield learninghierarchicalrepresentationsofdatainanunsupervisedmanner.
forrerepresentingdata.
Nonlinearoptimisation
Learningiscloselylinkedtooptimisation.Thus,wearealso
workingongeneralgradientbasedanddirectsearchand
optimisationalgorithms.Thisincludesrandomisedmethods,
especiallyevolutionaryalgorithms(EAs),whichareinspired
byneoDarwinianevolutiontheory.Efficientevolutionary
optimisationcanbeachievedbyanautomaticadjustmentof
thesearchstrategy.WearedevelopingEAswiththisability,
especiallyrealvaluedEAsthatlearnthemetricunderlyingthe
problemathand(e.g.,dependenciesbetweenvariables).
Currently,weareworkingonvariablemetricEAsforRLand
forefficientvector(multiobjective)optimisation.Thelatter
willbecomeincreasinglyrelevantforindustrialandscientific
applicationsinthefuture,becausemanyproblemsare
inherentlymultiobjective.
Team
Contributinghypervolumeof
candidatesolutionsin
multiobjectiveoptimization
[Suttorpetal,2006].
PengfeiDiao
FabianGieseke
OswinKrause
ChristianIgel
MichielKallenberg
JanKremer
DdacRodrguezArbons
YevgenySeldin
KristofferStensboSmidt
LaugeSrensen
http://image.diku.dk/MLLab/
4/6
24/12/2014
DIKUMachineLearningLab
MatthiasTuma
SelectedPublications
PleaseclickhereforafulllistofChristian'spapersandhereforafulllistofYevgeny'spapers.
FabianGieseke,JustinHeinermann,CosminOancea,andChristianIgel.BufferkdTrees:
ProcessingMassiveNearestNeighborQueriesonGPUs.JMLRW&CP32(ICML)pp.172180,
2014
YevgenySeldin,PeterL.Bartlett,KobyCrammer,andYasinAbbasiYadkori.Predictionwithlimited
adviceandmultiarmedbanditswithpaidobservations.InJMLRW&CP,32(ICML),2014
YevgenySeldinandAleksandrsSlivkins.Onepracticalalgorithmforbothstochasticandadversarial
bandits.InJMLRW&CP,32(ICML),2014
KaiBrgge,AsjaFischer,andChristianIgel.Theflipthestatetransitionoperatorforrestricted
Boltzmannmachines.MachineLearning13,pp.5369,2013
FabianGieseke,ChristianIgel,andTapioPahikkala.Polynomialruntimeboundsforfixedrank
unsupervisedleastsquaresclassification.JMLRW&CP29(ACML),pp.6271,2013
OswinKrause,AsjaFischer,TobiasGlasmachers,andChristianIgel.Approximationpropertiesof
DBNswithbinaryhiddenunitsandrealvaluedvisibleunits.JMLRW&CP28(ICML),pp.419
426,2013
IlyaTolstikhinandYevgenySeldin.PACBayesEmpiricalBernsteinInequality.InAdvancesin
NeuralInformationProcessingSystems(NIPS),2013
KimSteenstrupPedersen,KristofferStensboSmidt,AndrewZirm,andChristianIgel.ShapeIndex
DescriptorsAppliedtoTextureBasedGalaxyAnalysis.InternationalConferenceonComputer
Vision(ICCV),pp24402447,IEEEPress,2013
YevgenySeldin,FranoisLaviolette,NicolCesaBianchi,JohnShaweTaylor,andPeterAuer.PAC
Bayesianinequalitiesformartingales.IEEETransactionsonInformationTheory,58(12),pp.
70867093,2012
AsjaFischerandChristianIgel.BoundingtheBiasofContrastiveDivergenceLearning.Neural
Computation23,pp.664673,2011
YevgenySeldin,PeterAuer,FranoisLaviolette,JohnShaweTaylor,andRonaldOrtner.PAC
Bayesiananalysisofcontextualbandits.InAdvancesinNeuralInformationProcessing
Systems(NIPS),2011
TobiasGlasmachersandChristianIgel.MaximumLikelihoodModelSelectionfor1NormSoft
MarginSVMswithMultipleParameters.IEEETransactionsonPatternAnalysisandMachine
Intelligence32(8),pp.15221528,2010 sourcecode
YevgenySeldinandNaftaliTishby.PACBayesiananalysisofcoclusteringandbeyond.Journalof
MachineLearningResearch11,pp.35953646,2010
ThorstenSuttorp,NikolausHansen,andChristianIgel.EfficientCovarianceMatrixUpdatefor
VariableMetricEvolutionStrategies.MachineLearning75,pp.167197,2009 sourcecode
VerenaHeidrichMeisnerandChristianIgel.HoeffdingandBernsteinRacesforSelectingPoliciesin
EvolutionaryDirectPolicySearch.InL.BottouandM.Littman,eds.:Proceedingsofthe
http://image.diku.dk/MLLab/
5/6
24/12/2014
DIKUMachineLearningLab
InternationalConferenceonMachineLearning(ICML2009),pp.401408,2009
ChristianIgel,VerenaHeidrichMeisner,andTobiasGlasmachers.Shark.JournalofMachine
LearningResearch9,pp.993996,2008 sourcecode
TobiasGlasmachersandChristianIgel.MaximumGainWorkingSetSelectionforSVMs.Journalof
MachineLearningResearch7,pp.14371466,2006 sourcecode
Contact
ChristianIgel,Professormso,Dr.habil.
UniversityofCopenhagen
Universitetsparken5
2100Kbenhavn
Email: igel@diku.dk
Office:HCBuildingE,Office4.0.2
Phone:(+45)21849673
DepartmentofComputerScience
UniversityofCopenhagen
Universitetsparken1
2100Kbenhavn
http://image.diku.dk/MLLab/
Contact:
TheImageGroup/MachineLearningLab
webmaster@diku.dk
6/6