Informatics for Materials Science and Engineering: Data-driven Discovery for Accelerated Experimentation and Application

Ebook937 pages30 hours

Informatics for Materials Science and Engineering: Data-driven Discovery for Accelerated Experimentation and Application

Name: Informatics for Materials Science and Engineering: Data-driven Discovery for Accelerated Experimentation and Application
Brand: Elsevier Science
Rating: 5.0 (1 reviews)

Rating: 5 out of 5 stars

5/5

()

Read preview

About this ebook

Materials informatics: a ‘hot topic’ area in materials science, aims to combine traditionally bio-led informatics with computational methodologies, supporting more efficient research by identifying strategies for time- and cost-effective analysis.

The discovery and maturation of new materials has been outpaced by the thicket of data created by new combinatorial and high throughput analytical techniques. The elaboration of this "quantitative avalanche"—and the resulting complex, multi-factor analyses required to understand it—means that interest, investment, and research are revisiting informatics approaches as a solution.

This work, from Krishna Rajan, the leading expert of the informatics approach to materials, seeks to break down the barriers between data management, quality standards, data mining, exchange, and storage and analysis, as a means of accelerating scientific research in materials science.

This solutions-based reference synthesizes foundational physical, statistical, and mathematical content with emerging experimental and real-world applications, for interdisciplinary researchers and those new to the field.

Identifies and analyzes interdisciplinary strategies (including combinatorial and high throughput approaches) that accelerate materials development cycle times and reduces associated costs
Mathematical and computational analysis aids formulation of new structure-property correlations among large, heterogeneous, and distributed data sets
Practical examples, computational tools, and software analysis benefits rapid identification of critical data and analysis of theoretical needs for future problems

Skip carousel

Materials Science

LanguageEnglish

PublisherElsevier Science

Release dateJul 10, 2013

ISBN9780123946140

Related to Informatics for Materials Science and Engineering

Related ebooks

Skip carousel

Computational Materials Science: Surfaces, Interfaces, Crystallization
Ebook
Computational Materials Science: Surfaces, Interfaces, Crystallization
byA.M. Ovrutsky
Rating: 0 out of 5 stars
0 ratings
Computational Systems Biology: From Molecular Mechanisms to Disease
Ebook
Computational Systems Biology: From Molecular Mechanisms to Disease
byAndres Kriete
Rating: 5 out of 5 stars
5/5
Multidisciplinary Microfluidic and Nanofluidic Lab-on-a-Chip: Principles and Applications
Ebook
Multidisciplinary Microfluidic and Nanofluidic Lab-on-a-Chip: Principles and Applications
byXiujun (James) Li
Rating: 0 out of 5 stars
0 ratings
Integrated Design of Multiscale, Multifunctional Materials and Products
Ebook
Integrated Design of Multiscale, Multifunctional Materials and Products
byDavid L. McDowell
Rating: 0 out of 5 stars
0 ratings
Metaheuristics in Water, Geotechnical and Transport Engineering
Ebook
Metaheuristics in Water, Geotechnical and Transport Engineering
byXin-She Yang
Rating: 0 out of 5 stars
0 ratings
Introduction To Chemical Physics
Ebook
Introduction To Chemical Physics
byJ. C. Slater
Rating: 3 out of 5 stars
3/5
Computational Materials Engineering: Achieving High Accuracy and Efficiency in Metals Processing Simulations
Ebook
Computational Materials Engineering: Achieving High Accuracy and Efficiency in Metals Processing Simulations
byMaciej Pietrzyk
Rating: 0 out of 5 stars
0 ratings
Foundations of Biomaterials Engineering
Ebook
Foundations of Biomaterials Engineering
byMaria Cristina Tanzi
Rating: 0 out of 5 stars
0 ratings
Mathematical Optimization Terminology: A Comprehensive Glossary of Terms
Ebook
Mathematical Optimization Terminology: A Comprehensive Glossary of Terms
byAndre A. Keller
Rating: 0 out of 5 stars
0 ratings
Hermeticity of Electronic Packages
Ebook
Hermeticity of Electronic Packages
byHal Greenhouse
Rating: 0 out of 5 stars
0 ratings
A MATLAB® Primer for Technical Programming for Materials Science and Engineering
Ebook
A MATLAB® Primer for Technical Programming for Materials Science and Engineering
byLeonid Burstein
Rating: 5 out of 5 stars
5/5
Introduction to Nuclear and Particle Physics
Ebook
Introduction to Nuclear and Particle Physics
bySimone Malacrida
Rating: 0 out of 5 stars
0 ratings
Mathematical Modeling: A Chemical Engineer's Perspective
Ebook
Mathematical Modeling: A Chemical Engineer's Perspective
byRutherford Aris
Rating: 5 out of 5 stars
5/5
Engineering of Nanobiomaterials: Applications of Nanobiomaterials
Ebook
Engineering of Nanobiomaterials: Applications of Nanobiomaterials
byAlexandru Grumezescu
Rating: 0 out of 5 stars
0 ratings
Nano and Bio Heat Transfer and Fluid Flow
Ebook
Nano and Bio Heat Transfer and Fluid Flow
byMajid Ghassemi
Rating: 0 out of 5 stars
0 ratings
Sustainable Developments by Artificial Intelligence and Machine Learning for Renewable Energies
Ebook
Sustainable Developments by Artificial Intelligence and Machine Learning for Renewable Energies
byKrishna Kumar
Rating: 0 out of 5 stars
0 ratings
Multivariate Statistical Inference
Ebook
Multivariate Statistical Inference
byNarayan C. Giri
Rating: 5 out of 5 stars
5/5
Mechanical Properties of Nanostructured Materials: Quantum Mechanics and Molecular Dynamics Insights
Ebook
Mechanical Properties of Nanostructured Materials: Quantum Mechanics and Molecular Dynamics Insights
byAbdolhossein Fereidoon
Rating: 0 out of 5 stars
0 ratings
Advanced Theoretical Mechanics: A Course of Mathematics for Engineers and Scientists
Ebook
Advanced Theoretical Mechanics: A Course of Mathematics for Engineers and Scientists
byBrian H. Chirgwin
Rating: 0 out of 5 stars
0 ratings
System Reliability Theory: Models and Statistical Methods
Ebook
System Reliability Theory: Models and Statistical Methods
byArnljot Høyland
Rating: 0 out of 5 stars
0 ratings
Lipids in Nanotechnology
Ebook
Lipids in Nanotechnology
byMoghis U. Ahmad
Rating: 0 out of 5 stars
0 ratings
Neural Network Systems Techniques and Applications: Advances in Theory and Applications
Ebook
Neural Network Systems Techniques and Applications: Advances in Theory and Applications
byElsevier Books Reference
Rating: 0 out of 5 stars
0 ratings
Design of Experiments for Engineers and Scientists
Ebook
Design of Experiments for Engineers and Scientists
byJiju Antony
Rating: 0 out of 5 stars
0 ratings
Manufacturing Engineering Education
Ebook
Manufacturing Engineering Education
byJ. Paulo Davim
Rating: 0 out of 5 stars
0 ratings
Semiconductor Materials Analysis and Fabrication Process Control
Ebook
Semiconductor Materials Analysis and Fabrication Process Control
byElsevier Books Reference
Rating: 0 out of 5 stars
0 ratings
Nanocharacterization Techniques
Ebook
Nanocharacterization Techniques
byOsvaldo de Oliveira Jr
Rating: 0 out of 5 stars
0 ratings
Microfluidic Biosensors
Ebook
Microfluidic Biosensors
byWing Cheung Mak
Rating: 0 out of 5 stars
0 ratings
Nanotechnology: An Introduction
Ebook
Nanotechnology: An Introduction
byJeremy Ramsden
Rating: 5 out of 5 stars
5/5
The Ergodic Theory of Lattice Subgroups (AM-172)
Ebook
The Ergodic Theory of Lattice Subgroups (AM-172)
byAlexander Gorodnik
Rating: 0 out of 5 stars
0 ratings
Fabrication and Self-Assembly of Nanobiomaterials: Applications of Nanobiomaterials
Ebook
Fabrication and Self-Assembly of Nanobiomaterials: Applications of Nanobiomaterials
byAlexandru Grumezescu
Rating: 0 out of 5 stars
0 ratings

Materials Science For You

Skip carousel

1,001 Questions & Answers for the CWI Exam: Welding Metallurgy and Visual Inspection Study Guide
Ebook
1,001 Questions & Answers for the CWI Exam: Welding Metallurgy and Visual Inspection Study Guide
byDavid Ramon Quinonez
Rating: 4 out of 5 stars
4/5
Geotechnical Problem Solving
Ebook
Geotechnical Problem Solving
byJohn C. Lommler
Rating: 0 out of 5 stars
0 ratings
Additive Friction Stir Deposition
Ebook
Additive Friction Stir Deposition
byHang Z. Yu
Rating: 0 out of 5 stars
0 ratings
Storing Energy: with Special Reference to Renewable Energy Sources
Ebook
Storing Energy: with Special Reference to Renewable Energy Sources
byTrevor Letcher
Rating: 0 out of 5 stars
0 ratings
Mad About Metal: More Than 50 Embossed Craft Projects for Your Home
Ebook
Mad About Metal: More Than 50 Embossed Craft Projects for Your Home
byMonica Fischer
Rating: 0 out of 5 stars
0 ratings
Physical Metallurgy and Advanced Materials
Ebook
Physical Metallurgy and Advanced Materials
byR. E. Smallman
Rating: 5 out of 5 stars
5/5
Piping Materials Guide
Ebook
Piping Materials Guide
byPeter Smith
Rating: 4 out of 5 stars
4/5
Demystifying Explosives: Concepts in High Energy Materials
Ebook
Demystifying Explosives: Concepts in High Energy Materials
bySethuramasharma Venugopalan
Rating: 0 out of 5 stars
0 ratings
Applied Welding Engineering: Processes, Codes, and Standards
Ebook
Applied Welding Engineering: Processes, Codes, and Standards
byRamesh Singh
Rating: 0 out of 5 stars
0 ratings
Mechanics of Materials 2: The Mechanics of Elastic and Plastic Deformation of Solids and Structural Materials
Ebook
Mechanics of Materials 2: The Mechanics of Elastic and Plastic Deformation of Solids and Structural Materials
byE.J. Hearn
Rating: 4 out of 5 stars
4/5
Design and Manufacture of Plastic Components for Multifunctionality: Structural Composites, Injection Molding, and 3D Printing
Ebook
Design and Manufacture of Plastic Components for Multifunctionality: Structural Composites, Injection Molding, and 3D Printing
byVannessa Dr Goodship
Rating: 4 out of 5 stars
4/5
The Elements We Live By: How Iron Helps Us Breathe, Potassium Lets Us See, and Other Surprising Superpowers of the Periodic Table
Ebook
The Elements We Live By: How Iron Helps Us Breathe, Potassium Lets Us See, and Other Surprising Superpowers of the Periodic Table
byAnja Røyne
Rating: 4 out of 5 stars
4/5
Metalworking: Tools, Materials, and Processes for the Handyman
Ebook
Metalworking: Tools, Materials, and Processes for the Handyman
byPaul N. Hasluck
Rating: 5 out of 5 stars
5/5
Choosing & Using the Right Metal Shop Lathe
Ebook
Choosing & Using the Right Metal Shop Lathe
byRichard Rex
Rating: 0 out of 5 stars
0 ratings
Manufacturing Technology for Aerospace Structural Materials
Ebook
Manufacturing Technology for Aerospace Structural Materials
byFlake C Campbell Jr
Rating: 5 out of 5 stars
5/5
Handbook of Adhesion
Ebook
Handbook of Adhesion
byD. E. Packham
Rating: 0 out of 5 stars
0 ratings
Electric Vehicle Battery Systems
Ebook
Electric Vehicle Battery Systems
bySandeep Dhameja
Rating: 0 out of 5 stars
0 ratings
The Art of Welding: Featuring Ryan Friedlinghaus of West Coast Customs
Ebook
The Art of Welding: Featuring Ryan Friedlinghaus of West Coast Customs
byWilliam Galvery
Rating: 0 out of 5 stars
0 ratings
Practical Blacksmithing Vol. I: A Collection of Articles Contributed at Different Times by Skilled Workmen to the Columns of "The Blacksmith and Wheelwright" and Covering Nearly the Whole Range of Blacksmithing from the Simplest Job of Work to Some of the Most Complex Forgings
Ebook
Practical Blacksmithing Vol. I: A Collection of Articles Contributed at Different Times by Skilled Workmen to the Columns of "The Blacksmith and Wheelwright" and Covering Nearly the Whole Range of Blacksmithing from the Simplest Job of Work to Some of the Most Complex Forgings
byMilton Thomas Richardson
Rating: 5 out of 5 stars
5/5
High Pressure Pumps
Ebook
High Pressure Pumps
byMichael T. Gracey. P.E.
Rating: 4 out of 5 stars
4/5
Polymer Characterization: Laboratory Techniques and Analysis
Ebook
Polymer Characterization: Laboratory Techniques and Analysis
byNicholas P Cheremisinoff
Rating: 0 out of 5 stars
0 ratings
The Foseco Foundryman's Handbook: Facts, Figures and Formulae
Ebook
The Foseco Foundryman's Handbook: Facts, Figures and Formulae
byT.A. Burns
Rating: 3 out of 5 stars
3/5
Surface Chemistry of Nanobiomaterials: Applications of Nanobiomaterials
Ebook
Surface Chemistry of Nanobiomaterials: Applications of Nanobiomaterials
byAlexandru Grumezescu
Rating: 0 out of 5 stars
0 ratings
Distributed Renewable Energies for Off-Grid Communities: Empowering a Sustainable, Competitive, and Secure Twenty-First Century
Ebook
Distributed Renewable Energies for Off-Grid Communities: Empowering a Sustainable, Competitive, and Secure Twenty-First Century
byNasir El Bassam
Rating: 0 out of 5 stars
0 ratings
The Dynamics of Nazism: Leadership, Ideology, and the Holocaust
Ebook
The Dynamics of Nazism: Leadership, Ideology, and the Holocaust
byFred Weinstein
Rating: 0 out of 5 stars
0 ratings
An Introduction to Ceramic Science: The Commonwealth and International Library: Materials Science and Technology (Ceramics Division)
Ebook
An Introduction to Ceramic Science: The Commonwealth and International Library: Materials Science and Technology (Ceramics Division)
byD. W. Budworth
Rating: 5 out of 5 stars
5/5
The Art of Casting in Iron: How to Make Appliances, Chains, and Statues and Repair Broken Castings the Old-Fashioned Way
Ebook
The Art of Casting in Iron: How to Make Appliances, Chains, and Statues and Repair Broken Castings the Old-Fashioned Way
bySimpson Bolland
Rating: 5 out of 5 stars
5/5
Crack Analysis in Structural Concrete: Theory and Applications
Ebook
Crack Analysis in Structural Concrete: Theory and Applications
byZihai Shi
Rating: 0 out of 5 stars
0 ratings
Nanobiomaterials in Dentistry: Applications of Nanobiomaterials
Ebook
Nanobiomaterials in Dentistry: Applications of Nanobiomaterials
byAlexandru Grumezescu
Rating: 4 out of 5 stars
4/5
Skilletheads: <b>A Guide to Collecting and Restoring Cast-Iron Cookware</b>
Ebook
Skilletheads: <b>A Guide to Collecting and Restoring Cast-Iron Cookware</b>
byAshley L. Jones
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

Causal Trees: What do you get when you combine the causal infer…
Podcast episode
Causal Trees: What do you get when you combine the causal infer…
byLinear Digressions
0 ratings
0% found this document useful
Causal inference when you can't experiment: difference-in-differences and synthetic controls: When you need to untangle cause and effect, but y…
Podcast episode
Causal inference when you can't experiment: difference-in-differences and synthetic controls: When you need to untangle cause and effect, but y…
byLinear Digressions
0 ratings
0% found this document useful
Alexandr Draganov, "Mathematical Tools for Real-World Applications: A Gentle Introduction for Students and Practitioners" (MIT Press, 2022): An interview with Alexandr Draganov
Podcast episode
Alexandr Draganov, "Mathematical Tools for Real-World Applications: A Gentle Introduction for Students and Practitioners" (MIT Press, 2022): An interview with Alexandr Draganov
byNew Books in Mathematics
0 ratings
0% found this document useful
Nick Huntington-Klein, "The Effect: An Introduction to Research Design and Causality" (CRC Press, 2021): An interview with Nick Huntington-Klein
Podcast episode
Nick Huntington-Klein, "The Effect: An Introduction to Research Design and Causality" (CRC Press, 2021): An interview with Nick Huntington-Klein
byNew Books in Education
0 ratings
0% found this document useful
Episode 54: μ: Getting The Most Out Of Conferences
Podcast episode
Episode 54: μ: Getting The Most Out Of Conferences
byMaterialism: A Materials Science Podcast
0 ratings
0% found this document useful
003 Dr. Chris Wolverton - Defining Materials Informatics: This episode explores the definition of materials informatics. In this episode, Dr. Bryce Meredig and Dr. Wolverton discuss: The evolution of Dr. Wolverton’s research and his group’s’ focus on computational materials modeling and machine...
Podcast episode
003 Dr. Chris Wolverton - Defining Materials Informatics: This episode explores the definition of materials informatics. In this episode, Dr. Bryce Meredig and Dr. Wolverton discuss: The evolution of Dr. Wolverton’s research and his group’s’ focus on computational materials modeling and machine...
byDataLab: The Materials Informatics Podcast
0 ratings
0% found this document useful
[Bite] Data Science and the Scientific Method
Podcast episode
[Bite] Data Science and the Scientific Method
byDataCafé
0 ratings
0% found this document useful
007 Prof. Kristin Persson of the Materials Project – Building a Global Materials Informatics Platform: Summary: This episode focuses on Prof. Kristin Persson’s work directing the Materials Project, where she had her group have built an open-source materials informatics platform that reaches over 75,000 users worldwide. In this episode,...
Podcast episode
007 Prof. Kristin Persson of the Materials Project – Building a Global Materials Informatics Platform: Summary: This episode focuses on Prof. Kristin Persson’s work directing the Materials Project, where she had her group have built an open-source materials informatics platform that reaches over 75,000 users worldwide. In this episode,...
byDataLab: The Materials Informatics Podcast
0 ratings
0% found this document useful
48. Big Data Wrangling for Core Sensing Technology
Podcast episode
48. Big Data Wrangling for Core Sensing Technology
byDiscovery to Recovery
0 ratings
0% found this document useful
Episode: 42 - Machine Learning Informatics for Antibody Discovery
Podcast episode
Episode: 42 - Machine Learning Informatics for Antibody Discovery
byThe Chain: Protein Engineering Podcast
0 ratings
0% found this document useful
Ep 38: Hussein Khalil, Argonne National Lab: Nuclear Energy R&D
Podcast episode
Ep 38: Hussein Khalil, Argonne National Lab: Nuclear Energy R&D
byTitans Of Nuclear | Interviewing World Experts on Nuclear Energy
0 ratings
0% found this document useful
Collaborators: Renewable energy storage with Bichlien Nguyen and David Kwabi
Podcast episode
Collaborators: Renewable energy storage with Bichlien Nguyen and David Kwabi
byMicrosoft Research Podcast
0 ratings
0% found this document useful
Why Microservices Are Better Than Cloud Computing: This episode on Systems—one of the four Domains of Data Science UVA uses to define the field—explores the challenges of cloud computing within the framework of biomedical research. Phil Bourne, Dean of the UVA School of Data Science, speaks with computational biologist and associate professor Nathan Sheffield about a paper they co-wrote on systemic issues from cloud platforms that do not support FAIRness, including platform lock-in, poor integration across platforms, and duplicated efforts for users and developers. They suggest instead prioritizing microservices and access to modular data in smaller chunks or summarized form. Emphasizing modularity and interoperability would lead to a more powerful Unix-like ecosystem of web services for biomedical analysis and data retrieval. The two discuss how funders, developers, and researchers can support microservices as the next generation of cloud-based bioinformatics. From Cloud Computing to
Podcast episode
Why Microservices Are Better Than Cloud Computing: This episode on Systems—one of the four Domains of Data Science UVA uses to define the field—explores the challenges of cloud computing within the framework of biomedical research. Phil Bourne, Dean of the UVA School of Data Science, speaks with computational biologist and associate professor Nathan Sheffield about a paper they co-wrote on systemic issues from cloud platforms that do not support FAIRness, including platform lock-in, poor integration across platforms, and duplicated efforts for users and developers. They suggest instead prioritizing microservices and access to modular data in smaller chunks or summarized form. Emphasizing modularity and interoperability would lead to a more powerful Unix-like ecosystem of web services for biomedical analysis and data retrieval. The two discuss how funders, developers, and researchers can support microservices as the next generation of cloud-based bioinformatics. From Cloud Computing to
byUVA Data Points
0 ratings
0% found this document useful
Setting the Standard: Impact of Method Standardization in Chromatography
Podcast episode
Setting the Standard: Impact of Method Standardization in Chromatography
byThe Analytical Wavelength
0 ratings
0% found this document useful
Quantum Queries — Dr. Florian Neukart, Principle Scientist at Volkswagen — Quantum Computing for Research and Simulation, the Pathway to Understanding Materials and Building Better Products: Dr. Florian Neukart, principle scientist at Volkswagen, delivers a comprehensive analysis of quantum computing and simulation utilized for research. As principal scientist at Volkswagen Group, Dr. Neukart, focuses on intensive research in the fields...
Podcast episode
Quantum Queries — Dr. Florian Neukart, Principle Scientist at Volkswagen — Quantum Computing for Research and Simulation, the Pathway to Understanding Materials and Building Better Products: Dr. Florian Neukart, principle scientist at Volkswagen, delivers a comprehensive analysis of quantum computing and simulation utilized for research. As principal scientist at Volkswagen Group, Dr. Neukart, focuses on intensive research in the fields...
byFinding Genius Podcast
0 ratings
0% found this document useful
B. Fong and D. I. Spivak, "An Invitation to Applied Category Theory: Seven Sketches in Compositionality" (Cambridge UP, 2019): Fong and Spivak have written a marvelous and timely new textbook that, as its title suggests, invites readers of all backgrounds to explore what it means to take a compositional approach and how it might serve their needs....
Podcast episode
B. Fong and D. I. Spivak, "An Invitation to Applied Category Theory: Seven Sketches in Compositionality" (Cambridge UP, 2019): Fong and Spivak have written a marvelous and timely new textbook that, as its title suggests, invites readers of all backgrounds to explore what it means to take a compositional approach and how it might serve their needs....
byNew Books in Mathematics
0 ratings
0% found this document useful
The urgent need for more grid automation: Grid optimization expert Dr. Kyri Baker explains how utilities can expand artificial intelligence projects today, and what’s next for the technology.
Podcast episode
The urgent need for more grid automation: Grid optimization expert Dr. Kyri Baker explains how utilities can expand artificial intelligence projects today, and what’s next for the technology.
byWith Great Power
0 ratings
0% found this document useful
#037 - Tour De Bayesian with Connor Tann
Podcast episode
#037 - Tour De Bayesian with Connor Tann
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
The Quantum Theory of Computation and Developing Constructors to Revolutionize Computing with Chiara Marletto: Where can quantum computing take the next step to continue improving and begin outperforming current computers. Theoretically, physical transformations may be the next stage of development. Listen up to learn: The basic unit of quantum computation...
Podcast episode
The Quantum Theory of Computation and Developing Constructors to Revolutionize Computing with Chiara Marletto: Where can quantum computing take the next step to continue improving and begin outperforming current computers. Theoretically, physical transformations may be the next stage of development. Listen up to learn: The basic unit of quantum computation...
byFinding Genius Podcast
0 ratings
0% found this document useful
Dawning of the Era of Logical Qubits with Dr Vladan Vuletic
Podcast episode
Dawning of the Era of Logical Qubits with Dr Vladan Vuletic
byThe New Quantum Era
0 ratings
0% found this document useful
The Physics of Data with Alpha Lee - #377: Today we’re joined by Alpha Lee, Winton Advanced Fellow in the Department of Physics at the University of Cambridge, and Co-Founder of startup, PostEra. Our conversation centers around Alpha’s research which can be broken down into three main...
Podcast episode
The Physics of Data with Alpha Lee - #377: Today we’re joined by Alpha Lee, Winton Advanced Fellow in the Department of Physics at the University of Cambridge, and Co-Founder of startup, PostEra. Our conversation centers around Alpha’s research which can be broken down into three main...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Keeping ourselves honest when we work with observational healthcare data: The abundance of data in healthcare, and the valu…
Podcast episode
Keeping ourselves honest when we work with observational healthcare data: The abundance of data in healthcare, and the valu…
byLinear Digressions
0 ratings
0% found this document useful
The Computational Complexity of Machine Learning: In this episode, Professor Michael Kearns from the University of Pennsylvania joins host Kyle Polich to talk about the computational complexity of machine learning, complexity in game theory, and algorithmic fairness. Michael's doctoral thesis gave an...
Podcast episode
The Computational Complexity of Machine Learning: In this episode, Professor Michael Kearns from the University of Pennsylvania joins host Kyle Polich to talk about the computational complexity of machine learning, complexity in game theory, and algorithmic fairness. Michael's doctoral thesis gave an...
byData Skeptic
0 ratings
0% found this document useful
Real-time spectral library matching for sample multiplexed quantitative proteomics.
Podcast episode
Real-time spectral library matching for sample multiplexed quantitative proteomics.
byPaperPlayer biorxiv cell biology
0 ratings
0% found this document useful
Modular Quantum System Architectures with Yufei Ding
Podcast episode
Modular Quantum System Architectures with Yufei Ding
byThe New Quantum Era
0 ratings
0% found this document useful
37. Sean Knapp - The brave new world of data engineering
Podcast episode
37. Sean Knapp - The brave new world of data engineering
byTowards Data Science
0 ratings
0% found this document useful
ERDC Labs Collaborating on Leading Edge 3D Printing Nature-Based Solutions: Imagine scientists and engineers using 3D printing technology to create nature-inspired structures and to produce more effective, economic, and aesthetically pleasing solutions. In the premier episode of Season 5 of the Engineering With Nature®...
Podcast episode
ERDC Labs Collaborating on Leading Edge 3D Printing Nature-Based Solutions: Imagine scientists and engineers using 3D printing technology to create nature-inspired structures and to produce more effective, economic, and aesthetically pleasing solutions. In the premier episode of Season 5 of the Engineering With Nature®...
byEWN - Engineering With Nature
0 ratings
0% found this document useful
Putting machine learning into a database: Most data scientists bounce back and forth regula…
Podcast episode
Putting machine learning into a database: Most data scientists bounce back and forth regula…
byLinear Digressions
0 ratings
0% found this document useful
Is Excel Holding Back Your Research? Managing Data During Drug Development
Podcast episode
Is Excel Holding Back Your Research? Managing Data During Drug Development
byThe Analytical Wavelength
0 ratings
0% found this document useful
New ARU Course: Foundations of Physical Science: In this episode of New Ideal Live, Mike Mazza interviews Keith Lockitch about the Ayn Rand University and his upcoming course, Foundations of Physical Science, one of the many new courses being offered by ARU. Among the topics covered:
Podcast episode
New ARU Course: Foundations of Physical Science: In this episode of New Ideal Live, Mike Mazza interviews Keith Lockitch about the Ayn Rand University and his upcoming course, Foundations of Physical Science, one of the many new courses being offered by ARU. Among the topics covered:
byNew Ideal, from the Ayn Rand Institute
0 ratings
0% found this document useful

Skip carousel

When Your Amazon Purchase Explodes
The Atlantic
Article
When Your Amazon Purchase Explodes
Apr 30, 2019
11 min read
How Spooky Science Helps Us Peer Inside The Planets
All About Space
Article
How Spooky Science Helps Us Peer Inside The Planets
Dec 3, 2020
An assistant professor of computational science at the EPFL research centre in Lausanne, Switzerland, involved in the current research on metallic hydrogen. Could you explain how the machine-learning techniques used in your research work? Why were th
1 min read
To Build Amazing Computers, Mimic The Brain?
Futurity
Article
To Build Amazing Computers, Mimic The Brain?
Mar 4, 2020
5 min read
Data Centers Aren’t The Energy Hogs We Thought
Futurity
Article
Data Centers Aren’t The Energy Hogs We Thought
Feb 28, 2020
2 min read
Updated Restricted Science Rule Spells Reanalysis Paralysis for the EPA
Union of Concerned Scientists
Article
Updated Restricted Science Rule Spells Reanalysis Paralysis for the EPA
Nov 12, 2019
7 min read
Method Could Find Ingredients for Better Batteries
Futurity
Article
Method Could Find Ingredients for Better Batteries
Apr 7, 2017
A mathematical model may help scientists design new kinds of materials for use in high-power batteries. In a study published this week in Applied Physics Letters, researchers describe their new mathematical model for designing materials for storing e
2 min read
‘Rope-jumping’ Rotor Could Pave Way For Molecular Machines
Futurity
Article
‘Rope-jumping’ Rotor Could Pave Way For Molecular Machines
Jul 15, 2018
Researchers have created a new type of molecular rotor that shows promise for future development as a functional machine capable of manipulating matter at atomic and subatomic levels. The research could transform multiple branches of chemistry, along
2 min read
Metabolomes: A New Way To Store Data In Little Space
Futurity
Article
Metabolomes: A New Way To Store Data In Little Space
Jul 5, 2019
3 min read
Absurdly Thin Magnets Could Store Way More Data
Futurity
Article
Absurdly Thin Magnets Could Store Way More Data
May 9, 2018
3 min read
Moore’s Law Is About to Get Weird: Never mind tablet computers. Wait till you see bubbles and slime mold.
Nautilus
Article
Moore’s Law Is About to Get Weird: Never mind tablet computers. Wait till you see bubbles and slime mold.
Feb 12, 2015
I’ve never seen the computer you’re reading this story on, but I can tell you a lot about it. It runs on electricity. It uses binary logic to carry out programmed instructions. It shuttles information using materials known as semiconductors. Its brai
7 min read
See Metal And Plastic Micromachines Zip Around
Futurity
Article
See Metal And Plastic Micromachines Zip Around
Nov 25, 2020
2 min read
Business applications For Quantum computing
Rotman Management
Article
Business applications For Quantum computing
May 1, 2022
COMPUTERS DO ARITHMETIC. Underlying every amazing application of computers today is math, calculated using binary digits or ‘bits.’ The original computers of the early 1950s could perform about 465 multiplications per second — much faster than the ‘h
11 min read
Can Scientific Discovery Be Automated?
The Atlantic
Article
Can Scientific Discovery Be Automated?
Apr 25, 2017
4 min read
A.I. Speeds Up Battery Testing For Electric Vehicles
Futurity
Article
A.I. Speeds Up Battery Testing For Electric Vehicles
Feb 24, 2020
4 min read
Japanese Paper Art Could Let Electronics Stretch Out
Futurity
Article
Japanese Paper Art Could Let Electronics Stretch Out
Apr 4, 2018
Kirigami, a variation of origami that involves cutting folded pieces of paper, has inspired researchers’ efforts to build malleable electronic circuits. Their innovation—creating tiny sheets of strong yet bendable electronic materials made of select
1 min read
Method Predicts When Batteries Will Die
Futurity
Article
Method Predicts When Batteries Will Die
Mar 26, 2019
3 min read
Flexible Polymer Could Power Future Pacemakers Non-stop
Futurity
Article
Flexible Polymer Could Power Future Pacemakers Non-stop
Dec 27, 2019
1 min read
‘Shaving’ Nanocrystals Amps Their Electronic Properties
Futurity
Article
‘Shaving’ Nanocrystals Amps Their Electronic Properties
Mar 25, 2022
3 min read
Quantum Trick From Nature Provides ‘Roadmap’ To Better Solar Tech
Futurity
Article
Quantum Trick From Nature Provides ‘Roadmap’ To Better Solar Tech
Jun 15, 2020
3 min read
Can Machine Learning Predict The Next Big Disaster?
Futurity
Article
Can Machine Learning Predict The Next Big Disaster?
Jan 3, 2023
3 min read
Quantum Simulators An Overview
Techfastly
Article
Quantum Simulators An Overview
Oct 1, 2021
4 min read
System Shaves 75% Off Electric Vehicle Battery Test Time
Futurity
Article
System Shaves 75% Off Electric Vehicle Battery Test Time
Jun 29, 2022
3 min read
Team Takes A Step Toward Nanoparticle Drugs
Futurity
Article
Team Takes A Step Toward Nanoparticle Drugs
May 24, 2022
2 min read
New Quantum Algorithms Finally Crack Nonlinear Equations
Quanta
Article
New Quantum Algorithms Finally Crack Nonlinear Equations
Jan 5, 2021
4 min read
Is Transparency Always A Good Thing? EPA Weighs Controversial New Rule.
The Christian Science Monitor
Article
Is Transparency Always A Good Thing? EPA Weighs Controversial New Rule.
Mar 12, 2020
The Environmental Protection Agency is mulling a proposal to give preference to scientific research whose datasets and models are publicly available.
3 min read
3D-Printed ‘Schwarzites’ Could Build All Kinds of Stuff
Futurity
Article
3D-Printed ‘Schwarzites’ Could Build All Kinds of Stuff
Nov 17, 2017
Engineers are using 3D printers to turn largely theoretical structures into strong, light and durable materials with complex, repeating patterns. The porous structures called schwarzites are designed with computer algorithms, but Rice University rese
2 min read
Quantum Computing and The Rise Of Machine Learning
Techfastly
Article
Quantum Computing and The Rise Of Machine Learning
Oct 1, 2021
2 min read
Algorithm Points To Best Way To Deal With Ocean Plastic
Futurity
Article
Algorithm Points To Best Way To Deal With Ocean Plastic
Mar 3, 2020
3 min read
AI Could Mine The Past For Faster, Better Weather Forecasts
Futurity
Article
AI Could Mine The Past For Faster, Better Weather Forecasts
Dec 17, 2020
2 min read
The Superfast Processors
India Today
Article
The Superfast Processors
Aug 19, 2023
2 min read

Related categories

Skip carousel

Reviews for Informatics for Materials Science and Engineering

Rating: 5 out of 5 stars

5/5

1 rating0 reviews

Book preview

Informatics for Materials Science and Engineering - Krishna Rajan

1 Materials Informatics

An Introduction

Krishna Rajan, Dept. of Materials Science & Eng. and Bioinformatics & Computational Biology Program – Iowa State University, Ames IA, USA

1 The What and Why of Informatics

The search for new or alternative materials or processing strategies, whether through experiment or simulation, has been a slow and arduous task, punctuated by infrequent and often unexpected discoveries. Each of these findings prompts a flurry of studies to better understand the underlying science governing the behavior of these materials. While informatics is well established in fields such as biology, drug discovery, astronomy and quantitative social sciences, materials informatics is still in its infancy. The few systematic efforts that have been made to analyze trends in data as a basis for predictions have in large part been inconclusive, not least of which is due to the lack of large amounts of organized data and even more importantly the challenge of sifting through them in a timely and efficient manner.

When combined with a huge combinatorial space of chemistries as defined by even a small portion of the periodic table, it is clearly seen that searching for new materials with tailored properties is a prohibitive task. Hence the search for new materials for new applications is limited to educated guesses. Data that does exist is often limited to small regions of compositional space. Experimental data is dispersed in the literature and computationally derived data is limited to a few systems for which reliable data exists for calculations. Even with recent advances in high-speed computing, there are limits to how the structure and properties of many new materials can be calculated. Hence this poses both a challenge and an opportunity. The challenge is to deal with extremely large disparate databases and large-scale computation. It is here that knowledge discovery in databases or data mining, an interdisciplinary field merging ideas from statistics, machine learning, databases, and parallel and distributed computing, provides a unique tool to integrate scientific information and theory for materials discovery. The key challenge in data mining is the extraction of knowledge and insight from massive databases. It takes the form of discovering new patterns or building models from a given data set. The challenge is to take advantage of recent advances in data mining and apply them to state-of-the-art computational and experimental approaches for materials discovery.

A complete materials informatics program will have an information technology (IT)-based component that is linked to classical materials science research strategies. The former includes a number of features that help informatics to be critical in materials research (Figure 1.1):

• Data warehousing and data management: This involves a science-based selection and organization of data that is linked to a reliable data searching and management system.

• Data mining: Providing an accelerated analysis of large multivariate correlations.

• Scientific visualization: A key area of scientific research that allows high dimensional information to be assessed.

• Cyber infrastructure: An information technology infrastructure that can accelerate the sharing of information, data, and most importantly knowledge discovery.

Figure 1.1 The role of materials informatics is pervasive across all aspects of materials science and engineering. The mathematical tools based on data mining provide the computational engine for integrating materials science information across length scales. Informatics provides an accelerated means of fusing data and recognizing in a rapid yet robust manner structure–property relationships between disparate length and time scales. (From Rajan, 2005.)

2 Learning from Systems Biology: An Omics Approach to Materials Design

The concept of complexity in biology and how to assess the links between information at the molecular level to that at the living organism level (e.g. genomics, proteomics, etc.) is the foundation of systems biology. The understanding of systems biology provides an excellent paradigm for the materials scientist. Ultimately one would like to take an atoms applications approach to materials design. How do we organize atoms and build systematically structural units at increasing length scales to ultimately the final engineering component or structure? At present we need to rely on extensive prior knowledge with experiments, computation, and ultimately even failure analysis to understand the complex network of interactions of materials behavior that govern the performance of an engineering system. The problem is that even with advanced experimental and computational tools, the rate of discovery is still slow, only punctuated by unexpected findings (e.g. superconducting ceramics, conducting polymers) that stimulate new areas of research and development. The iterative approach as shown in Figure 1.2 is common to many fields as one tries to link observations with models. The challenge is to develop models that capture the system behavior by accounting for all the different levels of information that contribute to the system’s behavior.

Figure 1.2 Logic of information flow and and knowledge discovery in classical research methodology. The example provided here addresses the use of qualitative reasoning to simulate and identify metabolic pathways. (From King et al., 2005.)

The goal of modern systems biology is to understand physiology and disease from the level of molecular pathways, regulatory networks, cells, tissues, organs and ultimately the whole organism (Butcher et al., 2004). As currently employed, the term systems biology encompasses many different approaches and models for probing and understanding biological complexity, and studies of many organisms from bacteria to man. A similar paradigm exists for materials, e.g. atoms to airplanes (Figure 1.3).

Figure 1.3 Comparison of length scale challenges in designing drugs for the life sciences (a; from Butcher et al., 2004) and designing materials for the engineering sciences (b; from Noor et al., 2000). Note the overlap in the length scales that govern engineering design/human body.

As aptly described by Butcher et al., the omics (the bottom-up approach) focuses on the identification and global measurement of molecular components. Modeling (the top-down approach) attempts to form integrative (across scales) models of human physiology and disease although, with current technologies, such modeling focuses on relatively specific questions at particular scales, e.g. at the pathway or organ levels. An intermediate approach, with the potential to bridge the two, is to generate profiling data from high-throughput experiments designed to incorporate biological complexity at multiple levels: multiple interacting active pathways, multiple intercommunicating cell types, and multiple different environments.

A similar challenge occurs in materials science, identifying pathways of how chemistry, crystal structure, microstructure, processing variables, and component design and manufacturing communicate with each other to ultimately define performance. This forms the materials science equivalent of the biological regulatory network (Figure 1.4).

Figure 1.4 An example of a regulatory network linking diverse sets of information from both theory and experiment in the study of materials degradation due to irradiation. (From Wirth et al., 2001.)

Because biological complexity is an exponential function of the number of system components and the interactions between them, and escalates at each additional level of organization (Figure 1.5), such efforts are currently limited to simple organisms or to specific minimal pathways (and generally in very specific cell and environmental contexts) in higher organisms. The same can be said of complexity in materials science.

Figure 1.5 Identification of regulatory pathways using network graphing and simulated annealing methods. (This figure has been reprinted from Ideker et al., 2002, by permission of Oxford University Press, and Aitchison and Galitski, 2003.)

Even if our ability to measure molecules and their functional states and interactions were adequate to the task, computational limitations alone would prohibit our understanding of cell and tissue behavior at the molecular level. Thus, methodologies that filter information for relevance, such as biological context and experimental knowledge of cellular and higher level system responses, will be critical for successful understanding of different levels of organization in systems biology research.

Informatics is the enabling tool to facilitate this process. For instance, Csete and Doyle (2002) have provided a very appropriate analog between biological and engineering systems that can help to put materials informatics in perspective. A striking example of converging information across length scales is shown in Figure 1.6, where one is comparing cruise speed to mass M over 12 orders of magnitude, from 747 and 777, to fruit flies. This provides a good example of how, if one can integrate and identify key metrics (data), functional relations between variables across many scales can be developed. Here, a well-known elementary argument shows good correspondence with the data and yields explanations for deviations.

Figure 1.6 A scaling diagram of design behavior or aeronautical systems (both biological and artificial). (Adapted from Csete and Doyle, 2002.)

Such theories are largely irrelevant to complexity directly, but an understanding of them leads to what is relevant. The scaling theory described by the above figure does not distinguish between flight in the atmosphere and in a laboratory wind tunnel. In the latter context, a much simpler mutant 777 with nearly all of its 150,000-count aeronome knocked out would have roughly the same lift, mass, and cruise speed, and thus (from an allometric scaling viewpoint) would exhibit no deleterious laboratory phenotype. Redundancy does not explain this finding. Rather, the mutant has lost control systems and robustness required for real flight outside the lab. Allometric scaling emphasizes the essential similarities between these 777 variants and a toy scale model (and a fruit fly), whereas our interest is in their huge differences in complexity. Similarly, minimal cellular life requires a few hundred genes, yet even Escherichia coli have ~4000 genes, less than 300 of which have been classified as essential. The likely reason for this excess complexity is also the presence of complex regulatory networks for robustness. In technology as well as in organisms, such robustness tradeoffs drive the evolution of spiraling complexity.

3 Where Do We Get the Information?

One may naturally assume that having large amounts of data is critical for any serious informatics studies. What constitutes enough data in materials science applications, however, can vary significantly. In studying structural ceramics, for instance, fracture toughness measurements are difficult to make and in some of the more complex materials just a few careful measurements can be of great value. Similarly, having reliable measurements on fundamental constants or properties for a given material involves very detailed measurement and/or computational techniques. In essence data sets in materials science fall into two broad categories. The first is data sets on a given material’s behavior as related to mechanical or physical properties. The other is data sets related to intrinsic information based on the chemical characteristic of the material, such as thermodynamic data sets.

In the materials science community, crystallographic and thermochemical databases historically have been two of the best established in the community. The former serves as the foundation for interpreting crystal structure data of metals, alloys, and inorganic materials. The latter involves the compilation of fundamental thermochemical information in terms of heat capacity and calorimetric data. While crystallographic databases are primarily used as a reference source, thermodynamic databases were actually one of the earliest examples of informatics, as these databases were integrated into thermochemical computations to map phase stability in binary and ternary alloys. This led to the development of computationally derived phase diagrams, which is a classic example of integrating information in databases to data models. The evolution of both databases has occurred independently, although in terms of their scientific value they are extraordinarily intertwined. Phase diagrams map out regimes of crystal structure in temperature–composition space or temperature–pressure space. However, crystal structure databases have been developed totally independently. At present the community has to work with each database separately and information searches are cumbersome and data analysis interpretation involving both is very difficult. Researchers only integrate such information independently for a very specific system at a time based on their individual interests. Hence there is at present no unified way to explore patterns of behavior across both databases, which are scientifically related. Examples do exist in the biological and chemical sciences and provide useful templates for materials science (Figure 1.7).

Figure 1.7 Example of integrating databases into knowledge discovery in drug discovery from information at the genomic level. (Adapted from Strausberg and Schreiber, 2003.)

One of the more systematic efforts to address this challenge has been that of Ashby (see, for example, Figure 1.8, Ashby, 2011), who showed how, by merging phenomenological relationships in materials properties with discrete data on specific material characteristics, one can begin to develop patterns of classification of materials behavior. The visualization of multivariate data was managed by using normalization schemes, which permitted the development of maps that, in turn, provided a means of capturing new means of clustering of materials properties. It also provided a methodology to establish common structure–property relationships across seemingly different classes of materials. This approach, while very valuable, is limited in its predictive value and is ultimately based on utilizing prior models to build and seek relationships. In the informatics strategy of studying materials behavior we are approaching it from a broader perspective. By exploring all types of data that may have varying degrees of influence on a given property (or properties) with no prior assumptions, one utilizes data-mining techniques to establish both classification and predictive assessments in materials behavior. However, this is not done from a purely statistical perspective but one where one carefully integrates a physics-driven approach to data collection with data mining; it is validated or analyzed with theory-based computation and/or experiments.

Figure 1.8 An example of data integration in materials engineering mapping correlations between mechanical properties over a wide range of materials: Fracture toughness and modulus for metals, alloys, ceramic, glasses, polymers and metallic glasses. The contours show the toughness Gc in kJ/m². (From Ashby and Greer, 2006.)

The origins of the data can be either from experiment or computation and the former, when organized in terms of combinatorial experiments, can provide an opportunity to screen large amounts of data in a high-throughput fashion (see Figure 1.9).

Figure 1.9 Two examples of assessing large arrays of data in a microarray format.

(A) A correlation matrix showing effects of blood pressure control. (From Stoll et al., 2001.)

(B) An experimental array of thin-film chemistries showing empirical correlations between optical behavior and chemistry. (From Liu and Schultz, 1999.)

The materials informatics pathway to knowledge discovery, however, is not a linear one but rather an iterative process that can provide new information at each information cycle (Figure 1.10).

Figure 1.10 A comparison of a hypothesis-driven strategy with a systems biology approach (A; adapted from Kitano, 2002) with a systems engineering approach for materials development (B; from Noor et al., 2000).

As described by Kitano (2002), in terms of biological research, a cycle of research begins with the selection of contradictory issues of biological significance and the creation of a model representing the phenomenon. Models can be created either automatically or manually. The model represents a computable set of assumptions and hypotheses that need to be tested or supported experimentally. A similar analogy may be applied to materials science in trying to explain unexpected or unusual materials behavior such as the discovery, nearly two decades ago, of high-temperature superconductivity exhibited by oxide systems. Up to that point, the majority (but not all) of research in this field was focused on intermetallics. The new discovery at the time spawned a vast array of studies both experimental and theoretical to gain better understanding of the causes of this important materials behavior. This of course was part of a cycle of hypothesis-driven research in superconductivity that has had a long and distinguished history. The computational simulations (biologists refer to them as dry experiment) on models reveal computational adequacy of the assumptions and hypotheses embedded in each model. Inadequate models would expose inconsistencies with established experimental facts, and thus need to be rejected or modified. Models that pass this test become subjects of a thorough system analysis where a number of predictions may be made. A set of predictions that can distinguish a correct model among competing models is selected for experimental validation (called wet experiments by biologists). Successful experiments are those that eliminate inadequate models. Models that survive this cycle are deemed to be consistent with existing experimental evidence. While this is an idealized process of systems biology research, the hope is that advancement of research in computational science, analytical methods, technologies for measurements, and genomics/material informatics can transform research to fit this cycle for more systematic and hypothesis-driven science.

4 Data Mining: Data-Driven Materials Research

Broadly speaking, data-mining techniques have two primary functions, pattern recognition and prediction, both of which form the foundations for understanding materials behavior. Based on the treatment by Tan et al. (2004), the former, which is more descriptive in scope, serves as a basis for deriving correlations, trends, clusters, trajectory and anomalies among disparate data. The interpretation of these patterns is intrinsically tied to an understanding of materials physics and chemistry. In many ways this role of data mining is similar to the phenomenological structure–property paradigms that play a central role in the study of engineering materials, except now we will be able to recognize these relationships with far greater speed and not necessarily depend on a priori models, provided of course we have the relevant data. The predictive aspect of data-mining tasks can serve for both classification and regression operations. Data mining, which is an interdisciplinary blend of statistics, machine learning, artificial intelligence and pattern recognition, is considered to have a few core tasks:

• Cluster analysis: Seeks to find groups of closely related observations and is valuable in targeting groups of data that may have well-behaved correlations and can form the basis of physics-based as well as statistically-based models. Cluster analysis, when integrated with high-throughput experimentation, can serve as a powerful tool for rapidly screening combinatorial libraries.

• Predictive modeling: Helps build models for targeted objectives (e.g. a specific materials property) as a function of input or exploratory variables. The success of these models also helps refine the usefulness and relevance of the input parameters.

• Association analysis: Used to discover patterns that describe strongly associated features in data (e.g. the frequency of association of a specific materials property to materials chemistry). Such an analysis over extremely large data sets is made possible with the development of very-high-speed search algorithms and can help to develop heuristic rules for materials behavior governed by many factors.

• Anomaly detection: Does the opposite by identifying data or observations significantly different from the norm. The ability to identify such anomalies or outliers is critical in materials since it can serve to identify a new class of materials with an unusual property (e.g. superconducting ceramics as opposed to insulating ceramics) or anticipate potentially harmful effects that are often identified through a retrospective analysis after an engineering failure (e.g. ductile–brittle transition).

In most materials science studies, we identify a priori likely variables or parameters that affect a set of properties. This is usually based on theoretical considerations and/or heuristic analysis based on prior experience. It is, however, difficult to integrate information simultaneously from multivariate data and especially when phenomenological relationships cannot always be explained in advance.

As suggested by Ideker and Lauffenburger (2003), relationships between different components of information can be extracted from the scaffold using high-level computational models, which identify the key components, interactions, and influences required for more detailed low-level models. Large-scale experimental measurements validate high-level models, whereas targeted experimental manipulations and measurements test low-level models. The ultimate goal of knowledge discovery is achieved in systematic integration of data, correlation analysis developed through data-mining tools, and most importantly validated by fundamental theory- and experiment-based science of materials. The sources of data can be varied and numerous, ranging from computer simulations, high-throughput experimentation via combinatorial experiments, and large-scale databases of legacy information. The application of advanced data-mining tools permits the processing of very large sets of information in a robust yet rapid manner. The collective integration of statistical learning tools (the high-level models as shown in Figure 1.11) with experimental and computational materials science allows for an informatics driven strategy for materials design.

Figure 1.11 Computational models of cellular processes span a wide range of levels of abstraction. At the highest level, statistical data-mining approaches correlate dependent and independent variables, elucidating model components and their potential interrelationships. At a somewhat lower level, Bayesian networks expand on these relationships by modeling conditional dependencies of child nodes on their parents in the network, whereas Boolean and fuzzy logic models dictate logical rules governing these dependencies. Finally, at a relatively detailed level, Markov chains allow probabilistic production, loss and interconversion among molecular species and states, and complex systems of differential equations explicitly model a wide range of materials science behavior ranging from electronic structure calculations (e.g. Schrödinger equation) to transport behavior (diffusion equations). (From Ideker and Lauffenburger, 2003.)

Ultimately, the processing–structure–properties paradigm that forms the core of materials development is based on understanding multivariate correlations and their interpretation in terms of the fundamental physics, chemistry, and engineering of materials. The field of materials informatics can advance that paradigm in a significant manner. A few critical questions may be helpful to keep in mind in building the informatics infrastructure for materials science:

• How can data mining/machine learning best be used to discover what attributes (or combination of attributes) in a material may govern specific properties. Using information from different databases, we can compare and search for associations and patterns that can lead to ways of relating information among these different data sets.

• What are the most interesting patterns that can be extracted from the existing material science data? Such a pattern search process can potentially yield associations between seemingly disparate data sets, as well as establish possible correlations between parameters that are not easily studied experimentally in a coupled manner.

• How can we use mined associations from large volumes of data to guide future experiments and simulations? How does one select, from a materials library, which compounds are most likely to have the desired properties? Data-mining methods should be incorporated as part of design and testing methodologies to increase the efficiency of material application process. For instance, a possible test bed for materials discovery can involve the use of massive databases on crystal structure, electronic structure, and thermochemistry. Each of these databases by themselves can provide information on over hundreds of binary, ternary, and multicomponent systems. Coupled to electronic structure and thermochemical calculations, one can enlarge this library to permit a wide array of simulations for thousands of combinations of materials chemistries. Such a massively parallel approach in generating new virtual data would be a daunting if not impossible task were it not for data-mining tools as proposed here.

In this, the first chapter of the book, we have essentially provided a brief summary of some of the topics to be covered in the following chapters. We begin by providing an overview of some of the data-mining tools and vocabulary of the field of informatics. The following chapters examine how informatics or information processing of discrete data is being or can be applied to the field of materials science.

References

1. Aitchison JD, Galitski T. Inventories to insights. J Cell Biol. 2003;161(3):465–469.

2. Ashby MF. Materials Selection in Mechanical Design Burlington, MA: Elsevier; 2011.

3. Ashby MF, Greer AL. Metallic glasses as structural materials. Scr Mater. 2006;54:321–326.

4. Butcher EC, Berg EL, Kunkel. EJ. Systems biology in drug discovery. Nat Biotechnol 2004; In: http://dx.doi.org/10.1038/nbt1017; 2004.

5. Csete ME, Doyle JC. Reverse engineering of biological complexity. Science. 2002;295:1664–1669.

6. Ideker T, Lauffenburger D. Building with a scaffold: Emerging strategies for high- to low-level cellular modeling. Trends Biotechnol. 2003;21:255–262.

7. Ideker T, et al. Discovering regulatory signalling circuits in molecular interaction networks. Bioinformatics. 2002;18:S233–S240.

8. King RD, Garrett SM, Coghill GM. On the use of qualitative reasoning to simulate and identify metabolic pathways. Bioinformatics. 2005;21:2017–2026.

9. Kitano H. Systems biology: a brief overview. Science. 2002;295:1662–1664.

10. Liu DR, Schultz PG. From generating new molecular function: a lesson from nature. Angew Chem Int Ed. 1999;38:36–54.

11. Noor AK, Venneri SL, Paul DB, Hopkins MA. Structures technology for future aerospace systems. Comput Struct. 2000;74:507–519.

12. Rajan K. Materials informatics. Mater Today. 2005;8:35–39.

13. Stoll M, Cowley Jr AW, Tonellato PJ, et al. A genomic-systems biology map for cardiovascular function. Science. 2001;294:1723–1726.

14. Strausberg RL, Schreiber SL. From knowing to controlling: a path from genomics to drugs using small molecule probes. Science. 2003;300:294–295.

15. Tan P-N, Steinbach M, Kumar V. Introduction to Data Mining Addison-Wesley 2004.

16. Wirth BD, Caturla MJ, Diaz de la Rubia T, Khraishi T, Zbi H. Mechanical property degradation in irradiated materials: a multiscale modeling approach. Nucl Instrum Methods Phys Res B. 2001;180:23–31.

Chapter 2

Data Mining in Materials Science and Engineering

Chandrika Kamath and Ya Ju Fan, Center for Applied Scientific Computing, Lawrence Livermore National Laboratory

1 Introduction

Data mining techniques are increasingly being applied to data from scientific simulations, experiments, and observations with the aim of finding useful information in these data. These data sets are often massive and can be quite complex, in the form of structured or unstructured mesh data from simulations or sequences of images from experiments. They are often multivariate, such as data from different sensors monitoring a process or an experiment. The data can be at different spatial and temporal scales, for example as a result of simulations of a material being modeled at different scales to understand its behavior. In the case of experiments and observations, the data often have missing values and may be of low quality, for example images with low contrast, or noisy data from sensors. In addition to the size and the complexity of the data, we may also face challenges when the data are being analyzed for scientific discovery or in decision making. In the former, the domain scientists may not have a well-formulated question they want addressed, or they may want the data to be explored to determine if any insights could be gained in the analysis. In the latter case, in addition to the results of the analysis, scientists may need information on how much they can trust the results as they want to make decisions based on the analysis.

In this chapter, we provide a brief introduction to the field of data mining, focusing on techniques that are useful in the analysis of data from materials science and engineering applications. Data mining is the process of uncovering patterns, associations, anomalies, and statistically significant structures and events in data. It borrows and builds on ideas from a diverse collection of fields, including statistics, machine learning, pattern recognition, mathematical optimization, as well as signal, image, and video analysis. The resulting literature in the area of data analysis techniques is therefore enormous, and solution approaches are often rediscovered in different fields where they are likely known by different names. In addition, scientists often create their own solutions when they can exploit properties of the data to reduce the time for analysis or improve the accuracy of the analysis.

In light of this, we view this chapter as a starting point for someone interested in learning about the field of data mining and understanding the different categories of techniques available to address their data analysis problems in materials science. We begin the chapter by discussing the types of analysis problems often encountered in scientific domains, followed by a brief description of the analysis process. The bulk of the chapter focuses on different categories of analysis algorithms, including image analysis, dimension reduction, and the building of descriptive and predictive models. We conclude with some suggestions for further reading.

First, some caveats. This chapter is written from the viewpoint of a data miner, not a materials scientist. The focus therefore is on algorithms rather than the insights into materials science one might gain from the application of these algorithms. Further, for the reasons mentioned earlier, the chapter barely scratches the surface of a broad, multidisciplinary field. Therefore, we recommend that the interested reader learn more about various techniques available to address their problem before selecting one for use with their data.

2 Analysis Needs of Science Applications

There are many ways in which scientific data-mining techniques are being used to analyze data from scientific simulations, experiments, and observations in a variety of domains ranging from astronomy to combustion, plasma physics, wind energy, and materials science. The tasks being addressed in these domains are often very similar, and data analysis problems in materials science can frequently be addressed using approaches developed in the context of a related problem in a different application domain.

In the case of experimental data, the data are often in the form of one-dimensional signals or two-dimensional images. The data may be multivariate, for example the signals from several sensors monitoring a process, or may have a time component, such as a sequence of images taken over time. Data in the form of images are often analyzed to extract objects of interest and their characteristics, such as galaxies in astronomical surveys (Kamath et al., 2002) or fragments in images of material fragmentation (Kamath and Hurricane, 2011). Streaming data from sensors monitoring a process may be analyzed to determine if the process is evolving as expected; if something untoward is about to happen, prompting a shutdown; or if the process is moving from one normal state to another, requiring a change in the control parameters. Data analysis techniques are being used both in manufacturing (Harding et al., 2006), in materials development (Morgan and Ceder, 2005), and in the intelligent processing of materials (Wadley and Vancheeswaran, 1998), which integrates advanced sensors, process models, and feedback control concepts.

In the case of simulation data, the output of the simulations can be analyzed to extract information on the phenomenon being modeled. A common task is to identify coherent structures in the data and extract statistics on these structures as they evolve over time. Other tasks include: sensitivity analysis (Saltelli et al., 2009) to understand how sensitive the outputs are to changes in the inputs of the simulation; uncertainty quantification (Committee on Mathematical Foundations of Verification, 2012) to understand how uncertainty in the inputs affects the outputs; and design and analysis of computer experiments (Fang et al., 2005), which uses the simulations to better understand the input space of the phenomenon. For example, if we are interested in creating materials with certain properties, we could use computer simulations to generate the properties for a sample of compounds. We could then analyze the inputs and outputs of these simulations to determine which inputs are more sensitive (and therefore must be sampled more finely), to build a data-driven model to predict the output given the inputs, and to place additional sample points at appropriate input values to create a more accurate predictive model.

In some problems, we may need to analyze both simulation and experimental data. This is often the case in validation, where we compare how close a simulation is to an experiment by extracting statistics from both (Kamath and Miller, 2007). Or we may want to use the simulations to guide an experiment or understand the results of an experiment better.

Though there is a wide variety of analysis tasks encountered in scientific data sets, the techniques used in the analysis are often very similar. For example, methods used to extract information from images may be very similar regardless of whether the images were obtained using a hand-held camera or a scanning electron microscope. Also, techniques to identify coherent structures in simulation output may be similar for structured and unstructured grids. We next discuss some of the analysis techniques relevant to problems in materials science and engineering. A detailed discussion of all techniques is beyond the scope of this chapter; instead, brief descriptions are provided, followed by suggestions for further reading.

3 The Scientific Data-Mining Process

The process of scientific data mining is usually a multistep process, with the techniques used in each step motivated by the type of data and the type of analysis being performed (Kamath, 2009). At a high level, we can consider the process to consist of five steps, as shown in Figure 2.1. The first step is to identify and extract the objects of interest in the data. In some problems this is relatively easy, for example when the objects are chemical compounds or proteins and we are provided data on each of the compounds or proteins. In other cases, it can be more complicated, for example when the data are in the form of images and we need to identify the objects (say, galaxies in astronomical images) and extract them from the background. Once the objects have been identified, we need to describe them using features or descriptors. These should reflect the analysis task. For example, if the task focuses on the structure or the shape of the objects, the descriptors must reflect the structure or the shape respectively. In many cases, one may extract far more features than is necessary, requiring a reduction in the number of features or the dimension of the problem. These key features are then used in the pattern recognition step and the patterns extracted are visualized for validation by the domain scientists.

Figure 2.1 Scientific data analysis: an iterative and interactive process.

The data analysis process is iterative and interactive; any step may lead to a refinement of one or more of the previous steps and not all steps may be used during the analysis of a particular data set. For example, in some problems, such as the analysis of coherent structures (Kamath et al., 2009) or images of material fragmentation (Kamath and Hurricane, 2011), the analysis task may be to extract the objects and statistics on them. This would require only the first two steps of the process. In other problems, we may be given the data set in the form of objects described by a set of features and the task may be to identify the important features using dimension reduction techniques. A few problems may require the complete end-to-end process, for example finding galaxies with a particular shape in astronomical images.

In our experience, data analysis is a close collaboration between the analysis expert and the domain scientist who is actively involved in all steps, starting from the initial description of the data and the problem, the extraction of potentially relevant features, the identification of the training set where necessary in pattern recognition, and the validation of the results from each step.

We next discuss various analysis techniques we have found to be broadly useful in analysis of data from scientific simulations, experiments, and observations.

4 Image Analysis

In this section, we use some of our prior work to describe the tasks in image analysis. We consider the analysis of images obtained from experiments investigating the fragmentation of materials (Kamath and Hurricane 2007, 2011). Figure 2.2A shows a subset of a larger image of a material as it fragments. The lighter regions are the fragments of the material, while the darker areas are the gaps between the fragments. The images were analyzed to obtain statistics for both the fragments (such as their size) and the gaps (such as their length and width). The distributions of these characteristics, in the form of histograms, were then used to provide a concise summary of each image.

Figure 2.2 Processing of an image resulting from the fragmentation of a material. (A) Original image. (B) After the application of the Retinex algorithm to make the illumination uniform. (C) After smoothing to reduce the noise. (D) After segmentation to isolate the fragements from the background. (E) A zoomed-in view after cleanup following segmentation. (F) Identifying the skeleton of the gap regions.

The approach used in the analysis was to first segment the fragments from the background (that is, the gaps) and then extract the statistics on both the fragments and the gaps. This is challenging for several reasons. First, there is a large variation in the intensity, with the top left corner being brighter than the lower right region. The intensity of fragments in the darker regions is similar to the intensity of the gaps in the brighter regions. Some fragments, especially those in the lower right corner, have a range of intensity values. The images are also quite grainy and there is no clear demarcation between the fragments and the

Enjoying the preview?

Page 1 of 1

Informatics for Materials Science and Engineering: Data-driven Discovery for Accelerated Experimentation and Application

About this ebook

Related to Informatics for Materials Science and Engineering

Related ebooks

Materials Science For You

Related podcast episodes

Related articles

Related categories

Reviews for Informatics for Materials Science and Engineering

What did you think?

Book preview

Informatics for Materials Science and Engineering - Krishna Rajan

1

Materials Informatics

1 The What and Why of Informatics

2 Learning from Systems Biology: An Omics Approach to Materials Design

3 Where Do We Get the Information?

4 Data Mining: Data-Driven Materials Research

References

Chapter 2

1 Introduction

2 Analysis Needs of Science Applications

3 The Scientific Data-Mining Process

4 Image Analysis