The Synthesis Genome: Data Mining for Synthesis of New Materials

Project Personnel

Elsa Olivetti

Principal Investigator

Massachusetts Institute of Technology

Email

Yuriy Roman

Massachusetts Institute of Technology

Email

Gerbrand Ceder

University of California, Berkeley

Email

Andrew McCallum

University of Massachusetts Amherst

Email

Funding Divisions

Division of Materials Research (DMR), Civil, Mechanical and Manufacturing Innovation (CMMI), Office of Advanced Cyberinfrastructure (OAC)

Development of new materials is the key to addressing many of the technical challenges our society faces from energy storage to water treatment and purification. To offer just a few examples: in the oil industry, new materials are needed to withstand aggressive conditions, where failure comes with tremendous cost; electrified vehicle drive trains will be advanced by higher performing battery electrodes; carbon dioxide capture requires inexpensive new materials with the proper thermodynamic and kinetic behavior towards absorption and release. The rapid design of novel materials has been transformed by approaches where properties for many tens of thousands of materials can be predicted or inferred by a computer. The pace of commercially-realized advanced materials seems now to be limited by trial-and-error synthesis techniques. In other words, researchers have accelerated the process of knowing what to make such that the bottleneck is now how to make the structures. This research will learn from existing knowledge to develop insight on the synthesis of inorganic compounds. The analytical foundation of these activities stems from advances in machine learning that has allowed computers to excel in typically "human" tasks such as health care diagnoses and game show participation. This research will further accelerate the goals of efforts such as the Materials Genome Initiative for Global Competitiveness by enabling efficient synthesis of novel materials thereby speeding up evaluation of newly suggested materials.

Publications

Machine-Learning Rationalization and Prediction of Solid-State Synthesis Conditions
H. Huo, C. J. Bartel, T. He, A. Trewartha, A. Dunn, B. Ouyang, A. Jain, and G. Ceder
8/5/2022
Dataset of solution-based inorganic materials synthesis procedures extracted from the scientific literature
Z. Wang, O. Kononova, K. Cruse, T. He, H. Huo, Y. Fei, Y. Zeng, Y. Sun, Z. Cai, W. Sun, and G. Ceder
5/25/2022
Aluminum alloy compositions and properties extracted from a corpus of scientific manuscripts and US patents
O. P. Pfeiffer, H. Liu, L. Montanelli, M. I. Latypov, F. G. Sen, V. Hegadekatte, E. A. Olivetti, and E. R. Homer
3/30/2022
ULSA: unified language of synthesis actions for the representation of inorganic synthesis protocols
Z. Wang, K. Cruse, Y. Fei, A. Chia, Y. Zeng, H. Huo, T. He, B. Deng, O. Kononova, and G. Ceder
1/1/2022
A priori control of zeolite phase competition and intergrowth with high-throughput simulations
D. Schwalbe-Koda, S. Kwon, C. Paris, E. Bello-Jurado, Z. Jensen, E. Olivetti, T. Willhammar, A. Corma, Y. Román-Leshkov, M. Moliner, and R. Gómez-Bombarelli
10/15/2021
Discovering Relationships between OSDAs and Zeolites through Data Mining and Generative Neural Networks
Z. Jensen, S. Kwon, D. Schwalbe-Koda, C. Paris, R. Gómez-Bombarelli, Y. Román-Leshkov, A. Corma, M. Moliner, and E. A. Olivetti
4/16/2021
Opportunities and challenges of text mining in materials research
O. Kononova, T. He, H. Huo, A. Trewartha, E. A. Olivetti, and G. Ceder
3/1/2021
Literature mining for alternative cementitious precursors and dissolution rate modeling of glassy phases
H. Uvegi, Z. Jensen, T. N. Hoang, B. Traynor, T. Aytaş, R. T. Goodwin, and E. A. Olivetti
2/12/2021
MS-Mentions: Consistently Annotating Entity Mentions in Materials Science Procedural Text
T. O’Gorman, Z. Jensen, S. Mysore, K. Huang, R. Mahbub, E. Olivetti, and A. McCallum
1/1/2021
Data-driven materials research enabled by natural language processing and information extraction
E. A. Olivetti, J. M. Cole, E. Kim, O. Kononova, G. Ceder, T. Y. Han, and A. M. Hiszpanski
12/1/2020
Text mining for processing conditions of solid-state battery electrolytes
R. Mahbub, K. Huang, Z. Jensen, Z. D. Hood, J. L. M. Rupp, and E. A. Olivetti
12/1/2020
Similarity of Precursors in Solid-State Synthesis as Text-Mined from Scientific Literature
T. He, W. Sun, H. Huo, O. Kononova, Z. Rong, V. Tshitoyan, T. Botari, and G. Ceder
8/19/2020
Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks
E. Kim, Z. Jensen, A. van Grootel, K. Huang, M. Staib, S. Mysore, H. Chang, E. Strubell, A. McCallum, S. Jegelka, and E. Olivetti
1/7/2020
An Instance Level Approach for Shallow Semantic Parsing in Scientific Procedural Text
D. Swarup, A. Bajaj, S. Mysore, T. O’Gorman, R. Das, and A. McCallum
1/1/2020
A Machine Learning Approach to Zeolite Synthesis Enabled by Automatic Literature Data Extraction
Z. Jensen, E. Kim, S. Kwon, T. Z. H. Gani, Y. Román-Leshkov, M. Moliner, A. Corma, and E. Olivetti
4/19/2019

View All Publications

Research Highlights

Designing Materials to Revolutionize and Engineer our Future (DMREF)