The Synthesis Genome: Data Mining for Synthesis of New Materials

Project Personnel

Elsa Olivetti

Principal Investigator

Massachusetts Institute of Technology (MIT)


Gerbrand Ceder

University of California, Berkeley


Yuriy Roman

Massachusetts Institute of Technology (MIT)


Andrew McCallum

University of Massachusetts, Amherst


Funding Divisions

Division of Materials Research (DMR), Civil, Mechanical and Manufacturing Innovation (CMMI), Office of Advanced Cyberinfrastructure (OAC)

Development of new materials is the key to addressing many of the technical challenges our society faces from energy storage to water treatment and purification. To offer just a few examples: in the oil industry, new materials are needed to withstand aggressive conditions, where failure comes with tremendous cost; electrified vehicle drive trains will be advanced by higher performing battery electrodes; carbon dioxide capture requires inexpensive new materials with the proper thermodynamic and kinetic behavior towards absorption and release. The rapid design of novel materials has been transformed by approaches where properties for many tens of thousands of materials can be predicted or inferred by a computer. The pace of commercially-realized advanced materials seems now to be limited by trial-and-error synthesis techniques. In other words, researchers have accelerated the process of knowing what to make such that the bottleneck is now how to make the structures. This research will learn from existing knowledge to develop insight on the synthesis of inorganic compounds. The analytical foundation of these activities stems from advances in machine learning that has allowed computers to excel in typically "human" tasks such as health care diagnoses and game show participation. This research will further accelerate the goals of efforts such as the Materials Genome Initiative for Global Competitiveness by enabling efficient synthesis of novel materials thereby speeding up evaluation of newly suggested materials.


Machine-Learning Rationalization and Prediction of Solid-State Synthesis Conditions
H. Y. Huo, C. J. Bartel, T. J. He, A. Trewartha, A. Dunn, B. Ouyang, A. Jain, and G. Ceder
Aug 2022
Dataset of Solution-based Inorganic Materials Synthesis Procedures Extracted from the Scientific Literature
Z. R. Wang, O. Kononova, K. Cruse, T. N. He, H. Y. Huo, Y. X. Fei, Y. Zeng, Y. Z. Sun, Z. J. Cai, W. H. Sun, and G. Ceder
May 2022
ULSA: Unified language of synthesis actions for the representation of inorganic synthesis protocols
Zheren Wang, Kevin Cruse, Yuxing Fei, Ann Chia, Yan Zeng, Haoyan Huo, Tanjin He, Bowen Deng, Olga Kononova, and Gerbrand Ceder
Apr 2022
Aluminum Alloy Compositions and Properties Extracted from a Corpus of Scientific Manuscripts and US Patents
O. P. Pfeiffer, H. H. Liu, L. Montanelli, M. I. Latypov, F. G. Sen, V. Hegadekatte, E. A. Olivetti, and E. R. Homer
Mar 2022
A priori control of zeolite phase competition and intergrowth with high-throughput simulations
Schwalbe-Koda, D., Kwon, S., Paris, C., Bello-Jurado, E., Jensen, Z., Olivetti, E., Willhammar, T., Corma, A., Román-Leshkov, Y., Moliner, M. and Gómez-Bombarelli
Sep 2021
Discovering relationships between OSDAs and zeolites through data mining and generative neural networks
Jensen, Z., Kwon, S., Schwalbe-Koda, D., Paris, C., Gómez-Bombarelli, R., Román-Leshkov, Y., Corma, A., Moliner, M. and Olivetti, E.A.
Apr 2021
Opportunities and challenges of text mining in materials research
Olga Kononova, Tanjin He, Haoyan Huo, Amalie Trewartha, Elsa A. Olivetti, and Gerbrand Ceder
Mar 2021
Data-driven materials research enabled by natural language processing and information extraction
Elsa A. Olivetti, Jacqueline M. Cole, Edward Kim, Olga Kononova, Gerbrand Ceder, Thomas Yong-Jin Han, and Anna M. Hiszpanski
Dec 2020
Text Mining for Processing Conditions of Solid-state Battery Electrolyte
R. Mahbub, K. Huang, Z. Jensen, Z. D. Hood, J. L. M. Rupp, and E. A. Olivetti
Dec 2020
Literature Mining for Alternative Cementitious Precursors and Dissolution Rate Modeling of Glassy Phases
H. Uvegi, Z. Jensen, T. N. Hoang, B. Traynor, T. Aytas, R. T. Goodwin, and E. A. Olivetti
Dec 2020
Similarity of precursors in solid-state synthesis as text-mined from scientific literature
Tanjin He, Wenhao Sun, Haoyan Huo, Olga Kononova, Ziqin Rong, Vahe Tshitoyan, Tiago Botari, and Gerbrand Ceder
Aug 2020
Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks
E. Kim, Z. Jensen, A. van Grootel, K. Huang, M. Staib, S. Mysore, H.-S. Chang, E. Strubell, A. McCallum, S. Jegelka, and E. Olivetti
Jan 2020

Research Highlights