The Synthesis Genome: Data Mining for Synthesis of New Materials

Project Personnel

Elsa Olivetti

Principal Investigator

Massachusetts Institute of Technology


Yuriy Roman

Massachusetts Institute of Technology


Gerbrand Ceder

University of California, Berkeley


Andrew McCallum

University of Massachusetts Amherst


Funding Divisions

Division of Materials Research (DMR), Civil, Mechanical and Manufacturing Innovation (CMMI), Office of Advanced Cyberinfrastructure (OAC)

Development of new materials is the key to addressing many of the technical challenges our society faces from energy storage to water treatment and purification. To offer just a few examples: in the oil industry, new materials are needed to withstand aggressive conditions, where failure comes with tremendous cost; electrified vehicle drive trains will be advanced by higher performing battery electrodes; carbon dioxide capture requires inexpensive new materials with the proper thermodynamic and kinetic behavior towards absorption and release. The rapid design of novel materials has been transformed by approaches where properties for many tens of thousands of materials can be predicted or inferred by a computer. The pace of commercially-realized advanced materials seems now to be limited by trial-and-error synthesis techniques. In other words, researchers have accelerated the process of knowing what to make such that the bottleneck is now how to make the structures. This research will learn from existing knowledge to develop insight on the synthesis of inorganic compounds. The analytical foundation of these activities stems from advances in machine learning that has allowed computers to excel in typically "human" tasks such as health care diagnoses and game show participation. This research will further accelerate the goals of efforts such as the Materials Genome Initiative for Global Competitiveness by enabling efficient synthesis of novel materials thereby speeding up evaluation of newly suggested materials.


Machine-Learning Rationalization and Prediction of Solid-State Synthesis Conditions
H. Huo, C. J. Bartel, T. He, A. Trewartha, A. Dunn, B. Ouyang, A. Jain, and G. Ceder
Dataset of solution-based inorganic materials synthesis procedures extracted from the scientific literature
Z. Wang, O. Kononova, K. Cruse, T. He, H. Huo, Y. Fei, Y. Zeng, Y. Sun, Z. Cai, W. Sun, and G. Ceder
Aluminum alloy compositions and properties extracted from a corpus of scientific manuscripts and US patents
O. P. Pfeiffer, H. Liu, L. Montanelli, M. I. Latypov, F. G. Sen, V. Hegadekatte, E. A. Olivetti, and E. R. Homer
ULSA: unified language of synthesis actions for the representation of inorganic synthesis protocols
Z. Wang, K. Cruse, Y. Fei, A. Chia, Y. Zeng, H. Huo, T. He, B. Deng, O. Kononova, and G. Ceder
A priori control of zeolite phase competition and intergrowth with high-throughput simulations
D. Schwalbe-Koda, S. Kwon, C. Paris, E. Bello-Jurado, Z. Jensen, E. Olivetti, T. Willhammar, A. Corma, Y. Román-Leshkov, M. Moliner, and R. Gómez-Bombarelli
Discovering Relationships between OSDAs and Zeolites through Data Mining and Generative Neural Networks
Z. Jensen, S. Kwon, D. Schwalbe-Koda, C. Paris, R. Gómez-Bombarelli, Y. Román-Leshkov, A. Corma, M. Moliner, and E. A. Olivetti
Opportunities and challenges of text mining in materials research
O. Kononova, T. He, H. Huo, A. Trewartha, E. A. Olivetti, and G. Ceder
Literature mining for alternative cementitious precursors and dissolution rate modeling of glassy phases
H. Uvegi, Z. Jensen, T. N. Hoang, B. Traynor, T. Aytaş, R. T. Goodwin, and E. A. Olivetti
MS-Mentions: Consistently Annotating Entity Mentions in Materials Science Procedural Text
T. O’Gorman, Z. Jensen, S. Mysore, K. Huang, R. Mahbub, E. Olivetti, and A. McCallum
Data-driven materials research enabled by natural language processing and information extraction
E. A. Olivetti, J. M. Cole, E. Kim, O. Kononova, G. Ceder, T. Y. Han, and A. M. Hiszpanski
Text mining for processing conditions of solid-state battery electrolytes
R. Mahbub, K. Huang, Z. Jensen, Z. D. Hood, J. L. M. Rupp, and E. A. Olivetti
Similarity of Precursors in Solid-State Synthesis as Text-Mined from Scientific Literature
T. He, W. Sun, H. Huo, O. Kononova, Z. Rong, V. Tshitoyan, T. Botari, and G. Ceder
Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks
E. Kim, Z. Jensen, A. van Grootel, K. Huang, M. Staib, S. Mysore, H. Chang, E. Strubell, A. McCallum, S. Jegelka, and E. Olivetti
An Instance Level Approach for Shallow Semantic Parsing in Scientific Procedural Text
D. Swarup, A. Bajaj, S. Mysore, T. O’Gorman, R. Das, and A. McCallum
A Machine Learning Approach to Zeolite Synthesis Enabled by Automatic Literature Data Extraction
Z. Jensen, E. Kim, S. Kwon, T. Z. H. Gani, Y. Román-Leshkov, M. Moliner, A. Corma, and E. Olivetti

View All Publications

Research Highlights

Designing Materials to Revolutionize and Engineer our Future (DMREF)