Learning Molecular Mixture Property using Chemistry-Aware Graph Neural Network

Representing individual molecules as graphs and their mixture as a set, MolSets leverages a graph neural network and the deep sets architecture to extract information at the molecular level and aggregate it at the mixture level, thus addressing local complexity while retaining global flexibility.

James Rondinelli and Wei Cheng (Northwestern University)

The MolSets model architecture for predicting molecular mixture properties.

Recent advances in machine learning (ML) are expediting materials discovery and design. One significant challenge facing ML for materials is the expansive combinatorial space of potential materials formed by diverse constituents and their flexible configurations.

This complexity is particularly evident in molecular mixtures, a frequently explored space for materials, such as battery electrolytes. Owing to the complex structures of molecules and the sequence-independent nature of mixtures, conventional ML methods have difficulties in modeling such systems.

Here, MolSets, a specialized ML model for molecular mixtures, is presented to overcome the difficulties. Representing individual molecules as graphs and their mixture as a set, MolSets leverages a graph neural network and the deep sets architecture to extract information at the molecular level and aggregate it at the mixture level, thus addressing local complexity while retaining global flexibility.

The efficacy of MolSets is demonstrated in predicting the conductivity of lithium battery electrolytes benefitting the virtual screening of the combinatorial chemical space.

MolSets open-source code is available at: https://github.com/Henrium/MolSets 

Designing Materials to Revolutionize and Engineer our Future (DMREF)