General reactive element-based machine learning potentials for heterogeneous catalysis - Nature Catalysis

In this work, we introduce our element-based machine learning potential (EMLP) approach, which focuses on learning the diverse interactions between elements rather than relying on fixed structural arrangements or predefined reaction coordinates. Central to our approach is random exploration via imaginary chemical optimization (REICO), a sampling procedure that constructs datasets that are detached from structural space and focus solely on atomic interactions. As previous works by Deringer et al.31,32 and Smith et al.33 reported, adding a level of randomness can enhance the robustness and generality of the MLP. The REICO method avoids traditional dependencies on a specific system; instead, it uses complete randomness sampling. Leveraging small systems composed of randomly generated structures and their subsequent optimization trajectories, our EMLP can proficiently navigate a vast array of unique local atomic environments. Such a strategy significantly enhances the diversity and representativeness of training datasets, thereby extending the applicability of MLPs across a broader spectrum of chemical systems. The resulting EMLP can tackle arbitrary reactive catalytic processes without needing to sample the reaction pathways. To demonstrate, we built an EMLP that contains five elements: palladium and silver, which are commercially used as metal catalysts, along with carbon, hydrogen and oxygen, which form various reactants in heterogeneous catalysis. We conducted a comprehensive set of generality tests and benchmarks, from data diversity analysis to MD behaviours. We have also applied the EMLP to various reaction systems in heterogeneous catalysis and beyond. Across this broad spectrum of applications, the EMLP can predict results that are consistent with chemical intuition and DFT calculations, without the need for retaining or fine-tuning. In addition, we have benchmarked our potential with a MD-MLP trained using a traditional MD sampling method, and with MACE-mp34, EquiformerV235 and M3GNet36, three well-recognized foundation models trained on millions of DFT data. Across the wide array of tested systems, the EMLP consistently outperforms other models.

Building an EMLP requires only knowledge of the involved elements, with no domain expertise necessary. Figure 1 outlines the EMLP training workflow, designed to efficiently map and adapt to various local atomic environments within the chemical space (see also Supplementary Figs. 1-5). To verify that our approach achieves the intended diversity and generality, it is crucial to quantitatively assess the generated datasets. We first analysed the diversity and completeness of our dataset compared to traditional methods. We created a benchmark MLP dataset using conventional MD techniques, selecting initial structures from the M3GNet database that contained different combinations of the elements silver, palladium, carbon, hydrogen and oxygen. Figure 2a,b and Supplementary Fig. 7 provide visualizations of these high-dimensional datasets in terms of atomic energy and local atomic environments. It shows a comparison between the datasets generated by our REICO (112,732 structures) and traditional MD (104,104 structures). The REICO dataset, not only encompasses most of the scope of the MD dataset but also has a coverage significantly greater than the chemical space covered by MD sampling. The atomic interactions within the MD-sampled structures are similar, especial for metal elements where the periodic and symmetric nature of stable metal structures from public databases prevails. This can ultimately lead to overfitting, as the model becomes overly specialized in recognizing these familiar patterns and fails to generalize well to new, unseen systems.

The EMLP is designed for broad applicability across diverse chemical systems, extending beyond the specific configurations on which it was trained. To demonstrate its range, we conducted a series of benchmarks.

First, we started with basic atomic interactions. As shown in Fig. 2c, we evaluated the performance of the EMLP by comparing the interactions between all dimer combinations of silver, palladium, carbon, hydrogen and oxygen (see also Supplementary Figure 8 for the rest). These dimers, not specifically present in our training dataset, serve to test the EMLP across a range of bond lengths, showcasing the EMLP's capability to accurately replicate DFT results. Our EMLP predictions align closely with DFT calculations, effectively capturing the repulsion at shorter interatomic distances and accurately depicting the decreasing force as the atoms move apart -- something that other models fail to achieve. These characteristics are crucial to allow the EMLP to perform robust and accurate MD simulations without experiencing unphysical holes (see below). In comparison, M3GNet fails to predict the forces in these dimer systems; EquiformerV2's predictions deviate substantially from DFT; and although MACE-mp performs better than the other two and approaches EMLP-like performance, it still shows an average error of around 10% relative to DFT.

Next, we wanted to assess whether the predicted forces yield physically meaningful dynamics. We performed MD simulations on free clusters of AgO and PdO. MD simulations were performed at 700 K and ran for 500 ps (15 ps for DFT). The snapshots are shown in Fig. 3a. Notably, only EMLP's trajectories align well with the DFT reference, while the other models show unphysical holes (oxygen atoms form O and leave the cluster), causing deformations. Furthermore, to investigate the EMLP's ability to capture subtle energetic changes in finite systems, a critical requirement for studying catalyst behaviour under reaction conditions, we performed MD simulations on three distinct cluster models: a pure silver cluster, a pure palladium cluster and a bimetallic Pd/Ag cluster. As shown in Supplementary Fig. 9, the EMLP consistently reproduces the DFT energy trends across the snapshots, demonstrating its ability to handle subtle structural rearrangements and energetic changes. In contrast, other models show notable deviations for one or more of the tested cluster systems.

Finally, large and complex adsorbates. There are many reaction systems that are just too hard to sample using system-dependent sampling methods, especially for large reactants such as perylenetetracarboxylic dianhydride (PTCDA) which have more than 100 atoms. Previous studies, including experimental work by Hauschild et al. and DFT analysis by Ruiz et al., established consistent configurations of PTCDA adsorbed on the Ag(111) surface. However, building a MLP that provides accurate results for such a system often involves complex and computationally intensive sampling. To demonstrate the robust learning capabilities of our EMLP, which contains mostly training data with 5-30 atoms, we addressed the challenge of the PTCDA system containing 364 atoms. When relaxing PTCDA on Ag(111), shown in Fig. 3b, EMLP-produced geometries align closely with DFT results, accurately capturing both the shape of the adsorbed molecule and its vertical distance to the surface. In contrast, EquiformerV2 yields a nearly flat PTCDA structure but places it at an incorrect distance from the surface. M3GNet and MACE-mp generate structures with noticeable distortions (for example, bending or arching of PTCDA), suggesting that these models struggle with complex, flexible adsorbates.

In this section, we focus on three particularly challenging aspects of reactions in heterogeneous catalysis: complex catalyst surfaces, surface coverage and extended reaction networks. To demonstrate our EMLP's advantages for describing reactivity, we selected several classic reaction systems, including CO oxidation, acetylene hydrogenation, the Fischer-Tropsch process and ethylene epoxidation (Supplementary Fig. 10). We also evaluated M3GNet, EquiformerV2 and MACE-mp, paying special attention to transition state searches.

CO oxidation, one of the most used model systems in the study of heterogeneous catalysis, has greatly contributed to advances in fundamental catalytic research. Recent studies have highlighted the CO-induced reaction-driven metal-metal bond breaking in metal catalytic surfaces even under relatively mild conditions. These findings emphasize the need for theoretical predictions of potential surface reconstructions under reactive conditions, which would enhance our understanding of active sites on metal catalysts and guide future experimental efforts. To test whether our EMLP can recognize different surface facets, we built three kinds of surfaces to test the reactivity of the models, namely, simple surfaces (Pd(100), Pd(110), Pd(111)), special surfaces (twinned boundary palladium and nanorod structure palladium), and various palladium-on-Ag(111) surfaces (Pd/Ag(111) single-atom catalyst, palladium adatom on Ag(111), palladium clusters on Ag(111) and palladium-line/Ag(111)). As shown in Fig. 4, the EMLP consistently delivers the most accurate reaction barriers and enthalpy changes. In contrast, MACE-mp and M3GNet often show significant deviations, especially near transition states. EquiformerV2 can occasionally match EMLP's accuracy on symmetric surfaces and bimetallic surfaces for barrier but struggles with enthalpy predictions.

Simulating surface coverage effects, a critical factor in surface reaction kinetics, using DFT approaches is notably time consuming. The iterative method, currently considered state-of-the-art for calculating reaction kinetics, demands accurate knowledge of surface coverage effects, necessitating detailed configurations of all surface species interactions. To validate the reliability of our EMLP for such a system, we calculated the reaction pathways for acetylene hydrogenation on both clean Pd(111) and hydrogen-covered Pd(111) surfaces. As shown in Fig. 5, the energy profiles calculated by the EMLP for both coverage-independent and coverage-dependent reaction systems align closely with those calculated by DFT. Additionally, we tested the EMLP on metal alloys, particularly PdAg alloys, which are also highly regarded for acetylene hydrogenation from a previous work by Li et al.. They found that the Pd/PdAg(111) surface with a hydrogen coverage of 1 ML (ML, monolayer) and the PdAg/PdAg(111) surface with a hydrogen coverage of 0.25 ML exhibited the lowest reaction energy barriers. Using the EMLP, we reproduced their DFT results for these surfaces, while the other models struggle to match this performance. EquiformerV2 and MACE-mp are better than M3GNet in terms of energy predictions. This demonstrates that EMLP is not only capable of effectively capturing complex interspecies interactions on catalyst surfaces but also provides a robust platform for predicting reaction kinetics across various catalytic systems.

The Fischer-Tropsch process -- one of the most important and complex reaction systems in heterogeneous catalysis -- involves the elementary steps of chain initiation, chain growth and chain termination. The reaction systems include a large number of possible elementary steps and reaction intermediates. Here, we have selected a reaction pathway from C1 to C6. Although this process normally makes use of iron-based catalysts, for benchmarking purposes we are using Pd(100) as the catalyst in this case. The list of the elementary steps considered is shown in Supplementary Table 8. In Fig. 6a,c, we compare the entire energy profile predicted by our EMLP with the energy profile computed using DFT. The close agreement between the EMLP and DFT curves across numerous steps and complex intermediates highlights the generality and accuracy of the EMLP in capturing the energetics of intricate catalytic reaction networks. In terms of the success rate of finding transition states, from Fig. 6b,d, EMLP and MACE-mp were able to successfully find the transition states in each step of the reaction pathway, while EquiformerV2 failed to find the transition states in three reactions, R7, R12 and R21.

We have summarized the performance of our EMLP alongside two of the stronger foundation models, MACE-mp and EquiformerV2, across a range of heterogeneous catalysis reactions. In addition to evaluating energy predictions, we rigorously assessed the validity of the transition state (TS) structures predicted by each model through frequency calculations and structural similarity analysis. The average values for each system are listed in Table 1; details are shown in Supplementary Fig. 11 and Supplementary Tables 4-7 and 9. The EMLP achieves ~0.1 eV mean absolute error (MAE) for both E and ΔE across all the systems while maintaining a tight worst-case deviation (maximum deviation (MaxDev) ≤ 0.34 eV). In contrast, MACE-mp exhibits MAEs up to 0.59 eV and large outliers (MaxDev > 1.5 eV), and EquiformerV2 shows intermediate performance (MAE = 0.07-0.36 eV, MaxDev > 1.4 eV). Equally important, the EMLP achieves 100% transition‐state localization success and ≥ 96%geometric fidelity in every case, whereas MACE-mp and EquiformerV2 struggle. Taking the Fischer-Tropsch example, from Supplementary Fig. 11, we see that the EMLP is the best in both error and transition-state structure similarity, and more importantly, it maintains a high degree of consistency in such a complex catalytic reaction network. Namely, there is a certain correspondence between the transition-state energy error and the structure error. In contrast, both MACE-mp and EquiformerV2 are unable to maintain consistence: it appears that either the structures are very similar to the DFT structures, but the energies differ by >0.5 eV, such as TS6, TS11 and TS17 for MACE-mp, or the energies are accurate but the structures differ by >20%, such as TS13, TS18 and TS19. That is, even if some of the energies are correctly calculated, this is based on coincidental error cancellation rather than the generality of the model itself, which is also consistent with our previous conclusions. The performance of MACE-mp can be attributed to the fact that catalytic activity is often surface and interface dependent, which is simply not captured in bulk datasets, thus imposing a hard ceiling on predictive performance. It is also interesting to see that EquiformerV2, which is trained on a dataset focused on surfaces with adsorbed species, failed most test cases. It might also suggest that slab models are indeed not ideal training data, as it contains too many similar atomic arrangements, limiting the model's ability to generalize.

To fully demonstrate the capability and generality of our EMLP, we extended its application beyond heterogeneous catalysis; a broader range of systems from organic chemistry to surface dynamics were tested. We were pleasantly surprised by how well the EMLP performed across all test cases, showing its versatility and potential across diverse domains.

We have adapted the standard test reactions given by Baker and Chan, which cover a range of different reaction types and have long been used as a benchmark for transition-state searching. Nine out the 25 reactions were chosen as our EMLP only covers carbon, hydrogen and oxygen, as shown in Fig. 7a. For simpler or smaller molecular systems (for example, HCCH ↔ CCH, HCO ↔ H + CO), E and ΔH values from the EMLP differ from DFT calculations by only around 0.1-0.2 eV. For more complex reactions, such as the cyclization of butadiene + ethylene → cyclohexene, the predicted activation barrier (0.67 eV) from the EMLP slightly overshoots the DFT value (0.46 eV) by about 0.2 eV. Nevertheless, this remains a modest deviation. The ΔE predictions match even more closely, indicating that the EMLP maintains good fidelity for moderately complicated transformations. In the case of CHCH → CHCH + H, which exhibits higher activation barriers, DFT gives rise to an Eₐ of 4.70 eV, while the EMLP predicts 4.23 eV, yielding an absolute difference of ∼0.5 eV. However, this is still on the order of a 10% relative error, which can be considered reasonable for high-energy transition states. For reactions sensitive to conformational shifts, such as trans-butadiene ↔ cis-butadiene isomerization or vinyl alcohol ↔ acetaldehyde, the EMLP also provides E and ΔH values that closely align with DFT. This implies that the EMLP can effectively capture nuanced changes in molecular geometry and their corresponding energy variations.

Predicting the free energy accurately with MLPs, especially in complex reaction environments such as water, presents significant challenges. Luo et al. showed that their system-specific MLP can be used for enhanced MD simulations, namely, umbrella sampling, to compute free-energy profiles for the reaction of CO with atomic oxygen on the Pt(111) surface at both solid-gas and solid-liquid interfaces. We applied the EMLP to replicate and extend these findings to the Pd(111) surface, by performing enhanced MD simulations to obtain free-energy profiles for CO oxidation on Pd(111) under similar environmental conditions, including challenging aqueous environments. Figure 7b shows the structures and the free-energy profile versus the reaction coordinates for CO* + O* in the presence of water molecules, while Fig. 7c illustrates the CO* + O* reaction in the absence of water molecules at 300 K from umbrella sampling simulations. The free-energy barrier computed by the EMLP closely matches the energy barrier calculated by DFT, and the overall shape of the free-energy curve is almost identical, indicating that our general-purpose EMLP was able to capture the relevant chemistry as well as the system-specific MLP.

We also examined H adsorption and dissociation dynamics on the Ag(100) surface. To map the potential energy surface (PES) of this system and enable a direct comparison with our EMLP predictions, we conducted 854 single-point DFT and EMLP calculations at various combinations of vertical distance (z) and lateral spacing (r), while keeping the molecular centre and orientation fixed at the hollow site. Supplementary Note 11 shows structures from the dynamics simulations of H adsorption and dissociation on Ag(100). The resulting two-dimensional contour plots (Fig. 7d) reveal that the EMLP-derived PES closely mirrors the DFT reference. The smooth progression of energy contours indicates that the EMLP correctly captures the interplay between vertical adsorption height and lateral site variation. The overall similarity in contour shapes, spacing and energy gradients demonstrates that the EMLP provides a near-DFT-level representation of the potential energy surface.

Moreover, we have also tested our EMLP's performance for liquid-phase simulations (Supplementary Fig. 13). Figure 7e presents six radial distribution function (RDF) plots, O-O, O-H, H-H, C-C, C-H and C-O, illustrating how closely the EMLP matches the DFT reference. Overall, the EMLP successfully replicates the peak positions and approximate intensities, demonstrating its ability to capture the key structural features of potential energy surface of methanol. Some discrepancies in the C-C or C-H RDFs may result from limited sampling of these interactions because the EMLP is primarily designed for heterogeneous catalysis. The RDF results of liquid methanol confirm that the EMLP-based MD simulation successfully reproduces the well-defined hydrogen-bonding network (indicated by the O-O and O-H peaks), the arrangement of methyl groups captured by the C-C peaks and the intermediate ordering reflected in the H-H, C-H and C-O correlations.

We also carried out further validation to demonstrate the EMLP's broad applicability, including lattice constant calculations (Supplementary Table 11), global optimization of Ag-AgO (Supplementary Fig. 14), relaxation of rattled surface structures (Supplementary Fig. 15) and high-temperature MD (Supplementary Fig. 16).

General reactive element-based machine learning potentials for heterogeneous catalysis - Nature Catalysis

POPULAR CATEGORY

corporate

entertainment

research

misc

wellness

athletics