- 无标题文档
查看论文信息

中文题名:

 CO2电还原反应催化剂的理论设计与生物催化反应的机器学习预测    

姓名:

 裴书鑫    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 070304    

学科专业:

 物理化学(含∶化学物理)    

学生类型:

 硕士    

学位:

 理学硕士    

学位类型:

 学术学位    

学位年度:

 2023    

校区:

 北京校区培养    

学院:

 化学学院    

研究方向:

 理论与计算化学    

第一导师姓名:

 陈雪波    

第一导师单位:

 化学学院    

提交日期:

 2023-06-05    

答辩日期:

 2023-05-30    

外文题名:

 THEORETICAL DESIGN OF CATALYST FOR CO2 ELECTROREDUCTION REACTION AND MACHINE LEARNING PREDICTION OF BIOCATALYTIC REACTIONS    

中文关键词:

 CO2电催化还原反应 ; 密度泛函理论计算 ; 酶催化反应 ; 对映选择性 ; 机器学习    

外文关键词:

 CO2 electrocatalytic reduction reaction ; Density functional theory calculation ; Enzymatic reaction ; Enantioselectivity ; Machine learning    

中文摘要:

催化化学对人类社会的发展和进步有着深远的影响。本文分别利用密度泛函理论和机器学习模型,通过理论计算研究了两个催化体系:电催化CO2还原反应和酶催化反应。

化石能源枯竭带来的能源问题和以CO2为主的温室气体过量排放导致的环境问题是当今社会面临的两个全球性问题。如果能够实现将CO2转化为碳氢化合物则可以同时解决这两个问题,而电化学就是极具前景的方法之一。电催化反应具有反应条件温和、可控性强、环保、应用广泛等特点,其反应性能很大程度上依赖于电催化剂的性能,因此寻找合适的用于将CO2转化为碳氢化合物的电还原催化剂仍是这一研究领域的热点。为尝试解决这一问题,本文设计了一种新型双原子电催化剂,通过密度泛函理论(DFT)计算对其催化CO2还原反应的性能和反应机理进行了理论研究。

手性化合物在医药、化工以及生命科学等领域具有重要的应用和研究价值。生物催化反应具有高效、高催化活性等特点,利用该方法合成具有高对映选择性的手性化合物具有巨大的应用潜力。然而,通过传统的实验方法和量化计算方法来确定生物催化反应的活性和对映选择性效率低下,造成大量人力物力的浪费。因此,亟需新的方法实现对具有高活性和高对映选择性反应的快速预测。本文选择酰胺酶(amidase)和胺转氨酶(ATAs)催化的反应作为研究对象,使用机器学习(ML)方法分别研究了底物结构对反应对映选择性以及酶的氨基酸序列对反应活性和立体选择性的影响,实现了对具有高活性和高对映选择性反应的预测。主要工作如下:

(1)构建负载在石墨烯上的氮杂Cu/Ru双金属电催化剂(CuRuN6C)结构模型,通过进行第一性原理分子动力学(AIMD)模拟并计算结合能评估该结构的稳定性,结果表明在500 K下该催化剂是热力学稳定的。DFT计算结果表明,在该催化剂表面发生CO2还原反应(CO2RR)的路径为*CO2→*COOH→*CO→*CHO→*CHOH→*CH2OH→*CH2→*CH3→*CH4,最终生成CH4。这一过程的决速步为*CO→*CHO,热力学能垒为0.57 eV。计算过程中发现中间体*CO对CuRuN6C的吸附太强,因此构建了*CO共吸附结构模型。在该结构上发生CO2RR的反应路径改变为*CO2→*OCHO→*HCOOH→*CHO→*CH2O→*CH3O→*CH3OH→*CH3→*CH4,决速步为*OCHO→*HCOOH,热力学能垒降低至0.39 eV。这表明,*CO的共吸附削弱了CO2RR过程中各中间体在活性位点的结合。

(2)根据实验合作者提供的实验数据构建了包含240个酰胺酶催化反应的数据集,选择基于分子片段的独热编码和基于加权对称函数(wACSF)的直方图分布作为描述符,分别反映底物分子所包含的“化学”信息和“几何”信息。以ee = 80%、90%、95%为分类标准,训练了3个随机森林(RF)分类器用于预测反应的对映选择性范围,最终模型的F得分分别为0.862、0.862和0.711。对特征重要性进行分析,发现底物分子包含的一些原子类型周围的局域结构对于对映选择性可能有更加重要的影响。与实验合作者对14种新的底物开展“背靠背”的双盲验证,结果显示,其中11种底物的ML模型预测结果能够与实验测量结果吻合,这说明ML模型的可靠性。

(3)以实验合作者提供的转氨酶3FCR实验数据构建数据集,设计了一种改进的独热编码作为描述符用于描述ATAs催化反应中底物与关键氨基酸残基的空间效应和电子效应。随后,训练出可以预测反应催化活性和立体选择性的梯度提升回归树(GBRT)模型。实验合作者将该模型用于优化突变体的设计,得到的最佳突变体催化活性相比于之前提高了3倍。将合作者提供的新数据添加到数据集重新训练模型,实现了GBRT模型的迭代更新,最终模型测试集R2达到0.905。此外,通过在数据集中添加少量转氨酶3HMU数据重新训练该模型,可以较好地预测3HMU突变体的催化活性。

外文摘要:

The catalytic chemistry has a profound impact on the development and progress of human society. In this thesis, two catalytic systems, electrocatalytic CO2 reduction reaction and enzymatic reaction, were studied by theoretical calculations using density functional theory and machine learning, respectively.

The energy problem caused by the depletion of fossil fuels and the environmental problem of excessive CO2 emissions, which is the main greenhouse gas, are two global issues facing society today. If CO2 can be converted into hydrocarbons, both problems can be solved simultaneously, and electrochemistry is one of the most promising methods. Electrocatalytic reactions have the characteristics of mild reaction conditions, strong controllability, environmental friendliness, and wide applicability. The performance of electrocatalysts has a great influence on the reaction performance. Therefore, finding suitable electroreduction catalysts for converting CO2 into hydrocarbons is still a hot topic in this research field. In order to solve this problem, a novel dual-atom electrocatalyst was designed, and its performance and reaction mechanism for electrocatalytic CO2 reduction were studied theoretically using density functional theory (DFT) calculation.

Chiral compounds have significant applications and research value in the fields of medicine, chemical industry and life sciences. Biocatalytic reactions are characterized by high efficiency and catalytic activity. The use of biocatalytic reactions for the synthesis of chiral compounds with high enantioselectivity has great application prospects. However, using traditional experimental and quantum chemical calculation methods to determine the activity and enantioselectivity of biocatalytic reactions is inefficient, resulting in a lot of waste of manpower and material resources. Therefore, new methods are urgently needed to achieve rapid prediction of reactions with high activity and enantioselectivity. In this thesis, the reactions catalyzed by amidase and amine transaminases (ATAs) were selected as the research objects. The effects of substrate structure on enantioselectivity and the amino acid sequence of the enzymes on reaction activity and stereoselectivity were studied by machine learning (ML) method, respectively, and the prediction of reactions with high activity and enantioselectivity was realized. The main work is as follows:

(1) A nitrogen-doped Cu/Ru dual-atom electrocatalyst supported on graphene (CuRuN6C) was modeled and its stability was evaluated by performing ab initio molecular dynamics (AIMD) simulations and calculating the binding energy. The results showed that the CuRuN6C was thermodynamically stable at 500 K. The results of DFT calculations showed that the pathway of CO2 reduction reaction (CO2RR) on the surface of CuRuN6C is *CO2→*COOH→*CO→*CHO→*CHOH→*CH2OH→*CH2→*CH3→*CH4, and CH4 is finally generated. The rate-determining step of this process is *CO→*CHO, with a thermodynamic barrier of 0.57 eV. It was found that the intermediate *CO adsorbed strongly on CuRuN6C in the process of calculation, so a *CO co-adsorption structure model was constructed. The reaction pathway of CO2RR changed to *CO2→*OCHO→*HCOOH→*CHO→*CH2O→*CH3O→*CH3OH→*CH3→*CH4 on the co-adsorption structure with the rate-determining step being *OCHO→*HCOOH, and the thermodynamic barrier reduced to 0.39 eV. This suggests that the co-adsorption of *CO weakened the binding of various intermediates in the active site during the CO2RR process.

(2) A dataset consisting of 240 amidase-catalyzed reactions was constructed based on experimental data provided by experimental collaborators. The one hot encoding based on molecular fragments and the histograms of weighted atom-centered symmetry functions (wACSF) were selected as descriptors to represent the "chemical" and "geometrical" information contained in the substrate molecules, respectively. In the three classification criterion of ee = 80%, 90%, and 95%, three random forests (RF) classifiers were trained to predict the enantioselectivity range of the reaction. The F-scores of the final models were 0.862, 0.862, and 0.711, respectively. The analysis of feature importance revealed that the local structure around some atomic types contained in the substrate molecules may have a more significant impact on enantioselectivity. Double-blind validation of 14 new substrates with experimental collaborators showed that the prediction results of the ML models of 11 substrates were consistent with the experimental measurements, indicating the reliability of the ML model.

(3) A dataset was constructed using the transaminase 3FCR experimental data provided by the experimental collaborators. A modified one hot encoding was designed as descriptor to represent the steric and electronic effects of substrates and key amino acid residues in ATAs catalyzed reactions. Subsequently, a gradient boosting regression tree (GBRT) model was trained to predict both reaction catalytic activity and stereoselectivity. The experimental collaborators used this model to optimize the design of mutants, and the best mutant obtained showed a 3-fold increase in catalytic activity compared to before. The GBRT model was iteratively updated by adding new data provided by the experimental collaborator to the dataset and retraining the model. The test set R2 of the final model reached 0.905. In addition, by adding a small amount of transaminase 3HMU data to the dataset and retraining the model, the catalytic activity of the 3HMU mutant could be well predicted.

参考文献总数:

 145    

馆藏号:

 硕070304/23001    

开放日期:

 2024-06-04    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式