AI model enables reliable and accurate protein-ligand complex prediction

Understanding protein–ligand interactions is fundamental to molecular biology and biochemistry. These interactions are at the heart of many cellular processes, from enzyme catalysis to signal transduction. The foundational knowledge of protein–ligand interactions has paved the way for structure-based drug design (SBDD), a pivotal area in pharmaceutical research.
In a study in Nature Methods, a research team led by Prof. Zheng Mingyue from the Shanghai Institute of Materia Medica (SIMM) of the Chinese Academy of Sciences introduced a new deep learning method, SurfDock, which demonstrates good performance in both retrospective and prospective virtual screening tasks.
SurfDock is a geometric diffusion network designed to generate reliable and accurate binding ligand poses. The diffusion process is conditioned on a protein pocket and a random starting ligand conformation. SurfDock incorporates an internal scoring module, SurfScore, which is trained on crystal protein–ligand complexes to estimate pose confidence.
In addition, SurfDock integrates multimodal protein information such as surface features, residue structure features and pre-trained sequence level features, into a surface node level representation.
SurfDock achieved top performance in docking success rates across several benchmarks, outperforming existing deep learning methods in terms of the plausibility of generated poses. It also improved the accuracy and validity by incorporating an optional force field-based relaxation step for protein-fixed ligand optimization.
Moreover, SurfDock exhibited remarkable generalizability to new proteins, pockets and apo structures, while being robust against varying ligand flexibility. In virtual screening scenarios, it matched or even exceeded the performance of existing docking methods.
Researchers showed the practical utility of SurfDock in a real-world small-molecule discovery project targeting aldehyde dehydrogenase 1B1 (ALDH1B1) where seven hit molecules with novel scaffolds were quickly identified. This performance, combined with its practicality and reliability, makes SurfDock a valuable contribution to pharmaceutical research.
The ability to accurately predict protein–ligand complexes could significantly improve the understanding of protein biology and assist in designing new therapeutic agents. Researchers envision that, with continual improvements, SurfDock will become an essential tool in the SBDD community.
More information: Duanhua Cao et al, SurfDock is a surface-informed diffusion generative model for reliable and accurate protein–ligand complex prediction, Nature Methods (2024).
Journal information: Nature Methods
Provided by Chinese Academy of Sciences