AutoDock Vina docking calculation (large batch)

AutoDock Vina 1.2.0 docking calculation (large batch)

Example applications of AutoDockVina 1.2.0: A) docking of multiple ligands (PDB 5×72); B) docking with water molecules using the hydration docking protocol of AutoDock4 (PDB 4ykq); C) using the AutoDock4Zn force field in the presence of zinc ( PDB 1s63); D) Flexible macrocycles.

1. AutoDock Vina 1.2.0 Introduction

AutoDock Vina 1.2.0, arguably one of the fastest and most widely used open source programs for molecular docking, is an upgraded version of AutoDock Vina. Here are some of the key features and functionality of AutoDock Vina 1.2.0:

  • Simultaneous Multiple Ligand Docking
    Vina is now able to dock multiple ligands simultaneously. This capability may have applications in fragment-based drug design, where small molecules that bind the same target can be grown or combined into larger compounds with potentially better affinity.

  • Hydrated Docking
    Hydration docking can directly simulate ligand-receptor and water molecule interactions. This method, based on docking of ligands explicitly hydrated with spherical water molecules, can be used to predict the position and role (i.e., bridging or displacement) of individual water molecules and generally improve ligand conformation predictions.

  • AutoDock4_Zn
    AutoDock4_Zn is a special force field based on AD4 that simulates zinc ion ligands. Based on the use of pseudoatoms to describe the optimal tetrahedral coordination geometry of zinc ions complexed in proteins, and an improved definition of potentials to describe its coordinating elements (i.e., nitrogen, oxygen, and sulfur) in the ligand. This method was able to reproduce the docking performance of AD4, and the results showed good overlap between the crystal conformation of the ligand and the optimal zinc coordination geometry.

  • Macrocycle Conformational Sampling
    Docking of macrocyclic compounds is a challenging task due to the difficulty of sampling ring flexibility by modeling the relevant torsional changes leading to different conformations. AD4 has a dedicated protocol for docking macrocycles while dynamically modeling their flexibility. One bond in the ring structure is broken, creating an open form of the macrocycle, allowing independent exploration of torsional freedom. During the docking process, a linear potential is applied to restore the bond, resulting in a closed ring form. Therefore, macrocycle conformations are sampled while fitting the binding pocket, but at the cost of increased search complexity due to the addition of additional rotatable bonds.

  • Python API
    Taking advantage of the popularity and practicality of the Python language, a Python API was added to AutoDock Vina 1.2.0. In order to produce a Python interface that is as pythonic as possible, the Vina code was refactored into a Python library. Most of its functionality is implemented through direct binding to existing C++ code, or through additional functions to simplify access to the Python environment. The provided Python API has the following features:

    • Create an instance of the AutoDock Vina engine (scoring function selection, CPU cores and random seeds)
    • Read/write one or more PDBQT files
    • Calculate Vina affinity
    • Read/write Vina affinity and read AutoDock affinity
    • Randomize the orientation and position of input ligands (randomize_only)
    • Evaluate the energy of the current pose or poses (score_only)
    • Perform local optimization (local_only)
    • Set Monte Carlo global search parameters (exhaustiveness, number of output poses, maximum evaluation, etc…)

Therefore, a basic Vina calculation can be configured and executed as follows:

#!/usr/bin/env python
# Simple example with Vina Python bindings
from vina import Vina
v=Vina()

v.set_receptor("protein.pdbgt")ii
v.set_ligand_from_file("ligand.pdbqt")

v.compute_vina_maps([0., 0., 0.], [30, 30, 30])
v.dock(exhaustiveness=32)

v.write_poses("docking_results.pdbqt")

2. AutoDock Vina 1.2.0 installation

2.1 Download

According to your own hardware environment, download on demand, official download website.

2.2 Linux installation

Given that virtual screening tasks require large-scale or ultra-large-scale molecular docking calculation tasks, the most suitable option is to install Vina on a Linux server. Here are the steps to install, test, and set environment variables.
1. Installation

tar xzvf autodock_vina_1_1_2_linux_x86.tgz

2. Test

./autodock_vina_1_1_2_linux_x86/bin/vina --help

The help information is output as follows?

Input:
?–receptor arg rigid part of the receptor (PDBQT)
?–flex arg flexible side chains, if any (PDBQT)
?–ligand arg ligand (PDBQT)

Search space (required):
?–center_x arg X coordinate of the center
?–center_y arg Y coordinate of the center
?–center_z arg Z coordinate of the center
?–size_x arg size in the X dimension (Angstroms)
?–size_y arg size in the Y dimension (Angstroms)
?–size_z arg size in the Z dimension (Angstroms)

Output (optional):
?–out arg output models (PDBQT), the default is chosen based on the ligand file name
?–log arg optionally, write log file

Misc (optional):
?–cpu arg the number of CPUs to use (the default is to try to detect the number of CPUs or, failing that, use 1)
?–seed arg explicit random seed
?–exhaustiveness arg (=8) exhaustiveness of the global search (roughly proportional to time): 1 +
?–num_modes arg (=9) maximum number of binding modes to generate
?–energy_range arg (=3) maximum energy difference between the best binding mode and the worst one displayed (kcal/mol)

Configuration file (optional):
?–config arg the above options can be put here

Information (optional):
?–help display usage summary
?–help_advanced display usage summary with advanced options
?–version display program version

3. Add to environment variables
To simplify and facilitate the use of vina, you can add it to the environment variables. Edit configuration files such as ~/.bashrc or ~/.bash_profile, and then add the installation path of vina to the environment variable.

#Modify the path to the installation path of vina, and finally write it into the .basrch file
export PATH=/home/.../autodock_vina_1_1_2_linux_x86/bin/:$PATH

Note: Other programs may be used for early file processing and subsequent visual analysis. It is recommended to install ADFR and add it to the environment variables.

3. Receptor ligand treatment before docking

This part requires ADFR, please download and install it yourself and add it to the environment variables. The following example assumes that you are in a directory that contains the receptor receptor.pdbqt and a set of ligands named ligand_01.mol2, ligand_02.mol2, and so on.

3.1 Receptor processing (converting PDB to PDBQT format)

Before converting the PDB format of the receptor to the PDBQT format, be sure to confirm whether the receptor is protonated (the structure downloaded directly from the RCSB database lacks a protonation step) and whether there are heteroatoms (solvents, ions, cofactors and vina do not support calculated molecules), side chain deletion, loop deletion, binding site mutation< /strong>(Pathological mutations or others), Retention of water molecules, etc. are considered one by one. I myself often use the Protein Prepare Wizard in Schr?dinger’s Maestro to normalize proteins, which can solve problems such as protonation, side chain loss, loop loss, and energy minimization at once.
This script assumes that vina and ADFR are already in the PATH environment variable (see 2.2.3). Otherwise, please modify accordingly.

#!/bin/bash
prepare_receptor -r receptor.pdb -o receptor.pdbqt
#-r is the input receptor PDB format file
#-o is the output receptor PDBQT format file

3.2 Ligand processing (converting MOL2 to PDBQT format)

As the old saying goes, before converting the format, be sure to check whether the molecular structure is correct and check the protonation state to see if it is a 3D structure, otherwise you need to convert 2D to 3D (refer to Ligprep small molecule 3D structure generation in Schr?dinger Detailed explanation). It is strongly recommended not to use the PDB format for small molecules as it does not contain information on critical connections.
Currently, prepare_ligand in ADFR can convert MOL2 format to PDBQT format, and batch convert the formats of all small molecules to PDBQT format through a for loop.
This script assumes that vina and ADFR are already in the PATH environment variable (see 2.2.3). Otherwise, please modify accordingly.

#!/bin/bash

for mol2 in `ls ./*.mol2`
do
prepare_ligand -l ${mol2}
done

3.3 Docking pocket parameter generation

Create a configuration file conf.txt as shown below?

receptor = receptor.pdbqt

center_x = 2
center_y = 6
center_z = -7

size_x = 25
size_y = 25
size_z = 25

num_modes = 9

I usually generate pocket parameters through the PyMol plug-in GetBox-PyMOL-Plugin. After installing GetBox-PyMOL-Plugin in PyMol, binding site coordinates can be generated by selecting the ligand molecule or the specified amino acid. This plug-in can not only obtain the above-mentioned vina docking pocket parameters, but also generate the docking pocket parameters required by LeDock and AutoDock, which is very convenient for cross-validation of multiple molecular docking software.

4. Large-scale virtual screening

The following example assumes that you are in a directory (/home/…/vina_dock/) that contains the receptor receptor.pdbqt and a set of ligands named ligand_01.pdbqt, ligand_02.pdbqt, and so on. The script run_vina.sh assumes that vina is already in the PATH environment variable (refer to 2.2.3). Otherwise, please modify accordingly.

#!/bin/bash

cd /home/.../vina_dock/
for pdbqt in `ls ./*.pdbqt`
do
vina --config conf.txt --ligand ${pdbqt}
done

Eventually, the docking result files of ligand_01_out.pdbqt and ligand_02_out.pdbqt will be generated.

5. Result processing

5.1 Extract docking scores

After the docking is completed, we will filter and extract the docking scores in the ligand_01_out.pdbqt and ligand_02_out.pdbqt files. The following script can extract the top-ranked docking score for each small molecule.

#!/bin/bash

for i in `ls *_out.pdbqt`
do
name=`basename $i _out.pdbqt`
grep "REMARK VINA RESULT" -m 1 $i|awk -v id="$name" '{print id " " $4}'>>vina_score.txt
done

Script function: Save the top-ranked docking score of each small molecule in the vina_score.txt file.

The results of vina_score.txt are as follows?

ligand_01 -10.267
ligand_02 -10.146

5.2 Docking scoring and sorting

Arrange in ascending order according to the scoring values in the second column of vina_score.txt. You can use the sort command in the shell to complete. You can also save the sorted results to vina_score_sort .txt file.

#Result sorting
sort -k2,2n vina_score.txt
#Save the sorted results
sort -k2,2n vina_score.txt >vina_score_sort.txt

5.3 Extract scoring results less than a certain value

In some cases where there are scoring results of positive or negative molecules as controls, and using their scores as thresholds, we can obtain the results of better molecules by filtering vina_score_sort.txt.

awk '$2 < -8' vina_score_sort.txt

5.4 Conformational splitting after docking

Take ligand_01_out.pdbqt as an example. ligand_01_out.pdbqt contains multiple docked conformations. They are distributed in order according to the score. The split of the docked conformations is completed by using vina’s own program vina_split. , for subsequent result analysis. This script assumes that vina is already in the PATH environment variable (see 2.1.3). Otherwise, please modify accordingly.

vina_split --input ligand_01_out.pdbqt --ligand ligand_01_split

The split individual conformations will be obtained: ligand_01_split1.pdbqt ligand_01_split3.pdbqt ligand_01_split5.pdbqt ligand_01_split7.pdbqt ligand_01_split9.pdbqt
ligand_01_split2.pdbqt ligand_01_split4.pdbqt ligand_01_split6.pdbqt ligand_01_split8.pdbqt

5.5 Visual Analysis

The conformation after docking with vina can be directly imported into PMV in MGLTools for display visualization. However, for those of us who are more keen on using PyMol for visual analysis, the pdbqt format cannot be directly loaded for display in PyMol. It needs to be converted into MOL2, SDF and other formats. Format conversion can directly call babel in ADFR.

babel -ipdbqt ligand_01_split1.pdbqt -osdf ligand_01_split1.sdf
syntaxbug.com © 2021 All Rights Reserved.