Variant Effect Predictor Download and install
Download
Download ensembl-vep package (see below the different ways to download it) and then follow the installation instructions.
Using Git
Clone the Git repository
Use git to download the ensembl-vep package:
git clone https://github.com/Ensembl/ensembl-vep.git cd ensembl-vep
Update to a newer version
To update from a previous version:
cd ensembl-vep git pull git checkout release/104 perl INSTALL.pl
Use an older version
To use an older version (this example shows how to set up release 87):
cd ensembl-vep git checkout release/87 perl INSTALL.pl
Download the Zipped package file
Users without the git utility installed may download a zip file from GitHub, though we would always recommend using git if possible.
curl -L -O https://github.com/Ensembl/ensembl-vep/archive/release/104.zip unzip 104.zip cd ensembl-vep-release-104/
Previous versions (ensembl-tools)
Previously VEP was available as part of the ensembl-tools package (see the Ensembl archive site for documentation). The following downloads are available for archival purposes. Show versions
What's new?
New in version 104 (Jan 2022)
- Human GRCh37 cache files now include dbSNP 154!
- --var_synonyms output structure has been altered when used with --json
Previous version history - from version 88: Show
Older versions (ensembl-tools) - until version 87: Show
Requirements
VEP requires:- gcc, g++ and make
- Perl version 5.10 or above recommended (tested on 5.10, 5.14, 5.18, 5.22, 5.26)
- Perl packages:
See this guide for more information on how to install perl modules.
Additional libraries can be installed for extra features and enhancements but they are not required to run VEP in most of the use cases.
VEP's INSTALL.pl script will install required components of Ensembl API for you, but VEP may also be used with any pre-existing API installations you have, provided their versions match the version of VEP you are using.
VEP has been developed for UNIX-like environments and works well on Linux (e.g. Ubuntu, Debian, Mint) and Mac OSX.
It can also be used on Windows systems with a more involved installation process.
Installation
VEP's INSTALL.pl makes it easy to set up your environment for using the VEP. It will download and configure a minimal set of the Ensembl API for use by the VEP, and can also download cache files, FASTA files and plugins.
Run the following, and follow any prompts as they appear:
perl INSTALL.pl
Additional non-essential components and enhancements must be installed manually.
Software components installed
If you already have the latest version of the API installed you do not need to run the installer, although it can be used to simply update your API version (with post-release patches applied), and retrieve cache and FASTA files. The installer downloads the API within the VEP directory and will not affect any other Ensembl API installations.
The script will also attempt to install a Perl::XS module, Bio::DB::HTS, for rapid access to bgzipped FASTA files. If this fails, you may add the --NO_HTSLIB flag when running the installer; VEP will fall back to using Bio::DB::Fasta for this functionality (more details).
Running the installer
The installer is run on the command line as follows:
perl INSTALL.pl [options]
Follow on-screen prompts and note warnings of any files which will be deleted/overwritten
You should not need to add any options, but configuration of the installer is possible with the following flags:
Flag | Alternate | Description |
---|---|---|
--ASSEMBLY |
-y |
Assembly version to use when using --AUTO . Most species have only one
assembly available on each software release; currently this is only
required for human on release 76
onwards.
|
--AUTO |
-a |
Run installer without prompts. Use the following options to specify
parts to install:
e.g. for API and cache: perl INSTALL.pl --AUTO ac |
--CACHE_VERSION [version] |
By default the installer will download the latest version of VEP caches and FASTA files (currently 104). You can force the script to install a different version, but there is no guarantee that a version of the API will be compatible with a different version of the cache. | |
--CACHEDIR [dir] |
-c |
By default the script will install the cache files in the ".vep" subdirectory in your home area. This option configures where cache files are installed. The --dir_cache flag must be passed when running the VEP if a non-default cache directory is given: ./vep --dir_cache [dir] |
--DESTDIR [dir] |
-d |
By default the script will install the API modules in a subdirectory of the current directory named "Bio". Using this option you can configure where the Bio directory is created. If something other than the default is used, this directory must either be added to your PERL5LIB environment variable when running the VEP, or included using perl's -I flag: perl -I [dir] vep |
--NO_HTSLIB |
-l |
Don't attempt to install Bio::DB::HTS/htslib |
--NO_TEST |
Don't run API tests - useful if you know a harmless failure will prevent continuation of the installer | |
--NO_UPDATE |
-n |
By default the script will check for new versions or updates of the VEP. Using this option will skip this check. |
--PLUGINS |
-g |
Comma-separated list of plugins to install when using # List the available plugins: perl INSTALL.pl -a p --PLUGINS list # Download/install all the available plugins: perl INSTALL.pl -a p --PLUGINS all # Download/install a defined list of plugins, e.g.: perl INSTALL.pl -a p --PLUGINS dbNSFP,CADD,G2P |
--PLUGINSDIR [dir] |
-r |
By default the script will install the plugins files in the "Plugins" subdirectory of the The --dir_plugins flag must be passed when running the VEP if a non-default plugins directory is given: ./vep --dir_plugins [dir] |
--PREFER_BIN |
-p |
Use this if the installer fails with out of memory errors. |
--SPECIES |
-s |
Comma-separated list of species to install when using --AUTO . To
install the RefSeq cache, add "_refseq" to the species name, e.g.
"homo_sapiens_refseq", or "_merged" to install the merged Ensembl/RefSeq
cache. Remember to use --refseq or --merged when running the VEP
with the relevant cache!
|
--QUIET |
-q |
Don't write any status output when using --AUTO . |
Additional components
INSTALL.pl will set up the minimum requirements for VEP. Some features and enhancements, however, require the installation of additional components. Most are perl modules that are easily installed using cpanm; see this guide for more information on how to install perl modules.
Typically, you will use cpanm to install modules locally in your home directories; this shows how to set up a path for perl modules and install one there:
mkdir -p $HOME/cpanm export PERL5LIB=$PERL5LIB:$HOME/cpanm/lib/perl5 cpanm -l $HOME/cpanm Set::IntervalTree
To make the change to PERL5LIB
permanent, it is recommended to add the export
line to your $HOME/.bashrc
or $HOME/.profile
.
-
Additional features
- JSON - required to produce JSON format output
- Set::IntervalTree - used to find overlaps between entities in coordinate space. Required to use --nearest
- Bio::DB::BigFile - required to use bigWig format custom annotation files. See Bio::DB::BigFile instructions.
-
Speed enhancements - these modules can improve VEP runtime
- PerlIO::gzip - marginal gains in compressed file parsing as used by VEP cache
- ensembl-xs - provides pre-compiled replacements for frequently used routines in VEP. Requires manual installation, see README for details
Bio::DB::BigFile
In order for VEP to be able to access bigWig format custom annotation files, the Bio::DB::BigFile perl module is required. Installation involves downloading and compiling the kent source tree. The current version of the kent source tree does not work correctly with Bio::DB::BigFile, so it is necessary to install an archive version known to work (v335).
-
Download and unpack the kent source tree
wget https://github.com/ucscGenomeBrowser/kent/archive/v335_base.tar.gz tar xzf v335_base.tar.gz
-
Set up some environment variables; these are required only temporarily for this installation process
export KENT_SRC=$PWD/kent-335_base/src export MACHTYPE=$(uname -m) export CFLAGS="-fPIC" export MYSQLINC=`mysql_config --include | sed -e 's/^-I//g'` export MYSQLLIBS=`mysql_config --libs`
-
Modify kent build parameters
cd $KENT_SRC/lib echo 'CFLAGS="-fPIC"' > ../inc/localEnvironment.mk
-
Build kent source
make clean && make cd ../jkOwnLib make clean && make
If either of these steps fail, you may have some missing dependencies. Known common missing dependencies are libpng and libssl; these may be installed, for example, with
apt-get
on Ubuntu. If you do not have sudo access you may have to ask your sysadmin to install any missing dependencies.sudo apt-get install libpng-dev libssl-dev
On Mac OSX you may use
brew
; the openssl libraries also need to be symbolically linked to a different path:brew install libpng openssl cd /usr/local/include ln -s ../opt/openssl/include/openssl . cd -
-
On some systems (e.g. Mac OSX), a compiled file is placed in a path that Bio::DB::BigFile cannot find. You can correct this with:
ln -s $KENT_SRC/lib/x86_64/* $KENT_SRC/lib/
-
We'll now use cpanm to install the perl module for Bio::DB::BigFile itself. See above for guidance on this. In this example we're going to install the module to a path within your home directory. In order to do this we must modify the paths that perl looks in to find modules by adding to the
PERL5LIB
environment module. To make this change permanent you must add theexport
line to your$HOME/.bashrc
or$HOME/.profile
.mkdir -p $HOME/cpanm export PERL5LIB=$PERL5LIB:$HOME/cpanm/lib/perl5 cpanm -l $HOME/cpanm Bio::DB::BigFile
If you are prompted for the path to the kent source tree, that means something didn't go right in the compilation above. Double check that
$KENT_SRC/lib/jkweb.a
exists and is not found instead at e.g.$KENT_SRC/lib/x86_64/jkweb.a
. You may copy or link the file (and the other files in that directory) to the former path.ln -s $KENT_SRC/lib/x86_64/* $KENT_SRC/lib/
-
You should now be able to successfully run the appropriate test in the VEP package:
perl -Imodules t/AnnotationSource_File_BigWig.t
Using VEP in Mac OS
Installing VEP on Mac OS is slightly trickier than other Linux-based systems, and will require additional dependancies.
These instructions will guide you through the setup of Perlbrew, Homebrew, MySQL and other dependancies that will allow for a clean installation of VEP on your Mac OS system.
These instructions have been tested on macOS High Sierra (10.13) and macOS Sierra (10.12).
Older versions may require additional tweaks, however we shall endeavor to keep these instructions up to date for future versions of MacOS.
Prerequisite Setup
List of prerequisites: XCode, GCC, Perlbrew, Cpanm, Homebrew, mysql, DBI, DBD::mysql
XCode and GCC
VEP requires XCode and GCC for installation purposes. Fortunately, recent versions of macOS will look for (and attempt to install if required) both of these when you run the following command:
gcc -v
Perlbrew
We recommend using Perlbrew to install a new version of Perl on your mac, to prevent messing with the vendor perl too much. This can be done with the following command:
curl -L http://install.perlbrew.pl | bash echo 'source $HOME/perl5/perlbrew/etc/bashrc' >> ~/.bash_profile
At this point, PLEASE RESTART YOUR TERMINAL WINDOW to allow for the perlbrew changes to take effect.
We recommend installing Perl version 5.26.2 to run VEP, and installing cpanm to handle the installation of perl modules.
These steps can be completed with the commands:
perlbrew install -j 5 --as 5.26.2 --thread --64all -Duseshrplib perl-5.26.2 --notest perlbrew switch 5.26.2 perlbrew install-cpanm
Homebrew
This package management system for Mac OS would make the installation of the next prerequisite (i.e. xs) easier.
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
xz
VEP requires the installation of xz, a data-compression utility. The easiest way to install the xz package is through homebrew:
brew install xz
MySQL
In order to connect to the Ensembl databases, a collection of MySQL related dependancies are required. Fortunately, these can be installed neatly with Homebrew and Cpanm:
brew install mysql cpanm DBI cpanm DBD::mysql
Installing BioPerl
On some versions of macOS, the VEP installer fails to cleanly install BioPerl, so a manual install will prevent issues:
curl -O https://cpan.metacpan.org/authors/id/C/CJ/CJFIELDS/BioPerl-1.6.924.tar.gz tar zxvf BioPerl-1.6.924.tar.gz echo 'export PERL5LIB=${PERL5LIB}:##PATH_TO##/bioperl-1.6.924' >> ~/.bash_profile
where ##PATH_TO##/bioperl-1.6.924 refers to the location of the newly unzipped BioPerl directory.
Final Dependancies
Installing the following Perl modules with cpanm will allow for full VEP functionality:
cpanm Test::Differences Test::Exception Test::Perl::Critic Archive::Zip PadWalker Error Devel::Cycle Role::Tiny::With Module::Build export DYLD_LIBRARY_PATH=/usr/local/mysql/lib/:$DYLD_LIBRARY_PATH
Installing VEP
And that should be that! You should now be able to install VEP using the installer:
git clone https://github.com/ensembl/ensembl-vep cd ensembl-vep perl INSTALL.pl --NO_TEST
Using VEP in Windows
VEP was developed as a command-line tool, and as a Perl script its natural environment is a Linux system. However, there are several ways you can use VEP on a Windows machine.
You may also consider using VEP's web or REST interfaces.
Virtual machines
Using a virtual machine you can run a virtual Linux system in a window on your machine. There are two ways to do this:
- Use the Ensembl virtual machine image
- Use Docker
Perl
If Perl is installed on Windows, VEP can be setup. However this may require installation of dependent modules. We recommend using Docker to run VEP on Windows.
- Check Perl is installed
- Download and unpack the zip of the ensembl-vep package
- Open a Command Prompt (search for Command Prompt in the Start Menu)
-
Navigate to the directory where you unpacked the VEP package, e.g.
cd Downloads/ensembl-vep-release-104
-
Run INSTALL.pl with --NO_HTSLIB and --NO_TEST; you will see some warnings about the "which" command not being available (these will also appear when running VEP and can be ignored).
perl INSTALL.pl --NO_HTSLIB --NO_TEST
Docker
Docker allows you to run applications in virtualised "containers". A docker image for VEP is available from DockerHub: VEP in DockerHub
The VEP Docker image uses ubuntu:18.04 as base image.
Commands to download the VEP Docker image (need to download and install the docker client beforehand):
docker pull ensemblorg/ensembl-vep docker run -t -i ensemblorg/ensembl-vep ./vep
Currently no volumes are pre-configured for the container; this is required if you wish to download data (e.g. cache files) that persists across sessions.
The following is a brief example showing how to use a directory on your local (host) machine to store cache data for VEP.
# Create a directory on your machine: mkdir $HOME/vep_data # Make sure that the created directory on your machine has read and write access granted # so the docker container can write in the directory (VEP output): chmod a+rwx $HOME/vep_data docker run -t -i -v $HOME/vep_data:/opt/vep/.vep ensemblorg/ensembl-vep
Note
Cache and Plugins installation
You will now be prompted by the installer if you wish to re-install the API. Type "n" followed by enter to skip to cache installation. You will be presented with a list of species; type the number for your species/assembly of interest and press enter. Your data will now download and unpack; this may take a while.
If you wish to retrieve HGVS annotations it is recommended to also download the FASTA file for your species. To do this, at the next prompt type "0" and press enter. You may skip the plugin installation also.
The above process may also be performed in one command; for example, to set up the cache and corresponding FASTA for human GRCh38:
docker run -t -i -v $HOME/vep_data:/opt/vep/.vep ensemblorg/ensembl-vep perl INSTALL.pl -a cf -s homo_sapiens -y GRCh38
If you wish to include the VEP plugins, add the 'p' value to the -a
flag and the --PLUGINS (or -g
) flag as well:
# Install all the available plugins: docker run -t -i -v $HOME/vep_data:/opt/vep/.vep ensemblorg/ensembl-vep perl INSTALL.pl -a cfp -s homo_sapiens -y GRCh38 -g all # or install a defined list of plugins: docker run -t -i -v $HOME/vep_data:/opt/vep/.vep ensemblorg/ensembl-vep perl INSTALL.pl -a cfp -s homo_sapiens -y GRCh38 -g dbNSFP,CADD,G2P
The installer has now downloaded this data to $HOME/vep_data (and $HOME/vep_data/Plugins for the VEP plugins). VEP will automatically detect caches downloaded in this folder as it is mapped to VEP's default directory within the Docker instance.
docker run -t -i -v $HOME/vep_data:/opt/vep/.vep ensemblorg/ensembl-vep ./vep -i examples/homo_sapiens_GRCh38.vcf --cache
Read/Write access from the Docker container
Mounted volume - recommended data structure
i.e. VEP data structure outside the Docker container
Diagram representing a recommended data file structure for the mounted volume
Here is an example of how you can run VEP using the setup presented in the image above (providing that the cache, the dbNSFP plugin and its data file have been downloaded):
docker run -t -i -v $HOME/vep_data:/opt/vep/.vep ensemblorg/ensembl-vep # Example of VEP command line: ./vep --cache --offline --format vcf --vcf --force_overwrite \ --dir_cache /opt/vep/.vep/ \ --dir_plugins /opt/vep/.vep/Plugins/ \ --input_file /opt/vep/.vep/input/my_input.vcf \ --output_file /opt/vep/.vep/output/my_output.vcf \ --custom /opt/vep/.vep/custom/my_extra_data.bed,BED_DATA,bed,exact,1 \ --plugin dbNSFP,/opt/vep/.vep/Plugins/dbNSFP.gz,ALL
Note
Update from a previous version
Update your docker container
# List containers docker ps -a # e.g. CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d64055ffe9e9 ensemblorg/ensembl-vep "/bin/bash" About a minute ago Exited (0) 59 seconds ago tender_ritchie # Stop and remove old container docker stop tender_ritchie docker rm tender_ritchie # Update the container docker pull ensemblorg/ensembl-vep
Update your cache
# Install the new cache through the VEP INSTALL.pl script (see "Cache installation" section above) docker run -t -i -v $HOME/vep_data:/opt/vep/.vep ensemblorg/ensembl-vep perl INSTALL.pl -a c # Or you can install the cache manually cd $HOME/vep_data curl -O http://ftp.ensemblgenomes.org/pub/viruses/release-104/variation/vep/homo_sapiens_vep_104_GRCh38.tar.gz tar xzf homo_sapiens_vep_104_GRCh38.tar.gz