Non-Ensembl Applications Build/Install

Ensembl is built on the following applications:

  • Git - versioning system for downloading Ensembl code
  • MySQL - Open source database server
  • Perl - Open source scripting language
  • Apache and mod_perl - Open source web server
  • Optional components:
    • SQLite - alternative open source database server for session information (if not installing MySQL)
    • hubCheck - track hub validation utility

These applications are not version-specific for Ensembl; that is, if you upgrade your Ensembl installation to a newer version when one becomes available, you probably won't need to install new versions of these applications.

All of this software, like all of Ensembl, is Open Source software and can be downloaded and used free of charge. You should, however, check the documentation for each application to see what license it has been released under, particularly if you are installing Ensembl in a commercial environment.

The following instructions assume you have root access to the installation machine. If you do not, get your systems administrator to install this software for you.

You may have some or all of this software installed already. If you have any problems getting the site running with pre-installed software (in particular Apache with mod_perl installed from RPMs), we recommend simply installing the latest version using the following instructions.

Git

Git is a software version control system that we use for storing the source code to Ensembl. You will need Git installed if you want to download the latest Ensembl source code. It will also help you keep up-to-date with any bug fixes. Our repositories are stored on GitHub.

Git installation instructions can be found on the Git website.

We formerly developed the Ensembl codebase in CVS, but in December 2013 it was moved into git (below), where releases from 32 onwards are available.

MySQL

MySQL is a very popular Open Source relational database system. The easiest way to install MySQL is to use the pre-compiled binaries from http://dev.mysql.com. You can also get source from http://dev.mysql.com if you wish to compile MySQL yourself.

To install MySQL:

  1. Download the appropriate standard binaries from http://dev.mysql.com/downloads/mysql. Get the current stable version - at the time of writing, this is 5.0.51.
  2. Create a directory for MySQL to be installed into. A subdirectory of this will hold the databases, so choose somewhere that has sufficient space free - at least 2.7 TB for the complete set. We will use /data/ as an example. Again, when following these instructions, replace /data/ with whatever path you choose.
  3. Move the binary tarball to /data/
  4. Unpack the tarball with:

    $ gunzip < mysql-WHATEVER.tar.gz | tar xvf -
    Follow the straightforward setup instructions in the INSTALL-BINARY file that comes with MySQL. It should be located in the "mysql-WHATEVER" directory you just unpacked.

Perl

If you are on a Unix-based OS such as Mac OS X or Linux, you will already have Perl installed. You need Perl5, version 5.14.2 or higher, to run the website.

To see if you have Perl installed, and/or to check its version number, type:

perl -v

If you don't have Perl installed, or need to upgrade, go to www.cpan.org/ and choose the 'source code' install. Follow the installation instructions on the web site.

Apache & mod_perl

Apache is the web server that the Ensembl site runs on. mod_perl is a module for Apache that enables it to compile perl scripts once rather than each time they are requested, and so makes everything run a lot faster.

Please follow these instructions precisely as often the default version of Apache or mod-perl does not work correctly for Ensembl.

To Install Apache with mod_perl:

  1. Download the Apache2 source tarball from http://httpd.apache.org/dist/httpd/. Please note that mod_perl does not work reliably with Apache 2.4 at the moment, so we recommend using the 2.2.x legacy version - at the time of writing, this is 2.2.25.
  2. Download the mod_perl source from http://www.cpan.org/modules/by-module/Apache2/ . Again, get the latest version, currently this is 2.0.3 and the file to download is mod_perl-2.0.3.tar.gz.
  3. Unpack all the sources in a working directory with:

    tar zxf httpd-2.2.25.tar.gz
    tar zxf mod_perl-2.0.3.tar.gz | tar xvf -
    cd httpd-2.2.4
    ./configure --enable-deflate --prefix=Apache directory
    cd ../mod_perl-2.0.3
    The default installation directory is in an apache2 subdirectory of your website's server root; if you want to put it elsewhere, you will need to override the $APACHE_DIR dsetting in your plugin.
  4. Build the perl makefile:

    perl Makefile.PL PREFIX=Apache directory MP_APXS=Apache directory/bin/apxs
  5. Run the 'make' utility:

    make
  6. ...and install

    make install

Perl modules

The Ensembl website needs quite a few Perl modules to be installed in order for it to run. Many will be included by default in more recent versions of Perl; listed below are some you may have to install separately.

These modules can all be downloaded from www.cpan.org, and are all installed in much the same way: Download the module tarball, unpack in a working directory, and install the module:

gunzip < module.tar.gz | tar xvf -
cd module
perl Makefile.PL
make
make test
make install

The modules that are required are listed below - we recommend searching CPAN for the appropriate version for your current version of Perl.

Ensembl website

These modules are mandatory for any website based on EnsEMBL::Web code. Some plugins may have additional dependencies, e.g. public-plugins/users requires the Rose database abstraction suite.

Archive::Zip
OO interface for creating zip files
Bio::BigFile
Low-level interface to BigWig & BigBed files
Bio::DB::HTS
Interface to BAM and CRAM files
CGI::Session
persistent session data in CGI applications
Class::Accessor
Automated accessor generation
Class::DBI::Sweet
Database abstraction layer, used to access non-genomic databases such as ensembl_accounts
Clone
recursively copy Perl datatypes - used for DOM manipulation
Compress::Bzip2
Used to handle uploaded bz2 files
CSS::Minifier
Reduces CSS files for faster page loading
DB_File
Cache data from the database to a file
DBI
A common database interface for Perl
DBD::mysql
The MySQL drivers for the DBI interface
Digest::MD5
Perl interface to the MD5 Algorithm
File::Spec::Functions
Perform operations on file names
GD
A Graphics library
Note: may require additional modules. Please read install docs.
Hash::Merge
Used to merge data from multiple configuration files
HTML::Entities
Encode or decode strings with HTML entities
HTTP::Date
Convert datetimes into HTTP header formats
Image::Size
Used for getting size of images
Inline
Use inline C for fast parsing of large data files
IO::Scalar
Used to format compara API output
IO::Socket, IO::Socket::INET, IO::Socket::UNIX
Object interfaces to socket communications.
IO::String
Used for sequence handling
IO::Uncompress::Bunzip2
Used to handle uploaded bz2 files (bundled with recent versions of IO::Compress)
JavaScript::Minifier
Reduces JS files for faster page loading
libwww-perl
We use LWP, LWP::RobotUA and LWP::UserAgent extensively to communicate with outher web services
List::MoreUtils
Utility functions for handling arrays
Mail::Mailer
Used by web forms to send email
Math::Bezier
Used by drawing code
MIME::Types
Used to automatically identify the correct mime type of static files
PDF::API2
used by Image exporter for exporting as PDF
Role::Tiny
Used to dynamically alter Perl objects
RTF::Writer
Output web content in RTF format
Spreadsheet::WriteExcel
used for exporting Excel spreadsheets
Sys::Hostname::Long
Used by website startup process
Text::ParseWords
Parse text into an array of tokens or array of arrays
URI and URI::Escape
Used extensively to percent-encode and percent-decode unsafe characters
XML::Atom
Atom feed parser, used to embed Ensembl blog entries on home page
XML::Writer
Used when writing exported data files from Ensembl pages

Optional components

SQLite

Only required if you are running a bare-bones site using our public MySQL server, as you will still need a session database to store configuration changes.

hubCheck

This utility is produced by UCSC and is used to validate track hubs. It is not required by the Ensembl website, but it does provide additional error-checking and is therefore recommended if you know that you or your users are going to be working with experimental hubs.

To incorporate hubCheck into your site, you simply need to download the binary file and put it somewhere on your system where Apache can see it, then add that path to your mirror site configuration once you have installed the Ensembl web code.