The Exelixis Lab

new: A hybrid MPI/OpenMP version of MrBayes v3.1.2 by Alexis Stamatakis and Wayne Pfeiffer

Download a hybrid MPI/OpenMP parallelization of MrBayes. DNA and Protein models work correctly, you will probably need an Intel compiler (icc) to produce fast code. By using this component you agree to cite it as:
F. Pratas, P. Trancoso, A. Stamatakis, L. Sousa:  "Fine-grain parallelism using Multi-core, Cell/BE, and GPU systems: Accelerating the Phylogenetic Likelihood Function". Proceedings of ICPP 2009, accepted for publication, Vienna, Austria, September 2009.  PDF
and
F. Ronquist, J.P. Huelsenbeck "MrBayes  3: Bayesian Phylogenetic Inference  under mixed models",  Bioinformatics 19(12):1572-1574,  2003.

Some performance data: PDF

new: An IEEE-754 compliant logarithm approximation unit for FPGAs by Nikos Alachiotis

Download an open-source VHDL implementation of a fast space- and resource-efficient logarithm approximation unit for FPGAs.
By using this component you agree to cite it as: "Efficient Floating-Point Logarithm Unit for FPGAs", by Nikos Alachiotis and Alexandros Stamatakis, accepted for publication at RAW workhsop, held in conjunction with IPDPS 2010. PDF

new:  UDP Transceiver Core by Nikos Alachiotis and Simon A. Berger

Download an open-source VHDL implementation of a component that can be connected to the input port of the Virtex-5 Ethernet MAC Local Link Wrapper and that allows for transceiving IPv4 ethernet packets. The archive contains a JAVA test application and is also available at opencores.org
By using this component, you agree to cite it as:  "Efficient PC-FPGA Communication over Gigabit Ethernet", by Nikos Alachiotis, Simon A. Berger, and Alexandros Stamatakis, Exelixis Rapid Research Dissemination Report, Exelixis-RRDR-2010-4, TU Munich, February 2010.  PDF

some useful slides by Nick Pattengale explaining the bootstrap convergence criteria implemented in RAxML

raxml v 7.2.5 (alpha) source code now available for download here and here is a windows executable

new features
  • Significantly accelerated and SSE3-vectorized parsimony functions for DNA data, i.e., if your alignment consists only of DNA data partitions RAxML will automatically invoke these new fast routines. The SSE3-based implementation is about 20 times faster than the previous parsimony implementation
  • Significantly accelerated routines for MRE consensus tree building, this is now more than 7 times faster than in version 7.2.4, e.g., computing the MRE consensus of 10,000 trees with approximately 2,500 taxa now takes less than a minute on my laptop

raxml v 7.2.4 (alpha) now available for download here

new features
  • Thanks to Wayne Pfeiffer from SDSC, RAxML now offers a hybrid MPI-Pthreads/coarse-grain-fine-grain parallelization of the most important and time-consuming algorithms: rapid bootstrapping, rapid bootstrapping with subsequent ML search, standard bootstrapping, and standard tree searches. At present, only OpenMPI seems to be able to compile the mixed MPI/Pthreads code correctly without any trouble!
  • Partial Pthreads-parallelization of operations on bipartitions of trees such as drawing support values on ML trees, etc. (in progress)
  • Optimized performance of tree parsing routines

raxml v 7.2.3 (alpha) now available for download here

new features
  • Offers consensus tree building methods (majority rule and majority rule extended)
  • Full efficient Pthreads-based parallelization of evolutionary placement algorithm for metagenomics data
  • Implementation of multi-state models using MK, ordered Likelihood and GTR substitution models, up to 32 characters can be used to encode multi-state regions
  • Fixed a bug in the parsimony component for protein data (should only affect previous results to a small extent)
  • Offers a morphological weight calibration mechanism to determine sites that are congruent to some reference tree

some new RAxML wrapper scripts

Apurva Narechania at the American Museum of Natural history has kindly put togetehr a couple of wrapper scripts for RAxML :-)

raxml_launch_serially.sh: A simple shell script that launches one job after the other awaiting for completion of each job.

raxml_nexusPartConvert.pl: A Perl script that parses a partitioned alignment in Nexus format with charsets and produces a partition guide file to be fed to RAxML with -q. Preliminary - works with DNA or AA, but not the two together yet, so not suitable for mixed-molecule data. Unless the output gets redirected to a file with ">", it will appear on screen.

raxml_wrapper.pl: A Perl script that reads a raxml.config file with common run parameters and executes a directory of Phylip alignment files in batch, then outputs the results in another directory. See the documentation with "perldoc ./raxml_wrapper.pl".

updated version of easyRax

Guy Leonard at Exeter has put together an updated version of the easyRax wrapper program that works with the new RAxML version 7.2.2 below. You can download the code here.

raxml v 7.2.2 now available for download here

download windows executable

new features
  • Full implementation of all criteria for Bootstopping (MR, MRE, approximate MRE ignoring compatibility) as described in the 2009 RECOMB paper
  • Addition of fast MP and fast ML heuristics for evolutionary placement of short reads under the slow insertion criterion

raxml v 7.2.1 (alpha) for windows available for download here

Simon Berger betrayed all his principles and compiled Windows executables for the current RAxML release. Both the sequential as well as the Pthreads-based version seem to work under Windows XP. Please note that it has become really really easy to use Linux with Ubuntu now and that we will only provide extremely limited support for the Windows executables.

raxml v 7.2.1 (alpha) now available for download here

new features
  • Improved ML search convergence mechanism (the -D option)
  • Full SSE3 vectorization of AA and DNA models
  • Full single-precision implementation for all AA and DNA models
  • Unlike previsouly stated performance advantages can be achieved by using the single precision version on large phylogenomic alignments, i.e., it's worth a try and can yield speedups of more than 50%
  • Usage of single precision likelihood function implementation is not recommended for datasets with more than 500-1000 taxa because of potential numerical instability
  • Implemented the efficient method for computing likelihood function on gappy multi-gene alignments (-C option) described in the following paper A. Stamatakis, M. Ott: “Efficient Computation of the Phylogenetic Likelihood Function on Multi-Gene Alignments and Multi-Core Architectures”. In Philosophical Transactions of the Royal Society B, 363: 3977-3984, 2008. 
  • WARNING: The new -C option only works for scoring trees (no tree searches so far) and in combination with -M (per partition branch length estimate) it does also not work for the PTHREADS version yet! However, it will only assign as much memory as is needed to hold the actual sequence data and omit the memory space for the missing sequences.

raxml v 7.2.0 (alpha) now available for download here

cautionary note: this is the alpha release and will probably still be full of bugs, the manual is still under preparation

to report bugs send me an email and please send me all input files, the exact invocation, details of the HW and operating system,
as well as all error messages printed to screen.

new features
  • The DNA and Protein Likelihood functions have been accelerated using SSE3 vector instructions, this will yield speedups between 10% and 50% compared to the non-vectorized version. If you are experiencing problems compiling the SSE3 code, please ask your local computer nerd for help first.
  • Slight improvement of the numerical scaling procedure used to avoid numerical underflow according to a method proposed by BUI Quang Minh, a PostDoc at the CIBIV in Vienna, can yield up to 7% speed improvements on multi-gene datasets.
  • Implementation of single-precision versions for DNA and Protein models. While those actually execute 30-50% slower than the standard double-precision implementations they can help to save almost 50% of memory consumption on large alignments which is increasingly becoming an issue. The numerical stability of the single precision version needs further testing though, i.e., usage is currently only recommended when you run out of memory.
  • New -F option that stops ML searches under CAT or GAMMA after the specified number of trees has been computed without doing a more thorough search on the best-scoring final tree under GAMMA. If you are experiencing memory shortages you should do ML searches under CAT with -F since RAxML running in this mode will only assign the memory it needs for CAT (4 times less than for GAMMA).
  • New -D option: This option further helps to accelerate ML searches on the original tree on datasets with several thousands of taxa. It will stop the ML search much earlier during the "asymptotic convergence phase" of the likelihood score, if the relative RF distance between the trees generated by two succesive cycles of Lazy Subtree Rearrangements is smaller than 1%. On datasets with more than 1,000 taxa this yields run-time improvements of 50%, while returning almost equally good trees.
release notes
  • MPI version not ready yet (neither fine-grained nor coarse-grained)
  • Manual not ready yet
  • Fixed various bugs

perl script for computing bootstrap branch lengths with raxml

This script can be used to perform the following task with RAxML: Given a best-known ML tree, generate a number of Bootstrap replicates and just re-estimate the branch lengths for that given fixed tree topology on each Bootstrap replicate.
To invoke the script call it as follows: "perl bsBranchLengths.pl alignmentFileName treeFileName numberOfReplicates".  The script assumes that the RAxML executable is located in the directory where you execute it. Otherwise, if RAxML is located in your Linux/Unix path just replace every occurence of "./raxmlHPC" by "raxmlHPC" in the script. The bootstrapped trees with branch lengths will be written into a file called "bsTrees". This script is intended for use with programs that infer divergence time estimates.

raxml v 7.1.0 (alpha) now available for download here

cautionary note: this is the alpha release and will probably still be full of bugs, manual under preparation

to report bugs send me an email and please send me all input files, the exact invocation, details of the HW and operating system,
as well as all error messages printed to screen.

new features
  • Improved parallel load balance for Pthreads version when conducting a per-partition branch length estimate
  • Binary (Morphological) and Secondary Structure models implemented
  • Estimate of GTR model of amino acid substitution
  • Added LG model of protein substitution
  • Computation of RF and WRF tree distances
  • Implementation of signifcantly faster methods to operate on bipartitions of trees
  • Implementation of WC and FC Bootstrap convergence criteria, can be executed on the fly or a posteriori
  • For details on the bootstop method see N.D. Pattengale, M. Alipour, O.R.P. Bininda-Emonds, B.M.E. Moret, A. Stamatakis: "How Many Bootstrap Replicates are Necessary?". Proceedings of RECOMB 2009 PDF
  • Rapid Bootstraps now feasible under CAT, GAMMA, as well as GAMMA+P-Invar 
  • Parsimony ratchet implementation
  • 4 Algorithms to classify sequences from environmental samples into a given reference tree (to be described in more detail soon)
release notes
  • MPI version not ready yet
  • Manual not ready yet
  • Fixed various bugs
  • Slightly changed search algorithms

LG protein substitution model for raxml.

Fabien Burki from the University of Geneva has kindly helped me to put together a protein model file for the new LG model of amino acid substitution.
You can download it here
Note that external substitution models need to be read into RAxML by, e.g., "-P LGmodel" and this is a CAPITAL P, -p stands for something different!

raxml memory requirements.

Since datasets are getting larger here is a formula to estimate RAxML memory requirements:
Given an alignment of n taxa and m distinct patterns the memory consumption is approximately:
  • MEM(AA+GAMMA)    = (n-2) * m * (80 * 8) bytes
  • MEM(AA+CAT)           = (n-2) * m * (20 * 8) bytes
  • MEM(DNA+GAMMA) = (n-2) * m * (16 * 8) bytes
  • MEM(DNA+CAT)        = (n-2) * m * (4  * 8)  bytes

raxml web-servers.

co-maintained by the exelixis lab.

Vital IT unit of the Swiss Institute of Bioinformatics



CIPRES portal at San Diego Supercomputer Center

New beta-version of the CIPRES portal that provides a full workbench



not maintained by the exelixis lab.

Bioportal in Norway (University of Oslo)