The
Nexus standard is a more complex, plain text-based file format for phylogenetic data. Nexus files can contain other
data besides trees, such as sequence alignments or other types of character state matrices, distance matrices, and so on. Many commonly-used programs read and/or write Nexus data. These
include FigTree, Mesquite, TreeAnnotator, MrBayes, MacClade, PAUP*, etc. Nexus files often - but by no means always - have the extension
*.nex
or
*.nxs
. You can recognize
whether your file is a Nexus file by opening it in a text editor such as NotePad on Windows or TextEdit on Mac OSX. If the very first word of the file is
#nexus
, either in
lower or in uppper case, then the file is most likely a Nexus file. Regrettably, the Nexus "standard" is not followed too carefully by different programs. We have attempted to be permissive
in what the monophylizer accepts but there are some caveats to keep in mind:
- Nexus files can contain multiple tree descriptions. The monophylizer only reads the first tree.
- The monophylizer operates strictly on topology. Branch lengths are therefore optional: you can use either cladograms or phylograms.
If you do provide branch lengths, they can be arbitrary length integers, floating point numbers, or use scientific notation.
- Nexus files can contain data "blocks" besides trees. The monophylizer ignores all non-tree blocks.
- Tree descriptions in Nexus files follow the same syntax as Newick. However, in some cases the taxon names are replaced with integers that correspond with a numbered list
of taxon names, the so-called "translation table". This is done to save space and cut down on redundant names in files that contain many trees for the same
taxa (for example in Bayesian analyses). The monophylizer accepts either usage: names embedded inside the tree statements or a separate translation table are both fine.
- The tips in your tree should have names that consist of the species name (
Genus species
or Genus species subspecies
), and
some kind of separator (such as the |
symbol),
and then some kind of unique identifier such as a specimen or sequence identifier. In total, each name should therefore look something like 'Genus species|ID2347'
You can specify which record separator symbol you're using in the 'Tree reading' tab. You can use other symbols besides |
as long as they don't have special meaning in Nexus.
The following are therefore disallowed: ,'"[]();:_
- BOLD has a tendency to break the standard, especially when you export trees with more metadata than strictly needed. If you export trees from BOLD make sure the trees contain
species names and unique identifiers, but nothing else. Metadata such as names of collection localities or taxonomic authorities are quite likely to contain characters that
have special meanings in Nexus trees, yet BOLD does nothing to prevent their unsafe inclusion. For example, a place name such as
Côte d'Azur
is problematic, firstly
because the accent circonflexe might not transfer correctly (this has to do with character encoding) and
secondly because the apostrophe is interpreted as an opening quote for which a file reader that follows the standard will expect a closing quote. Likewise, any text that
is [inside square brackets]
is interpreted in a special way: file readers that follow the standard assume that such text is a comment that should normally be stripped
out of the tree description (some programs insert special data in these comments, though this is "non standard"). When BOLD inserts square brackets into your tree
description this can break the tree reading, especially when there is an opening bracket but no closing one, which we've seen "in the wild".
Here are examples of usable Nexus files:
- Nexus files with underscored whitespace:
- Nexus files without a translation table:
- Nexus files with quoted whitespace:
- Nexus cladograms: