Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32606, USA
Received Date: October 18, 2016; Accepted Date: October 20, 2016; Published Date: October 26, 2016
Citation: Brocchieri L (2016) Functional and Phylogenetic Diversity. J Phylogenetics Evol Biol 4:e122. doi: 10.4172/2329-9002.1000e122
Copyright: © 2016 Brocchieri L. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Phylogenetics & Evolutionary Biology
Functional Diversity (FD) is a concept developed and widely explored in theoretical ecology (for a critical review [1,2]). Measures of FD have been developed with the general idea of characterizing the functional aspects of biodiversity that inform the functioning of ecosystems in relation to environmental constraints. The simplest approach to describing FD is possibly the catalog of species presented in a community. The assumption is that "species richness" (the number of species in the community) correlates positively with the number of functionalities expressed by the community. Further developments can include accounting for the abundance of each species, and accounting for how similar species are in terms of their functionalities, i.e., describing the distribution of functional units rather than of species within the community. When a community is described in terms of functional units, it can be represented by a distribution of points in a functional-trait space, with as many dimensions as the number of traits considered. So, functional space can be represented by the values of one specific functionality of interest (a one-dimensional functional space), or as a multi-dimensional space of many continuous or discrete variables. Indices of FD are constructed to represent how the functional space is occupied by the community. Among many proposed indexes, some are based on pairwise diversities between all species [3,4] or between functional units , and are independent on the abundance of each species/functional unit. Rao's quadratic entropy, takes instead into account the frequencies of species in the community and their pairwise functional distance dij . Functional distances can be derived based on the position of points in the functional-trait space , or can operate on representation of functional differences summarized by phenograms [8,9], in which the length of branches represents functional distance. FD indexes have been characterized as measuring three different qualities of diversity [2,7,10]: Functional richness, i.e., how much of the niche space is occupied by species or functional units; functional evenness, measuring how evenly distributed are traits within the occupied functional space; and functional divergence, measuring the differentiation of functionalities within the trait's space.
In contrast to indexes of FD, which operate on the observed distribution of functional features, indexes of Phylogenetic Diversity (PD) infer diversities among the species present in the community based on their phylogenetic relations represented by a taxonomic or phylogenetic tree. Genetic relatedness is expected to result in different species sharing functional traits inherited by descent from their common ancestors. It has been proposed that measures of PD can be used as a proxy for measures of FD (e.g., ref. ). I argue that, in fact, PD indexes differ from indexes of FD only in that instead of operating on the observed distribution of functional features, they estimate this distribution based on the tree, and that any index of PD can be used as an index of FD by letting it operate on observed frequencies.
Commonly used indices of PD include Faith's PD where Li is the length of the i-th branch of the tree, corresponding to the length of the phylogenetic tree . PD is interpreted as the total amount of evolutionary history among the species considered in the tree. Implicit in this measure is the idea that the 'total amount of evolutionary history' is directly proportional to the total number of functional features. In contrast to PD, other measures of PD take into consideration the relative abundance of each species in the community. Among these is Rao's quadratic entropy . As in its use as a measure of FD, pi and pj are the relative abundances of species i and j, but dij is a measure of 'phylogenetic distance' defined using a rooted phylogenetic tree as the difference between the depth of the tree and the sum of the branch lengths common to both species. Phylogenetic entropy  , considers the aggregate abundance ai of all species descended from branch i. Pavoine et al.  parametric index of PD is based on a rooted ultrametric tree and species abundances. In constructing this index the tree is partitioned into N time intervals (tk–tk–1) separating consecutive bifurcation events. If Pik is the aggregate frequency of the i-th character among n characters descended from the branches represented in the time interval (tk–tk–1) with q ≥ 0 and q ≠ 1:
and when q=1:
is related to the previous measures of PD by the relations Chao and collaborators  derived a general class of measures called Mean PD of order taking into consideration species abundances and phylogenetic relations, related to Jost's diversity measures [15,16] based on Hill's numbers . When q ≥ 0 and q ≠ 1, mean PD is defined as:
and when q=1:
In these formulations is the average depth of a rooted tree. With the appropriate choice of order q, mean phylogenetic diversities have simple relations with other diversity indexes . With Faith's PD PD when with phylogenetic entropy HP when and with Rao's quadratic entropy Q when.
In the expression of PD indexes is implicit an estimate of the frequency of phenotypic (or functional) features in a community. This is accomplished assuming as a model for the evolution of traits Camin-Sokal parsimony , by which traits evolve only once in evolutionary history and persist through all descending lineages. With this assumption, each OTU is on average represented by a number of features proportional to the average depth of the tree, the length Li of a branch i am proportional to the number of features common to all OTUs descended from branch i, and is the relative aggregate abundance of those features in the community. These estimated frequencies are then used by the index to evaluate diversity. In the case of mean PD of order q ≥ 0 and q ≠ 1, this can be more clearly seen by rewriting it as:
Where is the estimated fraction of all features in the community that are shared by a fraction ai of all individuals. If E[f(a)] is the mean value of a transformation f(a) of the frequency of features, the above relation can be expressed in terms of this mean. For the example above: Similarly, when q=1 then
Since other PD indexes can be expressed as functions of mean phylogenetic diversities, they also operate on the corresponding means. To each index of PD can be then associated a corresponding index of FD. While the index of FD operates on estimated frequencies of features, the corresponding index of FD can be applied to observed frequencies of functional features.
Unique to PD indexes is how they infer frequencies of features from the phylogenetic tree. To obtain estimates for the frequency of features, many alternative evolutionary models could be utilized, and the choice of the Camin-Sokal parsimony model implicit in the current formulation of PD indexes can be questioned. With the availability of deep-sequencing technologies, molecular sequences (most often conserved proteins or ribosomal-RNA genes) are currently used to identify species within a community (as, for example, for human or environmental microbiomes), and to obtain evolutionary trees then used to evaluate the PD of the community. In these cases, the evolutionary history of protein or DNA sequences is used as a marker of the evolutionary history of the corresponding species. The possibility of using PD indexes on observed frequency of features, suggest that the multiple alignment of protein or DNA sequences used to obtain the phylogenetic tree could be used directly as a marker of phenotypic diversity within the community, bypassing the necessity to first infer a tree, and then to use the tree to infer distribution of features. If the alignment is not available, its diversity could be inferred using the same probabilistic evolutionary models used to produce the phylogenetic tree . It may, however, be questioned if the evolution and diversification of protein or DNA sequences is a useful representative of diversification of functionalities within the community. For example, proteins of obligate parasites such as Mycoplasma or Rickettsia often evolve at higher rates than in other species, reflecting in longer terminal branches in the evolutionary tree. Under a Camin-Sokal parsimony model, these long branches would be interpreted as accumulation of functional features. However, these organisms have at the same time experienced genome-size reduction and loss of functionalities, and the higher rate of evolution of some of their proteins may in fact more likely represent loss rather than accumulation of functionalities. Phylogenetic and FD indexes have been scrutinized for how useful they are in describing the functional properties of a community (e.g., refs. [3,20] Flynn et al. 2011; Srivastava et al. 2012). I am not aware of any study to date investigating the usefulness of different markers (e.g., protein or DNA sequences) in predicting the differentiation of functional features across organisms, and how the choice of alternative evolutionary models can improve accuracy in estimating FD. I believe these will be interesting topics of future investigation.