Scientists at NYU’s Center for Genomics and Systems Biology, the American Museum of Natural History, Cold Spring Harbor Laboratory, and the New York Botanical Garden have created the largest genome-based tree of life for seed plants to date.

Their findings, published in the journal PLoS Genetics, plot the evolutionary relationships of 150 different species of plants based on advanced genome-wide analysis of gene structure and function. This new approach, called “functional phylogenomics,” allows scientists to reconstruct the pattern of events that led to the vast number of plant species and could help identify genes used to improve seed quality for agriculture.

The research, performed by members of the New York Plant Genomics Consortium, was funded by the National Science Foundation (NSF) Plant Genome Program to identify the genes that caused the evolution of seeds, a trait of important economic interest. The group selected 150 representative species from all of the major seed plant groups to include in the study. The species span from the flowering variety—peanuts and dandelions, for example—to non-flowering cone plants like spruce and pine. The sequences of the plants’ genomes—all of the biological information needed to build and maintain an organism, encoded in DNA—were either culled from pre-existing databases or generated, in the field and at the New York Botanical Garden in the Bronx, from live specimens.

“Previously, phylogenetic trees were constructed from standard sets of genes and were used to identify the relationships of species,” explains Gloria Coruzzi, a
professor in NYU’s Center for Genomics and Systems Biology and the principal
investigator of the NSF grant. “In our novel approach, we create the phylogeny based on all the genes in a genome, and then use the phylogeny to identify which genes
provide positive support for the divergence of species.”

With new algorithms developed at the museum and NYU, and the processing power of supercomputers at Cold Spring Harbor Laboratory and overseas, the sequences—nearly 23,000 sets of genes (specific sections of DNA that code for certain proteins)—were grouped, ordered, and organized in a tree according to their evolutionary relationships. Algorithms that determine similarities of biological processes were used to identify the genes underlying species diversity.

The results support major hypotheses about evolutionary relationships in seed plants. Among the most interesting findings is that gnetophytes, a group that consists mostly of shrubs and woody vines, are the most primitive living, non-flowering seed plants—present since the late Mesozoic era, or the age of dinosaurs.

Press Contact