New York, NY - A multi-disciplinary group of scientists at New York University, led by Bud Mishra, professor of computer science and mathematics at NYU's Courant Institute of Mathematical Sciences, has developed a mathematical method for analyzing genetic data that could drastically improve the reliability of research findings. As a result, scientists using the new method can vastly improve the accuracy of the interpretation of experimental data, saving valuable resources and potentially accelerating the pace of research.

Since the human genome was fully mapped, the ongoing challenge for scientists has been to analyze exactly how the different genes and their products interact to give individuals their unique traits or give rise to genetic diseases such as cancer. Much of this research is done through the use of gene expression microarrays - or gene chips which grid down DNA on a solid surface in order to monitor interactions among hundreds or thousands of genes simultaneously.

While microarrays have vastly improved the effectiveness of genetics research, these experiments have proven time consuming and expensive due to the fact that they generate "relatively small" data sets, and therefore "noisy" or unreliable results. Mishra's team's new method for data analysis introduces a novel way of statistically manipulating the data to reduce the effect of noise, potentially reducing the need to generate additional data, as well as the time wasted from inaccurate results. The finding was published in the August 11-15 issue of the Proceedings of the National Academy of Sciences (PNAS).

"In spite of the fact that mathematics has been around for thousands of years, it is extremely new to biology, and our research in this area has focused on how best to leverage quantitative thinking in order to improve biological research," said Mishra. "This is not about data mining, or computation dealing with large amounts of data; it's about developing a better, more intelligent way of looking at things."

The current standard method used in the analysis of microarray data, pioneered by Eisen et al., is predicated on an arbitrary formulation. Mishra's team's algorithm replaces this method with a mathematically rigorous correlation coefficient of two gene expression vectors, based on James-Stein shrinkage estimators. Its research shows that the new algorithm corrects for many kinds of errors.

Given that close to 29,000 of the 30,000 genes found in the human genome have yet to be fully characterized, analysis of gene expression microarrays is a key component of overall biological research. Mishra's team's algorithm will likely be incorporated widely, particularly in the field of gene-related cancer research.

The PNAS paper, entitled Shrinkage-based Similarity Metric for Cluster Analysis of Microarray Data, was coauthored by NYU students and researchers Vera Cherepinsky, Jiawu Feng, and Marc Rejali. For a copy of the paper, contact

Bud Mishra, who is also a professor at Cold Spring Harbor Lab, is the director of the NYU Bioinformatics Group, which incorporates computer science, statistics, biology, and applied mathematics to improve the efficacy of biomedical research. The group receives funding from NYSTAR, DARPA's BioSpice program, National Science Foundation's Qubic Program, the Department of Energy, the US Air Force Research Lab, the National Institutes of Health, and the Howard Hughes Medical Institute.

The Courant Institute, a division of New York University, is one of the world's leading centers of research and instruction in mathematical analysis, applied mathematics, and scientific computation. It was founded in 1952 by Richard Courant.

Press Contact