Informatics Tools


Metabolomics, the global study of small molecules in a particular system, has in the last few years risen to become a primary –omics platform for the study of metabolic processes. With the ever-increasing pool of quantitative data yielded from metabolomic research, specialized methods and tools with which to analyze and extract meaningful conclusions from these data are becoming more and more crucial. Furthermore, the depth of knowledge and expertise required to undertake a metabolomics oriented study is a daunting obstacle to novices. As such, we have created a new statistical analysis workflow, MetaboLyzer, which has been specifically developed with the metabolomics novice in mind but is also useful to experts. MetaboLyzer’s workflow is specifically tailored to the unique characteristics and idiosyncrasies of post-processed liquid chromatography/mass spectrometry (LC/MS) based metabolomic datasets. It utilizes a wide gamut of statistical tests, procedures, and methodologies that belong to classical biostatistics, as well as several novel statistical techniques that we have developed specifically for metabolomics data. Furthermore, MetaboLyzer conducts rapid putative ion identification and biologically relevant analysis via incorporation of four major small molecule databases: KEGG, HMDB, Lipid Maps, and BioCyc. MetaboLyzer incorporates these aspects into a comprehensive workflow that outputs easy to understand statistically and biologically significant information in the form of heat maps, volcano plots, 3D visualization plots, correlation maps, and metabolic pathway hit histograms. MetaboLyzer’s unique ability to distinguish mammalian metabolites from those of bacterial and plant origin provides an extra degree of confidence in these results.

Related Publications

MetaboLyzer: a novel statistical workflow for analyzing Postprocessed LC-MS metabolomics data. Mak TD, Laiakis EC, Goudarzi M, Fornace AJ Jr. Anal Chem. (2014) [Link to paper]

Selective paired ion contrast analysis: a novel algorithm for analyzing post-processed LC-MS metabolomics data possessing high experimental noise. Mak TD, Laiakis EC, Goudarzi M, Fornace AJ. Anal Chem 2015; 87:3177-3186. [abstract] [full text]

Exposure to ionizing radiation reveals global dose- and time-dependent changes in the urinary metabolome of rat. Mak TD, Tyburski JB, Krausz KW, Kalinich, JF, Gonzalez FJ, and Fornace AJ. Metabolomics 2015; 11:1082-1094. [abstract]

For more information and to download the software please visit:



One of the consequences in analyzing biological data from noisy sources, such as human subjects, is the sheer variability of experimentally irrelevant factors that cannot be controlled for. This holds true especially in metabolomics, the global study of small molecules in a particular system. While metabolomics can offer a deep quantitative insight into the metabolome via easy-to-acquire biofluid samples such as urine and blood, the aforementioned confounding factors can easily overwhelm attempts to extract relevant information. This can mar potentially crucial applications such as biomarker discovery. As such, a new algorithm, called Selective Paired Ion Contrast (SPICA), has been developed with the intent of extracting biologically relevant information from the noisiest of metabolomic datasets. The basic idea of SPICA is built upon redefining the fundamental unit of statistical analysis. Whereas the vast majority of algorithms analyze metabolomics data on a single-ion basis, SPICA relies on analyzing ion-pairs. A standard metabolomic data set is reinterpreted by exhaustively considering all possible ion-pair combinations. Statistical comparisons between sample groups are made only by analyzing the differences in these pairs, which may be crucial in situations where no single metabolite can be used for normalization. With SPICA, human urine data sets from patients undergoing total body irradiation (TBI), and from a colorectal cancer (CRC) relapse study were analyzed in a statistically rigorous manner not possible with conventional methods. In the TBI study, 3530 statistically significant ion-pairs were identified, from which numerous putative radiation specific metabolite-pair biomarkers that mapped to potentially perturbed metabolic pathways were elucidated. In the CRC study, SPICA identified 6461 statistically significant ion-pairs, several of which putatively mapped to folic acid biosynthesis, a key pathway in colorectal cancer. Utilizing support vector machines (SVMs), SPICA was also able to unequivocally outperform binary classifiers built from classical single-ion feature based SVMs.


Metabolomics shows great promise as a tool for the discovery of biomarkers of mutagen exposure and early tumorigenesis. With the ever-increasing pool of quantitative data yielded from metabolomic research, methods and tools with which to analyze and extract meaningful conclusions from these data are becoming more and more crucial. Methods currently in wide use are limited in either the scope and depth of analysis or are not specifically tailored to metabolomic datasets.

In light of this, a new methodology has been developed and presented here that is specifically tailored to characteristics unique to post-processed LC/MS metabolomic data acquired from biofluid samples. Twenty-four-hour urine samples were collected from male rats before and after whole-body exposures to doses of gamma radiation ranging from 0.5 to 10 Gy (n = 20 per dose). Urine metabolomics data were acquired by Ultra-Performance Liquid Chromatography coupled to time-of-flight mass spectrometry, operated in both positive and negative electrospray ionization modes, and pre-processed using MarkerLynx software (Waters, Inc.). Relative abundances of urinary ions were normalized to corresponding creatinine relative abundances to reduce the probability of confounding by variation in renal function. A visualization tool effective in qualitatively illustrating an overall metabolome response to radiation exposure was developed. 3D plots were created, on which each plotted point represents a metabolite, specified by its mass/charge (X axis) and retention time (Y axis), with its pre- to post-exposure change in regulation (Z axis) colorized with regard to the magnitude of change. Data filters and data augmentation schemes were developed in order to yield results that were more quantitatively robust, from which stronger conclusions can be drawn about the experiment, such as identifying specific metabolites of interest. With the development of this methodology, the repertoire of analytical tools available to metabolomics researchers will be enhanced, facilitating more rapid and meaningful biomarker discovery.