Reproducible Research
A research team led by Duke
University has engaged in research to examine the host (body) response to
viruses. In the course of that research we have collected blood samples in human
challenge studies. Four different Institutional Review Boards have approved
these challenge studies. The host response has been investigated in terms of
the gene-expression response. In the interest of encouraging other
investigators to reproduce our results, and to build upon them and find new
discoveries, on this webpage we provide all data and the software used to
produce every figure in our papers. The software is in Matlab,
and the data are in the form of associated .mat files. The data have been
normalized from the raw expression values, to constitute the data posted here.
Interested individuals may contact Lawrence Carin (lcarin@duke.edu) to learn details of how the
normalization was done; standard techniques were applied. The raw data are
available in GEO (accession no. GSE17156), if one wishes to start from the
raw data.
The three papers are:
A.K. Zaas,
M. Chen, J. Varkey, T. Veldman,
A.O. Hero III, J. Lucas, R. Turner, A. Gilbert, C. Oien,
B. Nicholson, S. Kingsmore, L. Carin, C.W. Woods, and
G.S. Ginsburg, Gene
Expression Signatures Diagnose Influenza and Other Symptomatic Respiratory
Viral Infections in Humans, Cell Host and Microbe, 2009.
M. Chen, D. Carlson, A. Zaas, C. Woods, G. Ginsburg, A. O. Hero III, J. Lucas, and
L. Carin, Detection of
Viruses via Statistical Gene-Expression Analysis, IEEE Transactions on
Biomedical Engineering, 2010.
B. Chen, M. Chen, J. Paisley,
A. Zaas, C. Woods, G.S. Ginsburg, A. Hero III, J.
Lucas, D. Dunson, and L. Carin, Bayesian
inference of the number of factors in gene-expression analysis: application to
human virus challenge studies, BMC Bioinformatics, 2010.
The data and Matlab software needed to reproduce every figure in these
papers are here (517
MB).
We also present an example here, based on toy data,
which shows the ability of the model to infer the number of factors present. In
this example we consider a case for which the data are pure noise, and
demonstrate that the model is able to infer that no factors are present.