Reproducible Research
A research team led by Duke
University has engaged in research to examine the host (body) response to
viruses. In the course of that research we have collected blood samples in
human challenge studies. Four different Institutional Review Boards have
approved these challenge studies. The host response has been investigated in
terms of the gene-expression response. In the interest of encouraging other
investigators to reproduce our results, and to build upon them and find new
discoveries, on this webpage we provide all data and the software used to
produce every figure in our papers. The software is in Matlab,
and the data are in the form of associated .mat files. The data have been normalized
from the raw expression values, to constitute the data posted here. Interested
individuals may contact Lawrence Carin (lcarin@duke.edu)
to learn details of how the normalization was done; standard techniques were
applied. The raw data are available in GEO (accession no. GSE17156), if
one wishes to start from the raw data.
For the following three
papers are:
A.K. Zaas,
M. Chen, J. Varkey, T. Veldman,
A.O. Hero III, J. Lucas, R. Turner, A. Gilbert, C. Oien,
B. Nicholson, S. Kingsmore, L. Carin, C.W. Woods, and
G.S. Ginsburg, Gene
Expression Signatures Diagnose Influenza and Other Symptomatic Respiratory
Viral Infections in Humans, Cell Host and Microbe, 2009.
M. Chen, D. Carlson, A. Zaas, C. Woods, G. Ginsburg, A. O. Hero III, J. Lucas, and
L. Carin, Detection of
Viruses via Statistical Gene-Expression Analysis, IEEE Transactions on
Biomedical Engineering, 2010.
B. Chen, M. Chen, J. Paisley,
A. Zaas, C. Woods, G.S. Ginsburg, A. Hero III, J.
Lucas, D. Dunson, and L. Carin, Bayesian
inference of the number of factors in gene-expression analysis: application to
human virus challenge studies, BMC Bioinformatics, 2010.
The data and Matlab software needed to reproduce every figure in these
papers are here (517
MB).
We also present an example here, based on toy data,
which shows the ability of the model to infer the number of factors present. In
this example we consider a case for which the data are pure noise, and
demonstrate that the model is able to infer that no factors are present.
______
For the following paper:
C.W. Woods, M.T. McClain, M.
Chen, A.K. Zaas, B.P. Nicholson, J. Varkey, T. Veldman, S.F. Kingsmore, Y. Huang, R. Lambkin-Williams, A.G. Gilbert, A.O.
Hero III, E. Ramsburg, S. Glickman, J.E. Lucas, L.
Carin, and G.S. Ginsburg, A Host Transcriptional
Signature for Presymptomatic Detection of Infection
in Humans Exposed to Influenza H1N1 or H3N2, PLOS ONE, 2013
The data and the code are here (205 MB; the main
function to run is Flu_Validate.m ).