REPRODUCIBLE STATISTICAL RESEARCH IN PRACTICE

Friedrich Leisch
Department of Statistics, University of Munich

Scientific progress, for which statistical analyses often provide
important supporting evidence, requires the reproduction of research
results.  This applies to statistical research done as part of a
collaborative scientific team as well as the development of
statistical theory and methods.  Communication of results is also
important, as required by the peer-review process and needed by the
end consumers of the conducted research.  Proper scientific review and
evaluation should check the correctness of both theoretical evidence,
through the checking of proofs and assumptions, and computational
results as described by numerical studies and concrete implementations
of the proposed methods.

As motivating examples we will use two case studies: PhD students and
postdocs attending a summer school of the German Biometric Society on
"Reproducible Research and Software Validation" were asked to
reproduce the results of several biostatistical papers in groups of
2-4 students in a hands-on session. All data necessary to reproduce
the results and code provided as online supplement to the journal
articles were given to the students. In an ongoing research project a
random sample of one hundred articles of one of the leading journals
of the field have been analyzed for availability of data, code and
other instructions to reproduce results.

This talk will discuss the general problem of making computational
statistical research reproducible, and show which tools in R are
available to assist the analysis. One possible solution is writing
parts of manuscripts using Sweave and hence combine text and analysis
into one entity. But in many cases providing validated code and data
would be sufficient, with emphasis of code validation. In many cases
mere availability of code gives others the feeling that results are
reproducible, even if they are not. So the future challenge will be to
create mechanisms for automatic checking of analysis code beyond the
review process of a paper.