Tutorial: An Introduction to High-Performance R |
Computing resources are more abundant than ever in absolute terms thanks to the continued improvements in processing power that were predicted decades ago by Moore's law.
However, despite these advances, work in applied and computational statisticis is constrained by the seemingly parallel growth in data sets. As the larger and larger amounts of data offset the faster computers, any 'relative' improvement in computing power appears to be difficult to appreciate. Hence, R users still feel limited by their available computing resources, be it available memory, cpu power or both and complain that it still takes too long or it still uses all my memory.
The tutorial will introduce a number of available options to address these computing constraints in order to enhance or accelerate R processing.
The following topics will be covered in the tutorial:
inline
package RCppTemplate
package for C/C++
interfacessnow
package with Rmpi
or
Rpvm
nws
package with its python-nwsserver
backendpnmath
and pnmath0
packages for
multithreaded math functionsbiglm
and ff
packagesRscript
and
littler
wrappersThe tutorial is aimed at R users wishing to learn about different methods of extending R for more efficient data processing.
Basic R and programming knowledge will be required to take advantage of all examples. Likewise, some understanding of general computing concepts will be helpful. For the examples involving C and C++, some familiarity with these language is required.
Dr. Dirk Eddelbuettel has been using S+ and R for quantitative analysis for over a decade. He is the author of several CRAN packages, the maintainer of R and numerous other packages for the Debian Linux distribution, and the author of the Quantian computing environment.