|
Tutorial: Introduction to high-performance computing with R
|
Dirk Eddelbuettel,
Debian, Chicago, USA
Abstract
R users are often limited by available memory, cpu power or both. The
tutorial covers a number of available options to enhance or accelerate R
processing.
Compared to the previous tutorials at
useR! 2008 and
useR! 2009,
we will cover recently added packages (e.g. multicore and foreach / iterators
/ doMC / doSNOW / doMPI) and new methodologies
(e.g. GPU computing with R) while still covering parallel computing
(Rmpi, snow, nws) and interfacing with compiled code (Rcpp,
inline, RInside) and 'large memory' (biglm, bigmemory, ff).
Outline
Topics will include:
- Tools for automation and scripting with R
- Measuring and profiling to identify bottlenecks
- Performance enhancements using vectorization, just-in-time compilation, BLAS and GPUs
- Extending R with compiled code, and embedding R in C++ applications
- An overview of parallel computing (explicitly and implicitly) with R
- Out-of-memory processing using packages biglm, ff and bigmemory
Time will be alloted for discussion and questions. We also
hope to prepare another 'ready-to-run' live-cdrom containing most of the packages discussed in
the tutorial, and show its use in virtual machines or dual-boot setups.
Potential attendees
R users wishing to learn about measuring / profiling performance, running R in parallel or extending
R by means of compiled code.
Required knowledge
There is no formal requirement though basic knowledge of R and a basic programming background in C
or C++ is beneficial.
Tutorial Materials
Slides are here.