rmr {rmr2} | R Documentation |
Running on top of Hadoop, this package allows to define and run mapreduce jobs, including specifying the mapper and the reducer as R functions, and to move data between R and Hadoop in a mostly transparent way. The aim is to make writing map reduce jobs very similar to and just as easy as writing a lapply and a tapply. Additional features provide easy job composition, transparent intermediate result management, support for different data formats and more.