miércoles, 29 de mayo de 2019

Pipes and connections between functions in R

A recent addition to R is the pipe-forwarding mechanism (%>%) within the magrittr package. This is extremely useful when using the dplyr, ggvis, and tidyr packages, among others. Pipe forwarding is an alternative to nesting that yields code that can be read from top to bottom. Here we demonstrate an example that compares traditional (nested) dplyr function calls to the new pipe operator.


jueves, 23 de mayo de 2019

Summary of Matrix Operators in R

Summary of Matrix Operators in R

Parallel with R. Example with snow

Snow provides support for easily executing R functions in parallel. Most of the parallel execution functions in snow are variations of the standard lapply() function, making snow fairly easy to learn. To implement these parallel operations, snow uses a master/ worker architecture, where the master sends tasks to the workers, and the workers execute the tasks and return the results to the master.


The basic cluster creation function is makeCluster() which can create any type of cluster. snow includes a number of functions that we could use, including clusterApply(), clusterApplyLB(), and parLapply(). For this example, we’ll use clusterApply(). You call it exactly the same as lapply(), except that it takes a snow cluster object as the first argument. We also need to load MASS on the workers, rather than on the master, since it’s the workers that use the “Boston” dataset.

We’ll use snow.time() to gather timing information about the overall execution. We will also use snow.time()’s plotting capability to visualize the task execution on the workers.