Accelerating R Code with Fast Loops and Parallel Processing#

Author:

Kyle Monahan

Shirley Li, Bioinformatician, TTS Research Technology (xue.li37@tufts.edu)

Overview#

In R, processing large datasets can become time-consuming, especially when using for loops that execute sequentially. This tutorial explores techniques to speed up computations by leveraging parallel processing. We introduce the R libraries parallel, doParallel, foreach, and parallelly, each offering tools to efficiently distribute tasks across multiple cores.

We’ll cover:

How to implement basic parallelization using for and foreach loops.
Leveraging foreach and doParallel for more complex parallel workflows.
Using parLapply and parallelly for straightforward parallel operations.

Through examples and comparisons, this tutorial provides a practical guide to optimizing R code for faster execution, with insights on when to choose for, foreach, parLapply, and parallelly for different scenarios.

Libraries Required#

In this tutorial, we’ll use the following R libraries for parallel processing:

parallel: The base library in R for general parallel processing tasks.
doParallel: Supports foreach loops and more complex parallelized workflows.
foreach: Provides a more flexible looping option that can run in parallel.
parallelly: Contains helper functions that simplify cluster management.

# Load necessary libraries
library(parallel)    # Base parallel library for R
library(doParallel)  # Enables parallelized foreach loops
library(foreach)     # Defines foreach loops for parallel processing
library(parallelly)  # Helper functions for cluster management

Introduction to `for` Loops in R#

for loops are commonly used in R for iterating over data but can be slow with large datasets because they run sequentially. Here’s an example:

# Basic for loop example
result <- 0
for (i in 1:100000) {
  result <- result + i
}
print(result)

This loop iterates from 1 to 100,000, summing the values sequentially. Parallel processing can improve the runtime by distributing tasks across multiple cores.

Implementing `foreach` and `doParallel` for Parallel Loops#

Using foreach with doParallel provides flexibility and allows customization of result combination and parallelization.

Example: Parallel Summation with Two Cores#

In this example, we split the summation of numbers from 1 to 100,000 between two cores. Each core calculates the sum of an assigned range, and then the results are combined.

# Load necessary libraries
library(doParallel)
library(foreach)

# Set up parallel backend with 2 cores
registerDoParallel(cores = 2)

# Define the two ranges for summation
ranges <- list(1:50000, 50001:100000)

# Use foreach to process each range in parallel
partial_sums <- foreach(range = ranges, .combine = '+') %dopar% {
  sum(range)
}

# Print the final result after combining the two partial sums
print(partial_sums)

# Stop the parallel backend
stopImplicitCluster()

Tip: With more cores, divide the range accordingly to increase efficiency.

Another `for` loop example#

# Initialize a vector to store the results
results <- numeric(100000)

# Run tasks sequentially using a for loop
for (i in 1:100000) {
  results[i] <- i * 2
}

# Print or further process the results
print(results)

In this example, each loop is indenpendent from each other, and the results is a list of vectors instead of single value. In this case, we can use parLapply in parellelly package.

Using `parallelly` for Enhanced Cluster Management#

library(parallelly)

# Set up a PSOCK cluster with available cores
cluster <- makeClusterPSOCK(detectCores() - 1)

# Run tasks in parallel using parLapply
results <- parLapply(cluster, 1:100000, function(x) x * 2)

# Stop the cluster to free up resources
stopCluster(cluster)

The makeClusterPSOCK function provides more control over cluster configuration, making it useful for customized parallel environments.

In this example, parLapply applies a function to each element of the list (1 to 100,000) in parallel.

Running process:#

If you have 10 cores available and set up a PSOCK cluster with detectCores() - 1, the code will create a cluster using 9 cores. Here’s how the job distribution will work:

Task Distribution Across Cores:
- The parLapply function will split the task (multiplying each number in 1:100000 by 2) across the 9 cores.
- R will automatically divide the 1:100000 range into chunks, and each core will be assigned a subset of these numbers to process.
Chunking and Parallel Processing:
- The 100,000 tasks will be divided into 9 chunks, with each core processing approximately 11,111 numbers.
- For example, Core 1 might handle numbers 1:11111, Core 2 11112:22222, and so on.
- The exact distribution of numbers may vary slightly depending on R’s internal load-balancing, but each core will roughly process an equal amount of numbers.
Results Collection:
- Each core computes the results for its assigned subset and returns a list of results.
- Once all cores finish, parLapply will combine these lists into a single result vector (or list) containing all processed values.
Core Utilization:
- The task distribution makes efficient use of the 9 cores, allowing each core to operate independently on its chunk, thereby reducing the overall runtime compared to sequential processing.

#

Comparison: `for` vs. `foreach` Loops in R#

Choosing between for and foreach depends on the need for parallel processing, speed, and task complexity.

Feature	`for` Loop	`foreach` Loop
Execution	Sequentially, single-core only	Parallel using `%dopar%`, multi-core capable
Performance	Slower for large tasks	Faster for large, independent tasks
Syntax	Straightforward	Requires setup but allows flexible result combining
Memory Usage	Lower memory usage	Higher memory usage per core
Ideal Use Case	Small, sequential tasks	Large, independent, repetitive tasks

Conclusion: Use a for loop for simple sequential tasks and foreach with %dopar% for large-scale parallel processing.

`foreach` vs. `parLapply`#

The choice between foreach and parLapply depends on the operation type, required flexibility, and desired output structure.

Key Differences#

Feature	`parLapply`	`foreach`
Package	`parallel`	`foreach`, `doParallel`, etc.
Setup	Cluster-based, explicit setup required	Backend-based, no explicit cluster setup needed
Result Combination	Outputs list; post-processing may be needed	`.combine` option for direct result control
Flexibility	Best for simple, repetitive tasks	Ideal for complex, customizable tasks
Error Handling	Limited	More robust error-handling options

Examples#

Using parLapply:

library(parallel)
cluster <- makeCluster(detectCores() - 1)
result <- parLapply(cluster, 1:100000, function(x) x * 2)
stopCluster(cluster)

Using foreach:

library(doParallel)
library(foreach)
registerDoParallel(cores = detectCores() - 1)
result <- foreach(i = 1:100000, .combine = c) %dopar% {
  i * 2
}
stopImplicitCluster()

Conclusion:

Use parLapply for simple parallel tasks with independent iterations.
Use foreach for tasks that require customized result aggregation, error handling, or flexible backend support.

Final Thoughts#

Parallel processing in R can significantly improve runtime for data-intensive tasks. Understanding when to use for loops, foreach, parLapply, or parallelly can help you optimize your code’s performance. This tutorial provides a foundation for efficient parallelization in R.

References#

https://www.rdocumentation.org/packages/foreach/versions/1.5.2/topics/foreach

https://cran.r-project.org/web/packages/doParallel/index.html

https://gradientdescending.com/simple-parallel-processing-in-r/

https://davidzeleny.net/wiki/doku.php/recol:parallel

https://dept.stat.lsa.umich.edu/~jerrick/courses/stat701/notes/parallel.html