Accelerating R Code with Fast Loops and Parallel Processing#

Author:

Kyle Monahan

Shirley Li, Bioinformatician, TTS Research Technology (xue.li37@tufts.edu)

Overview#

In R, processing large datasets can become time-consuming, especially when using for loops that execute sequentially. This tutorial explores techniques to speed up computations by leveraging parallel processing. We introduce the R libraries parallel, doParallel, foreach, and parallelly, each offering tools to efficiently distribute tasks across multiple cores.

We’ll cover:

  • How to implement basic parallelization using for and foreach loops.

  • Leveraging foreach and doParallel for more complex parallel workflows.

  • Using parLapply and parallelly for straightforward parallel operations.

Through examples and comparisons, this tutorial provides a practical guide to optimizing R code for faster execution, with insights on when to choose for, foreach, parLapply, and parallelly for different scenarios.


Libraries Required#

In this tutorial, we’ll use the following R libraries for parallel processing:

  • parallel: The base library in R for general parallel processing tasks.

  • doParallel: Supports foreach loops and more complex parallelized workflows.

  • foreach: Provides a more flexible looping option that can run in parallel.

  • parallelly: Contains helper functions that simplify cluster management.


# Load necessary libraries
library(parallel)    # Base parallel library for R
library(doParallel)  # Enables parallelized foreach loops
library(foreach)     # Defines foreach loops for parallel processing
library(parallelly)  # Helper functions for cluster management

Introduction to for Loops in R#

for loops are commonly used in R for iterating over data but can be slow with large datasets because they run sequentially. Here’s an example:

# Basic for loop example
result <- 0
for (i in 1:100000) {
  result <- result + i
}
print(result)

This loop iterates from 1 to 100,000, summing the values sequentially. Parallel processing can improve the runtime by distributing tasks across multiple cores.


Implementing foreach and doParallel for Parallel Loops#

Using foreach with doParallel provides flexibility and allows customization of result combination and parallelization.

Example: Parallel Summation with Two Cores#

In this example, we split the summation of numbers from 1 to 100,000 between two cores. Each core calculates the sum of an assigned range, and then the results are combined.

# Load necessary libraries
library(doParallel)
library(foreach)

# Set up parallel backend with 2 cores
registerDoParallel(cores = 2)

# Define the two ranges for summation
ranges <- list(1:50000, 50001:100000)

# Use foreach to process each range in parallel
partial_sums <- foreach(range = ranges, .combine = '+') %dopar% {
  sum(range)
}

# Print the final result after combining the two partial sums
print(partial_sums)

# Stop the parallel backend
stopImplicitCluster()

Tip: With more cores, divide the range accordingly to increase efficiency.


Another for loop example#

# Initialize a vector to store the results
results <- numeric(100000)

# Run tasks sequentially using a for loop
for (i in 1:100000) {
  results[i] <- i * 2
}

# Print or further process the results
print(results)

In this example, each loop is indenpendent from each other, and the results is a list of vectors instead of single value. In this case, we can use parLapply in parellelly package.


Using parallelly for Enhanced Cluster Management#

library(parallelly)

# Set up a PSOCK cluster with available cores
cluster <- makeClusterPSOCK(detectCores() - 1)

# Run tasks in parallel using parLapply
results <- parLapply(cluster, 1:100000, function(x) x * 2)

# Stop the cluster to free up resources
stopCluster(cluster)

The makeClusterPSOCK function provides more control over cluster configuration, making it useful for customized parallel environments.

In this example, parLapply applies a function to each element of the list (1 to 100,000) in parallel.

Running process:#

If you have 10 cores available and set up a PSOCK cluster with detectCores() - 1, the code will create a cluster using 9 cores. Here’s how the job distribution will work:

  1. Task Distribution Across Cores:

    • The parLapply function will split the task (multiplying each number in 1:100000 by 2) across the 9 cores.

    • R will automatically divide the 1:100000 range into chunks, and each core will be assigned a subset of these numbers to process.

  2. Chunking and Parallel Processing:

    • The 100,000 tasks will be divided into 9 chunks, with each core processing approximately 11,111 numbers.

    • For example, Core 1 might handle numbers 1:11111, Core 2 11112:22222, and so on.

    • The exact distribution of numbers may vary slightly depending on R’s internal load-balancing, but each core will roughly process an equal amount of numbers.

  3. Results Collection:

    • Each core computes the results for its assigned subset and returns a list of results.

    • Once all cores finish, parLapply will combine these lists into a single result vector (or list) containing all processed values.

  4. Core Utilization:

    • The task distribution makes efficient use of the 9 cores, allowing each core to operate independently on its chunk, thereby reducing the overall runtime compared to sequential processing.

#


Comparison: for vs. foreach Loops in R#

Choosing between for and foreach depends on the need for parallel processing, speed, and task complexity.

Feature

for Loop

foreach Loop

Execution

Sequentially, single-core only

Parallel using %dopar%, multi-core capable

Performance

Slower for large tasks

Faster for large, independent tasks

Syntax

Straightforward

Requires setup but allows flexible result combining

Memory Usage

Lower memory usage

Higher memory usage per core

Ideal Use Case

Small, sequential tasks

Large, independent, repetitive tasks

Conclusion: Use a for loop for simple sequential tasks and foreach with %dopar% for large-scale parallel processing.


foreach vs. parLapply#

The choice between foreach and parLapply depends on the operation type, required flexibility, and desired output structure.

Key Differences#

Feature

parLapply

foreach

Package

parallel

foreach, doParallel, etc.

Setup

Cluster-based, explicit setup required

Backend-based, no explicit cluster setup needed

Result Combination

Outputs list; post-processing may be needed

.combine option for direct result control

Flexibility

Best for simple, repetitive tasks

Ideal for complex, customizable tasks

Error Handling

Limited

More robust error-handling options

Examples#

Using parLapply:

library(parallel)
cluster <- makeCluster(detectCores() - 1)
result <- parLapply(cluster, 1:100000, function(x) x * 2)
stopCluster(cluster)

Using foreach:

library(doParallel)
library(foreach)
registerDoParallel(cores = detectCores() - 1)
result <- foreach(i = 1:100000, .combine = c) %dopar% {
  i * 2
}
stopImplicitCluster()

Conclusion:

  • Use parLapply for simple parallel tasks with independent iterations.

  • Use foreach for tasks that require customized result aggregation, error handling, or flexible backend support.


Final Thoughts#

Parallel processing in R can significantly improve runtime for data-intensive tasks. Understanding when to use for loops, foreach, parLapply, or parallelly can help you optimize your code’s performance. This tutorial provides a foundation for efficient parallelization in R.

References#

https://www.rdocumentation.org/packages/foreach/versions/1.5.2/topics/foreach

https://cran.r-project.org/web/packages/doParallel/index.html

https://gradientdescending.com/simple-parallel-processing-in-r/

https://davidzeleny.net/wiki/doku.php/recol:parallel

https://dept.stat.lsa.umich.edu/~jerrick/courses/stat701/notes/parallel.html