Most (legacy) software is written for serial computation:
Figure from here
Figure from here
A classification of computer architectures (Flynn, 1972)
Figure from here
Rscript
There are many R packages for parallelization, check out the CRAN Task View on High-Performance and Parallel Computing for an overview. For example:
ForEach
PackageThe foreach package has numerous advantages including:
for()
loop-like syntaxmulticore
, parallel
, snow
,
Rmpi
, etc.)fork
on linux/mac
and snow
on windows).for
loopsfor()
for
loops## [1] 1 4 9
foreach()
## [[1]]
## [1] 1
##
## [[2]]
## [1] 4
##
## [[3]]
## [1] 9
x
is a list with one element for each iterator variable
(i
). You can also specify a function to use to combine the
outputs with .combine
. Let’s concatenate the results into a
vector with c
.
foreach()
loop with
.combine
## [1] 1 4 9
Tells foreach()
to first calculate each iteration, then
.combine
them with a c(...)
foreach()
loop with
.combine
## [,1]
## result.1 1
## result.2 4
## result.3 9
foreach()
loopSo far we’ve used %do%
which uses a single
processor.
Must register a parallel backend with one of the
do*
functions. On most multicore systems, the easiest
backend is typically doParallel()
. On linux and mac, it
uses fork
system call and on Windows machines it uses
snow
backend. The nice thing is it chooses automatically
for the system.
registerDoParallel(3) # register specified number of workers
#registerDoParallel() # or, reserve all all available cores
getDoParWorkers() # check registered cores
## [1] 3
foreach()
loopTo run in parallel, simply change the %do%
to
%dopar%
. Wasn’t that easy?
## [1] 1 4 9
## 0.002 sec elapsed
## 0.011 sec elapsed
## 9.019 sec elapsed
## 3.018 sec elapsed
# Example task: Simulate a large number of random normal distributions and calculate their means
num_simulations <- 1000
sample_size <- 1e6 # Size of each random sample
# sequential foreach loop
tic()
results <- foreach(i = 1:num_simulations, .combine = 'c') %do% {
# Generate a random sample and calculate the mean
sample_data <- rnorm(sample_size, mean = 0, sd = 1)
mean(sample_data)
}
toc()
## 31.091 sec elapsed
# Parallel foreach loop
tic()
results <- foreach(i = 1:num_simulations, .combine = 'c') %dopar% {
# Generate a random sample and calculate the mean
sample_data <- rnorm(sample_size, mean = 0, sd = 1)
mean(sample_data)
}
toc()
## 10.672 sec elapsed
Example from the foreach vignette
avec = 1:3
bvec = 1:4
sim <- function(a, b) # example function
10 * a + b ^ 2
# use a standard nested for() loop:
x <- matrix(0, length(avec), length(bvec))
for (j in 1:length(bvec)) {
for (i in 1:length(avec)) {
x[i, j] <- sim(avec[i], bvec[j])
}
}
x
## [,1] [,2] [,3] [,4]
## [1,] 11 14 19 26
## [2,] 21 24 29 36
## [3,] 31 34 39 46
## result.1 result.2 result.3 result.4
## [1,] 11 14 19 26
## [2,] 21 24 29 36
## [3,] 31 34 39 46
Again, simply change %do%
to %dopar%
to
execute in parallel.
Message Passing Interface: specification for an API for passing messages between different computers.
See here for details on using MPI on UB’s High Performance Computer Cluster.
Most parallel computing:
i=1:3
)%dopar%
).combine
)foreach
parameters.inorder
(true/false) results combined in the same
order that they were submitted?.errorhandling
(stop/remove/pass).packages
packages to made available to
sub-processes.export
variables to export to sub-processes# Load necessary libraries
library(multidplyr)
library(dplyr)
# Create a sample data frame
set.seed(42)
data <- tibble(
group = rep(1:4, each = 100000),
value = rnorm(400000)
) %>%
group_by(group)
# Start the cluster with 4 cores
cluster <- new_cluster(4)
# Split the data into groups and assign it to the cluster
cluster_copy(cluster, "data")
dplyr
to apply an operation in seriesmultidplyr
to apply a dplyr
operation
in parallelresult2 <- data %>%
partition(cluster = cluster) %>% # Partition the data into groups for parallel processing
mutate(
mean_value = mean(value), # Calculate mean of 'value' within each group
value_squared = value^2 # Square the 'value' column
) %>%
collect() # Combine the results from each partition
identical(result1,result2)
## [1] TRUE
Some functions in the raster package also easy to parallelize.
library(terra)
ncores=2
# Define the function to work with vectors
fn <- function(x) {
# Apply the calculation for each value in the vector 'x'
sapply(x, function(xi) mean(rnorm(1000, mean = xi, sd = abs(xi)) > 3))
}
r=rast(nrows=1e3,ncol=1e3) # make an empty raster
values(r)<-rnorm(ncell(r)) #fill it with random numbers
## 35.559 sec elapsed
## 23.406 sec elapsed
Each task should involve computationally-intensive work. If the tasks are very small, it can take longer to run in parallel.
Figure from here
A classification of computer architectures (Flynn, 1972)
Figure from here