Most (legacy) software is written for serial computation:
 
 
Figure from here
 
Figure from here
A classification of computer architectures (Flynn, 1972)
 
 
Figure from here
Rscript
There are many R packages for parallelization, check out the CRAN Task View on High-Performance and Parallel Computing for an overview. For example:
ForEach PackageThe foreach package has numerous advantages including:
for() loop-like syntaxmulticore, parallel, snow,
Rmpi, etc.)fork on linux/mac
and snow on windows).for loopsfor()for loops## [1] 1 4 9foreach()## [[1]]
## [1] 1
## 
## [[2]]
## [1] 4
## 
## [[3]]
## [1] 9x is a list with one element for each iterator variable
(i). You can also specify a function to use to combine the
outputs with .combine. Let’s concatenate the results into a
vector with c.
foreach() loop with
.combine## [1] 1 4 9Tells foreach() to first calculate each iteration, then
.combine them with a c(...)
foreach() loop with
.combine##          [,1]
## result.1    1
## result.2    4
## result.3    9x <- seq(-8, 8, by=0.2)
v <- foreach(y=x, .combine="cbind") %do% {
    r <- sqrt(x^2 + y^2)
    sin(r) / r 
}
persp(x, x, v)
foreach() loopSo far we’ve used %do% which uses a single
processor.
Must register a parallel backend with one of the
do* functions. On most multicore systems, the easiest
backend is typically doParallel(). On linux and mac, it
uses fork system call and on Windows machines it uses
snow backend. The nice thing is it chooses automatically
for the system.
registerDoParallel(3) # register specified number of workers
#registerDoParallel() # or, reserve all all available cores 
getDoParWorkers() # check registered cores## [1] 3foreach() loopTo run in parallel, simply change the %do% to
%dopar%. Wasn’t that easy?
## [1] 1 4 9## 0.002 sec elapsed## 0.011 sec elapsed## 9.019 sec elapsed## 3.018 sec elapsed# Example task: Simulate a large number of random normal distributions and calculate their means
num_simulations <- 1000
sample_size <- 1e6  # Size of each random sample
# sequential foreach loop
tic()
results <- foreach(i = 1:num_simulations, .combine = 'c') %do% {
  # Generate a random sample and calculate the mean
  sample_data <- rnorm(sample_size, mean = 0, sd = 1)
  mean(sample_data)
}
toc()## 31.091 sec elapsed# Parallel foreach loop
tic()
results <- foreach(i = 1:num_simulations, .combine = 'c') %dopar% {
  # Generate a random sample and calculate the mean
  sample_data <- rnorm(sample_size, mean = 0, sd = 1)
  mean(sample_data)
}
toc()## 10.672 sec elapsedExample from the foreach vignette
avec = 1:3
bvec = 1:4
sim <- function(a, b)  # example function
  10 * a + b ^ 2
# use a standard nested for() loop:
x <- matrix(0, length(avec), length(bvec))
for (j in 1:length(bvec)) {
  for (i in 1:length(avec)) {
    x[i, j] <- sim(avec[i], bvec[j])
  }
}
x##      [,1] [,2] [,3] [,4]
## [1,]   11   14   19   26
## [2,]   21   24   29   36
## [3,]   31   34   39   46##      result.1 result.2 result.3 result.4
## [1,]       11       14       19       26
## [2,]       21       24       29       36
## [3,]       31       34       39       46Again, simply change %do% to %dopar% to
execute in parallel.
Message Passing Interface: specification for an API for passing messages between different computers.
See here for details on using MPI on UB’s High Performance Computer Cluster.
Most parallel computing:
i=1:3)%dopar%).combine)foreach parameters.inorder (true/false) results combined in the same
order that they were submitted?.errorhandling (stop/remove/pass).packages packages to made available to
sub-processes.export variables to export to sub-processes# Load necessary libraries
library(multidplyr)
library(dplyr)
# Create a sample data frame
set.seed(42)
data <- tibble(
  group = rep(1:4, each = 100000),
  value = rnorm(400000)
) %>% 
  group_by(group)
# Start the cluster with 4 cores
cluster <- new_cluster(4)
# Split the data into groups and assign it to the cluster
cluster_copy(cluster, "data")dplyr to apply an operation in seriesmultidplyr to apply a dplyr operation
in parallelresult2 <- data %>%
  partition(cluster = cluster) %>%   # Partition the data into groups for parallel processing
  mutate(
    mean_value = mean(value),   # Calculate mean of 'value' within each group
    value_squared = value^2     # Square the 'value' column
  ) %>%
  collect()  # Combine the results from each partition
identical(result1,result2)## [1] TRUESome functions in the raster package also easy to parallelize.
library(terra)
ncores=2
# Define the function to work with vectors
fn <- function(x) {
  # Apply the calculation for each value in the vector 'x'
  sapply(x, function(xi) mean(rnorm(1000, mean = xi, sd = abs(xi)) > 3))
}
r=rast(nrows=1e3,ncol=1e3) # make an empty raster
values(r)<-rnorm(ncell(r)) #fill it with random numbers## 35.559 sec elapsed## 23.406 sec elapsedEach task should involve computationally-intensive work. If the tasks are very small, it can take longer to run in parallel.
 
Figure from here
A classification of computer architectures (Flynn, 1972)
 
 
Figure from here