A Future for batchtools
Henrik Bengtsson
Source:vignettes/future.batchtools.md.rsp
future.batchtools.Rmd
TL;DR
Here is an example on how evaluate R expression on a Slurm high-performance compute (HPC) cluster.
library(future)
# Limit runtime to 10 minutes and memory to 400 MiB per future,
# request a parallel environment with four slots on a single host.
# On this system, R is available via environment module 'r'. By
# specifying 'r/4.5.1', 'module load r/4.5.1' will be added to
# the submitted job script.
plan(future.batchtools::batchtools_slurm, resources = list(
time = "00:10:00", mem = "400M", nodes = 1, ntasks = 4,
modules = c("r/4.5.1")
))
# Give it a spin
f <- future({
data.frame(
hostname = Sys.info()[["nodename"]],
os = Sys.info()[["sysname"]],
cores = unname(parallelly::availableCores()),
modules = Sys.getenv("LOADEDMODULES")
)
})
info <- value(f)
print(info)
#> hostname os cores modules
#> 1 n12 Linux 4 r/4.5.1
Introduction
The future package provides a generic API for using futures in R. A future is a simple yet powerful mechanism to evaluate an R expression and retrieve its value at some point in time. Futures can be resolved in many different ways depending on which strategy is used. There are various types of synchronous and asynchronous futures to choose from in the future package.
This package, future.batchtools, provides a type of futures that utilizes the batchtools package. This means that any type of backend that the batchtools package supports can be used as a future. More specifically, future.batchtools will allow you or users of your package to leverage the compute power of high-performance computing (HPC) clusters via a simple switch in settings - without having to change any code at all.
For instance, if batchtools is properly configured, the below two expressions for two futures will be processed on two different compute nodes:
library(future)
plan(future.batchtools::batchtools_slurm)
f_x <- future({ Sys.sleep(5); 3.14 })
f_y <- future({ Sys.sleep(5); 2.71 })
x <- value(f_x)
y <- value(f_y)
x + y
#> [1] 5.85
This is just a toy example to illustrate what futures look like and how to work with them.
For an introduction as well as full details on how to use futures, please see https://www.futureverse.org or consult the package vignettes of the future package.
Demos
The future package provides a demo using futures for calculating a set of Mandelbrot planes. The demo does not assume anything about what type of futures are used. The user has full control of how futures are evaluated. For instance, to use local batchtools futures, run the demo as:
library(future)
plan(future.batchtools::batchtools_local)
demo("mandelbrot", package = "future", ask = FALSE)
Available batchtools backend
The future.batchtools package implements a generic future wrapper for all batchtools backends. Below are the most common types of batchtools backends. For other types of parallel and distributed backends, please see https://www.futureverse.org/backends.html.
Backend | Description | Alternative in future package |
---|---|---|
batchtools_lsf |
Futures are evaluated via a Load Sharing Facility (LSF) job scheduler | N/A |
batchtools_openlava |
Futures are evaluated via an OpenLava job scheduler | N/A |
batchtools_sge |
Futures are evaluated via a Sun/Son of/Oracle/Univa/Altair Grid Engine (SGE) job scheduler | N/A |
batchtools_slurm |
Futures are evaluated via a Slurm job scheduler | N/A |
batchtools_torque |
Futures are evaluated via a TORQUE / PBS job scheduler | N/A |
batchtools_custom |
Futures are evaluated via a custom batchtools configuration R script or via a set of cluster functions | N/A |
batchtools_multicore |
parallel evaluation by forking the current R process | plan(multicore) |
batchtools_local |
sequential evaluation in a separate R process (on current machine) | plan(cluster, workers = I(1)) |