quick()
can accelerate any R function, with some restrictions:
- Function arguments must have their types and shapes declared using
declare()
.- Only atomic vectors, matrices, and array are currently supported:
integer
,double
,logical
, andcomplex
.- The return value must be an atomic array (e.g., not a list)
- Named variables must have consistent shapes throughout their lifetimes.
NA
values are not supported.- Only a subset of R’s vocabulary is currently supported.
The author finds that such R function can be sped up 50-200 times by translating them to Fortran. Here is an example of what the transpiled code looks like. Running the R script
library(quickr)
# R function to be transpiled
slow_viterbi <- function(observations, states, initial_probs, transition_probs, emission_probs) {
declare(
type(observations = integer(num_steps)),
type(states = integer(num_states)),
type(initial_probs = double(num_states)),
type(transition_probs = double(num_states, num_states)),
type(emission_probs = double(num_states, num_obs)),
)
trellis <- matrix(0, nrow = length(states), ncol = length(observations))
backpointer <- matrix(0L, nrow = length(states), ncol = length(observations))
trellis[, 1] <- initial_probs * emission_probs[, observations[1]]
for (step in 2:length(observations)) {
for (current_state in 1:length(states)) {
probabilities <- trellis[, step - 1] * transition_probs[, current_state]
trellis[current_state, step] <- max(probabilities) * emission_probs[current_state, observations[step]]
backpointer[current_state, step] <- which.max(probabilities)
}
}
path <- integer(length(observations))
path[length(observations)] <- which.max(trellis[, length(observations)])
for (step in seq(length(observations) - 1, 1)) {
path[step] <- backpointer[path[step + 1], step + 1]
}
out <- states[path]
out
}
cat(quickr:::r2f((slow_viterbi)), "\n") # print the Fortran code
gives a Fortran subroutine
subroutine anonymous_function(observations, states, initial_probs, transition_probs, emission_probs, out, emission_probs__dim_2_, &
observations__len_, states__len_) bind(c)
use iso_c_binding, only: c_double, c_int, c_ptrdiff_t
implicit none
! manifest start
! sizes
integer(c_ptrdiff_t), intent(in), value :: observations__len_
integer(c_ptrdiff_t), intent(in), value :: states__len_
integer(c_int), intent(in), value :: emission_probs__dim_2_
! args
integer(c_int), intent(in) :: observations(observations__len_)
integer(c_int), intent(in) :: states(states__len_)
real(c_double), intent(in) :: initial_probs(states__len_)
real(c_double), intent(in) :: transition_probs(states__len_, states__len_)
real(c_double), intent(in) :: emission_probs(states__len_, emission_probs__dim_2_)
integer(c_int), intent(out) :: out(observations__len_)
! locals
integer(c_int) :: current_state
integer(c_int) :: step
integer(c_int) :: backpointer(states__len_, observations__len_)
real(c_double) :: trellis(states__len_, observations__len_)
integer(c_int) :: path(observations__len_)
real(c_double) :: probabilities(states__len_)
! manifest end
trellis = 0.0_c_double
backpointer = 0_c_int
trellis(:, 1_c_int) = (initial_probs * emission_probs(:, observations(1_c_int)))
do step = 2_c_int, size(observations), sign(1, size(observations)-2_c_int)
do current_state = 1_c_int, size(states), sign(1, size(states)-1_c_int)
probabilities = (trellis(:, (step - 1_c_int)) * transition_probs(:, current_state))
trellis(current_state, step) = (maxval(probabilities) * emission_probs(current_state, observations(step)))
backpointer(current_state, step) = maxloc(probabilities, 1)
end do
end do
path = 0
path(size(observations)) = maxloc(trellis(:, size(observations)), 1)
do step = (size(observations) - 1_c_int), 1_c_int, sign(1, 1_c_int-(size(observations) - 1_c_int))
path(step) = backpointer(path((step + 1_c_int)), (step + 1_c_int))
end do
out = states(path)
end subroutine
that the author says is about 50 times faster. There is a vast amount of R code out there, with base R having much functionality, and 22,000 packages on CRAN. It would be great if a large fraction could be automatically translated to Fortran, although R functions typically do not have a declare
block as the function above does. R and Fortran have some similarities, both supporting 1-based multidimensional arrays with operations on whole arrays and array sections.
Another package of the author is
This R package provides
.ModernFortran()
, an interface similar to.Fortran()
but for Fortran 2018.The 2018 Fortran language standard expanded support for interfacing C and Fortran. One of the additions is the introduction of C descriptors, a data structure for passing arrays between C and Fortran. This package takes advantage of that.
In contrast with
.Fortran
, R arrays are not passed as naked pointers, but as C descriptors that contain information about rank, shape, element size, type, and memory stride of the array that the Fortran routine can access directly. This means that additional arguments for passing the size or rank of arrays are no longer needed, which should lead to cleaner, simpler Fortran code. Additionally, logical and raw types are now supported directly.