#### Renjin: put some Java in your R

##### Quick look at using R with some Java 'under the hood'

`R`

`data science`

`statistics`

`computing`

R is a great, highly flexible language for statistical computing, but it does suffer greatly from performance issues. As I’ve steadily increase my use of R, I quickly became aware that I would have to one day learn to integrate R with a programming language with better performance, the main choice here being C++. To integrate R with C++, the `Rcpp`

framework (and R package) was created, allowing for parts of the R code of a given package or project to be re-written in C++ and easily integrated with R. Using `Rcpp`

comes with great advantages in terms of R code performance; however, it obviously requires that one learn C++. I was about to devote a great deal of time to doing so, when – fortuitously – I came across the rather new `renjin`

project. Renjin is a new (in-development) interpreter for GNU R that relies on the Java Virtual Machine (JVM) to enhance R’s performance. The idea seems to be that it can eventually serve as a drop-in replacement for GNU R. It seems that the `renjin`

R package can be used to provide performance gains via interfacing with the JVM, just by wrapping standard R code.

## Minimal example

For now, I just thought I would try the example from the `renjin`

R package documentation, more involved examples might be added to this post later or come in separate blog posts of their own. Here we go:

Let’s make sure we have the newest version of Renjin:

```
if (!require(renjin)) {
install.packages("https://nexus.bedatadriven.com/content/groups/public/org/renjin/renjin-gnur-package/0.8.2404/renjin-gnur-package-0.8.2404.tar.gz")
}
```

`## Loading required package: renjin`

`library(renjin)`

Let’s define a function to simply add by iteration:

```
bigsum <- function(n) {
sum <- 0
for(i in seq(from = 1, to = n)) {
sum <- sum + i
}
sum
}
```

We can improve the speed of this function by pre-compiling it to bytecode using R’s native bytecode compiler. We’d expect this to save us some time relative to the naive implementation.

`bigsumc <- compiler::cmpfun(bigsum) # GNU R's byte code compiler`

Alright, now we’re ready to compare the performances of the naive and bytecode-compiled implementations:

```
time_norm <- system.time(bigsum(1e7))
time_comp <- system.time(bigsumc(1e7))
```

Notice that directly using R’s native bytecode compiler improves the performance of our `bigsum`

function quite a bit – that is, considering the time the system spends on the computation, we save about 0 seconds, (roughly) a factor of 1. Maybe `renjin`

can help us out even more?

`time_renjin <- system.time(renjin(bigsum(1e7)))`

`print(table)`

```
## user system total
## naive 0.422 0.014 0.441
## cmpfun 0.430 0.008 0.444
## renjin 0.430 0.024 0.241
```

Wow – just, wow. The gain in computational efficiency here is incredible! Using `renjin`

– even just as a wrapper – improves the time cost (on the system side) by a factor of 1 relative to the naive implementation and by quite a bit still (**a factor of 2**) when compared to the bytecode-compiled version of our function. This was just a simple example, but we were able to save so much computational time just by naively calling `renjin`

…and it took just a few extra characters to call it as a wrapper…

Although Renjin is still in its infancy, I can’t help but be excited for the future of R – and statistical computing in general – with how well its already performing. We’re going to be able to (try to) do great things with these new tools ✨

#### Related

#### sl3: Machine Learning Pipelines for R

##### Simplifying machine learning in R through pipelines

`R`

`data science`

`machine learning`

`computing`

#### Taking blogdown for a test drive

##### Trying out RStudio's new blogging framework

`R`

`data science`

`tools`

`productivity`