Trick or tips 004 {R}

August 13, 2019

Trick or Tips

Ever tumbled on a code chunk that made you say: "I should have known this f_ piece of code long ago!" Chances are you have, frustratingly, just like we have, and on multiple occasions too. In comes Trick or Tips!

Trick or Tips is a series of blog posts that each present 5 -- hopefully helpful -- coding tips for a specific programming language. Posts should be short (i.e. no more than 5 lines of code, max 80 characters per line, except when appropriate) and provide tips of many kind: a function, a way of combining of functions, a single argument, a note about the philosophy of the language and practical consequences, tricks to improve the way you code, good practices, etc.

Note that while some tips might be obvious for careful documentation readers (God bless them for their wisdom), we do our best to present what we find very useful and underestimated. By the way, there are undoubtedly similar initiatives on the web (e.g. "One R Tip a Day" Twitter account). Last, feel free to comment below tip ideas or a post of code tips of your own which we will be happy to incorporate to our next post.

Enjoy and get ready to frustratingly appreciate our tips!

Subset an array with a matrix

Let’s consider two arrays of letters: the first has two dimensions (i.e. a matrix) and the second one has 3.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23


(arr2 <- array(LETTERS[1:9], dim = c(3,3)))
#R>       [,1] [,2] [,3]
#R>  [1,] "A"  "D"  "G" 
#R>  [2,] "B"  "E"  "H" 
#R>  [3,] "C"  "F"  "I"
class(arr2)
#R>  [1] "matrix" "array"
(arr3 <- array(LETTERS[1:18], dim = c(3, 3, 2)))
#R>  , , 1
#R>  
#R>       [,1] [,2] [,3]
#R>  [1,] "A"  "D"  "G" 
#R>  [2,] "B"  "E"  "H" 
#R>  [3,] "C"  "F"  "I" 
#R>  
#R>  , , 2
#R>  
#R>       [,1] [,2] [,3]
#R>  [1,] "J"  "M"  "P" 
#R>  [2,] "K"  "N"  "Q" 
#R>  [3,] "L"  "O"  "R"
class(arr3)
#R>  [1] "array"

Let’s say, you need to subset a specific set of values based on the position of the elements. To subset a single element, say "G", there are a couple of options, but I guess the most common approach is to use [ with one value per dimension:

1
2
3
4


arr2[1,3]
#R>  [1] "G"
arr3[1,3,1]
#R>  [1] "G"

or with a single value giving the position of the element:

1
2
3
4


arr2[7]
#R>  [1] "G"
arr3[7]
#R>  [1] "G"

Now we consider the case where you have a vector of positions (one value per dimension of the array), in this case, beware the orientation of the vector!

1
2
3
4
5
6


# with the line below, we get the 1rst and 3rd elements because we're using a column vector
arr2[c(1,3)]
#R>  [1] "A" "C"
# whereas with a row vector, we obtain the element of the 1rst row and the 3rd column
arr2[t(c(1,3))]
#R>  [1] "G"

And for more than one element, you need to use a matrix with one row per element to be subset:

1
2
3
4
5
6


(mat <- rbind(c(1,3), c(2,2)))
#R>       [,1] [,2]
#R>  [1,]    1    3
#R>  [2,]    2    2
arr2[mat]
#R>  [1] "G" "E"

Similarly, with an array of 3 dimensions, the matrix will have three columns and as many row as there are elements:

1
2
3
4
5
6
7
8
9


# Let us subset `E`,`C` and `O` and `C` (again)
(msub <- rbind(c(2,2,1), c(3, 1, 1), c(3, 2, 2), c(3, 1, 1)))
#R>       [,1] [,2] [,3]
#R>  [1,]    2    2    1
#R>  [2,]    3    1    1
#R>  [3,]    3    2    2
#R>  [4,]    3    1    1
arr3[msub]
#R>  [1] "E" "C" "O" "C"

Two additional comments. First, we should always keep in mind that data frames and arrays are different:

1
2
3
4
5
6
7
8
9


# this gives you the 1rst and 3rd **entire columns**
as.data.frame(arr2)[c(1,3)]
#R>    V1 V3
#R>  1  A  G
#R>  2  B  H
#R>  3  C  I
# this still gives you the element on the 1rst row and the 3rd column
as.data.frame(arr2)[t(c(1,3))]
#R>  [1] "G"

Second, if you are a tidyverse user, there is a new article dealing with subassigment with tibble 😎.

`nzchar()`

You may already be aware of nchar(), a function that returns the number of characters of a given character vector:

1
2


nchar(c("insil", "eco", ""))
#R>  [1] 5 3 0

nzchar() returns TRUE for every character string in the vector that has at least 1 character:

1
2
3
4
5
6


vec <- c("insil", "eco", "")
nzchar(vec)
#R>  [1]  TRUE  TRUE FALSE
# is there any empty character string in `vec`?
any(nzchar(vec))
#R>  [1] TRUE

Interesting, but let’s dig deeper: I can think about no less than 3 ways of writing a equivalent function with one more character:

1
2
3
4
5
6


nchar(vec) > 0
#R>  [1]  TRUE  TRUE FALSE
!! nchar(vec)
#R>  [1]  TRUE  TRUE FALSE
nchar(vec) & 1
#R>  [1]  TRUE  TRUE FALSE

One more character… so why bother? 💡 It should be a matter of performance! Let’s check that out with the cool 📦 microbenchmark:

1
2
3
4
5
6
7
8
9


library(microbenchmark)
microbenchmark(nchar(vec) > 0, !! nchar(vec), nchar(vec) & 1, nzchar(vec),
  times = 1000L)
#R>  Unit: nanoseconds
#R>             expr min   lq     mean median   uq   max neval
#R>   nchar(vec) > 0 871  912  965.792    932  962  2164  1000
#R>     !!nchar(vec) 942  992 1034.221   1012 1032  1973  1000
#R>   nchar(vec) & 1 962 1012 1096.261   1032 1072 10620  1000
#R>      nzchar(vec)  70   81   93.850     90  100   531  1000

Yep yep!nzchar() is indeed way faster 🚀!

Do you need to use `return()`?

If you have already written your own function, you must have used return() to specify what your function should return. There are programming languages where this instruction is mandatory, not in R! Check out the documentation ?return:

If the end of a function is reached without calling ‘return’, the value of the last evaluated expression is returned.

Let me write 2 functions:

1
2
3
4
5
6


add_v <- function(x, y) {
   x + y
}
add_v2 <- function(x, y) {
   return(x + y)
}

add_v() and add_v2() are equivalent! So… do we care? Well, you must bear in mind that whenever return() is encountered, the evaluation of the set of expressions within the function is stopped and therefore some time can be saved:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19


foo <- function(x) {
    out <- 0
    if (x > 3) out <- 3
    if (x > 2) out <- 2
    if (x > 1) out <- 1
    return(out)
}
foo2 <- function(x) {
  out <- 0
  if (x > 3) return(3)
  if (x > 2) return(2)
  if (x > 1) return(1)
  return(out)
}
microbenchmark(foo(4), foo2(4), times = 1e5)
#R>  Unit: nanoseconds
#R>      expr min  lq     mean median  uq      max neval
#R>    foo(4) 441 471 593.3889    481 501  3820331 1e+05
#R>   foo2(4) 330 361 543.5766    380 391 12679347 1e+05

`invisible()`

Let’s keep talking about what functions return. The function invisible() allows you to return an invisible copy of an object, meaning that nothing is (apparently) return if not assigned:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


add_v <- function(x, y) {
  x + y
}
##
add_i <- function(x, y) {
  invisible(x + y)
}
add_v(2, 3)
#R>  [1] 5
add_i(2, 3)
res <- add_i(2, 3)
res
#R>  [1] 5

But… why? As explained in the documentation (?invisible):

This function can be useful when it is desired to have functions return values which can be assigned, but which do not print when they are not assigned.

This is indeed helpful when you have a function that creates a plot (and you don’t normally to assign the result) for which you sometimes need to use an object that was created during the evaluation of the function:

1
2
3
4
5
6
7


plot_logy <- function(x, y) {
  # create ty
  ty <- log10(y + 1)
  plot(x, ty)
  invisible(ty)
}
plot_logy(0:10, 0:10)

1
2


# get ty
ty <- plot_logy(0:10, 0:10)

1
2
3


ty
#R>   [1] 0.0000000 0.3010300 0.4771213 0.6020600 0.6989700 0.7781513 0.8450980 0.9030900 0.9542425
#R>  [10] 1.0000000 1.0413927

`bquote()` and `substitute()`

When using mathematical annotations, we sometimes need to include the value of a variable. In such case, bquote() or substitute() are the functions you would need (rather than expression() you may already be familiar with).

If you opt for bquote(), then variables to be evaluated must be put in brackets and preceded by a dot, e.g. .(var). If you choose substitute(), then variables evaluated will be the ones included in the list passed as argument env (which can also be the name of a environment).

Let’s use both functions in to add mathematical expressions in an empty plot:

1
2
3
4
5


delta <- 1.5
plot(c(0,1), c(0,1), type = "n", axes = FALSE, ann = FALSE)
text(0.5, .75, labels = bquote(beta^j == .(delta) + bold("h")), cex = 3)
text(0.5, .25, labels = substitute(alpha[i] == a + delta, env = list(a = 2)), cex = 3)
box()

1
2


print(path_root)
#R>  [1] "/home/runner/work/inSileco.github.io/inSileco.github.io"

That’s all folks!

Display information relative to the R session used to render this post.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35


sessionInfo()
#R>  R version 4.5.0 (2025-04-11)
#R>  Platform: x86_64-pc-linux-gnu
#R>  Running under: Ubuntu 24.04.2 LTS
#R>  
#R>  Matrix products: default
#R>  BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#R>  LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#R>  
#R>  locale:
#R>   [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8    
#R>   [5] LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8    LC_PAPER=C.UTF-8       LC_NAME=C             
#R>   [9] LC_ADDRESS=C           LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
#R>  
#R>  time zone: UTC
#R>  tzcode source: system (glibc)
#R>  
#R>  attached base packages:
#R>  [1] stats     graphics  grDevices utils     datasets  methods   base     
#R>  
#R>  other attached packages:
#R>  [1] microbenchmark_1.5.0 inSilecoRef_0.1.1   
#R>  
#R>  loaded via a namespace (and not attached):
#R>   [1] sass_0.4.10       generics_0.1.4    xml2_1.3.8        blogdown_1.21     stringi_1.8.7    
#R>   [6] httpcode_0.3.0    digest_0.6.37     magrittr_2.0.3    evaluate_1.0.3    bookdown_0.43    
#R>  [11] fastmap_1.2.0     plyr_1.8.9        jsonlite_2.0.0    backports_1.5.0   crul_1.5.0       
#R>  [16] promises_1.3.3    bibtex_0.5.1      jquerylib_0.1.4   cli_3.6.5         shiny_1.10.0     
#R>  [21] rlang_1.1.6       cachem_1.1.0      yaml_2.3.10       tools_4.5.0       dplyr_1.1.4      
#R>  [26] httpuv_1.6.16     DT_0.33           rcrossref_1.2.0   curl_6.3.0        vctrs_0.6.5      
#R>  [31] R6_2.6.1          mime_0.13         lifecycle_1.0.4   stringr_1.5.1     fs_1.6.6         
#R>  [36] htmlwidgets_1.6.4 miniUI_0.1.2      pkgconfig_2.0.3   pillar_1.10.2     bslib_0.9.0      
#R>  [41] later_1.4.2       glue_1.8.0        Rcpp_1.0.14       xfun_0.52         tibble_3.3.0     
#R>  [46] tidyselect_1.2.1  knitr_1.50        xtable_1.8-4      htmltools_0.5.8.1 rmarkdown_2.29   
#R>  [51] compiler_4.5.0

Edits

Feb 6, 2023 -- Remove redundant headers.

Trick or tips 004 {R}

August 13, 2019

Table of Contents

Subset an array with a matrix

`nzchar()`

Do you need to use `return()`?

`invisible()`

`bquote()` and `substitute()`

That’s all folks!

Edits

Comments on this post

Related posts

Creating a monorepo of R packages with GitHub November 23, 2023

A few thoughts about pipes in R August 25, 2023

Trick or Tips 005 {R} February 12, 2023

Trick or tips 004 {R}

August 13, 2019

Table of Contents

Subset an array with a matrix

nzchar()

Do you need to use return()?

invisible()

bquote() and substitute()

That’s all folks!

Edits

Comments on this post

Related posts

Creating a monorepo of R packages with GitHub November 23, 2023

A few thoughts about pipes in R August 25, 2023

Trick or Tips 005 {R} February 12, 2023

`nzchar()`

Do you need to use `return()`?

`invisible()`

`bquote()` and `substitute()`