Trick or Tips 002 {R}
November 12, 2017
R tips trickortips
base
utils
graphics
magrittr
raster
knitr
Table of Contents
Ever tumbled on a code chunk that made you say: "I should have known this f_ piece of code long ago!" Chances are you have, frustratingly, just like we have, and on multiple occasions too. In comes Trick or Tips!
Trick or Tips is a series of blog posts that each present 5 -- hopefully helpful -- coding tips for a specific programming language. Posts should be short (i.e. no more than 5 lines of code, max 80 characters per line, except when appropriate) and provide tips of many kind: a function, a way of combining of functions, a single argument, a note about the philosophy of the language and practical consequences, tricks to improve the way you code, good practices, etc.
Note that while some tips might be obvious for careful documentation readers (God bless them for their wisdom), we do our best to present what we find very useful and underestimated. By the way, there are undoubtedly similar initiatives on the web (e.g. "One R Tip a Day" Twitter account). Last, feel free to comment below tip ideas or a post of code tips of your own which we will be happy to incorporate to our next post.
Enjoy and get ready to frustratingly appreciate our tips!
The drop
argument of the []
operator
This is something not obvious and poorly known but there is a logical argument
drop
that can be passed to the []
operator and I’ll try to explain why it
could be useful! Let’s first create a data frame with ten rows and three
columns:
|
|
To extract the first column, I use the []
operator and either type the number
of the column like so:
|
|
or the name of the column to be extracted:
|
|
Interestingly enough, this returns a vector, not a data.frame
|
|
while if I extract two columns, I have a data frame:
|
|
This behavior is actually very useful in many cases as we often are happy to
deal with a vector when we extract only one column. However this might become
an issue when we do extractions without knowing the number of columns
to be extracted beforehand (typically when extracting according to a request
that can give any number of columns). In such case if the number is one then
we end up with a vector instead of a data.frame. The argument drop
provides
a work around. By default it is set to TRUE
and a 1-column data frame becomes
a vector, but using drop = FALSE
prevents this from happening. Let’s try this:
|
|
Let’s check its class:
|
|
You can actually obtain the same result using the name of the column or its number without comma (a data frame is a list of vector which have the same size, so you can basically subset the list!).
|
|
But if you need a specific selection of rows, you better use drop
!
|
|
Now you know ;-)
Get the citation of a package
Many researchers (it is especially TRUE in ecology) uses R and write paper and
carry out analyses using R for their research. One cones the time of citing the
package I guess they wonder how to cite the package. However authors of
package actually provides this information in their package! Let’s have a look
of the reference for the package knitr
as of version 1.17 using function
citation
|
|
As suggested in the message, we can even retrieve a reference list in bibtex
format with the toBibtex
function, let’s do this:
|
|
Even if you are no a Latex user, this could be very helpful as this file can be read by a references management software such as Zotero. So now let’s say I use the following command line:
|
|
Then the biblio.bib
file just created can be imported in your favorite
references manager software.
Using namespace
In R, functions are stored in packages and adding a package is like adding a collection of functions. As you get more experienced with R you likely know and use more and more packages. You might even come to the point where you have functions that have the same name but originate from different package. If not, let me show you something:
|
|
Here I use the function extract()
from the magrittr
package that act as []
and I extract the column var1
from df
. This function is actually designed
to be use with pipes (if this sounds weird, have a look at the
magrittr package), for instance when piping
you can write df %$% extract(var1)
or even df %>% '['('var1')
and this will
do the same. So far, so good. Now I load the
raster package:
|
|
and try the same extraction.
|
|
It does not work…Why?? Briefly, extract()
from raster
is now called (this
was the warning message on load said) and it does not get well with data.frame
(this is the meaning of the error message). To overcome this you can use a
explicit namespace. To do so you put the names of the package followed by ::
,
this is basically the unique identifier of the function. Indeed, within a
specific package, functions have different names and on CRAN packages must have
different names, so the combination of the two is unique (this holds true if you
only package from the CRAN). Let’s use it:
|
|
Using this is also very helpful when you develop a package and functions from different packages. Even if you script and use a large number of function from various packages, it could be better to remember from which package functions come from. Finally, note that this is not R specific at all, actually this something very common in programming languages.
How to use non-exported functions?
Packages often contain functions that are not exported. There are often functions
called by the functions exported that helps structuring the code of the package.
However, it happens that when you try to understand how a package work you may
want to spend some time understanding how they do work (especially given that
they are nit documented). There is actually a way to call them! Instead of using
tow colons (:
), use three! Let’s have a look to the code of one of this function
from the knitr
package (again version 1.17):
knitr:::.color.block
Interesting, isn’t it! To give you an idea about how frequent this can be, in this packages there are 103 exported functions and 425 not-exported. Below are presented few examples of exported functions followed by not-exported ones.
|
|
I think that this could be very helpful when you want to understand exactly how a package works!
The las
argument of par()
I really enjoy using graphics
to create plots in R. That being said the default
values always puzzles me! One I specially dislike is that values on the y-axis
are perpendicular…
|
|
Fortunately this can readily be changed using the the las
argument of the
par()
function which can take 3 values: 0 (default), 1 or 2. Let’s plot
and see the differences:
|
|
So, I personally prefer and use las=1
!
That’s all for number 2 of this series, see you for the next tips!
Display information relative to the R session used to render this post.
|
|
Edits
Apr 23, 2022 -- Fix typos.
May 24, 2022 -- Add session info section.
Feb 4, 2023 -- Edit headers.