Creating an R package (on a Mac, in 10 mins)

Posted by

Once you know how to do it, creating your own R package is a snip – and it’s well worth the effort as it protects your ‘helper’ functions from your cleaning instinct … i myself am over-prone to:

 
rm(list=ls()) 

Which also cleans out the functions that live in the global environment. The solution? Stuff said functions into a namespace, so they do not get cleaned out when you clean up.

To create an R package, we will spend a little time in R setting up the package (R commands will be preceded by `R>`), and a little time in the shell (I use BASH; BASH commands will be preceded by `$`).

Our package will be very simple: we will package our combined head and tail method (described here, improved with thanks to stackoverflow) and quicker-quit (described here).

So, here we go (set the timer):

1/ fire up R …

2/ load the functions you want to put into the package into your R session. I’m going to assume that you will type them in – but you might source() them from some file (see below for to do that):

R> ht <- function(d, m=5, n=m){
  # print the head and tail together
  list=NULL
  list[[paste0('HEAD #', m)]] <- head(d,m)
  list[[paste0('TAIL #', n)]] <- tail(d,n)
  return(list)
}

R> qw <- function(save='no', ...){
  # quit without saving the workspace
  quit(save=save, ...)
}

If you have the functions in a separate file (a good idea if you are going to grow this package), you’d save your fingers and load the functions using the source command:

R> source("~/R/RAhelper.r")

3/ Tell R that you want to put these functions into a package – using the `package.skeleton` function. At this point, you must pick the name of your package (i’m going for ‘RApack’) and specify the path (if you don’t have an ‘~/R/wd/raRlibrary/’ folder this will fail … create one, or change the destination to something that suits you better … such as “~/R/suitsMeBetter/”).

R> package.skeleton(name='RApack', list=c('ht', 'qw'), path='~/R/wd/raRlibrary/')

If you have a clean R session with just these functions in it, there is a neat trick that will save your fingers. You are probably aware that `ls()` returns all the objects in the specified environment – there is an equivalent that returns only functions, it is `lsf.str()`.

If you have been following along, the only functions in your environment will be `ht` and `qw` – so we can just dump all the functions into the build.

Like the source() trick above, this will trick will reward you when you have a larger package – as you can use this command to put all the functions loaded via the source command (and any others in the environment) into your package:

package.skeleton(name='RApack', list=lsf.str(), path='~/R/wd/raRlibrary')

If all has gone well, it’ll return:

Creating directories …
Creating DESCRIPTION …
Creating NAMESPACE …
Creating Read-and-delete-me …
Saving functions and data …
Making help files …
Done.
Further steps are described in ‘~/R/wd/raRlibrary/RApack/Read-and-delete-me’.

4/ Now we edit the documentation. If you are *REALLY* in a rush, you can get around this step by deleting a few files … but even though they are your own functions, I recommend that you document them (what’s obvious now may not be in a few years time).

If you want to delete them, do the following:

$ cd ~/R/wd/raRlibrary/RApack
$ rm man Read-and-delete-me

I think you should document your work … so do the following:

$ cd ~/R/wd/raRlibrary/RApack
$ vi Read-and-delete-me
$ rm Read-and-delete-me

okay, read it? now you can delete it.

We will follow the instructions step-by-step … (If you don’t know how to get out of vi, type `:q!`, and see this vi/vim intro, because I’m going to assume you can edit using vim)

a/ editing ‘man’:


$ cd man
$ vi Read-and-delete-me

Now edit is so that it looks something like this (note, I’ve shown the tabs and carriage-return markers):

\name{RApack-package}
\alias{RApack-package}
\alias{RApack}
\docType{package}
\title{
a home for miscellanious R functions
}
\description{
A head+tail method; qwick quitting; and later I'll add some POSIXct helpers ... 
}
\details{
\tabular{ll}{
Package: \tab RApack\cr
Type: \tab Package\cr
Version: \tab 0.1\cr
Date: \tab 2012-07-22\cr
License: \tab GPL (&gt;=3)\cr
}

}
\author{
Maintainer: Ricardo &lt;ricardianambivalence@gmail.com&gt;
}
\references{
~~ i didn't invent any of this - except the bugs, they are all mine.
}

Next, edit the function docs. First we’ll edit `ht`:

$ vi ht.Rd

make it look something like this:

\name{ht}
\alias{ht}
\title{
A function to reveal the top and bottom of a data-set 
}
\description{
==> returns head(d,m) + tail(d,n) -- plus some formatting.
}
\usage{
ht(d, m = 5, n=m)
}
\arguments{
  \item{d}{
    the data you want to inspect 
}
  \item{m}{
    the number of rows to use in head(d,m):
    defaults to m=5 if unspecified. 

}
  \item{n}{
    the number of rows to use in tail(d,n):
    defaults to n=m if unspecified. 
}
}
\value{
 the return value is a list containing the head and tail, with HEAD #m and TAIL #n the slot names. 
}
\references{
 https://www.stackoverflow.com/questions/11600391/combining-head-and-tail-methods-in-r 
}

\examples{
x <- 1:100
ht(x, m=4, n=2)

## The function is currently defined as:

function (d, m = 5, n = m) 
{
    list >- NULL
    list[[paste0('HEAD #', m)]] <- head(d,m)
    list[[paste0('TAIL #', n)]] <- tail(d,n)
    return(list)
  }
}

% Add one or more standard keywords, see file 'KEYWORDS' in the
% R documentation directory.
\keyword{ utilities }
\keyword{ misc }

Finally, we’ll edit `qw`:

$ vi qw.Rd

make it look something like this:


\name{qw}
\alias{qw}
\title{
qwick quit - without saving the workspace
}
\description{
wrapper for quit('no')
}
\usage{
qw(save = &quot;no&quot;, ...)
}
\arguments{
  \item{save}{
argument to save workspace - defaults to 'no'
}
  \item{\dots}{
...
}
}
\value{
quits R without saving the workspace - OR asking if you'd like it saved. 
}
\references{

}
\author{
Ricardo
}
\note{
careful with that qwick-quit - it WILL delete you workspace!
}

\examples{
# exit R without saving your workspace
qw() 

}
% Add one or more standard keywords, see file 'KEYWORDS' in the
% R documentation directory.
\keyword{ misc }
\keyword{ utilities }% __ONLY ONE__ keyword per line

and you are done editing ‘man’.

so go back up one directory


$ cd ..

b/ edit NAMESPACE:

you can probably skip this, but it’s worth covering.

$ vi NAMESPACE

when vi opens, you’ll see:

exportPattern("^[[:alpha:]]+")

What does this mean? It means export all functions that have names beginning with a letter. To see this, do the following in R:

R> lLn <- c(letters, LETTERS, 1:26)
R> grep("^[[:alpha:]]+", lLn, value=TRUE)

Note that lLn is 78 elements long, and but that the `grep` command returns only the first 52 items (as the final 26 are not of type `alpha`).

if you only wanted to export a subset of the functions, or felt like typing them all out, you’d replace the regex with:

R> export('ht', 'qw')

finally, we must edit the DESCRIPTION file.

$ vi DESCRIPTION

I made mine look like this:

Package: RApack
Type: Package
Title: Misc utility functions
Version: 0.1
Date: 2012-07-22
Author: Ricardo 
Maintainer: Ricardo &lt;ricardianambivalence@gmail.com&gt;
Description: qwick-quit, head_tail 
License: GPL-3

c/ we don’t have any C/C++/Fortran code …

d/ we have not compiled our code, so we can leave useDynLib() out.

e/ running R CMD build (note we invoke –vanilla, to stop your tweaks from loading and messing things up):

I’m told that we cannot direct the output of this step, so we need to move to our desired location before we build.

$ cd ~/R/wd/raRpackages
$ R --vanilla CMD build ~/R/wd/raRlibrary/RApack

which should return the following:

* checking for file ‘~/R/wd/raRlibrary/RApack/DESCRIPTION’ … OK
* preparing ‘RApack’:
* checking DESCRIPTION meta-information … OK
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* building ‘RApack_0.1.tar.gz’

and drop a .tar.gz file in your current working directory (you can check with `$ ls`).

f/ R CMD check(ing): note, you require latex for this step.

$ R --vanilla CMD check RApack_0.1.tar.gz

This will output lots of tests, each of which should be followed with … OK

Now for the good bit — we install our package:

$ R CMD INSTALL -l ~/R/wd/raRpackages ~/R/wd/raRpackages/RApack_0.1.tar.gz

See R CMD INSTALL –help for details: the logic is R CMD [destination] … you can install it somewhere else as follows

$ R CMD INSTALL -l ~/R/otherdir/ ~/R/wd/raRpackages/RApack_0.1.tar.gz

or from R:

R> install.packages(pkgs='~/R/wd/raRpackages/RApack_0.1.tar.gz', type='source', repos=NULL)

now you should be able to load R and attach your package. If not, you may need to add the folder you R CMD INSTALLed into to your library path.

I’ll assume we need to do so, then attach with `require(RApack)` and finally check that it’s on the search path with search()

R> .libPaths("~/R/wd/raRpackages/")
R> require(RApack)
R> search()

… you should see “package:RApack” amongst the returned packages on the search-path. Try typing in then names of the function, without brackets, to check that their definitions copied across faithfully

R> qw
function (save = 'no', ...)
{
  quit(save = save, ...)
}
<environment: namespace:RApack>

And now they won’t go away when you rm(list=ls()) …

To get rid of it, try:

R> detach(name=package:RApack)