Introduction to annotater

annotater came to be while teaching workshops or helping peers and realizing that many issues relate to package installation failures and dependency issues for packages that were not even used in a problematic script. Scripts get passed around, code is copied and pasted, and we might not know what certain packages are for. Additionally, it is often useful to specify a package’s source and version within a script, for reproducibility purposes and to keep a record of where any of the packages can be obtained.

Package functions

This package works around a suite of functions that match package load calls (i.e. library and require) in a character string with one line per element, and replace them with annotated versions. As of version 0.2.0, loading packages with pacman::p_load is also supported.

First, match_pkg_names produces a tibble of package load calls and package names.

pkgs_string <- c("library(boot)\nrequire(Matrix)")
match_pkg_names(pkgs_string)
#> # A tibble: 2 × 3
#>   call            package_name pkgname_clean
#>   <chr>           <chr>        <chr>        
#> 1 library(boot)   boot         boot         
#> 2 require(Matrix) Matrix       Matrix

The values in this tabular output are then passed to utils::packageDescription, which parses and returns the ‘DESCRIPTION’ file of an installed package. Fields of interest from these descriptions are then used to build the annotations.

The ‘Title’ field from a package’s description makes for a good summary of what it does, so annotate_pkg_calls uses it to build annotations. These titles are inserted after each package load call, separated by a commenting symbol.

pkgs_string <- c("library(boot)\nrequire(tools)")
annotate_pkg_calls(pkgs_string)
#> [1] "library(boot) # Bootstrap Functions (Originally by Angelo Canty for S)\nrequire(tools) # Tools for Package Development"

A similar approach is used by annotate_repo_source to paste the repository source and version number.

pkgs_string <- c("library(boot)\nrequire(lattice)")
annotate_repo_source(pkgs_string)
#> [1] "library(boot)    # CRAN v1.3-31\nrequire(lattice) # CRAN v0.22-5"

Titles, repositories and version numbers can be annotated together in the output from annotate_repostitle.

pkgs_string <- c("library(boot)\nrequire(lattice)")
annotate_repostitle(pkgs_string)
#> [1] "library(boot) # Bootstrap Functions (Originally by Angelo Canty for S) CRAN v1.3-31\nrequire(lattice) # Trellis Graphics for R CRAN v0.22-5"

To annotate which functions are being called from each loaded package, use annotate_fun_calls.

testcode <- c('library(purrr) 
x <- list("a", 1, c("bo","bi","bu"))
pluck(x, 1)
map(x, pluck, 2)')
annotate_fun_calls(testcode)
#> [1] "library(purrr) # pluck map \nx <- list(\"a\", 1, c(\"bo\",\"bi\",\"bu\"))\npluck(x, 1)\nmap(x, pluck, 2)"

To annotate which datasets are being called from each loaded package, use annotate_pkg_datasets.

testcode <- c('library(tidyr) 
summary(household)
plot(fish_encounters)')
annotate_pkg_datasets(testcode)
#> [1] "library(tidyr) # fish_encounters household \nsummary(household)\nplot(fish_encounters)"

A note on the tidyverse and other metapackages

The tidyverse package is a meta-package with few exported functions of its own, so the annotation tools provided here (annotate_fun_calls) will not match the functions from the various individual packages (such as dplyr or readr) that get attached when loading tidyverse. As of version 0.2.2, load calls for metapackages can be replaced with separate calls to each of the core metapackage components as defined by their respective attach.R files.

Usage in RStudio

These main package functions can be called through their respective RStudio addins, written to work on the active .R or .Rmd file open in the Source pane.

Annotate package calls in active file

Describes the packages being loaded by calling annotate_pkg_calls.

Annotate package repository sources in active file

Adds the source and version by calling annotate_repo_source. This addin aligns the commenting symbols vertically for aesthetic purposes.

Annotate titles and repository sources in active file

Adds titles, sources, and versions by calling annotate_repostitles.

Annotate each package’s function calls

Adds all the unique functions being called by each loaded package, calls annotate_fun_calls.

Annotate each package’s datasets

Adds all the datasets being called by each loaded package, calls annotate_pkg_datasets. Works for lazy loaded data, namespaced data, or objects called with data.

Expand metapackages

Replace a call to a metapackage with multiple calls to each of its core components.