Using different/modified version of the same package

Sometimes a package on CRAN doesn’t behave quite as you would like. Maybe a change introduced broke your workflow so you want an older version, maybe you want to add additional functionality and the maintainer is uninterested in incorporating your changes which may not fit the scope or they are simply inactive and inaccessible. Now you have a version of the package that you need that conflicts with CRAN.

To make my life easier in a case like this I have written a small function forkCRAN (code below). So I can simply do

# read my github token for authentication
token = readLines('~/auth/gh_token')
forkCRAN('ontologyIndex', version = '2.5', token=token, newname = 'ontologyIndexOgfork')

and have a this github repository that contains ontologyIndex, version 2.5 with the name ontologyIndexOgfork. I can use this repository like any other github package

devtools::install_github('oganm/ontologyIndexOgfork')
library(ontologyIndexOgfork)

In this particular case, the original ontologyIndex package could not parse cyclic ontologies. I have introduced some changes so that it can parse such ontologies and my current pipeline depends on this change. Having this version allows me to write unambigious code that still can be replicated by third parties, especially since including remote package dependencies is fairly easy when using remotes/devtools family of install functions.

You can find forkCRAN function here. Can be safely pasted elsewhere since it doesn’t depend on anything else in the package it’s in.

Note that this isn’t entirely safe to do. Especially when all you want to do is to use an older version. The code of the package will be preserved but it’s dependencies will continue to move forward in time. There are more robust solutions out there for long term reproducibility.

Another problem is that this doesn’t always behave nicely with pacakges that require compilation. In these cases the package name seems to be hard coded into the code. I initially noticed this when I tried to get a copy of the glue package. For that particular case, all it took to fix it was to change the useDynLib commands in the NAMESPACE and roxygen comments to use the new package. This change now happens automatically when the function is called but I have failed to make a generalized solution. dplyr package for instance will still fail to work.