在运行 install.package ()之前检查已安装的包

我有一个 R 脚本,它与不同计算机上的几个用户共享。其中一行包含 install.packages("xtable")命令。

问题是,每次有人运行脚本时,R 显然要花费大量时间重新安装软件包(实际上确实需要一些时间,因为实际情况有多个软件包的矢量)。

我怎样才能首先检查软件包是否已经安装,然后只对那些没有安装的软件包运行 install.packages()

213847 次浏览

为什么不直接从脚本中删除这一行呢?如果最终用户不具备根据需要安装 xtable的智能,那么就会遇到更大的问题:。 也就是说,看看 installed.packages()

编辑: 该死,忍者被抢先了一分钟!

编辑: 一个一般性的建议: 加载软件包 sos,你会发现很容易得到很多“是否有一个函数可以做 XXXXX”的问题的答案。

尝试: require("xtable")"xtable" %in% rownames(installed.packages())

这是一个我经常用来检查软件包的函数,否则安装它,然后重新加载:

pkgTest <- function(x)
{
if (!require(x,character.only = TRUE))
{
install.packages(x,dep=TRUE)
if(!require(x,character.only = TRUE)) stop("Package not found")
}
}

就像 pkgTest("xtable")。它只在镜像被设置的情况下工作,但是您可以在 require调用中输入它。

这样就可以了。如果你需要检查不止一个的话,你可以把 required.packages变成一个矢量。

required.packages <- "data.table"
new.packages <- required.packages[!(required.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)

我使用的解决方案来自 Sacha Epskamp 和曙光的输入,函数如下:

instalaPacotes <- function(pacote) {
if (!pacote %in% installed.packages()) install.packages(pacote)
}

它静默地工作,如果包“ pacote”已经安装并以其他方式安装,则不回显任何内容。不要忘记在引号之间写包的名称!

# Function to check whether package is installed
is.installed <- function(mypkg){
is.element(mypkg, installed.packages()[,1])
}


# check if package "hydroGOF" is installed
if (!is.installed("hydroGOF")){
install.packages("hydroGOF")
}

试试这个怎么样?

#will install the pROC library if you don't have it
if(!is.element('pROC', installed.packages()[,1]))
{install.packages('pROC')
}else {print("pROC library already installed")}

或者来自 github,glibrary 上的 drknexus/repsych的一个极其夸张的例子。几乎可以肯定,有更有效和更好的方法来做到这一点,但我编程了很长一段时间回来,它基本上工作。

  • 即使没有选择回购,也可以通过默认的云选项(如果有的话)来实现。如果您使用的是较老版本的 R,它将回滚并根据国家代码选择一个镜像。
  • 它尝试加载库(使用上面的一些方法可以使这个步骤更有效率)
    • 如果失败,它将尝试安装它
    • 如果安装失败,它会通知您哪些软件包安装失败
  • 这是正确的,包,多个包可以装载/安装在一个单一的传递连同它们的依赖关系(至少通常,这里可能有一个错误)。

例如: glibrary(xtable,sos,data.table),但是我不认为如果你调用 glibrary("xtable","sos","data.table")它会发疯。

函数代码:

#' Try to load a library, if that fails, install it, then load it.
#'
#' glibrary short for (get)library.
#' The primary aim of this function is to make loading packages more transparent.  Given that we know we want to load a given package, actually fetching it is a formality.  glibrary skims past this formality to install the requested package.
#'
#' @export
#' @param ... comma seperated package names
#' @param lib.loc See \code{\link{require}}
#' @param quietly See \code{\link{require}}
#' @param warn.conflicts See \code{\link{require}}
#' @param pickmirror If TRUE, glibrary allows the user to select the mirror, otherwise it auto-selects on the basis of the country code
#' @param countrycode This option is ignored and the first mirror with the substring "Cloud", e.g. the RStudio cloud, is selected.  If no mirrors with that substring are identified, glibrary compares this value to results from getCRANmirrors() to select a mirror in the specified country.
#' @return logical; TRUE if glibrary was a success, an error if a package failed to load
#' @note keep.source was an arguement to require that was deprecated in R 2.15
#' @note This warning \code{Warning in install.packages: InternetOpenUrl failed: 'The operation timed out'} indicates that the randomly selected repository is not available.  Check your internet connection.  If your internet connection is fine, set pickmirror=TRUE and manually select an operational mirror.
#' @examples
#' #glibrary(lattice,MASS) #not run to prevent needless dependency
glibrary <- function(..., lib.loc = NULL, quietly = FALSE, warn.conflicts = TRUE, pickmirror = FALSE, countrycode = "us") {
warningHandle <- function(w) {
if (grepl("there is no package called",w$message,fixed=TRUE)) {
return(FALSE) #not-loadable
} else {
return(TRUE) #loadable
}
}


character.only <- TRUE  #this value is locked to TRUE so that the function passes the character value to require and not the variable name thislib
librarynames <- unlist(lapply(as.list(substitute(.(...)))[-1],as.character))
#if package already loaded, remove it from librarynames before processing further
si.res <- sessionInfo()
cur.loaded <- c(si.res$basePkgs,names(si.res$otherPkgs)) #removed names(si.res$loadedOnly) because those are loaded, but not attached, so glibrary does need to handle them.
librarynames <- librarynames[librarynames %!in% cur.loaded]
success <- vector("logical", length(librarynames))
if (length(success)==0) {return(invisible(TRUE))} #everything already loaded, end.


alreadyInstalled <- installed.packages()[,"Package"]
needToInstall <- !librarynames %in% alreadyInstalled


if (any(needToInstall)) {
if (pickmirror) {chooseCRANmirror()}
if (getOption("repos")[["CRAN"]] == "@CRAN@") {
#Select the first "Cloud" if available
m <- getCRANmirrors(all = FALSE, local.only = FALSE)
URL <- m[grepl("Cloud",m$Name),"URL"][1] #get the first repos with "cloud" in the name
if (is.na(URL)) { #if we did not find the cloud,
#Fall back and use the previous method
message("\nIn repsych:glibrary:  Now randomly selecting a CRAN mirror. You may reselect your CRAN mirror with chooseCRANmirror().\n")
#if there is no repository set pick a random one by country code
getCRANmirrors.res <- getCRANmirrors()
foundone <- FALSE  #have we found a CRAN mirror yet?
#is it a valid country code?
if (!countrycode %in% getCRANmirrors.res$CountryCode) {
stop("In repsych::glibrary:  Invalid countrycode argument")
}
ticker <- 0
while (!foundone) {
ticker <- ticker + 1
URL <- getCRANmirrors.res$URL[sample(grep(countrycode, getCRANmirrors.res$CountryCode), 1)]
host.list <- strsplit(URL, "/")
host.clean <- unlist(lapply(host.list, FUN = function(x) {return(x[3])}))
#make sure we can actually access the package list
if (nrow(available.packages(contrib.url(URL)))!=0) {foundone <- TRUE}
if (ticker > 5) {stop("In repsych::glibrary:  Unable to access valid repository.  Is the internet connection working?")}
} #end while
} #end else
repos <- getOption("repos")
repos["CRAN"] <- gsub("/$", "", URL[1L])
options(repos = repos)
} #done setting CRAN mirror
#installing packages
installResults <- sapply(librarynames[needToInstall],install.packages)
#checking for successful install
needToInstall <- !librarynames %in% installed.packages()[,"Package"]
if (any(needToInstall)) {
stop(paste("In repsych::glibrary: Could not download and/or install: ",paste(librarynames[needToInstall],collapse=", "),"... glibrary stopped.",sep=""))
} # done reporting any failure to install
} #done if any needed to install


#message("In repsych::glibrary:  Attempting to load requested packages...\n")
#success <- tryCatch(
success <- sapply(librarynames,require, lib.loc = lib.loc, quietly = FALSE, warn.conflicts = warn.conflicts, character.only = TRUE)
#, warning=warningHandle) #end tryCatch
if(length(success) != length(librarynames)) {stop("A package failed to return a success in glibrary.")}




if (all(success)) {
#message("In repsych::glibrary:  Success!")
return(invisible(TRUE))
} else {
stop(paste("\nIn repsych::glibrary, unable to load: ", paste(librarynames[!success]),
collapse = " "))
}
stop("A problem occured in glibrary") #shouldn't get this far down, all returns should be made.
}
NULL

我在某个地方发现了一个 packages脚本,我总是把它放在每个加载库的脚本中。它将完成所有库处理(下载、安装和加载) ,并且只在需要的时候。

# Install function for packages
packages<-function(x){
x<-as.character(match.call()[[2]])
if (!require(x,character.only=TRUE)){
install.packages(pkgs=x,repos="http://cran.r-project.org")
require(x,character.only=TRUE)
}
}
packages(ggplot2)
packages(reshape2)
packages(plyr)
# etc etc

阅读每个人的回复,我采取了一些暗示这里和那里,并创建了我的。其实非常相似,但大多数。

## These codes are used for installing packages
# function for installing needed packages
installpkg <- function(x){
if(x %in% rownames(installed.packages())==FALSE) {
if(x %in% rownames(available.packages())==FALSE) {
paste(x,"is not a valid package - please check again...")
} else {
install.packages(x)
}


} else {
paste(x,"package already installed...")
}
}


# install necessary packages
required_packages  <- c("sqldf","car")
lapply(required_packages,installpkg)

如果你想尽可能的简单:

packages <- c("ggplot2", "dplyr", "Hmisc", "lme4", "arm", "lattice", "lavaan")


install.packages(setdiff(packages, rownames(installed.packages())))

将第一行中列出的包替换为运行代码所需的包,瞧!

注意: 编辑删除条件包装感谢 Artem 的评论如下。

看看我的旧功能,更新它使用提示以上,这是我得到了什么。

# VERSION 1.0
assign("installP", function(pckgs){
ins <- function(pckg, mc){
add <- paste(c(" ", rep("-", mc+1-nchar(pckg)), " "), collapse = "");
if( !require(pckg,character.only=TRUE) ){
reps <- c("http://lib.stat.cmu.edu/R/CRAN","http://cran.uk.R-project.org");
for (r in reps) try(utils::install.packages(pckg, repos=r), silent=TRUE);
if(!require(pckg,character.only = TRUE)){   cat("Package: ",pckg,add,"not found.\n",sep="");
}else{                                      cat("Package: ",pckg,add,"installed.\n",sep="");}
}else{                                          cat("Package: ",pckg,add,"is loaded.\n",sep=""); } }
invisible(suppressMessages(suppressWarnings(lapply(pckgs,ins, mc=max(nchar(pckgs)))))); cat("\n");
}, envir=as.environment("dg_base"))


installP(c("base","a","TFX"))
Package: base ------------------- is loaded.
Package: a ---------------------- not found.
Package: TFX -------------------- installed.

我已经实现了这个函数,可以静默地安装和加载所需的 R 包。希望能有所帮助。下面是代码:

# Function to Install and Load R Packages
Install_And_Load <- function(Required_Packages)
{
Remaining_Packages <- Required_Packages[!(Required_Packages %in% installed.packages()[,"Package"])];


if(length(Remaining_Packages))
{
install.packages(Remaining_Packages);
}
for(package_name in Required_Packages)
{
library(package_name,character.only=TRUE,quietly=TRUE);
}
}


# Specify the list of required packages to be installed and load
Required_Packages=c("ggplot2", "Rcpp");


# Call the Function
Install_And_Load(Required_Packages);

还有 CRAN 软件包 吃豆人,它具有 p_load功能,可以安装一个或多个软件包(但仅在必要时) ,然后加载它们。

我建议使用 system.file提供一种更轻量级的解决方案。

is_inst <- function(pkg) {
nzchar(system.file(package = pkg))
}


is_inst2 <- function(pkg) {
pkg %in% rownames(installed.packages())
}


library(microbenchmark)
microbenchmark(is_inst("aaa"), is_inst2("aaa"))
## Unit: microseconds
##            expr      min        lq       mean    median       uq       max neval
##  is_inst("aaa")   22.284   24.6335   42.84806   34.6815   47.566   252.568   100
## is_inst2("aaa") 1099.334 1220.5510 1778.57019 1401.5095 1829.973 17653.148   100
microbenchmark(is_inst("ggplot2"), is_inst2("ggplot2"))
## Unit: microseconds
##                expr      min       lq     mean   median       uq      max neval
##  is_inst("ggplot2")  336.845  386.660  459.243  431.710  483.474  867.637   100
## is_inst2("ggplot2") 1144.613 1276.847 1507.355 1410.054 1656.557 2747.508   100
requiredPackages = c('plyr','ggplot2','ggtern')
for(p in requiredPackages){
if(!require(p,character.only = TRUE)) install.packages(p)
library(p,character.only = TRUE)
}