下面是一个脚本,它可以找到你已经加载但没有在脚本中使用的包。它需要在一个干净的会话中运行,因为没有办法验证你当前会话的状态是否与脚本创建的状态相同。它假设包只使用library或require加载,这是一个很好的实践。我没有广泛地测试它,但似乎相当合理。 注解中解释了代码是如何工作的,这是一个有趣的练习,完全用base R编写,这样它本身就不必加载任何包。 使用getParseData作为起点的想法来自Eric Green's answer to this related question
# Define the file to test in the line below. That is the only per-run configuration needed.
fileToTest <- "Plot.R"
# Get the parse data for the file
parseData <- getParseData(parse(fileToTest), includeText = TRUE)
# Extract all the function calls and keep a unique list of them.
functionCalls <- unique(parseData[parseData$token == "SYMBOL_FUNCTION_CALL", "text"])
# Look for any calls to `library` or `require` and go two steps up the
# call tree to find the complete call (with arguments).
libraryCalls <- parseData[parseData$token == "SYMBOL_FUNCTION_CALL" & parseData$text %in% c("library", "require"),]
libraryCalls <- parseData[parseData$id %in% libraryCalls$parent,]
libraryCalls <- parseData[parseData$id %in% libraryCalls$parent,]
libraryCalls <- libraryCalls$text
# Execute all the library/require calls to attach them to this session
eval(parse(text = libraryCalls))
# For each function called,
# * Use `getAnywhere` to find out where it is found. That information is in a character
# vector which is the `where` component of the returned list.
# * From that vector of locations, keep only the ones starting with "package:",
# getting rid of those starting with "namespace:".
# * Take the first one of these which sould be the first package that the
# function is found in and thus would be the one used.
names(functionCalls) <- functionCalls
matchPkg <- vapply(functionCalls,
FUN = (\(f) grep("^package:", getAnywhere(f)$where, value = TRUE)[1]),
FUN.VALUE = character(1))
# get a list of all packages from the search path, keep only those that are
# actually packages (not .GlobalEnv, Autoloads, etc.), ignore those that are
# automatically attached (base, methods, datasets, utils, grDevices, graphics, stats),
# and then see of those which ones did not show up in the list of packages used
# by the functions.
packages <- search()
packages <- grep("^package:", packages, value = TRUE)
packages <- setdiff(packages, c("package:base", "package:methods", "package:datasets", "package:utils", "package:grDevices", "package:graphics", "package:stats"))
packages <- setdiff(packages, unique(matchPkg))
# Report results
if(length(packages) > 0) {
cat("Unused packages: \n");
print(packages)
} else {
cat("No unused packages found.\n")
}
library("stringr")
script_path = "/path/to/your/script.R"
load_command_pattern <- "library\\(\"[a-z,0-9]+\"\\)"
text <- readChar(script_path, file.info(script_path)$size)
pck <- str_extract_all(text, pattern = load_command_pattern)
# Find all instances where packages are loaded
packages <- list()
for(i in 1:length(pck[[1]])){
p = pck[[1]][i]
name <- str_extract(gsub("library", "", p), "[a-z,0-9]+")
packages <- append(packages, name, after = length(packages))
}
# Load packages
for(i in 1:length(packages)){
p <- packages[[i]]
library(packages[[i]], character.only = TRUE)
}
# Make a list to store packages from which no function is called
remove <- list()
for(i in 1:length(packages)){
p <- packages[[i]]
# list all functions contained in the package
funs <- ls(paste0("package:", p))
# add an opening bracket to make sure to only find functions, not comments etc.
functions <- paste0(funs, "\\(")
# for every function in the package, check whether its name appears in the script
in_script <- mapply(grepl, functions, text)
# if none of the functions are contained in the script, add the package to the list
if(!any(in_script)){
remove <- append(remove, p)
}
}
# Remove loading commands for all packages
for(i in 1:length(remove)){
to_remove <- paste0("library\\(\"",remove[[i]] , "\"\\)")
text = gsub(to_remove, "", text)
}
# Save output (to a new file! Don't overwrite your existing script without testing!)
sink(file = "/path/to/your/new_script.R")
cat(gsub("\\r", "", text))
sink()
3条答案
按热度按时间yks3o0rb1#
下面是一个脚本,它可以找到你已经加载但没有在脚本中使用的包。它需要在一个干净的会话中运行,因为没有办法验证你当前会话的状态是否与脚本创建的状态相同。它假设包只使用
library
或require
加载,这是一个很好的实践。我没有广泛地测试它,但似乎相当合理。注解中解释了代码是如何工作的,这是一个有趣的练习,完全用base R编写,这样它本身就不必加载任何包。
使用
getParseData
作为起点的想法来自Eric Green's answer to this related question4zcjmb1e2#
如果你通过
library
或require
附加库,那么搜索你的代码是最容易的。如果你调用库而没有通过<library>::<export>
语法附加它们,那么搜索::
。如果你担心传递依赖或者只是想创建一个可复制的环境,请查看packrat包:http://rstudio.github.io/packrat/lfapxunr3#
这不是特别漂亮或高效,但它应该做的工作(在大多数情况下):
注意,我假设您使用
library("package_name")
加载包,您可能需要调整regex模式。代码应该做什么:
1.阅读文本中的R脚本
1.查找加载包的所有示例。在本例中,我专门搜索调用
library(...)
。在这里,我们提取包名,假设它只包含字符和数字。1.加载程序包,并列出其中包含的函数。如果在脚本中找不到任何函数,请将程序包名称附加到要删除的程序包列表中。
1.替换加载不必要的包的所有示例。(您也可以删除换行符。)
1.将脚本文本写入新文件。检查输出是否与预期相符,并测试新脚本。
请注意,这并不完美(例如,具有相似名称的函数可能出现在多个包中。此外,当前不区分完整函数名称匹配和函数名称结尾的匹配(搜索
my_function(
将给予another_my_function(
的假肯定。您可以添加附加检查以查看函数名称前面是否存在符号、换行符或空格)。然而,我认为代码应该适用于大多数情况。当然,如果你在脚本开始时加载了所有的包,你可以手动创建一个已加载包的列表,同样的,你也可以打印出未使用包的列表,然后手动删除它们。