如何得到错误的 R 脚本行号?

如果我从命令行(R --slave script.R)运行一个很长的 R 脚本,那么如何让它在出错时给出行号呢?

如果可能的话,我不想在脚本中添加调试命令; 我只是希望 R 的行为与大多数其他脚本语言一样。

41381 次浏览

这不会给你行号,但是它会告诉你在调用堆栈中故障发生的位置,这非常有帮助:

traceback()

[Edit:] When running a script from the command line you will have to skip one or two calls, see traceback() for interactive and non-interactive R sessions

我不知道还有什么方法可以避免常见的调试问题:

  1. debug()
  2. 浏览器()
  3. Options (error = Recovery)[后跟选项(error = NULL)以恢复它]

你可能想看看这篇相关的文章。

[Edit:] Sorry...just saw that you're running this from the command line. In that case I would suggest working with the options(error) functionality. Here's a simple example:

options(error = quote({dump.frames(to.file=TRUE); q()}))

您可以根据错误条件创建详细的脚本,因此您只需决定调试所需的信息。

否则,如果有您关心的特定领域(例如连接到数据库) ,那么将它们包装在 tryCatch ()函数中。

R 2.10及更高版本将对此提供支持。邓肯 · 默多克(Duncan Murdoch)刚刚在2009年9月10日的 r-devel 上发表了一篇关于 查找 LineNum 和 setBreapoint的文章:

我刚刚在 R-devel 中添加了一些函数来帮助实现 findLineNum()查找哪个函数的哪一行 对应于特定的源代码行; setBreakpoint()采用 findLineNum的输出,并调用 trace()来设置断点 那里。

它们依赖于代码中具有源引用调试信息。 这是 source()读取的代码的默认值,但不是包的默认值。 若要获取包代码中的源引用,请设置环境 变量 R_KEEP_PKG_SOURCE=yes,或在 R 内,设置 options(keep.source.pkgs=TRUE),然后从源代码安装包 阅读 ?findLineNum了解如何告诉它搜索的详细信息 而不是将搜索限制在全局范围内 环境。

For example,

x <- " f <- function(a, b) {
if (a > b)  {
a
} else {
b
}
}"




eval(parse(text=x))  # Normally you'd use source() to read a file...


findLineNum("<text>#3")   # <text> is a dummy filename used by
parse(text=)

This will print

 f step 2,3,2 in <environment: R_GlobalEnv>

你可以利用

setBreakpoint("<text>#3")

to set a breakpoint there.

There are still some limitations (and probably bugs) in the code; I'll 在修东西

Doing options(error=traceback) provides a little more information about the content of the lines leading up to the error. It causes a traceback to appear if there is an error, and for some errors it has the line number, prefixed by #. But it's hit or miss, many errors won't get line numbers.

你可以通过设置

options(show.error.locations = TRUE)

我只是想知道为什么这个设置不是 R 中的默认设置?应该是这样,就像其他语言一样。

指定用于处理非灾难性错误的全局 R 选项对我来说很有用,还有一个自定义的工作流程,用于保留错误信息,并在错误发生后检查这些信息。我目前正在运行 R 版本3.4.1。 下面,我将介绍一下我使用的工作流程,以及一些用于在 R 中设置全局错误处理选项的代码。

正如我所配置的,错误处理还创建了一个 RData 文件,其中包含发生错误时工作内存中的所有对象。这个转储可以使用 load()读回 R,然后可以使用 debugger(errorDump)交互式地检查发生错误时存在的各种环境。

我将注意到,我能够从堆栈中的任何自定义函数获得 traceback()输出中的行号,但只有在对脚本中使用的任何自定义函数调用 source()时使用 keep.source=TRUE选项时才能获得行号。如果没有这个选项,将全局错误处理选项设置为如下,将 traceback()的完整输出发送到一个名为 error.log的错误日志,但是行号不可用。

下面是我在工作流中采取的一般步骤,以及在非交互式 R 失败后如何访问内存转储和错误日志。

  1. 我将以下内容放在从命令行调用的主脚本的顶部。这将设置 R 会话的全局错误处理选项。我的主要剧本叫 myMainScript.R。代码中的各行后面都有注释来描述它们的作用。基本上,使用这个选项,当 R 遇到触发 stop()的错误时,它将创建一个 RData (* 。Rda)在目录 ~/myUsername/directoryForDump中的所有活动环境中转储工作内存文件,并将名为 error.log的错误日志和一些有用的信息写入同一目录。您可以修改这个代码片段来添加其他错误处理(例如,向转储文件添加时间戳和错误日志文件名等)。

    options(error = quote({
    setwd('~/myUsername/directoryForDump'); # Set working directory where you want the dump to go, since dump.frames() doesn't seem to accept absolute file paths.
    dump.frames("errorDump", to.file=TRUE, include.GlobalEnv=TRUE); # First dump to file; this dump is not accessible by the R session.
    sink(file="error.log"); # Specify sink file to redirect all output.
    dump.frames(); # Dump again to be able to retrieve error message and write to error log; this dump is accessible by the R session since not dumped to file.
    cat(attr(last.dump,"error.message")); # Print error message to file, along with simplified stack trace.
    cat('\nTraceback:');
    cat('\n');
    traceback(2); # Print full traceback of function calls with all parameters. The 2 passed to traceback omits the outermost two function calls.
    sink();
    q()}))
    
  2. Make sure that from the main script and any subsequent function calls, anytime a function is sourced, the option keep.source=TRUE is used. That is, to source a function, you would use source('~/path/to/myFunction.R', keep.source=TRUE). This is required for the traceback() output to contain line numbers. It looks like you may also be able to set this option globally using options( keep.source=TRUE ), but I have not tested this to see if it works. If you don't need line numbers, you can omit this option.

  3. From the terminal (outside R), call the main script in batch mode using Rscript myMainScript.R. This starts a new non-interactive R session and runs the script myMainScript.R. The code snippet given in step 1 that has been placed at the top of myMainScript.R sets the error handling option for the non-interactive R session.
  4. Encounter an error somewhere within the execution of myMainScript.R. This may be in the main script itself, or nested several functions deep. When the error is encountered, handling will be performed as specified in step 1, and the R session will terminate.
  5. An RData dump file named errorDump.rda and and error log named error.log are created in the directory specified by '~/myUsername/directoryForDump' in the global error handling option setting.
  6. At your leisure, inspect error.log to review information about the error, including the error message itself and the full stack trace leading to the error. Here's an example of the log that's generated on error; note the numbers after the # character are the line numbers of the error at various points in the call stack:

    Error in callNonExistFunc() : could not find function "callNonExistFunc"
    Calls: test_multi_commodity_flow_cmd -> getExtendedConfigDF -> extendConfigDF
    
    
    Traceback:
    3: extendConfigDF(info_df, data_dir = user_dir, dlevel = dlevel) at test_multi_commodity_flow.R#304
    2: getExtendedConfigDF(config_file_path, out_dir, dlevel) at test_multi_commodity_flow.R#352
    1: test_multi_commodity_flow_cmd(config_file_path = config_file_path,
    spot_file_path = spot_file_path, forward_file_path = forward_file_path,
    data_dir = "../", user_dir = "Output", sim_type = "spot",
    sim_scheme = "shape", sim_gran = "hourly", sim_adjust = "raw",
    nsim = 5, start_date = "2017-07-01", end_date = "2017-12-31",
    compute_averages = opt$compute_averages, compute_shapes = opt$compute_shapes,
    overwrite = opt$overwrite, nmonths = opt$nmonths, forward_regime = opt$fregime,
    ltfv_ratio = opt$ltfv_ratio, method = opt$method, dlevel = 0)
    
  7. At your leisure, you may load errorDump.rda into an interactive R session using load('~/path/to/errorDump.rda'). Once loaded, call debugger(errorDump) to browse all R objects in memory in any of the active environments. See the R help on debugger() for more info.

This workflow is enormously helpful when running R in some type of production environment where you have non-interactive R sessions being initiated at the command line and you want information retained about unexpected errors. The ability to dump memory to a file you can use to inspect working memory at the time of the error, along with having the line numbers of the error in the call stack, facilitate speedy post-mortem debugging of what caused the error.

首先是 options(show.error.locations = TRUE),然后是 traceback()。错误行号将显示在 # 之后