从 data.table 中删除多个列

从 data.table 中删除多个列的正确方法是什么?我目前正在使用下面的代码,但是当我不小心重复了其中一个列名时,出现了意想不到的行为。我不确定这是一个错误,还是我不应该这样删除柱子。

library(data.table)
DT <- data.table(x = letters, y = letters, z = letters)
DT[ ,c("x","y") := NULL]
names(DT)
[1] "z"

以上工作很好,但

DT <- data.table(x = letters, y = letters, z = letters)
DT[ ,c("x","x") := NULL]
names(DT)
[1] "z"
45573 次浏览

This looks like a solid, reproducible bug. It's been filed as Bug #2791.

It appears that repeating the column attempts to delete the subsequent columns.
If no columns remain, then R crashes.


UPDATE : Now fixed in v1.8.11. From NEWS :

Assigning to the same column twice in the same query is now an error rather than a crash in some circumstances; e.g., DT[,c("B","B"):=NULL] (delete by reference the same column twice). Thanks to Ricardo (#2751) and matt_k (#2791) for reporting. Tests added.

This Q has been answered but regard this as a side note.

I prefer the following syntax to drop multiple columns

DT[ ,`:=`(x = NULL, y = NULL)]

because it matches the one to add multiple columns (variables)

DT[ ,`:=`(x = letters, y = "Male")]

This also check for duplicated column names. So trying to drop x twice will throw an error message.