使用 dplyr 将列移到最后

对于带有 N列的 data.frame,我希望能够将一列从 1-(n-1)的任何位置移动到第 n 列(即非最后一列是最后一列)。我也想这样做使用 dplyr。我希望这样做,而不是简单地输入所有列的名称。

例如:

data<-data.frame(a=1:5, b=6:10, c=11:15)

这是有效的,但不是 dplyr的方式:

data[,c(colnames(data)[colnames(data)!='b'],'b')]

这是 dplyr制作第一列 b的方法:

data%>%select(b, everything())

但这并不能使 b列持续下去:

data%>%select(everything(), b)

这样可以,但是需要我输入所有的列:

data%>%select(a,c,b)

那么有没有一种优雅的 dplyr 方法来实现这一点呢?

相关问题:

42205 次浏览

We can either use

data %>%
select(-one_of('b'), one_of('b'))
#  a  c  b
#1 1 11  6
#2 2 12  7
#3 3 13  8
#4 4 14  9
#5 5 15 10

Or

data %>%
select(matches("[^b]"), matches("b"))

or with the select_

data %>%
select_(.dots = c(setdiff(names(.), 'b'), 'b'))
#  a  c  b
#1 1 11  6
#2 2 12  7
#3 3 13  8
#4 4 14  9
#5 5 15 10

Since there's no ready-made solution to this in dplyr you could define your own little function to do it for you:

move_last <- function(DF, last_col) {
match(c(setdiff(names(DF), last_col), last_col), names(DF))
}

You can then use it easily in a normal select call:

mtcars %>% select(move_last(., "mpg")) %>% head()

You can also move multiple columns to the end:

mtcars %>% select(move_last(., c("mpg", "cyl"))) %>% head()

And you can still supply other arguments to select, for example to remove a column:

mtcars %>% select(move_last(., "mpg"), -carb) %>% head()

After some tinkering, the following works and requires very little typing.

data %>% select(-b,b)


UPDATE: dplyr 1.0.0

dplyr 1.0.0 introduces the relocate verb:

data %>% relocate(b, .after = last_col())

I still prefer the old "hacky" way.

Update:

dplyr::relocate, a new verb introduced in dplyr 1.0.0, is now my preferred solution, since it is explicit about what you are doing, you can continue to pick variables using tidyselect helpers, and you can specify exactly where to put the columns with .before or .after

data %>% relocate(b, .after = last_col()) (same as dule arnaux's update)

Original answer

data%>%select(-b,everything())

will move variable b to the end.

This is because a negative variable in the first position of select elicits a special behavior from select(), which is to insert all the variables. Then it removes b, and then it gets added back with the everything() part.

Explained by Hadley himself: https://github.com/tidyverse/dplyr/issues/2838

Also see this other answer for other examples of how to move some columns to the end and other columns to the beginning: How does dplyr's select helper function everything() differ from copying?

df <- df[, c(which(colnames(df) != "YourColumnName"), which(colnames(df) == "YourColumnName"))]