将管道运算符% >% 与诸如 colname() <-之类的替换函数一起使用

如何使用管道操作符管道到替换功能,如 colnames()<-

这就是我要做的:

library(dplyr)
averages_df <-
group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp))
colnames(averages_df) <- c("cyl", "disp_mean", "hp_mean")
averages_df


# Source: local data frame [3 x 3]
#
#   cyl disp_mean   hp_mean
# 1   4  105.1364  82.63636
# 2   6  183.3143 122.28571
# 3   8  353.1000 209.21429

但理想的情况是:

averages_df <-
group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
add_colnames(c("cyl", "disp_mean", "hp_mean"))

有没有办法不用每次都编写一个特殊函数就可以做到这一点?

这里的答案是一个开始,但不完全是我的问题: 在 dplyr 中链接算术运算符

66430 次浏览

You could use colnames<- or setNames (thanks to @David Arenburg)

group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
`colnames<-`(c("cyl", "disp_mean", "hp_mean"))
# or
# `names<-`(c("cyl", "disp_mean", "hp_mean"))
# setNames(., c("cyl", "disp_mean", "hp_mean"))


#   cyl disp_mean   hp_mean
# 1   4  105.1364  82.63636
# 2   6  183.3143 122.28571
# 3   8  353.1000 209.21429

Or pick an Alias (set_colnames) from magrittr:

library(magrittr)
group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
set_colnames(c("cyl", "disp_mean", "hp_mean"))

dplyr::rename may be more convenient if you are only (re)naming a few out of many columns (it requires writing both the old and the new name; see @Richard Scriven's answer)

In dplyr, there are a couple different ways to rename the columns.

One is to use the rename() function. In this example you'd need to back-tick the names created by summarise(), since they are expressions.

group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
rename(disp_mean = `mean(disp)`, hp_mean = `mean(hp)`)
#   cyl disp_mean   hp_mean
# 1   4  105.1364  82.63636
# 2   6  183.3143 122.28571
# 3   8  353.1000 209.21429

You could also use select(). This is a bit easier because we can use the column number, eliminating the need to mess around with back-ticks.

group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
select(1, disp_mean = 2, hp_mean = 3)

But for this example, the best way would be to do what @thelatemail mentioned in the comments, and that is to go back one step and name the columns in summarise().

group_by(mtcars, cyl) %>%
summarise(disp_mean = mean(disp), hp_mean = mean(hp))

We can add a suffix to the summarised variables by using .funs argument of summarise_at with dplyr as below code.

library(dplyr)


# summarise_at with dplyr
mtcars %>%
group_by(cyl) %>%
summarise_at(
.cols = c("disp", "hp"),
.funs = c(mean="mean")
)
# A tibble: 3 × 3
# cyl disp_mean   hp_mean
# <dbl>     <dbl>     <dbl>
# 1     4  105.1364  82.63636
# 2     6  183.3143 122.28571
# 3     8  353.1000 209.21429

Also, we can set column names in several ways.

# set_names with magrittr
mtcars %>%
group_by(cyl) %>%
summarise(mean(disp), mean(hp)) %>%
magrittr::set_names(c("cyl", "disp_mean", "hp_mean"))


# set_names with purrr
mtcars %>%
group_by(cyl) %>%
summarise(mean(disp), mean(hp)) %>%
purrr::set_names(c("cyl", "disp_mean", "hp_mean"))


# setNames with stats
mtcars %>%
group_by(cyl) %>%
summarise(mean(disp), mean(hp)) %>%
stats::setNames(c("cyl", "disp_mean", "hp_mean"))


# A tibble: 3 × 3
# cyl disp_mean   hp_mean
# <dbl>     <dbl>     <dbl>
# 1     4  105.1364  82.63636
# 2     6  183.3143 122.28571
# 3     8  353.1000 209.21429

This would also work :

set <- function(fun) {
match.fun(paste0(deparse(substitute(fun)), "<-"))
}


library(dplyr, w = F)
group_by(mtcars, cyl) %>%
summarise(mean(disp), mean(hp)) %>%
set(colnames)(c("cyl", "disp_mean", "hp_mean"))
#> # A tibble: 3 × 3
#>     cyl disp_mean hp_mean
#>   <dbl>     <dbl>   <dbl>
#> 1     4      105.    82.6
#> 2     6      183.   122.
#> 3     8      353.   209.

Created on 2022-11-23 with reprex v2.0.2