笛卡儿积数据框架

小开

最佳答案

你可以使用 expand.grid(A, B, C)

编辑: 使用 do.call实现第二部分的替代方案是来自 plyr包的函数 mdply:

library(plyr)


d = expand.grid(x = A, y = B, z = C)
d = mdply(d, f)

为了使用一个简单的“粘贴”函数来说明它的用法，您可以尝试

d = mdply(d, 'paste', sep = '+');

小开

I can never remember that standard function expand.grid. So here's another version.

crossproduct <- function(...,FUN='data.frame') {
args <- list(...)
n1 <- names(args)
n2 <- sapply(match.call()[1+1:length(args)], as.character)
nn <- if (is.null(n1)) n2 else ifelse(n1!='',n1,n2)
dims <- sapply(args,length)
dimtot <- prod(dims)
reps <- rev(cumprod(c(1,rev(dims))))[-1]
cols <- lapply(1:length(dims), function(j)
args[[j]][1+((1:dimtot-1) %/% reps[j]) %% dims[j]])
names(cols) <- nn
do.call(match.fun(FUN),cols)
}


A <- c(1,2,3)
B <- factor(c('x','y'))
C <- c(.1,.5)


crossproduct(A,B,C)


crossproduct(A,B,C, FUN=function(...) paste(...,sep='_'))

小开

这里有一个方法可以同时做到这两点，使用 Ramnath 的 expand.grid建议:

f <- function(x,y,z) paste(x,y,z,sep="+")
d <- expand.grid(x=A, y=B, z=C)
d$D <- do.call(f, d)

请注意，do.call对 d“原样”工作，因为 data.frame是 list。但是 do.call期望 d的列名与 f的参数名匹配。

小开

有一个操作数据框架的函数，这在本例中很有帮助。

它可以产生各种连接(用 SQL 术语来说) ，而笛卡儿积是一种特殊情况。

您必须首先将变量转换为数据帧，因为它将数据帧作为参数。

所以像这样的东西就可以了:

A.B=merge(data.frame(A=A), data.frame(B=B),by=NULL);
A.B.C=merge(A.B, data.frame(C=C),by=NULL);

唯一需要关心的是行不按照您所描述的那样排序。您可以根据需要手动对它们进行排序。

merge(x, y, by = intersect(names(x), names(y)),
by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all,
sort = TRUE, suffixes = c(".x",".y"),
incomparables = NULL, ...)

如果 by 或者两个 by. x 和 by. y 的长度为0(长度为0向量或者为零) ，那么结果 r 就是 x 和 y 的笛卡儿积

详细信息请参阅这个 URL: http://stat.ethz.ch/R-manual/R-patched/library/base/html/merge.html

小开

Consider using the wonderful data.table library for expressiveness and speed. It handles many plyr use-cases (relational group by), along with transform, subset and relational join using a fairly simple uniform syntax.

library(data.table)
d <- CJ(x=A, y=B, z=C)  # Cross join
d[, w:=f(x,y,z)]  # Mutates the data.table

或者排成一行

d <- CJ(x=A, y=B, z=C)[, w:=f(x,y,z)]

小开

使用 tidyr库，可以使用 tidyr::crossing(订单将与 OP 中一样) :

library(tidyr)
crossing(A,B,C)
# A tibble: 12 x 3
#        A B         C
#    <dbl> <fct> <dbl>
#  1     1 x       0.1
#  2     1 x       0.5
#  3     1 y       0.1
#  4     1 y       0.5
#  5     2 x       0.1
#  6     2 x       0.5
#  7     2 y       0.1
#  8     2 y       0.5
#  9     3 x       0.1
# 10     3 x       0.5
# 11     3 y       0.1
# 12     3 y       0.5

下一步是使用 tidyverse，特别是 purrr::pmap*家族:

library(tidyverse)
crossing(A,B,C) %>% mutate(D = pmap_chr(.,paste,sep="_"))
# A tibble: 12 x 4
#        A B         C D
#    <dbl> <fct> <dbl> <chr>
#  1     1 x       0.1 1_1_0.1
#  2     1 x       0.5 1_1_0.5
#  3     1 y       0.1 1_2_0.1
#  4     1 y       0.5 1_2_0.5
#  5     2 x       0.1 2_1_0.1
#  6     2 x       0.5 2_1_0.5
#  7     2 y       0.1 2_2_0.1
#  8     2 y       0.5 2_2_0.5
#  9     3 x       0.1 3_1_0.1
# 10     3 x       0.5 3_1_0.5
# 11     3 y       0.1 3_2_0.1
# 12     3 y       0.5 3_2_0.5

小开

在 sqldf中使用交叉连接:

library(sqldf)


A <- data.frame(c1 = c(1,2,3))
B <- data.frame(c2 = factor(c('x','y')))
C <- data.frame(c3 = c(0.1,0.5))


result <- sqldf('SELECT * FROM (A CROSS JOIN B) CROSS JOIN C')