创建一个 data.frame,其中一列是一个列表

我知道如何添加列表栏:

> df <- data.frame(a=1:3)
> df$b <- list(1:1, 1:2, 1:3)
> df
a       b
1 1       1
2 2    1, 2
3 3 1, 2, 3

这种做法有效,但不行:

> df <- data.frame(a=1:3, b=list(1:1, 1:2, 1:3))
Error in data.frame(1L, 1:2, 1:3, check.names = FALSE, stringsAsFactors = TRUE) :
arguments imply differing number of rows: 1, 2, 3

为什么?

另外,是否有一种方法可以在对 data.frame的单次调用中创建 df(以上) ?

51759 次浏览

Slightly obscurely, from ?data.frame:

If a list or data frame or matrix is passed to ‘data.frame’ it is as if each component or column had been passed as a separate argument (except for matrices of class ‘"model.matrix"’ and those protected by ‘I’).

(emphasis added).

So

data.frame(a=1:3,b=I(list(1,1:2,1:3)))

seems to work.

If you are working with data.tables, then you can avoid the call to I()

library(data.table)
# the following works as intended
data.table(a=1:3,b=list(1,1:2,1:3))


a     b
1: 1     1
2: 2   1,2
3: 3 1,2,3

data_frames (variously called tibbles, tbl_df, tbl) natively support the creation of list columns using the data_frame constructor. To use them, load one of the many libraries with them such as tibble, dplyr or tidyverse.

> data_frame(abc = letters[1:3], lst = list(1:3, 1:3, 1:3))
# A tibble: 3 × 2
abc       lst
<chr>    <list>
1     a <int [3]>
2     b <int [3]>
3     c <int [3]>

They are actually data.frames under the hood, but somewhat modified. They can almost always be used as normal data.frames. The only exception I've found is that when people do inappropriate class checks, they cause problems:

> #no problem
> data.frame(x = 1:3, y = 1:3) %>% class
[1] "data.frame"
> data.frame(x = 1:3, y = 1:3) %>% class == "data.frame"
[1] TRUE
> #uh oh
> data_frame(x = 1:3, y = 1:3) %>% class
[1] "tbl_df"     "tbl"        "data.frame"
> data_frame(x = 1:3, y = 1:3) %>% class == "data.frame"
[1] FALSE FALSE  TRUE
> #dont use if with improper testing!
> if(data_frame(x = 1:3, y = 1:3) %>% class == "data.frame") "something"
Warning message:
In if (data_frame(x = 1:3, y = 1:3) %>% class == "data.frame") "something" :
the condition has length > 1 and only the first element will be used
> #proper
> data_frame(x = 1:3, y = 1:3) %>% inherits("data.frame")
[1] TRUE

I recommending reading about them in R 4 Data Science (free).

You can use list2DF to create a ABC1 where a column is a list.

x <- list2DF(list(a=1:3, b=list(1:1, 1:2, 1:3)))
#x <- data.frame(a=1:3, list2DF(list(b=list(1:1, 1:2, 1:3)))) #Alternative


x
#  a       b
#1 1       1
#2 2    1, 2
#3 3 1, 2, 3


str(x)
#'data.frame':   3 obs. of  2 variables:
# $ a: int  1 2 3
# $ b:List of 3
#  ..$ : int 1
#  ..$ : int  1 2
#  ..$ : int  1 2 3

With this you don't have the attr AsIs in the data.frame, what you would have when using I.