如何获得列表中每个元素的第二个子元素

我知道我以前遇到过这个问题,但是我现在有点精神障碍。既然我在网上找不到,我就把它贴在这里,这样下次就能找到了。

我有一个数据框,其中包含一个表示 ID 标签的字段。这个标签有两部分,一个字母前缀和一个数字后缀。我想把它分开,创建两个新的字段,其中包含。

structure(list(lab = c("N00", "N01", "N02", "B00", "B01", "B02",
"Z21", "BA01", "NA03")), .Names = "lab", row.names = c(NA, -9L
), class = "data.frame")


df$pre<-strsplit(df$lab, "[0-9]+")
df$suf<-strsplit(df$lab, "[A-Z]+")

也就是说

   lab pre  suf
1  N00   N , 00
2  N01   N , 01
3  N02   N , 02
4  B00   B , 00
5  B01   B , 01
6  B02   B , 02
7  Z21   Z , 21
8 BA01  BA , 01
9 NA03  NA , 03

所以,第一个 strsplit 工作得很好,但是第二个给出一个列表,每个列表有两个元素,一个空字符串和我想要的结果,并将它们都填充到 dataframe 列中。

如何从列表的每个元素中选择第二个子元素?(或者,有没有更好的方法来做到这一点)

66296 次浏览

To select the second element of each list item:

R> sapply(df$suf, "[[", 2)
[1] "00" "01" "02" "00" "01" "02" "21" "01" "03"

An alternative approach using regular expressions:

df$pre <- sub("^([A-Z]+)[0-9]+", "\\1", df$lab)
df$suf <- sub("^[A-Z]+([0-9]+)", "\\1", df$lab)

First of all: if you use str(df) you'll see that df$pre is list. I think you want vector (but I might be wrong).
Return to problem - in this case I will use gsub:

df$pre <- gsub("[0-9]", "", df$lab)
df$suf <- gsub("[A-Z]", "", df$lab)

This guarantee that both columns are vectors, but it fail if your label is not from key (i.e. 'AB01B').

with purrr::map this would be

df$suf %>%  map_chr(c(2))

for further info on purrr::map