N <- 1e4 # total number of rows to preallocate--possibly an overestimate
DF <- data.frame(num=rep(NA, N), txt=rep("", N), # as many cols as you need
stringsAsFactors=FALSE) # you don't know levels yet
然后在操作期间一次插入一行
DF[i, ] <- list(1.4, "foo")
That should work for arbitrary data.frame and be much more efficient. If you overshot N you can always shrink empty rows out at the end.
If R Core wanted to do this, underlying vector storage could function like a union mount. One reference to the vector storage might be valid for subscripts 1:N, while another reference to the same storage is valid for subscripts 1:(N+1). There could be reserved storage not yet validly referenced by anything but convenient for a quick push_back(). You don't violate the functional concept when appending outside the range that any existing reference considers valid.
Eventually appending rows incrementally, you run out of reserved storage. You'll need to create new copies of everything, with the storage multiplied by some increment. The STL implementations I've use tend to multiply storage by 2 when extending allocation. I thought I read in R Internals that there is a memory structure where the storage increments by 20%. Either way, growth operations occur with logarithmic frequency relative to the total number of elements appended. On an amortized basis, this is usually acceptable.
Admittedly, I see 2 major limitations: (1) this only works with single-mode data, and (2) you must know your final # columns for this to work (i.e., I'm assuming that you're not working with a ragged array whose greatest row length is unknown 先验的).
这个解决方案看起来很简单,但是根据我在 R 中进行类型转换的经验,我确信它会带来新的挑战。有人对此有何评论吗?
row1<-list("a",1,FALSE) #use 'list', not 'c' or 'cbind'!
row2<-list("b",2,TRUE)
df<-data.frame(row1,stringsAsFactors = F) #first row
df<-rbind(df,row2) #now this works as you'd expect.