从数据框中的标签获取列索引

假设我们有以下数据框架:

> df
A B C
1 1 2 3
2 4 5 6
3 7 8 9

我们可以从它的索引中选择列‘ B’:

> df[,2]
[1] 2 5 8

有没有办法从列标签(‘ B’)中获得索引(2) ?

229379 次浏览

you can get the index via grep and colnames:

grep("B", colnames(df))
[1] 2

or use

grep("^B$", colnames(df))
[1] 2

to only get the columns called "B" without those who contain a B e.g. "ABC".

The following will do it:

which(colnames(df)=="B")

I wanted to see all the indices for the colnames because I needed to do a complicated column rearrangement, so I printed the colnames as a dataframe. The rownames are the indices.

as.data.frame(colnames(df))


1 A
2 B
3 C

This seems to be an efficient way to list vars with column number:

cbind(names(df))

Output:

     [,1]
[1,] "A"
[2,] "B"
[3,] "C"

Sometimes I like to copy variables with position into my code so I use this function:

varnums<- function(x) {w=as.data.frame(c(1:length(colnames(x))),
paste0('# ',colnames(x)))
names(w)= c("# Var/Pos")
w}
varnums(df)

Output:

# Var/Pos
# A         1
# B         2
# C         3

Following on from chimeric's answer above:

To get ALL the column indices in the df, so i used:

which(!names(df)%in%c())

or store in a list:

indexLst<-which(!names(df)%in%c())

Use t function:

t(colnames(df))


[,1]   [,2]   [,3]   [,4]   [,5]   [,6]
[1,] "var1" "var2" "var3" "var4" "var5" "var6"
match("B", names(df))

Can work also if you have a vector of names.

Here is an answer that will generalize Henrik's answer.

df=data.frame(A=rnorm(100), B=rnorm(100), C=rnorm(100))
numeric_columns<-c('A', 'B', 'C')
numeric_index<-sapply(1:length(numeric_columns), function(i)
grep(numeric_columns[i], colnames(df)))

To generalize @NPE's answer slightly:

which(colnames(dat) %in% var)

where var is of the form

c("colname1","colname2",...,"colnamen")

returns the indices of whichever column names one needs.

#I wanted the column index instead of the column name. This line of code worked for me:

which (data.frame (colnames (datE)) == colnames (datE[c(1:15)]), arr.ind = T)[,1]


#with datE being a regular dataframe with 15 columns (variables)


data.frame(colnames(datE))
#>    colnames.datE.
#> 1              Ce
#> 2              Eu
#> 3              La
#> 4              Pr
#> 5              Nd
#> 6              Sm
#> 7              Gd
#> 8              Tb
#> 9              Dy
#> 10             Ho
#> 11             Er
#> 12              Y
#> 13             Tm
#> 14             Yb
#> 15             Lu


which(data.frame(colnames(datE))==colnames(datE[c(1:15)]),arr.ind=T)[,1]
#> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15