从日期中提取年份

如何从变量中删除第一个元素,特别是如果该变量具有特殊字符。例如,我有以下专栏:

Date
01/01/2009
01/01/2010
01/01/2011
01/01/2012

我需要一个像下面这样的新专栏:

Date
2009
2010
2011
2012
304100 次浏览

if all your dates are the same width, you can put the dates in a vector and use substring

Date
a <- c("01/01/2009", "01/01/2010" , "01/01/2011")
substring(a,7,10) #This takes string and only keeps the characters beginning in position 7 to position 10

output

[1] "2009" "2010" "2011"

As discussed in the comments, this can be achieved by converting the entry into Date format and extracting the year, for instance like this:

format(as.Date(df1$Date, format="%d/%m/%Y"),"%Y")

When you convert your variable to Date:

date <-  as.Date('10/30/2018','%m/%d/%Y')

you can then cut out the elements you want and make new variables, like year:

year <- as.numeric(format(date,'%Y'))

or month:

month <- as.numeric(format(date,'%m'))

This is more advice than a specific answer, but my suggestion is to convert dates to date variables immediately, rather than keeping them as strings. This way you can use date (and time) functions on them, rather than trying to use very troublesome workarounds.

As pointed out, the lubridate package has nice extraction functions.

For some projects, I have found that piecing dates out from the start is helpful: create year, month, day (of month) and day (of week) variables to start with. This can simplify summaries, tables and graphs, because the extraction code is separate from the summary/table/graph code, and because if you need to change it, you don't have to roll out those changes in multiple spots.

If you are using the date package, this can be done fairly easily.

library(date)
Date <- c("01/01/2009", "01/01/2010", "01/01/2011", "01/01/2012")
Date <- as.date(Date)
Date
# [1] 1Jan2009 1Jan2010 1Jan2011 1Jan2012
date.mdy(Date)$year
# [1] 2009 2010 2011 2012


## be aware that these are now integers and thus different methods may be invoked:
str(date.mdy(Date)$year)
# int [1:4] 2009 2010 2011 2012
summary(Date)
#     First      Last
# "1Jan2009" "1Jan2012"
summary(date.mdy(Date)$year)
#    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
#    2009    2010    2010    2010    2011    2012

For some time now, you can also only rely on the data.table package and its IDate class plus associated functions (Check ?as.IDate()).

require(data.table)


a <- c("01/01/2009", "01/01/2010" , "01/01/2011")
year(as.IDate(a, '%d/%m/%Y')) # all data.table functions