在Python中迭代一系列日期

我有以下代码来做到这一点,但我如何能做得更好?现在我认为它比嵌套循环更好,但是当您在列表理解中使用生成器时,它开始变得像perl一行程序。

day_count = (end_date - start_date).days + 1
for single_date in [d for d in (start_date + timedelta(n) for n in range(day_count)) if d <= end_date]:
print strftime("%Y-%m-%d", single_date.timetuple())

笔记

  • 我不是用这个来打印的。这只是为了演示。
  • start_dateend_date变量是datetime.date对象,因为我不需要时间戳。(它们将用于生成报告)。

样例输出

开始日期为2009-05-30,结束日期为2009-06-09:

2009-05-30
2009-05-31
2009-06-01
2009-06-02
2009-06-03
2009-06-04
2009-06-05
2009-06-06
2009-06-07
2009-06-08
2009-06-09
480261 次浏览

为什么有两个嵌套迭代?对我来说,它只用一次迭代就产生了相同的数据列表:

for single_date in (start_date + timedelta(n) for n in range(day_count)):
print ...

没有列表被存储,只有一个生成器被迭代。还有&;if&;在发电机似乎是不必要的。

毕竟,线性序列应该只需要一个迭代器,而不是两个。

与John Machin讨论后更新:

也许最优雅的解决方案是使用生成器函数来完全隐藏/抽象日期范围内的迭代:

from datetime import date, timedelta


def daterange(start_date, end_date):
for n in range(int((end_date - start_date).days)):
yield start_date + timedelta(n)


start_date = date(2013, 1, 1)
end_date = date(2015, 6, 2)
for single_date in daterange(start_date, end_date):
print(single_date.strftime("%Y-%m-%d"))

注意:为了与内置的range()函数保持一致,此迭代停止之前到达end_date。所以对于包容性迭代使用第二天,就像使用range()一样。

这可能更清楚:

from datetime import date, timedelta


start_date = date(2019, 1, 1)
end_date = date(2020, 1, 1)
delta = timedelta(days=1)
while start_date <= end_date:
print(start_date.strftime("%Y-%m-%d"))
start_date += delta
import datetime


def daterange(start, stop, step=datetime.timedelta(days=1), inclusive=False):
# inclusive=False to behave like range by default
if step.days > 0:
while start < stop:
yield start
start = start + step
# not +=! don't modify object passed in if it's mutable
# since this function is not restricted to
# only types from datetime module
elif step.days < 0:
while start > stop:
yield start
start = start + step
if inclusive and start == stop:
yield start


# ...


for date in daterange(start_date, end_date, inclusive=True):
print strftime("%Y-%m-%d", date.timetuple())

此函数通过支持负步长等功能,可以实现超出严格要求的功能。只要你提出了范围逻辑,那么你就不需要单独的day_count,最重要的是,当你从多个地方调用函数时,代码变得更容易阅读。

使用dateutil库:

from datetime import date
from dateutil.rrule import rrule, DAILY


a = date(2009, 5, 30)
b = date(2009, 6, 9)


for dt in rrule(DAILY, dtstart=a, until=b):
print dt.strftime("%Y-%m-%d")

这个python库有许多更高级的特性,其中一些非常有用,比如__abc0,并且被实现为单个文件(模块),很容易包含到项目中。

import datetime


def daterange(start, stop, step_days=1):
current = start
step = datetime.timedelta(step_days)
if step_days > 0:
while current < stop:
yield current
current += step
elif step_days < 0:
while current > stop:
yield current
current += step
else:
raise ValueError("daterange() step_days argument must not be zero")


if __name__ == "__main__":
from pprint import pprint as pp
lo = datetime.date(2008, 12, 27)
hi = datetime.date(2009, 1, 5)
pp(list(daterange(lo, hi)))
pp(list(daterange(hi, lo, -1)))
pp(list(daterange(lo, hi, 7)))
pp(list(daterange(hi, lo, -7)))
assert not list(daterange(lo, hi, -1))
assert not list(daterange(hi, lo))
assert not list(daterange(lo, hi, -7))
assert not list(daterange(hi, lo, 7))
for i in range(16):
print datetime.date.today() + datetime.timedelta(days=i)

下面做一个按天递增的范围怎么样:

for d in map( lambda x: startDate+datetime.timedelta(days=x), xrange( (stopDate-startDate).days ) ):
# Do stuff here
  • startDate和stopDate是datetime。日期对象

对于通用版本:

for d in map( lambda x: startTime+x*stepTime, xrange( (stopTime-startTime).total_seconds() / stepTime.total_seconds() ) ):
# Do stuff here
  • startTime和stopTime是datetime。日期或datetime。datetime对象 (两者应该是相同的类型)
  • stepTime是一个timedelta对象

注意.total_seconds()只在python 2.7之后才被支持。如果你被早期版本困住了,你可以写自己的函数:

def total_seconds( td ):
return float(td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6) / 10**6

为什么不试试呢:

import datetime as dt


start_date = dt.datetime(2012, 12,1)
end_date = dt.datetime(2012, 12,5)


total_days = (end_date - start_date).days + 1 #inclusive 5 days


for day_number in range(total_days):
current_date = (start_date + dt.timedelta(days = day_number)).date()
print current_date

一般来说,Pandas非常适合时间序列,并直接支持日期范围。

import pandas as pd
daterange = pd.date_range(start_date, end_date)

然后你可以循环daterrange来打印日期:

for single_date in daterange:
print (single_date.strftime("%Y-%m-%d"))

它也有很多选择,让生活更轻松。例如,如果您只想要工作日,您只需交换bdate_range。看到http://pandas.pydata.org/pandas-docs/stable/timeseries.html#generating-ranges-of-timestamps

Pandas的强大之处在于它的数据框架,它支持向量化操作(很像numpy),使得跨大量数据的操作非常快速和简单。

< p >编辑: 你也可以完全跳过for循环,直接打印它,这样更简单、更高效:

print(daterange)

这个函数有一些额外的特性:

  • 可以传递一个匹配DATE_FORMAT开始或结束的字符串,并将其转换为日期对象
  • 可以通过日期对象开始或结束
  • 错误检查,以防结束比开始更早

    import datetime
    from datetime import timedelta
    
    
    
    
    DATE_FORMAT = '%Y/%m/%d'
    
    
    def daterange(start, end):
    def convert(date):
    try:
    date = datetime.datetime.strptime(date, DATE_FORMAT)
    return date.date()
    except TypeError:
    return date
    
    
    def get_date(n):
    return datetime.datetime.strftime(convert(start) + timedelta(days=n), DATE_FORMAT)
    
    
    days = (convert(end) - convert(start)).days
    if days <= 0:
    raise ValueError('The start date must be before the end date.')
    for n in range(0, days):
    yield get_date(n)
    
    
    
    
    start = '2014/12/1'
    end = '2014/12/31'
    print list(daterange(start, end))
    
    
    start_ = datetime.date.today()
    end = '2015/12/1'
    print list(daterange(start, end))
    

显示从今天开始的最后n天:

import datetime
for i in range(0, 100):
print((datetime.date.today() + datetime.timedelta(i)).isoformat())

输出:

2016-06-29
2016-06-30
2016-07-01
2016-07-02
2016-07-03
2016-07-04

Numpy的arange函数可以应用于日期:

import numpy as np
from datetime import datetime, timedelta
d0 = datetime(2009, 1,1)
d1 = datetime(2010, 1,1)
dt = timedelta(days = 1)
dates = np.arange(d0, d1, dt).astype(datetime)

astype的使用是将numpy.datetime64对象转换为datetime.datetime对象数组。

下面是一个通用日期范围函数的代码,类似于Ber的答案,但更灵活:

def count_timedelta(delta, step, seconds_in_interval):
"""Helper function for iterate.  Finds the number of intervals in the timedelta."""
return int(delta.total_seconds() / (seconds_in_interval * step))




def range_dt(start, end, step=1, interval='day'):
"""Iterate over datetimes or dates, similar to builtin range."""
intervals = functools.partial(count_timedelta, (end - start), step)


if interval == 'week':
for i in range(intervals(3600 * 24 * 7)):
yield start + datetime.timedelta(weeks=i) * step


elif interval == 'day':
for i in range(intervals(3600 * 24)):
yield start + datetime.timedelta(days=i) * step


elif interval == 'hour':
for i in range(intervals(3600)):
yield start + datetime.timedelta(hours=i) * step


elif interval == 'minute':
for i in range(intervals(60)):
yield start + datetime.timedelta(minutes=i) * step


elif interval == 'second':
for i in range(intervals(1)):
yield start + datetime.timedelta(seconds=i) * step


elif interval == 'millisecond':
for i in range(intervals(1 / 1000)):
yield start + datetime.timedelta(milliseconds=i) * step


elif interval == 'microsecond':
for i in range(intervals(1e-6)):
yield start + datetime.timedelta(microseconds=i) * step


else:
raise AttributeError("Interval must be 'week', 'day', 'hour' 'second', \
'microsecond' or 'millisecond'.")

这是我能想到的最适合人类阅读的解决方案。

import datetime


def daterange(start, end, step=datetime.timedelta(1)):
curr = start
while curr < end:
yield curr
curr += step

我也有类似的问题,但我需要每月而不是每天迭代一次。

这就是我的解

import calendar
from datetime import datetime, timedelta


def days_in_month(dt):
return calendar.monthrange(dt.year, dt.month)[1]


def monthly_range(dt_start, dt_end):
forward = dt_end >= dt_start
finish = False
dt = dt_start


while not finish:
yield dt.date()
if forward:
days = days_in_month(dt)
dt = dt + timedelta(days=days)
finish = dt > dt_end
else:
_tmp_dt = dt.replace(day=1) - timedelta(days=1)
dt = (_tmp_dt.replace(day=dt.day))
finish = dt < dt_end

示例# 1

date_start = datetime(2016, 6, 1)
date_end = datetime(2017, 1, 1)


for p in monthly_range(date_start, date_end):
print(p)

输出

2016-06-01
2016-07-01
2016-08-01
2016-09-01
2016-10-01
2016-11-01
2016-12-01
2017-01-01

例# 2

date_start = datetime(2017, 1, 1)
date_end = datetime(2016, 6, 1)


for p in monthly_range(date_start, date_end):
print(p)

输出

2017-01-01
2016-12-01
2016-11-01
2016-10-01
2016-09-01
2016-08-01
2016-07-01
2016-06-01

您可以简单而可靠地使用pandas库在两个日期之间生成一系列日期

import pandas as pd


print pd.date_range(start='1/1/2010', end='1/08/2018', freq='M')

您可以通过设置freq为D, M, Q, Y来改变生成日期的频率 (每天,每月,每季,每年 ) < / p >

> pip install DateTimeRange


from datetimerange import DateTimeRange


def dateRange(start, end, step):
rangeList = []
time_range = DateTimeRange(start, end)
for value in time_range.range(datetime.timedelta(days=step)):
rangeList.append(value.strftime('%m/%d/%Y'))
return rangeList


dateRange("2018-09-07", "2018-12-25", 7)


Out[92]:
['09/07/2018',
'09/14/2018',
'09/21/2018',
'09/28/2018',
'10/05/2018',
'10/12/2018',
'10/19/2018',
'10/26/2018',
'11/02/2018',
'11/09/2018',
'11/16/2018',
'11/23/2018',
'11/30/2018',
'12/07/2018',
'12/14/2018',
'12/21/2018']

通过将range参数存储在元组中,实现可逆步骤的方法略有不同。

def date_range(start, stop, step=1, inclusive=False):
day_count = (stop - start).days
if inclusive:
day_count += 1


if step > 0:
range_args = (0, day_count, step)
elif step < 0:
range_args = (day_count - 1, -1, step)
else:
raise ValueError("date_range(): step arg must be non-zero")


for i in range(*range_args):
yield start + timedelta(days=i)

为了完整起见,Pandas还有一个period_range函数用于时间戳越界:

import pandas as pd


pd.period_range(start='1/1/1626', end='1/08/1627', freq='D')
import datetime
from dateutil.rrule import DAILY,rrule


date=datetime.datetime(2019,1,10)


date1=datetime.datetime(2019,2,2)


for i in rrule(DAILY , dtstart=date,until=date1):
print(i.strftime('%Y%b%d'),sep='\n')

输出:

2019Jan10
2019Jan11
2019Jan12
2019Jan13
2019Jan14
2019Jan15
2019Jan16
2019Jan17
2019Jan18
2019Jan19
2019Jan20
2019Jan21
2019Jan22
2019Jan23
2019Jan24
2019Jan25
2019Jan26
2019Jan27
2019Jan28
2019Jan29
2019Jan30
2019Jan31
2019Feb01
2019Feb02
from datetime import date,timedelta
delta = timedelta(days=1)
start = date(2020,1,1)
end=date(2020,9,1)
loop_date = start
while loop_date<=end:
print(loop_date)
loop_date+=delta

使用pendulum.period:

import pendulum


start = pendulum.from_format('2020-05-01', 'YYYY-MM-DD', formatter='alternative')
end = pendulum.from_format('2020-05-02', 'YYYY-MM-DD', formatter='alternative')


period = pendulum.period(start, end)


for dt in period:
print(dt.to_date_string())

对于那些对python函数方式感兴趣的人:

from datetime import date, timedelta
from itertools import count, takewhile


for d in takewhile(lambda x: x<=date(2009,6,9), map(lambda x:date(2009,5,30)+timedelta(days=x), count())):
print(d)

你可以使用Arrow:

这是一个来自文档的例子,在几个小时内迭代:

from arrow import Arrow


>>> start = datetime(2013, 5, 5, 12, 30)
>>> end = datetime(2013, 5, 5, 17, 15)
>>> for r in Arrow.range('hour', start, end):
...     print repr(r)
...
<Arrow [2013-05-05T12:30:00+00:00]>
<Arrow [2013-05-05T13:30:00+00:00]>
<Arrow [2013-05-05T14:30:00+00:00]>
<Arrow [2013-05-05T15:30:00+00:00]>
<Arrow [2013-05-05T16:30:00+00:00]>

要在几天内迭代,你可以这样使用:

>>> start = Arrow(2013, 5, 5)
>>> end = Arrow(2013, 5, 5)
>>> for r in Arrow.range('day', start, end):
...     print repr(r)

(没有检查是否可以传递datetime.date对象,但无论如何Arrow对象通常更容易)

如果你打算使用动态 timedelta,那么你可以使用:

1. 使用while循环

def datetime_range(start: datetime, end: datetime, delta: timedelta) -> Generator[datetime, None, None]:
while start <= end:
yield start
start += delta

2. 使用for循环

from datetime import datetime, timedelta
from typing import Generator




def datetime_range(start: datetime, end: datetime, delta: timedelta) -> Generator[datetime, None, None]:
delta_units = int((end - start) / delta)


for _ in range(delta_units + 1):
yield start
start += delta

3.如果你正在使用async/await

async def datetime_range(start: datetime, end: datetime, delta: timedelta) -> AsyncGenerator[datetime, None]:
delta_units = int((end - start) / delta)


for _ in range(delta_units + 1):
yield start
start += delta

4. 列表理解

def datetime_range(start: datetime, end: datetime, delta: timedelta) -> List[datetime]:
delta_units = int((end - start) / delta)
return [start + (delta * index) for index in range(delta_units + 1)]

那么1和2解可以简单地像这样使用

start = datetime(2020, 10, 10, 10, 00)
end = datetime(2022, 10, 10, 18, 00)
delta = timedelta(minutes=30)


result = [time_part for time_part in datetime_range(start, end, delta)]
# or
for time_part in datetime_range(start, end, delta):
print(time_part)

3- 3 / 3解决方案可以在异步上下文中使用。因为它运行一个异步生成器对象,该对象只能在异步上下文中使用

start = datetime(2020, 10, 10, 10, 00)
end = datetime(2022, 10, 10, 18, 00)
delta = timedelta(minutes=30)


result = [time_part async for time_part in datetime_range(start, end, delta)]


async for time_part in datetime_range(start, end, delta):
print(time_part)

解决方案的好处是,他们都使用动态 timedelta。这在你不知道你将得到哪个时间增量的情况下非常有用。