有效的日期范围重叠计算? ?

我有两个日期范围,其中每个范围由开始和结束日期(显然是 datetime.date实例)决定。这两个范围可以重叠,也可以不重叠。我需要重叠的天数。当然,我可以在两个范围内预先填充所有日期的两个集合,并执行一个集合的交集,但这可能是低效的... 是否有一个更好的方法,除了另一个解决方案使用一个长的 if-elif部分涵盖所有情况?

57173 次浏览
  • Determine the latest of the two start dates and the earliest of the two end dates.
  • Compute the timedelta by subtracting them.
  • If the delta is positive, that is the number of days of overlap.

Here is an example calculation:

>>> from datetime import datetime
>>> from collections import namedtuple
>>> Range = namedtuple('Range', ['start', 'end'])


>>> r1 = Range(start=datetime(2012, 1, 15), end=datetime(2012, 5, 10))
>>> r2 = Range(start=datetime(2012, 3, 20), end=datetime(2012, 9, 15))
>>> latest_start = max(r1.start, r2.start)
>>> earliest_end = min(r1.end, r2.end)
>>> delta = (earliest_end - latest_start).days + 1
>>> overlap = max(0, delta)
>>> overlap
52

Pseudocode:

 1 + max( -1, min( a.dateEnd, b.dateEnd) - max( a.dateStart, b.dateStart) )

Function calls are more expensive than arithmetic operations.

The fastest way of doing this involves 2 subtractions and 1 min():

min(r1.end - r2.start, r2.end - r1.start).days + 1

compared with the next best which needs 1 subtraction, 1 min() and a max():

(min(r1.end, r2.end) - max(r1.start, r2.start)).days + 1

Of course with both expressions you still need to check for a positive overlap.

def get_overlap(r1,r2):
latest_start=max(r1[0],r2[0])
earliest_end=min(r1[1],r2[1])
delta=(earliest_end-latest_start).days
if delta>0:
return delta+1
else:
return 0

I implemented a TimeRange class as you can see below.

The get_overlapped_range first negates all the non overlapped options by a simple condition, and then calculate the overlapped range by considering all the possible options.

To get the amount of days you'll need to take the TimeRange value that was returned from get_overlapped_range and divide the duration by 60*60*24.

class TimeRange(object):
def __init__(self, start, end):
self.start = start
self.end = end
self.duration = self.end - self.start


def is_overlapped(self, time_range):
if max(self.start, time_range.start) < min(self.end, time_range.end):
return True
else:
return False


def get_overlapped_range(self, time_range):
if not self.is_overlapped(time_range):
return


if time_range.start >= self.start:
if self.end >= time_range.end:
return TimeRange(time_range.start, time_range.end)
else:
return TimeRange(time_range.start, self.end)
elif time_range.start < self.start:
if time_range.end >= self.end:
return TimeRange(self.start, self.end)
else:
return TimeRange(self.start, time_range.end)


def __repr__(self):
return '{0} ------> {1}'.format(*[time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(d))
for d in [self.start, self.end]])

Ok my solution is a bit wonky because my df uses all series - but lets say you have the following columns, 2 of which are fixed which is your "Fiscal Year". PoP is "Period of performance" which is your variable data:

df['PoP_Start']
df['PoP_End']
df['FY19_Start'] = '10/1/2018'
df['FY19_End'] = '09/30/2019'

Assume all of the data is in datetime format ie -

df['FY19_Start'] = pd.to_datetime(df['FY19_Start'])
df['FY19_End'] = pd.to_datetime(df['FY19_End'])

Try the following equations to find the number of days overlap:

min1 = np.minimum(df['POP_End'], df['FY19_End'])
max2 = np.maximum(df['POP_Start'], df['FY19_Start'])


df['Overlap_2019'] = (min1 - max2) / np.timedelta64(1, 'D')
df['Overlap_2019'] = np.maximum(df['Overlap_2019']+1,0)

You can use the datetimerange package: https://pypi.org/project/DateTimeRange/

from datetimerange import DateTimeRange
time_range1 = DateTimeRange("2015-01-01T00:00:00+0900", "2015-01-04T00:20:00+0900")
time_range2 = DateTimeRange("2015-01-01T00:00:10+0900", "2015-01-04T00:20:00+0900")
tem3 = time_range1.intersection(time_range2)
if tem3.NOT_A_TIME_STR == 'NaT':  # No overlap
S_Time = 0
else: # Output the overlap seconds
S_Time = tem3.timedelta.total_seconds()

"2015-01-01T00:00:00+0900" inside the DateTimeRange() can also be datetime format, like Timestamp('2017-08-30 20:36:25').

Another solution would be sorting a source array by ascending first and then looping through and comparing the dates like so:

date_ranges = sorted(
date_ranges,
key=lambda item: item['start_date'],
)
for i in range(len(date_ranges)-1):
if date_ranges[i]['end_date'] > date_ranges[i+1]['start_date']:
raise Exception('Overlap'})

Building on the solution of @Raymond Hettinger, since python 3.6 you can now use NamedTuple from the typing module.

from datetime import datetime
from typing import NamedTuple


class Range(NamedTuple):
start: datetime
end: datetime
>>> r1 = Range(start=datetime(2012, 1, 15), end=datetime(2012, 5, 10))
>>> r2 = Range(start=datetime(2012, 3, 20), end=datetime(2012, 9, 15))
>>> latest_start = max(r1.start, r2.start)
>>> earliest_end = min(r1.end, r2.end)
>>> delta = (earliest_end - latest_start).days + 1
>>> overlap = max(0, delta)
>>> overlap
52

What I have used in one of my apps, is to create a list of the date differences and querying that aginst the date difference in the date range proposed for save.

If the list of date difference values (in days) of all old date ranges is:

dateDiffOld = [2920753, 2920746, 2920698, 2920387, 2920360, 2920296]

and that for the new date range is:

dateDiffNew = 2920360

Then using:

if dateDiffNew in dateDiffOld:
# do something

My quest for a possible solution with a similar requirement took me to various answers provided on SO, but somehow (for my use case) I have found this to be working (so far, with a multitude of records). Have not had the opportunity to use it elsewhere.

Note: The instant application I am describing is created using Django.

Note2: Members are kindly requested to advise on any possible pitfalls (I have not come across so far) of this method.