Python 中的版本号比较

我想编写一个类似于 cmp的函数,它比较两个版本号,并根据它们的比较值返回 -101

  • 如果版本 A 比版本 B 旧,则返回 -1
  • 如果版本 A 和 B 相等,返回 0
  • 如果版本 A 比版本 B 更新,则返回 1

每个小节都应该被解释为一个数字,因此1.10 > 1.1。

所需的功能输出为

mycmp('1.0', '1') == 0
mycmp('1.0.0', '1') == 0
mycmp('1', '1.0.0.1') == -1
mycmp('12.10', '11.0.0.0.0') == 1
...

这是我的实施方案,需要改进的地方:

def mycmp(version1, version2):
parts1 = [int(x) for x in version1.split('.')]
parts2 = [int(x) for x in version2.split('.')]


# fill up the shorter version with zeros ...
lendiff = len(parts1) - len(parts2)
if lendiff > 0:
parts2.extend([0] * lendiff)
elif lendiff < 0:
parts1.extend([0] * (-lendiff))


for i, p in enumerate(parts1):
ret = cmp(p, parts2[i])
if ret: return ret
return 0

我正在使用 Python 2.4.5 btw (安装在我的工作地点...)。

这里有一个小的“测试套件”,你可以使用

assert mycmp('1', '2') == -1
assert mycmp('2', '1') == 1
assert mycmp('1', '1') == 0
assert mycmp('1.0', '1') == 0
assert mycmp('1', '1.000') == 0
assert mycmp('12.01', '12.1') == 0
assert mycmp('13.0.1', '13.00.02') == -1
assert mycmp('1.1.1.1', '1.1.1.1') == 0
assert mycmp('1.1.1.2', '1.1.1.1') == 1
assert mycmp('1.1.3', '1.1.3.000') == 0
assert mycmp('3.1.1.0', '3.1.2.10') == -1
assert mycmp('1.1', '1.10') == -1
67566 次浏览

No need to iterate over the version tuples. The built in comparison operator on lists and tuples already works exactly like you want it. You'll just need to zero extend the version lists to the corresponding length. With python 2.6 you can use izip_longest to pad the sequences.

from itertools import izip_longest
def version_cmp(v1, v2):
parts1, parts2 = [map(int, v.split('.')) for v in [v1, v2]]
parts1, parts2 = zip(*izip_longest(parts1, parts2, fillvalue=0))
return cmp(parts1, parts2)

With lower versions, some map hackery is required.

def version_cmp(v1, v2):
parts1, parts2 = [map(int, v.split('.')) for v in [v1, v2]]
parts1, parts2 = zip(*map(lambda p1,p2: (p1 or 0, p2 or 0), parts1, parts2))
return cmp(parts1, parts2)

This is a little more compact than your suggestion. Rather than filling the shorter version with zeros, I'm removing trailing zeros from the version lists after splitting.

def normalize_version(v):
parts = [int(x) for x in v.split(".")]
while parts[-1] == 0:
parts.pop()
return parts


def mycmp(v1, v2):
return cmp(normalize_version(v1), normalize_version(v2))

Remove the uninteresting part of the string (trailing zeroes and dots), and then compare the lists of numbers.

import re


def mycmp(version1, version2):
def normalize(v):
return [int(x) for x in re.sub(r'(\.0+)*$','', v).split(".")]
return cmp(normalize(version1), normalize(version2))

This is the same approach as Pär Wieslander, but a bit more compact:

Here are some tests, thanks to "How to compare two strings in dot separated version format in Bash?":

assert mycmp("1", "1") == 0
assert mycmp("2.1", "2.2") < 0
assert mycmp("3.0.4.10", "3.0.4.2") > 0
assert mycmp("4.08", "4.08.01") < 0
assert mycmp("3.2.1.9.8144", "3.2") > 0
assert mycmp("3.2", "3.2.1.9.8144") < 0
assert mycmp("1.2", "2.1") < 0
assert mycmp("2.1", "1.2") > 0
assert mycmp("5.6.7", "5.6.7") == 0
assert mycmp("1.01.1", "1.1.1") == 0
assert mycmp("1.1.1", "1.01.1") == 0
assert mycmp("1", "1.0") == 0
assert mycmp("1.0", "1") == 0
assert mycmp("1.0", "1.0.1") < 0
assert mycmp("1.0.1", "1.0") > 0
assert mycmp("1.0.2.0", "1.0.2") == 0

Remove trailing .0 and .00 with regex, split and use cmp function which compares arrays correctly:

def mycmp(v1,v2):
c1=map(int,re.sub('(\.0+)+\Z','',v1).split('.'))
c2=map(int,re.sub('(\.0+)+\Z','',v2).split('.'))
return cmp(c1,c2)

And, of course, you can convert it to a one-liner if you don't mind the long lines.

def compare_version(v1, v2):
return cmp(*tuple(zip(*map(lambda x, y: (x or 0, y or 0),
[int(x) for x in v1.split('.')], [int(y) for y in v2.split('.')]))))

It's a one liner (split for legability). Not sure about readable...

Is reuse considered elegance in this instance? :)

# pkg_resources is in setuptools
# See http://peak.telecommunity.com/DevCenter/PkgResources#parsing-utilities
def mycmp(a, b):
from pkg_resources import parse_version as V
return cmp(V(a),V(b))

The most difficult to read solution, but a one-liner nevertheless! and using iterators to be fast.

next((c for c in imap(lambda x,y:cmp(int(x or 0),int(y or 0)),
v1.split('.'),v2.split('.')) if c), 0)

that is for Python2.6 and 3.+ btw, Python 2.5 and older need to catch the StopIteration.

My preferred solution:

Padding the string with extra zeroes and just using the four first is easy to understand, doesn't require any regex and the lambda is more or less readable. I use two lines for readability, for me elegance is short and simple.

def mycmp(version1,version2):
tup = lambda x: [int(y) for y in (x+'.0.0.0.0').split('.')][:4]
return cmp(tup(version1),tup(version2))

How about using Python's distutils.version.StrictVersion?

>>> from distutils.version import StrictVersion
>>> StrictVersion('10.4.10') > StrictVersion('10.4.9')
True

So for your cmp function:

>>> cmp = lambda x, y: StrictVersion(x).__cmp__(y)
>>> cmp("10.4.10", "10.4.11")
-1

If you want to compare version numbers that are more complex distutils.version.LooseVersion will be more useful, however be sure to only compare the same types.

>>> from distutils.version import LooseVersion, StrictVersion
>>> LooseVersion('1.4c3') > LooseVersion('1.3')
True
>>> LooseVersion('1.4c3') > StrictVersion('1.3')  # different types
False

LooseVersion isn't the most intelligent tool, and can easily be tricked:

>>> LooseVersion('1.4') > LooseVersion('1.4-rc1')
False

To have success with this breed, you'll need to step outside the standard library and use setuptools's parsing utility parse_version.

>>> from pkg_resources import parse_version
>>> parse_version('1.4') > parse_version('1.4-rc2')
True

So depending on your specific use-case, you'll need to decide whether the builtin distutils tools are enough, or if it's warranted to add as a dependency setuptools.

This is my solution (written in C, sorry). I hope you'll find it useful

int compare_versions(const char *s1, const char *s2) {
while(*s1 && *s2) {
if(isdigit(*s1) && isdigit(*s2)) {
/* compare as two decimal integers */
int s1_i = strtol(s1, &s1, 10);
int s2_i = strtol(s2, &s2, 10);


if(s1_i != s2_i) return s1_i - s2_i;
} else {
/* compare as two strings */
while(*s1 && !isdigit(*s1) && *s2 == *s1) {
s1++;
s2++;
}


int s1_i = isdigit(*s1) ? 0 : *s1;
int s2_i = isdigit(*s2) ? 0 : *s2;


if(s1_i != s2_i) return s1_i - s2_i;
}
}


return 0;
}

Lists are comparable in Python, so if someone converts the strings representing the numbers into integers, the basic Python comparison can be used with success.

I needed to extend this approach a bit because I use Python3x where the cmp function does not exist any more. I had to emulate cmp(a,b) with (a > b) - (a < b). And, version numbers are not that clean at all, and can contain all kind of other alphanumeric characters. There are cases when the function can't tell the order so it returns False (see the first example).

So I'm posting this even if the question is old and answered already, because it may save a few minutes in someone's life.

import re


def _preprocess(v, separator, ignorecase):
if ignorecase: v = v.lower()
return [int(x) if x.isdigit() else [int(y) if y.isdigit() else y for y in re.findall("\d+|[a-zA-Z]+", x)] for x in v.split(separator)]


def compare(a, b, separator = '.', ignorecase = True):
a = _preprocess(a, separator, ignorecase)
b = _preprocess(b, separator, ignorecase)
try:
return (a > b) - (a < b)
except:
return False


print(compare('1.0', 'beta13'))
print(compare('1.1.2', '1.1.2'))
print(compare('1.2.2', '1.1.2'))
print(compare('1.1.beta1', '1.1.beta2'))

In case you don't want to pull in an external dependency here is my attempt written for Python 3.x.

rc, rel (and possibly one could add c) are regarded as "release candidate" and divide the version number into two parts and if missing the value of the second part is high (999). Else letters produce a split and are dealt as sub-numbers via base-36 code.

import re
from itertools import chain
def compare_version(version1,version2):
'''compares two version numbers
>>> compare_version('1', '2') < 0
True
>>> compare_version('2', '1') > 0
True
>>> compare_version('1', '1') == 0
True
>>> compare_version('1.0', '1') == 0
True
>>> compare_version('1', '1.000') == 0
True
>>> compare_version('12.01', '12.1') == 0
True
>>> compare_version('13.0.1', '13.00.02') <0
True
>>> compare_version('1.1.1.1', '1.1.1.1') == 0
True
>>> compare_version('1.1.1.2', '1.1.1.1') >0
True
>>> compare_version('1.1.3', '1.1.3.000') == 0
True
>>> compare_version('3.1.1.0', '3.1.2.10') <0
True
>>> compare_version('1.1', '1.10') <0
True
>>> compare_version('1.1.2','1.1.2') == 0
True
>>> compare_version('1.1.2','1.1.1') > 0
True
>>> compare_version('1.2','1.1.1') > 0
True
>>> compare_version('1.1.1-rc2','1.1.1-rc1') > 0
True
>>> compare_version('1.1.1a-rc2','1.1.1a-rc1') > 0
True
>>> compare_version('1.1.10-rc1','1.1.1a-rc2') > 0
True
>>> compare_version('1.1.1a-rc2','1.1.2-rc1') < 0
True
>>> compare_version('1.11','1.10.9') > 0
True
>>> compare_version('1.4','1.4-rc1') > 0
True
>>> compare_version('1.4c3','1.3') > 0
True
>>> compare_version('2.8.7rel.2','2.8.7rel.1') > 0
True
>>> compare_version('2.8.7.1rel.2','2.8.7rel.1') > 0
True


'''
chn = lambda x:chain.from_iterable(x)
def split_chrs(strings,chars):
for ch in chars:
strings = chn( [e.split(ch) for e in strings] )
return strings
split_digit_char=lambda x:[s for s in re.split(r'([a-zA-Z]+)',x) if len(s)>0]
splt = lambda x:[split_digit_char(y) for y in split_chrs([x],'.-_')]
def pad(c1,c2,f='0'):
while len(c1) > len(c2): c2+=[f]
while len(c2) > len(c1): c1+=[f]
def base_code(ints,base):
res=0
for i in ints:
res=base*res+i
return res
ABS = lambda lst: [abs(x) for x in lst]
def cmp(v1,v2):
c1 = splt(v1)
c2 = splt(v2)
pad(c1,c2,['0'])
for i in range(len(c1)): pad(c1[i],c2[i])
cc1 = [int(c,36) for c in chn(c1)]
cc2 = [int(c,36) for c in chn(c2)]
maxint = max(ABS(cc1+cc2))+1
return base_code(cc1,maxint) - base_code(cc2,maxint)
v_main_1, v_sub_1 = version1,'999'
v_main_2, v_sub_2 = version2,'999'
try:
v_main_1, v_sub_1 = tuple(re.split('rel|rc',version1))
except:
pass
try:
v_main_2, v_sub_2 = tuple(re.split('rel|rc',version2))
except:
pass
cmp_res=[cmp(v_main_1,v_main_2),cmp(v_sub_1,v_sub_2)]
res = base_code(cmp_res,max(ABS(cmp_res))+1)
return res




import random
from functools import cmp_to_key
random.shuffle(versions)
versions.sort(key=cmp_to_key(compare_version))
from distutils.version import StrictVersion
def version_compare(v1, v2, op=None):
_map = {
'<': [-1],
'lt': [-1],
'<=': [-1, 0],
'le': [-1, 0],
'>': [1],
'gt': [1],
'>=': [1, 0],
'ge': [1, 0],
'==': [0],
'eq': [0],
'!=': [-1, 1],
'ne': [-1, 1],
'<>': [-1, 1]
}
v1 = StrictVersion(v1)
v2 = StrictVersion(v2)
result = cmp(v1, v2)
if op:
assert op in _map.keys()
return result in _map[op]
return result

Implement for php version_compare, except "=". Because it's ambiguous.

Another solution:

def mycmp(v1, v2):
import itertools as it
f = lambda v: list(it.dropwhile(lambda x: x == 0, map(int, v.split('.'))[::-1]))[::-1]
return cmp(f(v1), f(v2))

One can use like this too:

import itertools as it
f = lambda v: list(it.dropwhile(lambda x: x == 0, map(int, v.split('.'))[::-1]))[::-1]
f(v1) <  f(v2)
f(v1) == f(v2)
f(v1) >  f(v2)

I did this in order to be able to parse and compare the Debian package version string. Please notice that it is not strict with the character validation.

This might be helpful as well:

#!/usr/bin/env python


# Read <https://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Version> for further informations.


class CommonVersion(object):
def __init__(self, version_string):
self.version_string = version_string
self.tags = []
self.parse()


def parse(self):
parts = self.version_string.split('~')
self.version_string = parts[0]
if len(parts) > 1:
self.tags = parts[1:]




def __lt__(self, other):
if self.version_string < other.version_string:
return True
for index, tag in enumerate(self.tags):
if index not in other.tags:
return True
if self.tags[index] < other.tags[index]:
return True


@staticmethod
def create(version_string):
return UpstreamVersion(version_string)


class UpstreamVersion(CommonVersion):
pass


class DebianMaintainerVersion(CommonVersion):
pass


class CompoundDebianVersion(object):
def __init__(self, epoch, upstream_version, debian_version):
self.epoch = epoch
self.upstream_version = UpstreamVersion.create(upstream_version)
self.debian_version = DebianMaintainerVersion.create(debian_version)


@staticmethod
def create(version_string):
version_string = version_string.strip()
epoch = 0
upstream_version = None
debian_version = '0'


epoch_check = version_string.split(':')
if epoch_check[0].isdigit():
epoch = int(epoch_check[0])
version_string = ':'.join(epoch_check[1:])
debian_version_check = version_string.split('-')
if len(debian_version_check) > 1:
debian_version = debian_version_check[-1]
version_string = '-'.join(debian_version_check[0:-1])


upstream_version = version_string


return CompoundDebianVersion(epoch, upstream_version, debian_version)


def __repr__(self):
return '{} {}'.format(self.__class__.__name__, vars(self))


def __lt__(self, other):
if self.epoch < other.epoch:
return True
if self.upstream_version < other.upstream_version:
return True
if self.debian_version < other.debian_version:
return True
return False




if __name__ == '__main__':
def lt(a, b):
assert(CompoundDebianVersion.create(a) < CompoundDebianVersion.create(b))


# test epoch
lt('1:44.5.6', '2:44.5.6')
lt('1:44.5.6', '1:44.5.7')
lt('1:44.5.6', '1:44.5.7')
lt('1:44.5.6', '2:44.5.6')
lt('  44.5.6', '1:44.5.6')


# test upstream version (plus tags)
lt('1.2.3~rc7',          '1.2.3')
lt('1.2.3~rc1',          '1.2.3~rc2')
lt('1.2.3~rc1~nightly1', '1.2.3~rc1')
lt('1.2.3~rc1~nightly2', '1.2.3~rc1')
lt('1.2.3~rc1~nightly1', '1.2.3~rc1~nightly2')
lt('1.2.3~rc1~nightly1', '1.2.3~rc2~nightly1')


# test debian maintainer version
lt('44.5.6-lts1', '44.5.6-lts12')
lt('44.5.6-lts1', '44.5.7-lts1')
lt('44.5.6-lts1', '44.5.7-lts2')
lt('44.5.6-lts1', '44.5.6-lts2')
lt('44.5.6-lts1', '44.5.6-lts2')
lt('44.5.6',      '44.5.6-lts1')

i'm using this one on my project:

cmp(v1.split("."), v2.split(".")) >= 0

Years later, but stil this question is on the top.

Here is my version sort function. It splits version into numbers and non-numbers sections. Numbers are compared as int rest as str (as parts of list items).

def sort_version_2(data):
def key(n):
a = re.split(r'(\d+)', n)
a[1::2] = map(int, a[1::2])
return a
return sorted(data, key=lambda n: key(n))

You can use function key as kind of custom Version type with compare operators. If out really want to use cmp you can do it like in this example: https://stackoverflow.com/a/22490617/9935708

def Version(s):
s = re.sub(r'(\.0*)*$', '', s)  # to avoid ".0" at end
a = re.split(r'(\d+)', s)
a[1::2] = map(int, a[1::2])
return a


def mycmp(a, b):
a, b = Version(a), Version(b)
return (a > b) - (a < b)  # DSM's answer


Test suite passes.