快速整数分解模块

我正在寻找一个 实施清晰的算法获得主要因素的 N在任何 Python,伪代码或任何其他可读性。有一些要求/限制:

  • N 在1到20位之间
  • 没有预先计算的查找表,但是可以使用制表
  • 不需要数学证明(例如,如果需要可以依赖哥德巴赫猜想)
  • 不需要精确,如果需要,允许是概率的/确定的

我需要一个快速的整数分解算法,不仅仅是为了它自己,也是为了其他许多算法的使用,比如计算欧拉 Phi (n)

我已经尝试了维基百科上的其他算法,但是要么我不能理解它们(ECM) ,要么我不能用这个算法创建一个有效的实现(Pollard-Brent)。

我对 Pollard-Brent 算法真的很感兴趣,所以如果有更多关于它的信息/实现就好了。

谢谢!

剪辑

在稍微调整了一下之后,我创建了一个相当快的素数/因数分解模块。它结合了优化的试算除法,波拉德-布伦特算法,米勒-拉宾素数测试和我在互联网上找到的最快的素数。GCD 是欧几里得常规的 GCD 实现(二进制欧几里得的 GCD 比常规的慢 很多)。

赏金

哦,欢乐,赏金是可以获得的! 但是我怎样才能赢得它呢?

  • 在我的模块中找到一个优化或错误。
  • 提供替代/更好的算法/实现。

最完整/最有建设性的答案获得奖金。

最后是模块本身:

import random


def primesbelow(N):
# http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
#""" Input N>=6, Returns a list of primes, 2 <= p < N """
correction = N % 6 > 1
N = {0:N, 1:N-1, 2:N+4, 3:N+3, 4:N+2, 5:N+1}[N%6]
sieve = [True] * (N // 3)
sieve[0] = False
for i in range(int(N ** .5) // 3 + 1):
if sieve[i]:
k = (3 * i + 1) | 1
sieve[k*k // 3::2*k] = [False] * ((N//6 - (k*k)//6 - 1)//k + 1)
sieve[(k*k + 4*k - 2*k*(i%2)) // 3::2*k] = [False] * ((N // 6 - (k*k + 4*k - 2*k*(i%2))//6 - 1) // k + 1)
return [2, 3] + [(3 * i + 1) | 1 for i in range(1, N//3 - correction) if sieve[i]]


smallprimeset = set(primesbelow(100000))
_smallprimeset = 100000
def isprime(n, precision=7):
# http://en.wikipedia.org/wiki/Miller-Rabin_primality_test#Algorithm_and_running_time
if n < 1:
raise ValueError("Out of bounds, first argument must be > 0")
elif n <= 3:
return n >= 2
elif n % 2 == 0:
return False
elif n < _smallprimeset:
return n in smallprimeset




d = n - 1
s = 0
while d % 2 == 0:
d //= 2
s += 1


for repeat in range(precision):
a = random.randrange(2, n - 2)
x = pow(a, d, n)
    

if x == 1 or x == n - 1: continue
    

for r in range(s - 1):
x = pow(x, 2, n)
if x == 1: return False
if x == n - 1: break
else: return False


return True


# https://comeoncodeon.wordpress.com/2010/09/18/pollard-rho-brent-integer-factorization/
def pollard_brent(n):
if n % 2 == 0: return 2
if n % 3 == 0: return 3


y, c, m = random.randint(1, n-1), random.randint(1, n-1), random.randint(1, n-1)
g, r, q = 1, 1, 1
while g == 1:
x = y
for i in range(r):
y = (pow(y, 2, n) + c) % n


k = 0
while k < r and g==1:
ys = y
for i in range(min(m, r-k)):
y = (pow(y, 2, n) + c) % n
q = q * abs(x-y) % n
g = gcd(q, n)
k += m
r *= 2
if g == n:
while True:
ys = (pow(ys, 2, n) + c) % n
g = gcd(abs(x - ys), n)
if g > 1:
break


return g


smallprimes = primesbelow(1000) # might seem low, but 1000*1000 = 1000000, so this will fully factor every composite < 1000000
def primefactors(n, sort=False):
factors = []


for checker in smallprimes:
while n % checker == 0:
factors.append(checker)
n //= checker
if checker > n: break


if n < 2: return factors


while n > 1:
if isprime(n):
factors.append(n)
break
factor = pollard_brent(n) # trial division did not fully factor, switch to pollard-brent
factors.extend(primefactors(factor)) # recurse to factor the not necessarily prime factor returned by pollard-brent
n //= factor


if sort: factors.sort()


return factors


def factorization(n):
factors = {}
for p1 in primefactors(n):
try:
factors[p1] += 1
except KeyError:
factors[p1] = 1
return factors


totients = {}
def totient(n):
if n == 0: return 1


try: return totients[n]
except KeyError: pass


tot = 1
for p, exp in factorization(n).items():
tot *= (p - 1)  *  p ** (exp - 1)


totients[n] = tot
return tot


def gcd(a, b):
if a == b: return a
while b > 0: a, b = b, a % b
return a


def lcm(a, b):
return abs((a // gcd(a, b)) * b)
70223 次浏览

You should probably do some prime detection which you could look here, Fast algorithm for finding prime numbers?

You should read that entire blog though, there is a few algorithms that he lists for testing primality.

Even on the current one, there are several spots to be noticed.

  1. Don't do checker*checker every loop, use s=ceil(sqrt(num)) and checher < s
  2. checher should plus 2 each time, ignore all even numbers except 2
  3. Use divmod instead of % and //

There is no need to calculate smallprimes using primesbelow, use smallprimeset for that.

smallprimes = (2,) + tuple(n for n in xrange(3,1000,2) if n in smallprimeset)

Divide your primefactors into two functions for handling smallprimes and other for pollard_brent, this can save a couple of iterations as all the powers of smallprimes will be divided from n.

def primefactors(n, sort=False):
factors = []


limit = int(n ** .5) + 1
for checker in smallprimes:
print smallprimes[-1]
if checker > limit: break
while n % checker == 0:
factors.append(checker)
n //= checker




if n < 2: return factors
else :
factors.extend(bigfactors(n,sort))
return factors


def bigfactors(n, sort = False):
factors = []
while n > 1:
if isprime(n):
factors.append(n)
break
factor = pollard_brent(n)
factors.extend(bigfactors(factor,sort)) # recurse to factor the not necessarily prime factor returned by pollard-brent
n //= factor


if sort: factors.sort()
return factors

By considering verified results of Pomerance, Selfridge and Wagstaff and Jaeschke, you can reduce the repetitions in isprime which uses Miller-Rabin primality test. From Wiki.

  • if n < 1,373,653, it is enough to test a = 2 and 3;
  • if n < 9,080,191, it is enough to test a = 31 and 73;
  • if n < 4,759,123,141, it is enough to test a = 2, 7, and 61;
  • if n < 2,152,302,898,747, it is enough to test a = 2, 3, 5, 7, and 11;
  • if n < 3,474,749,660,383, it is enough to test a = 2, 3, 5, 7, 11, and 13;
  • if n < 341,550,071,728,321, it is enough to test a = 2, 3, 5, 7, 11, 13, and 17.

Edit 1: Corrected return call of if-else to append bigfactors to factors in primefactors.

There's a python library with a collection of primality tests (including incorrect ones for what not to do). It's called pyprimes. Figured it's worth mentioning for posterity's purpose. I don't think it includes the algorithms you mentioned.

I just ran into a bug in this code when factoring the number 2**1427 * 31.

  File "buckets.py", line 48, in prettyprime
factors = primefactors.primefactors(n, sort=True)
File "/private/tmp/primefactors.py", line 83, in primefactors
limit = int(n ** .5) + 1
OverflowError: long int too large to convert to float

This code snippet:

limit = int(n ** .5) + 1
for checker in smallprimes:
if checker > limit: break
while n % checker == 0:
factors.append(checker)
n //= checker
limit = int(n ** .5) + 1
if checker > limit: break

should be rewritten into

for checker in smallprimes:
while n % checker == 0:
factors.append(checker)
n //= checker
if checker > n: break

which will likely perform faster on realistic inputs anyway. Square root is slow — basically the equivalent of many multiplications —, smallprimes only has a few dozen members, and this way we remove the computation of n ** .5 from the tight inner loop, which is certainly helpful when factoring numbers like 2**1427. There's simply no reason to compute sqrt(2**1427), sqrt(2**1426), sqrt(2**1425), etc. etc., when all we care about is "does the [square of the] checker exceed n".

The rewritten code doesn't throw exceptions when presented with big numbers, and is roughly twice as fast according to timeit (on sample inputs 2 and 2**718 * 31).

Also notice that isprime(2) returns the wrong result, but this is okay as long as we don't rely on it. IMHO you should rewrite the intro of that function as

if n <= 3:
return n >= 2
...

You could factorize up to a limit then use brent to get higher factors

from fractions import gcd
from random import randint


def brent(N):
if N%2==0: return 2
y,c,m = randint(1, N-1),randint(1, N-1),randint(1, N-1)
g,r,q = 1,1,1
while g==1:
x = y
for i in range(r):
y = ((y*y)%N+c)%N
k = 0
while (k<r and g==1):
ys = y
for i in range(min(m,r-k)):
y = ((y*y)%N+c)%N
q = q*(abs(x-y))%N
g = gcd(q,N)
k = k + m
r = r*2
if g==N:
while True:
ys = ((ys*ys)%N+c)%N
g = gcd(abs(x-ys),N)
if g>1:  break
return g


def factorize(n1):
if n1==0: return []
if n1==1: return [1]
n=n1
b=[]
p=0
mx=1000000
while n % 2 ==0 : b.append(2);n//=2
while n % 3 ==0 : b.append(3);n//=3
i=5
inc=2
while i <=mx:
while n % i ==0 : b.append(i); n//=i
i+=inc
inc=6-inc
while n>mx:
p1=n
while p1!=p:
p=p1
p1=brent(p)
b.append(p1);n//=p1
if n!=1:b.append(n)
return sorted(b)


from functools import reduce
#n= 2**1427 * 31 #
n= 67898771  * 492574361 * 10000223 *305175781* 722222227*880949 *908909
li=factorize(n)
print (li)
print (n - reduce(lambda x,y :x*y ,li))

If you don't want to reinvent the wheel, use the library sympy

pip install sympy

Use the function sympy.ntheory.factorint

Given a positive integer n, factorint(n) returns a dict containing the prime factors of n as keys and their respective multiplicities as values. For example:

Example:

>>> from sympy.ntheory import factorint
>>> factorint(10**20+1)
{73: 1, 5964848081: 1, 1676321: 1, 137: 1}

You can factor some very large numbers:

>>> factorint(10**100+1)
{401: 1, 5964848081: 1, 1676321: 1, 1601: 1, 1201: 1, 137: 1, 73: 1, 129694419029057750551385771184564274499075700947656757821537291527196801: 1}