这是我能想到的最好的算法。
def get_primes(n):
numbers = set(range(n, 1, -1))
primes = []
while numbers:
p = numbers.pop()
primes.append(p)
numbers.difference_update(set(range(p*2, n+1, p)))
return primes
>>> timeit.Timer(stmt='get_primes.get_primes(1000000)', setup='import get_primes').timeit(1)
1.1499958793645562
还能做得更快吗?
这段代码有一个缺陷:由于numbers是一个无序集,不能保证numbers.pop()将从集合中移除最低的数字。尽管如此,它还是适用于(至少对我来说)一些输入数字:
>>> sum(get_primes(2000000))
142913828922L
#That's the correct sum of all numbers below 2 million
>>> 529 in get_primes(1000)
False
>>> 529 in get_primes(530)
True
下面是Eratosthenes的一个numpy版本,具有良好的复杂度(低于排序长度为n的数组)和向量化。与@unutbu相比,用46微秒就可以找到100万以下的所有质数。
import numpy as np
def generate_primes(n):
is_prime = np.ones(n+1,dtype=bool)
is_prime[0:2] = False
for i in range(int(n**0.5)+1):
if is_prime[i]:
is_prime[i**2::i]=False
return np.where(is_prime)[0]
计时:
import time
for i in range(2,10):
timer =time.time()
generate_primes(10**i)
print('n = 10^',i,' time =', round(time.time()-timer,6))
>> n = 10^ 2 time = 5.6e-05
>> n = 10^ 3 time = 6.4e-05
>> n = 10^ 4 time = 0.000114
>> n = 10^ 5 time = 0.000593
>> n = 10^ 6 time = 0.00467
>> n = 10^ 7 time = 0.177758
>> n = 10^ 8 time = 1.701312
>> n = 10^ 9 time = 19.322478
你有一个更快的代码和最简单的代码生成质数。
但对于更大的数字,当n=10000, 10000000时,它不起作用,可能是。pop()方法失败了
考虑:N是质数吗?
case 1:
You got some factors of N,
for i in range(2, N):
If N is prime loop is performed for ~(N-2) times. else less number of times
case 2:
for i in range(2, int(math.sqrt(N)):
Loop is performed for almost ~(sqrt(N)-2) times if N is prime else will break somewhere
case 3:
Better We Divide N With Only number of primes<=sqrt(N)
Where loop is performed for only π(sqrt(N)) times
π(sqrt(N)) << sqrt(N) as N increases
from math import sqrt
from time import *
prime_list = [2]
n = int(input())
s = time()
for n0 in range(2,n+1):
for i0 in prime_list:
if n0%i0==0:
break
elif i0>=int(sqrt(n0)):
prime_list.append(n0)
break
e = time()
print(e-s)
#print(prime_list); print(f'pi({n})={len(prime_list)}')
print(f'{n}: {len(prime_list)}, time: {e-s}')
Output
100: 25, time: 0.00010275840759277344
1000: 168, time: 0.0008606910705566406
10000: 1229, time: 0.015588521957397461
100000: 9592, time: 0.023436546325683594
1000000: 78498, time: 4.1965954303741455
10000000: 664579, time: 109.24591708183289
100000000: 5761455, time: 2289.130858898163
小于1000似乎很慢,但小于10^6我认为更快。
然而,我无法理解时间的复杂性。
这里是最快的函数之一的两个更新版本(纯Python 3.6),
from itertools import compress
def rwh_primes1v1(n):
""" Returns a list of primes < n for n > 2 """
sieve = bytearray([True]) * (n//2)
for i in range(3,int(n**0.5)+1,2):
if sieve[i//2]:
sieve[i*i//2::i] = bytearray((n-i*i-1)//(2*i)+1)
return [2,*compress(range(3,n,2), sieve[1:])]
def rwh_primes1v2(n):
""" Returns a list of primes < n for n > 2 """
sieve = bytearray([True]) * (n//2+1)
for i in range(1,int(n**0.5)//2+1):
if sieve[i]:
sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1)
return [2,*compress(range(3,n,2), sieve[1:])]
对于最快的代码,numpy解决方案是最好的。不过,出于纯粹的学术原因,我发布了我的纯python版本,它比上面发布的食谱版本快不到50%。由于我将整个列表放在内存中,所以需要足够的空间来容纳所有内容,但它的可伸缩性似乎相当好。
def daniel_sieve_2(maxNumber):
"""
Given a number, returns all numbers less than or equal to
that number which are prime.
"""
allNumbers = range(3, maxNumber+1, 2)
for mIndex, number in enumerate(xrange(3, maxNumber+1, 2)):
if allNumbers[mIndex] == 0:
continue
# now set all multiples to 0
for index in xrange(mIndex+number, (maxNumber-3)/2+1, number):
allNumbers[index] = 0
return [2] + filter(lambda n: n!=0, allNumbers)
结果是:
>>>mine = timeit.Timer("daniel_sieve_2(1000000)",
... "from sieves import daniel_sieve_2")
>>>prev = timeit.Timer("get_primes_erat(1000000)",
... "from sieves import get_primes_erat")
>>>print "Mine: {0:0.4f} ms".format(min(mine.repeat(3, 1))*1000)
Mine: 428.9446 ms
>>>print "Previous Best {0:0.4f} ms".format(min(prev.repeat(3, 1))*1000)
Previous Best 621.3581 ms