如何将任意长度的列表拆分为大小相等的块?


请参阅如果数据结果将直接用于循环,并且不需要存储,则如何以块形式遍历列表。

对于字符串输入的同一问题,请参见每n个字符拆分字符串?。相同的技术通常适用,但也有一些变化。


当前回答

toolz库具有如下分区函数:

from toolz.itertoolz.core import partition

list(partition(2, [1, 2, 3, 4]))
[(1, 2), (3, 4)]

其他回答

让r是块大小,L是初始列表,您可以这样做。

chunkL = [ [i for i in L[r*k:r*(k+1)] ] for k in range(len(L)/r)] 

使用列表综合:

l = [1,2,3,4,5,6,7,8,9,10,11,12]
k = 5 #chunk size
print [tuple(l[x:y]) for (x, y) in [(x, x+k) for x in range(0, len(l), k)]]
>>> def f(x, n, acc=[]): return f(x[n:], n, acc+[(x[:n])]) if x else acc
>>> f("Hallo Welt", 3)
['Hal', 'lo ', 'Wel', 't']
>>> 

如果你在括号里-我拿起了一本关于Erlang的书:)

我很好奇不同方法的性能,这里是:

在Python 3.5.1上测试

import time
batch_size = 7
arr_len = 298937

#---------slice-------------

print("\r\nslice")
start = time.time()
arr = [i for i in range(0, arr_len)]
while True:
    if not arr:
        break

    tmp = arr[0:batch_size]
    arr = arr[batch_size:-1]
print(time.time() - start)

#-----------index-----------

print("\r\nindex")
arr = [i for i in range(0, arr_len)]
start = time.time()
for i in range(0, round(len(arr) / batch_size + 1)):
    tmp = arr[batch_size * i : batch_size * (i + 1)]
print(time.time() - start)

#----------batches 1------------

def batch(iterable, n=1):
    l = len(iterable)
    for ndx in range(0, l, n):
        yield iterable[ndx:min(ndx + n, l)]

print("\r\nbatches 1")
arr = [i for i in range(0, arr_len)]
start = time.time()
for x in batch(arr, batch_size):
    tmp = x
print(time.time() - start)

#----------batches 2------------

from itertools import islice, chain

def batch(iterable, size):
    sourceiter = iter(iterable)
    while True:
        batchiter = islice(sourceiter, size)
        yield chain([next(batchiter)], batchiter)


print("\r\nbatches 2")
arr = [i for i in range(0, arr_len)]
start = time.time()
for x in batch(arr, batch_size):
    tmp = x
print(time.time() - start)

#---------chunks-------------
def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in range(0, len(l), n):
        yield l[i:i + n]
print("\r\nchunks")
arr = [i for i in range(0, arr_len)]
start = time.time()
for x in chunks(arr, batch_size):
    tmp = x
print(time.time() - start)

#-----------grouper-----------

from itertools import zip_longest # for Python 3.x
#from six.moves import zip_longest # for both (uses the six compat library)

def grouper(iterable, n, padvalue=None):
    "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
    return zip_longest(*[iter(iterable)]*n, fillvalue=padvalue)

arr = [i for i in range(0, arr_len)]
print("\r\ngrouper")
start = time.time()
for x in grouper(arr, batch_size):
    tmp = x
print(time.time() - start)

结果:

slice
31.18285083770752

index
0.02184295654296875

batches 1
0.03503894805908203

batches 2
0.22681021690368652

chunks
0.019841909408569336

grouper
0.006506919860839844
def chunk(input, size):
    return map(None, *([iter(input)] * size))