将清单大致等份的最佳方法是什么?例如,如果列表有7个元素,并将其分为2部分,我们希望其中一部分有3个元素,而另一部分应该有4个元素。

我正在寻找类似even_split(L, n)的东西,它将L分解为n部分。

def chunks(L, n):
    """ Yield successive n-sized chunks from L.
    """
    for i in range(0, len(L), n):
        yield L[i:i+n]

上面的代码给出了3个块,而不是3个块。我可以简单地转置(遍历这个,取每列的第一个元素,称之为第一部分,然后取第二个元素,把它放在第二部分,等等),但这破坏了项目的顺序。


当前回答

假设你想把一个列表[1、2、3、4、5、6、7、8]分成3个元素列表

如[[1,2,3],[4,5,6],[7,8]],如果剩下的最后一个元素小于3,则将它们分组在一起。

my_list = [1, 2, 3, 4, 5, 6, 7, 8]
my_list2 = [my_list[i:i+3] for i in range(0, len(my_list), 3)]
print(my_list2)

输出:[[1,2,3],[4,5,6],[7,8]]

其中一部分的长度为3。用你自己的块大小替换3。

其他回答

这段代码为我工作(python3兼容):

def chunkify(tab, num):
    return [tab[i*num: i*num+num] for i in range(len(tab)//num+(1 if len(tab)%num else 0))]

示例(适用于bytearray类型,但也适用于列表):

b = bytearray(b'\x01\x02\x03\x04\x05\x06\x07\x08')
>>> chunkify(b,3)
[bytearray(b'\x01\x02\x03'), bytearray(b'\x04\x05\x06'), bytearray(b'\x07\x08')]
>>> chunkify(b,4)
[bytearray(b'\x01\x02\x03\x04'), bytearray(b'\x05\x06\x07\x08')]

另一个简单可读的chunker的尝试。

def chunk(iterable, count): # returns a *generator* that divides `iterable` into `count` of contiguous chunks of similar size
    assert count >= 1
    return (iterable[int(_*len(iterable)/count+0.5):int((_+1)*len(iterable)/count+0.5)] for _ in range(count))

print("Chunk count:  ", len(list(         chunk(range(105),10))))
print("Chunks:       ",     list(         chunk(range(105),10)))
print("Chunks:       ",     list(map(list,chunk(range(105),10))))
print("Chunk lengths:",     list(map(len, chunk(range(105),10))))

print("Testing...")
for iterable_length in range(100):
    for chunk_count in range(1,100):
        chunks = list(chunk(range(iterable_length),chunk_count))
        assert chunk_count == len(chunks)
        assert iterable_length == sum(map(len,chunks))
        assert all(map(lambda _:abs(len(_)-iterable_length/chunk_count)<=1,chunks))
print("Okay")

输出:

Chunk count:   10
Chunks:        [range(0, 11), range(11, 21), range(21, 32), range(32, 42), range(42, 53), range(53, 63), range(63, 74), range(74, 84), range(84, 95), range(95, 105)]
Chunks:        [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], [32, 33, 34, 35, 36, 37, 38, 39, 40, 41], [42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52], [53, 54, 55, 56, 57, 58, 59, 60, 61, 62], [63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73], [74, 75, 76, 77, 78, 79, 80, 81, 82, 83], [84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94], [95, 96, 97, 98, 99, 100, 101, 102, 103, 104]]
Chunk lengths: [11, 10, 11, 10, 11, 10, 11, 10, 11, 10]
Testing...
Okay

这里有一个单独的函数,它处理了大多数不同的分裂情况:

def splitList(lst, into):
    '''Split a list into parts.

    :Parameters:
        into (str) = Split the list into parts defined by the following:
            '<n>parts' - Split the list into n parts.
                ex. 2 returns:  [[1, 2, 3, 5], [7, 8, 9]] from [1,2,3,5,7,8,9]
            '<n>parts+' - Split the list into n equal parts with any trailing remainder.
                ex. 2 returns:  [[1, 2, 3], [5, 7, 8], [9]] from [1,2,3,5,7,8,9]
            '<n>chunks' - Split into sublists of n size.
                ex. 2 returns: [[1,2], [3,5], [7,8], [9]] from [1,2,3,5,7,8,9]
            'contiguous' - The list will be split by contiguous numerical values.
                ex. 'contiguous' returns: [[1,2,3], [5], [7,8,9]] from [1,2,3,5,7,8,9]
            'range' - The values of 'contiguous' will be limited to the high and low end of each range.
                ex. 'range' returns: [[1,3], [5], [7,9]] from [1,2,3,5,7,8,9]
    :Return:
        (list)
    '''
    from string import digits, ascii_letters, punctuation
    mode = into.lower().lstrip(digits)
    digit = into.strip(ascii_letters+punctuation)
    n = int(digit) if digit else None

    if n:
        if mode=='parts':
            n = len(lst)*-1 // n*-1 #ceil
        elif mode=='parts+':
            n = len(lst) // n
        return [lst[i:i+n] for i in range(0, len(lst), n)]

    elif mode=='contiguous' or mode=='range':
        from itertools import groupby
        from operator import itemgetter

        try:
            contiguous = [list(map(itemgetter(1), g)) for k, g in groupby(enumerate(lst), lambda x: int(x[0])-int(x[1]))]
        except ValueError as error:
            print ('{} in splitList\n   # Error: {} #\n {}'.format(__file__, error, lst))
            return lst
        if mode=='range':
            return [[i[0], i[-1]] if len(i)>1 else (i) for i in contiguous]
        return contiguous

r = splitList([1, '2', 3, 5, '7', 8, 9], into='2parts')
print (r) #returns: [[1, '2', 3, 5], ['7', 8, 9]]

将代码更改为产生n个块,而不是n个块:

def chunks(l, n):
    """ Yield n successive chunks from l.
    """
    newn = int(len(l) / n)
    for i in xrange(0, n-1):
        yield l[i*newn:i*newn+newn]
    yield l[n*newn-newn:]

l = range(56)
three_chunks = chunks (l, 3)
print three_chunks.next()
print three_chunks.next()
print three_chunks.next()

这使:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]
[18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
[36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55]

这将把额外的元素分配给最后一组,这并不完美,但在你的“大致N相等部分”的规范范围内:-)我的意思是,56个元素作为(19,19,18)会更好,而这给出(18,18,20)。

你可以用下面的代码得到更均衡的输出:

#!/usr/bin/python
def chunks(l, n):
    """ Yield n successive chunks from l.
    """
    newn = int(1.0 * len(l) / n + 0.5)
    for i in xrange(0, n-1):
        yield l[i*newn:i*newn+newn]
    yield l[n*newn-newn:]

l = range(56)
three_chunks = chunks (l, 3)
print three_chunks.next()
print three_chunks.next()
print three_chunks.next()

输出:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
[19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37]
[38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55]

以下是我的解决方案:

def chunks(l, amount):
    if amount < 1:
        raise ValueError('amount must be positive integer')
    chunk_len = len(l) // amount
    leap_parts = len(l) % amount
    remainder = amount // 2  # make it symmetrical
    i = 0
    while i < len(l):
        remainder += leap_parts
        end_index = i + chunk_len
        if remainder >= amount:
            remainder -= amount
            end_index += 1
        yield l[i:end_index]
        i = end_index

生产

    >>> list(chunks([1, 2, 3, 4, 5, 6, 7], 3))
    [[1, 2], [3, 4, 5], [6, 7]]