我有一个Python脚本,它把一个整数列表作为输入,我需要一次处理四个整数。不幸的是,我无法控制输入,否则我将它作为一个四元素元组列表传入。目前,我以这种方式迭代它:

for i in range(0, len(ints), 4):
    # dummy op for example code
    foo += ints[i] * ints[i + 1] + ints[i + 2] * ints[i + 3]

不过,它看起来很像“C-think”,这让我怀疑有一种更python的方式来处理这种情况。该列表在迭代后被丢弃,因此不需要保留。也许这样会更好?

while ints:
    foo += ints[0] * ints[1] + ints[2] * ints[3]
    ints[0:4] = []

不过,感觉还是不太对。: - /

相关问题:在Python中如何将列表分割成大小均匀的块?


当前回答

要避免所有到列表的转换,请导入itertools和:

>>> for k, g in itertools.groupby(xrange(35), lambda x: x/10):
...     list(g)

生产:

... 
0 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
1 [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
2 [20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
3 [30, 31, 32, 33, 34]
>>> 

我检查了groupby,它不转换为列表或使用len,所以我(认为)这将延迟每个值的解析,直到它实际使用。不幸的是,没有一个现成的答案(在这个时候)似乎提供了这种变化。

显然,如果你需要依次处理每一项,在g上嵌套一个for循环:

for k,g in itertools.groupby(xrange(35), lambda x: x/10):
    for i in g:
       # do what you need to do with individual items
    # now do what you need to do with the whole group

我对此特别感兴趣的是需要消耗一个生成器,以批量提交最多1000个更改到gmail API:

    messages = a_generator_which_would_not_be_smart_as_a_list
    for idx, batch in groupby(messages, lambda x: x/1000):
        batch_request = BatchHttpRequest()
        for message in batch:
            batch_request.add(self.service.users().messages().modify(userId='me', id=message['id'], body=msg_labels))
        http = httplib2.Http()
        self.credentials.authorize(http)
        batch_request.execute(http=http)

其他回答

我需要一个解决方案,也将工作与集和生成器。我写不出很短很漂亮的东西,但至少可读性很好。

def chunker(seq, size):
    res = []
    for el in seq:
        res.append(el)
        if len(res) == size:
            yield res
            res = []
    if res:
        yield res

列表:

>>> list(chunker([i for i in range(10)], 3))
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]

Set:

>>> list(chunker(set([i for i in range(10)]), 3))
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]

发电机:

>>> list(chunker((i for i in range(10)), 3))
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
def group_by(iterable, size):
    """Group an iterable into lists that don't exceed the size given.

    >>> group_by([1,2,3,4,5], 2)
    [[1, 2], [3, 4], [5]]

    """
    sublist = []

    for index, item in enumerate(iterable):
        if index > 0 and index % size == 0:
            yield sublist
            sublist = []

        sublist.append(item)

    if sublist:
        yield sublist

首先,我将它设计为将字符串拆分为子字符串以解析包含十六进制的字符串。 今天我把它变成复杂的,但仍然简单的生成器。

def chunker(iterable, size, reductor, condition):
    it = iter(iterable)
    def chunk_generator():
        return (next(it) for _ in range(size))
    chunk = reductor(chunk_generator())
    while condition(chunk):
        yield chunk
        chunk = reductor(chunk_generator())

参数:

明显的

Iterable是任何包含/生成/迭代输入数据的Iterable /迭代器/生成器, 当然,大小是你想要得到的块的大小,

更有趣的

reductor is a callable, which receives generator iterating over content of chunk. I'd expect it to return sequence or string, but I don't demand that. You can pass as this argument for example list, tuple, set, frozenset, or anything fancier. I'd pass this function, returning string (provided that iterable contains / generates / iterates over strings): def concatenate(iterable): return ''.join(iterable) Note that reductor can cause closing generator by raising exception. condition is a callable which receives anything what reductor returned. It decides to approve & yield it (by returning anything evaluating to True), or to decline it & finish generator's work (by returning anything other or raising exception). When number of elements in iterable is not divisible by size, when it gets exhausted, reductor will receive generator generating less elements than size. Let's call these elements lasts elements. I invited two functions to pass as this argument: lambda x:x - the lasts elements will be yielded. lambda x: len(x)==<size> - the lasts elements will be rejected. replace <size> using number equal to size

在我的特殊情况下,我需要填充项目,以重复最后一个元素,直到它达到大小,所以我改变了这个答案,以适应我的需要。

大小为4的输入输出示例:

Input = [1,2,3,4,5,6,7,8]
Output= [[1,2,3,4], [5,6,7,8]]

Input = [[1,2,3,4,5,6,7]]
Output= [[1,2,3,4], [5,6,7,7]]

Input = [1,2,3,4,5]
Output= [[1,2,3,4], [5,5,5,5]]
def chunker(seq, size):
    res = []
    for el in seq:
        res.append(el)
        if len(res) == size:
            yield res
            res = []
    if res:
        res = res + (size - len(res)) * [res[-1]] 
        yield res

下面是一个支持生成器的无导入chunker:

def chunks(seq, size):
    it = iter(seq)
    while True:
        ret = tuple(next(it) for _ in range(size))
        if len(ret) == size:
            yield ret
        else:
            raise StopIteration()

使用示例:

>>> def foo():
...     i = 0
...     while True:
...         i += 1
...         yield i
...
>>> c = chunks(foo(), 3)
>>> c.next()
(1, 2, 3)
>>> c.next()
(4, 5, 6)
>>> list(chunks('abcdefg', 2))
[('a', 'b'), ('c', 'd'), ('e', 'f')]