滚动或滑动窗口迭代器?

我需要一个滚动窗口(又名滑动窗口)可迭代的序列/迭代器/生成器。(默认的Python迭代可以被认为是一种特殊情况，其中窗口长度为1。)我目前正在使用以下代码。我怎样才能做得更优雅和/或更有效?

def rolling_window(seq, window_size):
    it = iter(seq)
    win = [it.next() for cnt in xrange(window_size)] # First window
    yield win
    for e in it: # Subsequent windows
        win[:-1] = win[1:]
        win[-1] = e
        yield win

if __name__=="__main__":
    for w in rolling_window(xrange(6), 3):
        print w

"""Example output:   
   [0, 1, 2]
   [1, 2, 3]
   [2, 3, 4]
   [3, 4, 5]
"""

对于window_size == 2的特定情况(即，在序列中迭代相邻的重叠对)，请参见如何从列表中迭代重叠(当前，下一个)值对?

当前回答

我的两个版本的窗口实现

from typing import Sized, Iterable

def window(seq: Sized, n: int, strid: int = 1, drop_last: bool = False):
    for i in range(0, len(seq), strid):
        res = seq[i:i + n]
        if drop_last and len(res) < n:
            break
        yield res


def window2(seq: Iterable, n: int, strid: int = 1, drop_last: bool = False):
    it = iter(seq)
    result = []
    step = 0
    for i, ele in enumerate(it):
        result.append(ele)
        result = result[-n:]
        if len(result) == n:
            if step % strid == 0:
                yield result
            step += 1
    if not drop_last:
        yield result

2021-04-12 04:40:57

其他回答

如何使用以下方法:

mylist = [1, 2, 3, 4, 5, 6, 7]

def sliding_window(l, window_size=2):
    if window_size > len(l):
        raise ValueError("Window size must be smaller or equal to the number of elements in the list.")

    t = []
    for i in xrange(0, window_size):
        t.append(l[i:])

    return zip(*t)

print sliding_window(mylist, 3)

输出:

[(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 7)]

2016-04-06 12:36:16

更新

Kelly发现这是一个重复的答案。但我在这里留下这个作为反例，因为我包含了一个毫无意义的最小值。

所以如果你想用min来避免IndexError，没有必要，range会帮你处理这种情况。

旧的答案

奇怪的是，当n > len(l)返回[]时，下面的句柄在语义上是正确的。

>>> l = [0, 1, 2, 3, 4]

>>> n = 2
>>> [l[i: i + min(n, len(l)-i)] for i in range(len(l)-n+1)]
>>> [[0, 1], [1, 2], [2, 3], [3, 4]]
>>>
>>> n = 3
>>> [l[i: i + min(n, len(l)-i)] for i in range(len(l)-n+1)]
>>> [[0, 1, 2], [1, 2, 3], [2, 3, 4]]
>>>
>>> n = 4
>>> [l[i: i + min(n, len(l)-i)] for i in range(len(l)-n+1)]
>>> [[0, 1, 2, 3], [1, 2, 3, 4]]
>>>
>>> n = 5
>>> [l[i: i + min(n, len(l)-i)] for i in range(len(l)-n+1)]
>>> [[0, 1, 2, 3, 4]]
>>>
>>> n = 10 # n > len(l)
>>> [l[i: i + min(n, len(l)-i)] for i in range(len(l)-n+1)]
>>> []

2022-10-17 17:20:34

这里是一个泛化，增加了对step, fillvalue参数的支持:

from collections import deque
from itertools import islice

def sliding_window(iterable, size=2, step=1, fillvalue=None):
    if size < 0 or step < 1:
        raise ValueError
    it = iter(iterable)
    q = deque(islice(it, size), maxlen=size)
    if not q:
        return  # empty iterable or size == 0
    q.extend(fillvalue for _ in range(size - len(q)))  # pad to size
    while True:
        yield iter(q)  # iter() to avoid accidental outside modifications
        try:
            q.append(next(it))
        except StopIteration: # Python 3.5 pep 479 support
            return
        q.extend(next(it, fillvalue) for _ in range(step - 1))

它每次产生块大小的项目，每次迭代滚动步骤位置，在必要时用fillvalue填充每个块。示例:size=4, step=3, fillvalue='*':

 [a b c d]e f g h i j k l m n o p q r s t u v w x y z
  a b c[d e f g]h i j k l m n o p q r s t u v w x y z
  a b c d e f[g h i j]k l m n o p q r s t u v w x y z
  a b c d e f g h i[j k l m]n o p q r s t u v w x y z
  a b c d e f g h i j k l[m n o p]q r s t u v w x y z
  a b c d e f g h i j k l m n o[p q r s]t u v w x y z
  a b c d e f g h i j k l m n o p q r[s t u v]w x y z
  a b c d e f g h i j k l m n o p q r s t u[v w x y]z
  a b c d e f g h i j k l m n o p q r s t u v w x[y z * *]

有关step参数的用例示例，请参见用python有效地处理大型.txt文件。

2012-11-15 23:47:32

只是一个简短的贡献。

由于当前的python文档在itertool示例中没有“window”(即，在http://docs.python.org/library/itertools.html的底部)，这里有一个基于石斑鱼的代码，这是给出的例子之一:

import itertools as it
def window(iterable, size):
    shiftedStarts = [it.islice(iterable, s, None) for s in xrange(size)]
    return it.izip(*shiftedStarts)

基本上，我们创建了一系列切片迭代器，每个迭代器的起点都在前面一个位置。然后，我们把它们拉在一起。注意，这个函数返回一个生成器(它本身不是直接的生成器)。

就像上面的appendingelement和advingiterator版本一样，性能(即，哪个是最好的)随列表大小和窗口大小而变化。我喜欢这个，因为它是一个两行代码(它也可以是一行代码，但我更喜欢命名概念)。

事实证明上面的代码是错误的。如果传递给iterable的参数是一个序列则有效，但如果它是一个迭代器则无效。如果它是一个迭代器，那么在islice调用之间共享相同的迭代器(但不是tee - d)，这将严重破坏事情。

下面是一些固定的代码:

import itertools as it
def window(iterable, size):
    itrs = it.tee(iterable, size)
    shiftedStarts = [it.islice(anItr, s, None) for s, anItr in enumerate(itrs)]
    return it.izip(*shiftedStarts)

另外，书里还有一个版本。这个版本不是复制一个迭代器，然后多次向前复制，而是在开始位置向前移动时成对复制每个迭代器。因此，迭代器t既提供了起点为t的“完整”迭代器，也提供了创建迭代器t + 1的基础:

import itertools as it
def window4(iterable, size):
    complete_itr, incomplete_itr = it.tee(iterable, 2)
    iters = [complete_itr]
    for i in xrange(1, size):
        incomplete_itr.next()
        complete_itr, incomplete_itr = it.tee(incomplete_itr, 2)
        iters.append(complete_itr)
    return it.izip(*iters)

2012-06-28 17:28:24

这似乎是为collections.deque定制的，因为您实际上有一个FIFO(添加到一端，从另一端删除)。然而，即使你使用列表，你也不应该切片两次;相反，您应该只从列表中弹出(0)并追加()新项。

下面是一个基于deque的优化实现:

from collections import deque

def window(seq, n=2):
    it = iter(seq)
    win = deque((next(it, None) for _ in xrange(n)), maxlen=n)
    yield win
    append = win.append
    for e in it:
        append(e)
        yield win

在我的测试中，它在大多数时候都轻松击败了这里发布的其他所有东西，尽管pillmuncher的tee版本在大可迭代对象和小窗口方面击败了它。在较大的窗口上，deque再次以原始速度领先。

Access to individual items in the deque may be faster or slower than with lists or tuples. (Items near the beginning are faster, or items near the end if you use a negative index.) I put a sum(w) in the body of my loop; this plays to the deque's strength (iterating from one item to the next is fast, so this loop ran a a full 20% faster than the next fastest method, pillmuncher's). When I changed it to individually look up and add items in a window of ten, the tables turned and the tee method was 20% faster. I was able to recover some speed by using negative indexes for the last five terms in the addition, but tee was still a little faster. Overall I would estimate that either one is plenty fast for most uses and if you need a little more performance, profile and pick the one that works best.

2011-07-25 21:46:19

滚动或滑动窗口迭代器?

推荐文章

最新文章

标签