我如何划分(分裂，划分)一个基于条件的列表?

我有这样的代码:

good = [x for x in mylist if x in goodvals]
bad = [x for x in mylist if x not in goodvals]

目标是根据mylist的内容是否满足条件，将它们拆分为另外两个列表。

我怎样才能做得更优雅呢?我能避免在mylist上做两个单独的迭代吗?我可以通过这样做来提高性能吗?

当前回答

清晰快速

这个列表理解是简单的阅读和快速。这正是上级要求的。

set_good_vals = set(good_vals)    # Speed boost.
good = [x for x in my_list if x in set_good_vals]
bad = [x for x in my_list if x not in set_good_vals]

我更喜欢一个列表理解而不是两个，但不像张贴的许多答案(其中一些相当巧妙)，它是可读的和清晰的。这也是网页上最快的答案之一。

唯一(稍微)快一点的答案是:

set_good_vals = set(good_vals)
good, bad = [], []
for item in my_list:
    _ = good.append(item) if item in set_good_vals else bad.append(item)

.．.还有它的变体。(见我的另一个答案)。但我觉得第一种方法更优雅，而且几乎一样快。

2021-04-19 17:38:07

其他回答

bad = []
good = [x for x in mylist if x in goodvals or bad.append(x)]

append返回None，所以它可以工作。

2019-09-17 14:18:40

我的看法。我提出一个惰性单次配分函数，它保持输出子序列的相对顺序。

1. 需求

我认为这些要求是:

维护元素的相对顺序(因此，没有集合和字典) 对于每个元素只计算condition一次(因此不使用 (i)筛选或分组) 允许任意一个序列的惰性消耗(如果我们能够负担得起的话) 预先计算它们，那么naïve实现很可能是可接受)

2. 把图书馆

我的配分函数(下面介绍)和其他类似的函数把它变成了一个小图书馆:

python-split

它通常可以通过PyPI安装:

pip install --user split

要根据条件拆分列表，使用partition函数:

>>> from split import partition
>>> files = [ ('file1.jpg', 33L, '.jpg'), ('file2.avi', 999L, '.avi') ]
>>> image_types = ('.jpg','.jpeg','.gif','.bmp','.png')
>>> images, other = partition(lambda f: f[-1] in image_types, files)
>>> list(images)
[('file1.jpg', 33L, '.jpg')]
>>> list(other)
[('file2.avi', 999L, '.avi')]

3.配分函数说明

在内部，我们需要同时构建两个子序列，因此需要消耗只有一个输出序列强制计算另一个输出序列了。我们需要在用户请求之间保持状态(存储已处理) 但还没有请求的元素)。为了保持状态，我使用了两个双端队列(双端队列):

from collections import deque

SplitSeq类负责内部管理:

class SplitSeq:
    def __init__(self, condition, sequence):
        self.cond = condition
        self.goods = deque([])
        self.bads = deque([])
        self.seq = iter(sequence)

魔术发生在它的. getnext()方法中。就像。next() 的迭代器，但允许指定我们想要的元素类型这一次。在幕后，它并没有丢弃被拒绝的元素，而是把它们放在两个队列中的一个:

    def getNext(self, getGood=True):
        if getGood:
            these, those, cond = self.goods, self.bads, self.cond
        else:
            these, those, cond = self.bads, self.goods, lambda x: not self.cond(x)
        if these:
            return these.popleft()
        else:
            while 1: # exit on StopIteration
                n = self.seq.next()
                if cond(n):
                    return n
                else:
                    those.append(n)

最终用户应该使用配分函数。它需要条件函数和序列(就像映射或过滤器)，以及返回两个生成器。的子序列元素，则第二个元素将构建互补的子序列。迭代器和生成器允许延迟偶长序列或无限序列的分裂。

def partition(condition, sequence):
    cond = condition if condition else bool  # evaluate as bool if condition == None
    ss = SplitSeq(cond, sequence)
    def goods():
        while 1:
            yield ss.getNext(getGood=True)
    def bads():
        while 1:
            yield ss.getNext(getGood=False)
    return goods(), bads()

为了方便起见，我选择test函数作为第一个参数将来的部分应用(类似于如何映射和过滤将test函数作为第一个参数)。

2011-10-24 19:42:33

第一步(pre-OP-edit):使用集合:

mylist = [1,2,3,4,5,6,7]
goodvals = [1,3,7,8,9]

myset = set(mylist)
goodset = set(goodvals)

print list(myset.intersection(goodset))  # [1, 3, 7]
print list(myset.difference(goodset))    # [2, 4, 5, 6]

这对可读性(IMHO)和性能都有好处。

第二步(post-OP-edit):

创建一个好的扩展列表:

IMAGE_TYPES = set(['.jpg','.jpeg','.gif','.bmp','.png'])

这将提高性能。否则，你现在的情况在我看来还不错。

2009-06-04 07:41:20

我转向numpy来解决这个问题，以限制行数，并使其成为一个简单的函数。

我能够得到一个条件满足，将一个列表分为两个，使用np。在哪里分离出一个列表。这适用于数字，但这可以扩展使用字符串和列表，我相信。

在这儿……

from numpy import where as wh, array as arr

midz = lambda a, mid: (a[wh(a > mid)], a[wh((a =< mid))])
p_ = arr([i for i in [75, 50, 403, 453, 0, 25, 428] if i])
high,low = midz(p_, p_.mean())

2021-05-24 20:41:22

这个问题已经有很多答案了，但似乎都不如我最喜欢的解决这个问题的方法，这种方法只遍历和测试每个项目一次，并使用列表理解的速度来构建两个输出列表之一，因此它只需要使用相对较慢的附加来构建一个输出列表:

bad = []
good = [x for x in mylist if x in goodvals or bad.append(x)]

In my answer to a similar question, I explain how this approach works (a combination of Python's greedy evaluation of or refraining from executing the append for "good" items, and append returning a false-like value which leaves the if condition false for "bad" items), and I show timeit results indicating that this approach outcompetes alternatives like those suggested here, especially in cases where the majority of items will go into the list built by list-comprehension (in this case, the good list).

2022-04-24 05:46:25

我如何划分(分裂，划分)一个基于条件的列表?

推荐文章

最新文章

标签