我有这样的代码:
good = [x for x in mylist if x in goodvals]
bad = [x for x in mylist if x not in goodvals]
目标是根据mylist的内容是否满足条件,将它们拆分为另外两个列表。
我怎样才能做得更优雅呢?我能避免在mylist上做两个单独的迭代吗?我可以通过这样做来提高性能吗?
我有这样的代码:
good = [x for x in mylist if x in goodvals]
bad = [x for x in mylist if x not in goodvals]
目标是根据mylist的内容是否满足条件,将它们拆分为另外两个列表。
我怎样才能做得更优雅呢?我能避免在mylist上做两个单独的迭代吗?我可以通过这样做来提高性能吗?
当前回答
这个问题已经有很多答案了,但似乎都不如我最喜欢的解决这个问题的方法,这种方法只遍历和测试每个项目一次,并使用列表理解的速度来构建两个输出列表之一,因此它只需要使用相对较慢的附加来构建一个输出列表:
bad = []
good = [x for x in mylist if x in goodvals or bad.append(x)]
In my answer to a similar question, I explain how this approach works (a combination of Python's greedy evaluation of or refraining from executing the append for "good" items, and append returning a false-like value which leaves the if condition false for "bad" items), and I show timeit results indicating that this approach outcompetes alternatives like those suggested here, especially in cases where the majority of items will go into the list built by list-comprehension (in this case, the good list).
其他回答
我认为基于N个条件来划分一个可迭代对象是很方便的
from collections import OrderedDict
def partition(iterable,*conditions):
'''Returns a list with the elements that satisfy each of condition.
Conditions are assumed to be exclusive'''
d= OrderedDict((i,list())for i in range(len(conditions)))
for e in iterable:
for i,condition in enumerate(conditions):
if condition(e):
d[i].append(e)
break
return d.values()
例如:
ints,floats,other = partition([2, 3.14, 1, 1.69, [], None],
lambda x: isinstance(x, int),
lambda x: isinstance(x, float),
lambda x: True)
print " ints: {}\n floats:{}\n other:{}".format(ints,floats,other)
ints: [2, 1]
floats:[3.14, 1.69]
other:[[], None]
如果元素可以满足多个条件,则删除断点。
如果你不介意使用一个外部库,有两个我知道本机实现这个操作:
>>> files = [ ('file1.jpg', 33, '.jpg'), ('file2.avi', 999, '.avi')]
>>> IMAGE_TYPES = ('.jpg','.jpeg','.gif','.bmp','.png')
iteration_utilities.partition: >>> from iteration_utilities import partition >>> notimages, images = partition(files, lambda x: x[2].lower() in IMAGE_TYPES) >>> notimages [('file2.avi', 999, '.avi')] >>> images [('file1.jpg', 33, '.jpg')] more_itertools.partition >>> from more_itertools import partition >>> notimages, images = partition(lambda x: x[2].lower() in IMAGE_TYPES, files) >>> list(notimages) # returns a generator so you need to explicitly convert to list. [('file2.avi', 999, '.avi')] >>> list(images) [('file1.jpg', 33, '.jpg')]
还有另一个答案,简短但“邪恶”(用于理解列表的副作用)。
digits = list(range(10))
odd = [x.pop(i) for i, x in enumerate(digits) if x % 2]
>>> odd
[1, 3, 5, 7, 9]
>>> digits
[0, 2, 4, 6, 8]
如果你不想用两行代码来完成一个语义只需要一次的操作,你可以把上面的一些方法(甚至是你自己的方法)包装在一个函数中:
def part_with_predicate(l, pred):
return [i for i in l if pred(i)], [i for i in l if not pred(i)]
这不是一种惰性计算方法,它确实对列表进行了两次迭代,但是它允许您在一行代码中对列表进行分区。
所有提出的解决方案的问题是,它将扫描和应用过滤功能两次。我会做一个简单的小函数,像这样:
def split_into_two_lists(lst, f):
a = []
b = []
for elem in lst:
if f(elem):
a.append(elem)
else:
b.append(elem)
return a, b
这样你就不会重复处理任何东西,也不会重复代码。