如何在列表中找到一个元素的所有出现

Index()将给出列表中第一个出现的项。是否有一个巧妙的技巧可以返回一个元素列表中的所有索引?

当前回答

创建一个生成器

生成器速度很快，占用的内存很小。它们可以让你灵活地使用结果。

def indices(iter, val):
    """Generator: Returns all indices of val in iter
    Raises a ValueError if no val does not occur in iter
    Passes on the AttributeError if iter does not have an index method (e.g. is a set)
    """
    i = -1
    NotFound = False
    while not NotFound:
        try:
            i = iter.index(val, i+1)
        except ValueError:
            NotFound = True
        else:
            yield i
    if i == -1:
        raise ValueError("No occurrences of {v} in {i}".format(v = val, i = iter))

上面的代码可以用来创建一个索引列表:list(indexes (input,value));使用它们作为字典键:dict(索引(输入，值));求和:Sum (indexes (input,value));在for循环中index_ In indexes (input,value):;等……无需创建临时列表/元组或类似的。

在for循环中，当你调用下一个索引时，你将得到它，而不需要等待所有其他索引先计算出来。这意味着:如果出于某种原因跳出循环，就可以节省查找根本不需要的索引所需的时间。

它是如何工作的

在输入iter上调用.index来查找瓦尔使用第二个参数.index从该点开始在最后发现的事件之后收益率指数重复操作，直到index引发ValueError

选择版本

我尝试了四种不同的流量控制版本;两个EAFP(使用try - except)和两个TBYL(在while语句中使用逻辑测试):

"WhileTrueBreak": while True: ... except ValueError: break. Surprisingly, this was usually a touch slower than option 2 and (IMV) less readable "WhileErrFalse": Using a bool variable err to identify when a ValueError is raised. This is generally the fastest and more readable than 1 "RemainingSlice": Check whether val is in the remaining part of the input using slicing: while val in iter[i:]. Unsurprisingly, this does not scale well "LastOccurrence": Check first where the last occurrence is, keep going while i < last

1、2和4之间的整体表现差异可以忽略不计，所以这取决于个人风格和偏好。鉴于.index使用ValueError来让你知道它没有找到任何东西，而不是例如返回None, eafp方法似乎适合我。

下面是4个代码变体和timeit(以毫秒为单位)对于不同长度的输入和稀疏匹配的结果

@version("WhileTrueBreak", versions)
def indices2(iter, val):
    i = -1
    while True:
        try:
            i = iter.index(val, i+1)
        except ValueError:
            break
        else:
            yield i

@version("WhileErrFalse", versions)
def indices5(iter, val):
    i = -1
    err = False
    while not err:
        try:
            i = iter.index(val, i+1)
        except ValueError:
            err = True
        else:
            yield i

@version("RemainingSlice", versions)
def indices1(iter, val):
    i = 0
    while val in iter[i:]:
        i = iter.index(val, i)
        yield i
        i += 1

@version("LastOccurrence", versions)
def indices4(iter,val):
    i = 0
    last = len(iter) - tuple(reversed(iter)).index(val)
    while i < last:
        i = iter.index(val, i)
        yield i
        i += 1

Length: 100, Ocurrences: 4.0%
{'WhileTrueBreak': 0.0074799987487494946, 'WhileErrFalse': 0.006440002471208572, 'RemainingSlice': 0.01221001148223877, 'LastOccurrence': 0.00801000278443098}
Length: 1000, Ocurrences: 1.2%
{'WhileTrueBreak': 0.03101000329479575, 'WhileErrFalse': 0.0278000021353364, 'RemainingSlice': 0.08278000168502331, 'LastOccurrence': 0.03986000083386898}
Length: 10000, Ocurrences: 2.05%
{'WhileTrueBreak': 0.18062000162899494, 'WhileErrFalse': 0.1810499932616949, 'RemainingSlice': 2.9145700042136014, 'LastOccurrence': 0.2049500006251037}
Length: 100000, Ocurrences: 1.977%
{'WhileTrueBreak': 1.9361200043931603, 'WhileErrFalse': 1.7280600033700466, 'RemainingSlice': 254.4725100044161, 'LastOccurrence': 1.9101499929092824}
Length: 100000, Ocurrences: 9.873%
{'WhileTrueBreak': 2.832529996521771, 'WhileErrFalse': 2.9984100023284554, 'RemainingSlice': 1132.4922299943864, 'LastOccurrence': 2.6660699979402125}
Length: 100000, Ocurrences: 25.058%
{'WhileTrueBreak': 5.119729996658862, 'WhileErrFalse': 5.2082200068980455, 'RemainingSlice': 2443.0577100021765, 'LastOccurrence': 4.75954000139609}
Length: 100000, Ocurrences: 49.698%
{'WhileTrueBreak': 9.372120001353323, 'WhileErrFalse': 8.447749994229525, 'RemainingSlice': 5042.717969999649, 'LastOccurrence': 8.050809998530895}

2022-05-13 14:55:03

其他回答

如果你需要搜索所有元素在某些索引之间的位置，你可以声明它们:

[i for i,x in enumerate([1,2,3,2]) if x==2 & 2<= i <=3] # -> [3]

2017-12-01 18:53:48

您可以创建defaultdict

from collections import defaultdict
d1 = defaultdict(int)      # defaults to 0 values for keys
unq = set(lst1)              # lst1 = [1, 2, 2, 3, 4, 1, 2, 7]
for each in unq:
      d1[each] = lst1.count(each)
else:
      print(d1)

2017-12-07 16:31:23

这里是使用np的时间性能比较。Where vs list_comprehension。好像是np。哪里的平均速度更快。

# np.where
start_times = []
end_times = []
for i in range(10000):
    start = time.time()
    start_times.append(start)
    temp_list = np.array([1,2,3,3,5])
    ixs = np.where(temp_list==3)[0].tolist()
    end = time.time()
    end_times.append(end)
print("Took on average {} seconds".format(
    np.mean(end_times)-np.mean(start_times)))

Took on average 3.81469726562e-06 seconds

# list_comprehension
start_times = []
end_times = []
for i in range(10000):
    start = time.time()
    start_times.append(start)
    temp_list = np.array([1,2,3,3,5])
    ixs = [i for i in range(len(temp_list)) if temp_list[i]==3]
    end = time.time()
    end_times.append(end)
print("Took on average {} seconds".format(
    np.mean(end_times)-np.mean(start_times)))

Took on average 4.05311584473e-06 seconds

2020-07-05 09:47:09

你可以使用枚举的列表推导式:

indices = [i for i, x in enumerate(my_list) if x == "whatever"]

迭代器enumerate(my_list)为列表中的每一项生成对(index, item)。使用i, x作为循环变量目标，将这些对解包到索引i和列表项x中。我们向下筛选到所有符合条件的x，并选择这些元素的索引i。

2011-06-09 14:13:57

使用list.index的解决方案:

def indices(lst, element):
    result = []
    offset = -1
    while True:
        try:
            offset = lst.index(element, offset+1)
        except ValueError:
            return result
        result.append(offset)

对于大型列表，它比使用enumerate的列表理解要快得多。如果已经有数组，它也比numpy解决方案慢得多，否则转换的成本超过了速度增益(在包含100、1000和10000个元素的整数列表上进行测试)。

注意:根据Chris_Rands的评论，需要注意的是:如果结果足够稀疏，这个解决方案比列表推导式更快，但是如果列表中有很多正在搜索的元素的实例(超过列表的15%，在一个包含1000个整数的列表测试中)，列表推导式更快。

2013-09-07 02:29:52

如何在列表中找到一个元素的所有出现

推荐文章

最新文章

标签