找到Python列表中最常见元素的有效方法是什么?

我的列表项可能不是可哈希的,所以不能使用字典。 同样,在抽取的情况下,应返回索引最低的项。例子:

>>> most_common(['duck', 'duck', 'goose'])
'duck'
>>> most_common(['goose', 'duck', 'duck', 'goose'])
'goose'

当前回答

简单的一行代码:

def most_common(lst):
    return max(set(lst), key=lst.count)

其他回答

简单的一行代码:

def most_common(lst):
    return max(set(lst), key=lst.count)

如果它们是不可哈希的,您可以对它们进行排序,并对结果进行一次循环,以计数项(相同的项将彼此相邻)。但是使它们可哈希并使用字典可能更快。

def most_common(lst):
    cur_length = 0
    max_length = 0
    cur_i = 0
    max_i = 0
    cur_item = None
    max_item = None
    for i, item in sorted(enumerate(lst), key=lambda x: x[1]):
        if cur_item is None or cur_item != item:
            if cur_length > max_length or (cur_length == max_length and cur_i < max_i):
                max_length = cur_length
                max_i = cur_i
                max_item = cur_item
            cur_length = 1
            cur_i = i
            cur_item = item
        else:
            cur_length += 1
    if cur_length > max_length or (cur_length == max_length and cur_i < max_i):
        return cur_item
    return max_item
ans  = [1, 1, 0, 0, 1, 1]
all_ans = {ans.count(ans[i]): ans[i] for i in range(len(ans))}
print(all_ans)
all_ans={4: 1, 2: 0}
max_key = max(all_ans.keys())

4

print(all_ans[max_key])

1

我在最近的一个项目中需要这样做。我承认,我无法理解Alex的回答,所以这就是我最后得到的答案。

def mostPopular(l):
    mpEl=None
    mpIndex=0
    mpCount=0
    curEl=None
    curCount=0
    for i, el in sorted(enumerate(l), key=lambda x: (x[1], x[0]), reverse=True):
        curCount=curCount+1 if el==curEl else 1
        curEl=el
        if curCount>mpCount \
        or (curCount==mpCount and i<mpIndex):
            mpEl=curEl
            mpIndex=i
            mpCount=curCount
    return mpEl, mpCount, mpIndex

我根据Alex的解决方案计时,对于短列表,它要快10-15%,但一旦超过100个或更多元素(测试多达20万个),它就会慢20%。

#This will return the list sorted by frequency:

def orderByFrequency(list):

    listUniqueValues = np.unique(list)
    listQty = []
    listOrderedByFrequency = []
    
    for i in range(len(listUniqueValues)):
        listQty.append(list.count(listUniqueValues[i]))
    for i in range(len(listQty)):
        index_bigger = np.argmax(listQty)
        for j in range(listQty[index_bigger]):
            listOrderedByFrequency.append(listUniqueValues[index_bigger])
        listQty[index_bigger] = -1
    return listOrderedByFrequency

#And this will return a list with the most frequent values in a list:

def getMostFrequentValues(list):
    
    if (len(list) <= 1):
        return list
    
    list_most_frequent = []
    list_ordered_by_frequency = orderByFrequency(list)
    
    list_most_frequent.append(list_ordered_by_frequency[0])
    frequency = list_ordered_by_frequency.count(list_ordered_by_frequency[0])
    
    index = 0
    while(index < len(list_ordered_by_frequency)):
        index = index + frequency
        
        if(index < len(list_ordered_by_frequency)):
            testValue = list_ordered_by_frequency[index]
            testValueFrequency = list_ordered_by_frequency.count(testValue)
            
            if (testValueFrequency == frequency):
                list_most_frequent.append(testValue)
            else:
                break    
    
    return list_most_frequent

#tests:
print(getMostFrequentValues([]))
print(getMostFrequentValues([1]))
print(getMostFrequentValues([1,1]))
print(getMostFrequentValues([2,1]))
print(getMostFrequentValues([2,2,1]))
print(getMostFrequentValues([1,2,1,2]))
print(getMostFrequentValues([1,2,1,2,2]))
print(getMostFrequentValues([3,2,3,5,6,3,2,2]))
print(getMostFrequentValues([1,2,2,60,50,3,3,50,3,4,50,4,4,60,60]))

Results:
[]
[1]
[1]
[1, 2]
[2]
[1, 2]
[2]
[2, 3]
[3, 4, 50, 60]