给定一个无序的值列表,比如
a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
我怎样才能得到出现在列表中的每个值的频率,就像这样?
# `a` has 4 instances of `1`, 4 of `2`, 2 of `3`, 1 of `4,` 2 of `5`
b = [4, 4, 2, 1, 2] # expected output
给定一个无序的值列表,比如
a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
我怎样才能得到出现在列表中的每个值的频率,就像这样?
# `a` has 4 instances of `1`, 4 of `2`, 2 of `3`, 1 of `4,` 2 of `5`
b = [4, 4, 2, 1, 2] # expected output
当前回答
另一种方法是使用较重但功能强大的库——NLTK。
import nltk
fdist = nltk.FreqDist(a)
fdist.values()
fdist.most_common()
其他回答
如果您不想使用任何库并保持简单和简短,可以尝试这种方法!
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
marked = []
b = [(a.count(i), marked.append(i))[0] for i in a if i not in marked]
print(b)
o/p
[4, 4, 2, 1, 2]
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
counts = dict.fromkeys(a, 0)
for el in a: counts[el] += 1
print(counts)
# {1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
通过遍历列表并计算它们,手动计算出现的数量,使用collections.defaultdict跟踪到目前为止看到的内容:
from collections import defaultdict
appearances = defaultdict(int)
for curr in a:
appearances[curr] += 1
def frequencyDistribution(data):
return {i: data.count(i) for i in data}
print frequencyDistribution([1,2,3,4])
...
{1: 1, 2: 1, 3: 1, 4: 1} # originalNumber: count
我将简单地以以下方式使用scipy.stats.itemfreq:
from scipy.stats import itemfreq
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
freq = itemfreq(a)
a = freq[:,0]
b = freq[:,1]
您可以在这里查看文档:http://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.stats.itemfreq.html