我需要写一个加权版的random。选择(列表中的每个元素有不同的被选择的概率)。这是我想到的:

def weightedChoice(choices):
    """Like random.choice, but each element can have a different chance of
    being selected.

    choices can be any iterable containing iterables with two items each.
    Technically, they can have more than two items, the rest will just be
    ignored.  The first item is the thing being chosen, the second item is
    its weight.  The weights can be any numeric values, what matters is the
    relative differences between them.
    """
    space = {}
    current = 0
    for choice, weight in choices:
        if weight > 0:
            space[current] = choice
            current += weight
    rand = random.uniform(0, current)
    for key in sorted(space.keys() + [current]):
        if rand < key:
            return choice
        choice = space[key]
    return None

这个函数对我来说太复杂了,而且很丑。我希望这里的每个人都能提供一些改进的建议或其他方法。对我来说,效率没有代码的整洁和可读性重要。


当前回答

下面是Python 3.6标准库中包含的版本:

import itertools as _itertools
import bisect as _bisect

class Random36(random.Random):
    "Show the code included in the Python 3.6 version of the Random class"

    def choices(self, population, weights=None, *, cum_weights=None, k=1):
        """Return a k sized list of population elements chosen with replacement.

        If the relative weights or cumulative weights are not specified,
        the selections are made with equal probability.

        """
        random = self.random
        if cum_weights is None:
            if weights is None:
                _int = int
                total = len(population)
                return [population[_int(random() * total)] for i in range(k)]
            cum_weights = list(_itertools.accumulate(weights))
        elif weights is not None:
            raise TypeError('Cannot specify both weights and cumulative weights')
        if len(cum_weights) != len(population):
            raise ValueError('The number of weights does not match the population')
        bisect = _bisect.bisect
        total = cum_weights[-1]
        return [population[bisect(cum_weights, random() * total)] for i in range(k)]

来源:https://hg.python.org/cpython/file/tip/Lib/random.py l340

其他回答

我不喜欢它们的语法。我只想具体说明这些项目是什么以及每项的权重是多少。我意识到我可以用随机。选项,但我很快就写了下面的类。

import random, string
from numpy import cumsum

class randomChoiceWithProportions:
    '''
    Accepts a dictionary of choices as keys and weights as values. Example if you want a unfair dice:


    choiceWeightDic = {"1":0.16666666666666666, "2": 0.16666666666666666, "3": 0.16666666666666666
    , "4": 0.16666666666666666, "5": .06666666666666666, "6": 0.26666666666666666}
    dice = randomChoiceWithProportions(choiceWeightDic)

    samples = []
    for i in range(100000):
        samples.append(dice.sample())

    # Should be close to .26666
    samples.count("6")/len(samples)

    # Should be close to .16666
    samples.count("1")/len(samples)
    '''
    def __init__(self, choiceWeightDic):
        self.choiceWeightDic = choiceWeightDic
        weightSum = sum(self.choiceWeightDic.values())
        assert weightSum == 1, 'Weights sum to ' + str(weightSum) + ', not 1.'
        self.valWeightDict = self._compute_valWeights()

    def _compute_valWeights(self):
        valWeights = list(cumsum(list(self.choiceWeightDic.values())))
        valWeightDict = dict(zip(list(self.choiceWeightDic.keys()), valWeights))
        return valWeightDict

    def sample(self):
        num = random.uniform(0,1)
        for key, val in self.valWeightDict.items():
            if val >= num:
                return key

在Udacity免费课程AI for Robotics中,Sebastien Thurn对此进行了演讲。基本上,他用mod运算符%做了一个权重索引的圆形数组,将变量beta设为0,随机选择一个索引, for循环遍历N,其中N是指标的数量,在for循环中,首先按公式增加beta:

Beta = Beta +来自{0…2 * Weight_max}

然后在for循环中嵌套一个while循环per:

while w[index] < beta:
    beta = beta - w[index]
    index = index + 1

select p[index]

然后到下一个索引,根据概率(或课程中介绍的情况下的归一化概率)重新采样。

在Udacity上找到第8课,机器人人工智能的第21期视频,他正在讲粒子滤波器。

我需要做这样的事情非常快速非常简单,从搜索的想法,我终于建立了这个模板。其思想是以json的形式从api接收加权值,这里是由dict模拟的。

然后将其转换为一个列表,其中每个值都与它的权重成比例地重复,只需使用random。选择从列表中选择一个值。

我尝试了10次、100次和1000次迭代。分布似乎很稳定。

def weighted_choice(weighted_dict):
    """Input example: dict(apples=60, oranges=30, pineapples=10)"""
    weight_list = []
    for key in weighted_dict.keys():
        weight_list += [key] * weighted_dict[key]
    return random.choice(weight_list)

为random.choice()提供一个预先加权的列表:

解决方案和测试:

import random

options = ['a', 'b', 'c', 'd']
weights = [1, 2, 5, 2]

weighted_options = [[opt]*wgt for opt, wgt in zip(options, weights)]
weighted_options = [opt for sublist in weighted_options for opt in sublist]
print(weighted_options)

# test

counts = {c: 0 for c in options}
for x in range(10000):
    counts[random.choice(weighted_options)] += 1

for opt, wgt in zip(options, weights):
    wgt_r = counts[opt] / 10000 * sum(weights)
    print(opt, counts[opt], wgt, wgt_r)

输出:

['a', 'b', 'b', 'c', 'c', 'c', 'c', 'c', 'd', 'd']
a 1025 1 1.025
b 1948 2 1.948
c 5019 5 5.019
d 2008 2 2.008

如果您的加权选项列表是相对静态的,并且您希望频繁采样,则可以执行一个O(N)预处理步骤,然后使用相关答案中的函数在O(1)中进行选择。

# run only when `choices` changes.
preprocessed_data = prep(weight for _,weight in choices)

# O(1) selection
value = choices[sample(preprocessed_data)][0]