给定一个函数,它产生的是1到5之间的随机整数,写一个函数,它产生的是1到7之间的随机整数。
当前回答
Python:有一个简单的两行答案,它使用空间代数和模量的组合。这不是直观的。我对它的解释令人困惑,但却是正确的。
知道5*7=35 7/5 = 1余数为2。如何保证余数之和始终为0?5*[7/5 = 1余数2]——> 35/5 = 7余数0
想象一下,我们有一条丝带,缠在一根周长为7的杆子上。丝带需要35个单位才能均匀地缠绕。随机选择7个色带片段len=[1…5]。忽略换行的有效长度与将rand5()转换为rand7()的方法相同。
import numpy as np
import pandas as pd
# display is a notebook function FYI
def rand5(): ## random uniform int [1...5]
return np.random.randint(1,6)
n_trials = 1000
samples = [rand5() for _ in range(n_trials)]
display(pd.Series(samples).value_counts(normalize=True))
# 4 0.2042
# 5 0.2041
# 2 0.2010
# 1 0.1981
# 3 0.1926
# dtype: float64
def rand7(): # magic algebra
x = sum(rand5() for _ in range(7))
return x%7 + 1
samples = [rand7() for _ in range(n_trials)]
display(pd.Series(samples).value_counts(normalize=False))
# 6 1475
# 2 1475
# 3 1456
# 1 1423
# 7 1419
# 4 1393
# 5 1359
# dtype: int64
df = pd.DataFrame([
pd.Series([rand7() for _ in range(n_trials)]).value_counts(normalize=True)
for _ in range(1000)
])
df.describe()
# 1 2 3 4 5 6 7
# count 1000.000000 1000.000000 1000.000000 1000.000000 1000.000000 1000.000000 1000.000000
# mean 0.142885 0.142928 0.142523 0.142266 0.142704 0.143048 0.143646
# std 0.010807 0.011526 0.010966 0.011223 0.011052 0.010983 0.011153
# min 0.112000 0.108000 0.101000 0.110000 0.100000 0.109000 0.110000
# 25% 0.135000 0.135000 0.135000 0.135000 0.135000 0.135000 0.136000
# 50% 0.143000 0.142000 0.143000 0.142000 0.143000 0.142000 0.143000
# 75% 0.151000 0.151000 0.150000 0.150000 0.150000 0.150000 0.151000
# max 0.174000 0.181000 0.175000 0.178000 0.189000 0.176000 0.179000
其他回答
通过使用滚动总数,您可以同时
保持平均分配;而且 不需要牺牲随机序列中的任何元素。
这两个问题都是简单的rand(5)+rand(5)…类型的解决方案。下面的Python代码展示了如何实现它(其中大部分是证明发行版)。
import random
x = []
for i in range (0,7):
x.append (0)
t = 0
tt = 0
for i in range (0,700000):
########################################
##### qq.py #####
r = int (random.random () * 5)
t = (t + r) % 7
########################################
##### qq_notsogood.py #####
#r = 20
#while r > 6:
#r = int (random.random () * 5)
#r = r + int (random.random () * 5)
#t = r
########################################
x[t] = x[t] + 1
tt = tt + 1
high = x[0]
low = x[0]
for i in range (0,7):
print "%d: %7d %.5f" % (i, x[i], 100.0 * x[i] / tt)
if x[i] < low:
low = x[i]
if x[i] > high:
high = x[i]
diff = high - low
print "Variation = %d (%.5f%%)" % (diff, 100.0 * diff / tt)
这个输出显示了结果:
pax$ python qq.py
0: 99908 14.27257
1: 100029 14.28986
2: 100327 14.33243
3: 100395 14.34214
4: 99104 14.15771
5: 99829 14.26129
6: 100408 14.34400
Variation = 1304 (0.18629%)
pax$ python qq.py
0: 99547 14.22100
1: 100229 14.31843
2: 100078 14.29686
3: 99451 14.20729
4: 100284 14.32629
5: 100038 14.29114
6: 100373 14.33900
Variation = 922 (0.13171%)
pax$ python qq.py
0: 100481 14.35443
1: 99188 14.16971
2: 100284 14.32629
3: 100222 14.31743
4: 99960 14.28000
5: 99426 14.20371
6: 100439 14.34843
Variation = 1293 (0.18471%)
一个简单的rand(5)+rand(5),忽略那些返回大于6的情况,其典型变化为18%,是上面所示方法的100倍:
pax$ python qq_notsogood.py
0: 31756 4.53657
1: 63304 9.04343
2: 95507 13.64386
3: 127825 18.26071
4: 158851 22.69300
5: 127567 18.22386
6: 95190 13.59857
Variation = 127095 (18.15643%)
pax$ python qq_notsogood.py
0: 31792 4.54171
1: 63637 9.09100
2: 95641 13.66300
3: 127627 18.23243
4: 158751 22.67871
5: 126782 18.11171
6: 95770 13.68143
Variation = 126959 (18.13700%)
pax$ python qq_notsogood.py
0: 31955 4.56500
1: 63485 9.06929
2: 94849 13.54986
3: 127737 18.24814
4: 159687 22.81243
5: 127391 18.19871
6: 94896 13.55657
Variation = 127732 (18.24743%)
并且,根据Nixuz的建议,我已经清理了脚本,所以您可以提取并使用rand7…材料:
import random
# rand5() returns 0 through 4 inclusive.
def rand5():
return int (random.random () * 5)
# rand7() generator returns 0 through 6 inclusive (using rand5()).
def rand7():
rand7ret = 0
while True:
rand7ret = (rand7ret + rand5()) % 7
yield rand7ret
# Number of test runs.
count = 700000
# Work out distribution.
distrib = [0,0,0,0,0,0,0]
rgen =rand7()
for i in range (0,count):
r = rgen.next()
distrib[r] = distrib[r] + 1
# Print distributions and calculate variation.
high = distrib[0]
low = distrib[0]
for i in range (0,7):
print "%d: %7d %.5f" % (i, distrib[i], 100.0 * distrib[i] / count)
if distrib[i] < low:
low = distrib[i]
if distrib[i] > high:
high = distrib[i]
diff = high - low
print "Variation = %d (%.5f%%)" % (diff, 100.0 * diff / count)
#!/usr/bin/env ruby
class Integer
def rand7
rand(6)+1
end
end
def rand5
rand(4)+1
end
x = rand5() # x => int between 1 and 5
y = x.rand7() # y => int between 1 and 7
..尽管这可能被认为是作弊。
我觉得你们都想多了。难道这个简单的解决方案行不通吗?
int rand7(void)
{
static int startpos = 0;
startpos = (startpos+5) % (5*7);
return (((startpos + rand5()-1)%7)+1);
}
我玩了一下,我为这个Rand(7)算法写了“测试环境”。例如,如果你想尝试哪种分布给你的算法,或者需要多少次迭代才能生成所有不同的随机值(对于Rand(7) 1-7),你可以使用它。
我的核心算法是:
return (Rand5() + Rand5()) % 7 + 1;
和亚当·罗森菲尔德的分布一样均匀。(我将其包含在代码片段中)
private static int Rand7WithRand5()
{
//PUT YOU FAVOURITE ALGORITHM HERE//
//1. Stackoverflow winner
int i;
do
{
i = 5 * (Rand5() - 1) + Rand5(); // i is now uniformly random between 1 and 25
} while (i > 21);
// i is now uniformly random between 1 and 21
return i % 7 + 1;
//My 2 cents
//return (Rand5() + Rand5()) % 7 + 1;
}
这个“测试环境”可以采用任何Rand(n)算法并测试和评估它(分布和速度)。只需将代码放入“Rand7WithRand5”方法并运行代码片段。
一些观察:
亚当·罗森菲尔德(Adam Rosenfield)的算法并不比我的算法分布得更好。不管怎样,两种算法的分布都很糟糕。 本机Rand7(随机的。Next(1,8))完成,因为它在大约200+迭代中生成了给定间隔内的所有成员,Rand7WithRand5算法的顺序为10k(约30-70k) 真正的挑战不是编写从Rand(5)生成Rand(7)的方法,而是生成几乎均匀分布的值。
这个答案更像是一个从Rand5函数中获得最大熵的实验。因此,T有点不清楚,几乎可以肯定比其他实现慢得多。
假设0-4为均匀分布,0-6为均匀分布:
public class SevenFromFive
{
public SevenFromFive()
{
// this outputs a uniform ditribution but for some reason including it
// screws up the output distribution
// open question Why?
this.fifth = new ProbabilityCondensor(5, b => {});
this.eigth = new ProbabilityCondensor(8, AddEntropy);
}
private static Random r = new Random();
private static uint Rand5()
{
return (uint)r.Next(0,5);
}
private class ProbabilityCondensor
{
private readonly int samples;
private int counter;
private int store;
private readonly Action<bool> output;
public ProbabilityCondensor(int chanceOfTrueReciprocal,
Action<bool> output)
{
this.output = output;
this.samples = chanceOfTrueReciprocal - 1;
}
public void Add(bool bit)
{
this.counter++;
if (bit)
this.store++;
if (counter == samples)
{
bool? e;
if (store == 0)
e = false;
else if (store == 1)
e = true;
else
e = null;// discard for now
counter = 0;
store = 0;
if (e.HasValue)
output(e.Value);
}
}
}
ulong buffer = 0;
const ulong Mask = 7UL;
int bitsAvail = 0;
private readonly ProbabilityCondensor fifth;
private readonly ProbabilityCondensor eigth;
private void AddEntropy(bool bit)
{
buffer <<= 1;
if (bit)
buffer |= 1;
bitsAvail++;
}
private void AddTwoBitsEntropy(uint u)
{
buffer <<= 2;
buffer |= (u & 3UL);
bitsAvail += 2;
}
public uint Rand7()
{
uint selection;
do
{
while (bitsAvail < 3)
{
var x = Rand5();
if (x < 4)
{
// put the two low order bits straight in
AddTwoBitsEntropy(x);
fifth.Add(false);
}
else
{
fifth.Add(true);
}
}
// read 3 bits
selection = (uint)((buffer & Mask));
bitsAvail -= 3;
buffer >>= 3;
if (selection == 7)
eigth.Add(true);
else
eigth.Add(false);
}
while (selection == 7);
return selection;
}
}
每次调用Rand5添加到缓冲区的比特数目前是4/5 * 2,所以是1.6。 如果包括1/5的概率值,则增加0.05,因此增加1.65,但请参阅代码中的注释,我不得不禁用它。
调用Rand7消耗的比特数= 3 + 1/8 *(3 + 1/8 *(3 + 1/8 *(… 这是3 + 3/8 + 3/64 + 3/512…大约是3.42
通过从7中提取信息,我每次调用回收1/8*1/7位,大约0.018
这使得每次调用的净消耗为3.4比特,这意味着每一次Rand7调用到Rand5的比率为2.125。最优值应该是2.1。
我可以想象这种方法比这里的许多其他方法都要慢得多,除非调用Rand5的代价非常昂贵(比如调用一些外部熵源)。