考虑下面四个百分比,用浮点数表示:
13.626332%
47.989636%
9.596008%
28.788024%
-----------
100.000000%
我需要用整数表示这些百分比。如果我简单地使用Math.round(),我最终得到的总数是101%。
14 + 48 + 10 + 29 = 101
如果我使用parseInt(),我最终得到了97%。
13 + 47 + 9 + 28 = 97
有什么好的算法可以将任何百分比数表示为整数,同时还保持总数为100%?
编辑:在阅读了一些评论和回答后,显然有很多方法可以解决这个问题。
在我看来,为了保持数字的真实性,“正确”的结果是最小化总体误差的结果,定义为相对于实际值会引入多少误差舍入:
value rounded error decision
----------------------------------------------------
13.626332 14 2.7% round up (14)
47.989636 48 0.0% round up (48)
9.596008 10 4.0% don't round up (9)
28.788024 29 2.7% round up (29)
在平局的情况下(3.33,3.33,3.33)可以做出任意的决定(例如3,4,3)。
我已经实现了Varun Vohra的答案在这里的列表和字典的方法。
import math
import numbers
import operator
import itertools
def round_list_percentages(number_list):
"""
Takes a list where all values are numbers that add up to 100,
and rounds them off to integers while still retaining a sum of 100.
A total value sum that rounds to 100.00 with two decimals is acceptable.
This ensures that all input where the values are calculated with [fraction]/[total]
and the sum of all fractions equal the total, should pass.
"""
# Check input
if not all(isinstance(i, numbers.Number) for i in number_list):
raise ValueError('All values of the list must be a number')
# Generate a key for each value
key_generator = itertools.count()
value_dict = {next(key_generator): value for value in number_list}
return round_dictionary_percentages(value_dict).values()
def round_dictionary_percentages(dictionary):
"""
Takes a dictionary where all values are numbers that add up to 100,
and rounds them off to integers while still retaining a sum of 100.
A total value sum that rounds to 100.00 with two decimals is acceptable.
This ensures that all input where the values are calculated with [fraction]/[total]
and the sum of all fractions equal the total, should pass.
"""
# Check input
# Only allow numbers
if not all(isinstance(i, numbers.Number) for i in dictionary.values()):
raise ValueError('All values of the dictionary must be a number')
# Make sure the sum is close enough to 100
# Round value_sum to 2 decimals to avoid floating point representation errors
value_sum = round(sum(dictionary.values()), 2)
if not value_sum == 100:
raise ValueError('The sum of the values must be 100')
# Initial floored results
# Does not add up to 100, so we need to add something
result = {key: int(math.floor(value)) for key, value in dictionary.items()}
# Remainders for each key
result_remainders = {key: value % 1 for key, value in dictionary.items()}
# Keys sorted by remainder (biggest first)
sorted_keys = [key for key, value in sorted(result_remainders.items(), key=operator.itemgetter(1), reverse=True)]
# Otherwise add missing values up to 100
# One cycle is enough, since flooring removes a max value of < 1 per item,
# i.e. this loop should always break before going through the whole list
for key in sorted_keys:
if sum(result.values()) == 100:
break
result[key] += 1
# Return
return result
下面是@varun-vohra答案的一个简单的Python实现:
def apportion_pcts(pcts, total):
proportions = [total * (pct / 100) for pct in pcts]
apportions = [math.floor(p) for p in proportions]
remainder = total - sum(apportions)
remainders = [(i, p - math.floor(p)) for (i, p) in enumerate(proportions)]
remainders.sort(key=operator.itemgetter(1), reverse=True)
for (i, _) in itertools.cycle(remainders):
if remainder == 0:
break
else:
apportions[i] += 1
remainder -= 1
return apportions
你需要math, itertools, operator。
如果你真的必须四舍五入,这里已经有了很好的建议(最大余数,最小相对误差,等等)。
也有一个很好的理由不四舍五入(你至少会得到一个“看起来更好”但“错误”的数字),以及如何解决这个问题(警告你的读者),这就是我所做的。
让我加上“错误”的数字部分。
假设你有三个事件/实体/…用一些百分比来近似:
DAY 1
who | real | app
----|-------|------
A | 33.34 | 34
B | 33.33 | 33
C | 33.33 | 33
稍后,值略有变化,为
DAY 2
who | real | app
----|-------|------
A | 33.35 | 33
B | 33.36 | 34
C | 33.29 | 33
第一个表有前面提到的“错误”数字的问题:33.34更接近33而不是34。
但现在误差更大了。与第2天和第1天相比,A的实际百分比值增加了0.01%,但近似值显示下降了1%。
这是一个定性错误,可能比最初的定量错误更严重。
你可以为整个集合设计一个近似值,但是,你可能必须在第一天发布数据,因此你不知道第二天的情况。所以,除非你真的,真的,必须近似,否则最好不要。
因为这里没有一个答案似乎能正确解决这个问题,下面是我使用下划线的半模糊版本:
function foo(l, target) {
var off = target - _.reduce(l, function(acc, x) { return acc + Math.round(x) }, 0);
return _.chain(l).
sortBy(function(x) { return Math.round(x) - x }).
map(function(x, i) { return Math.round(x) + (off > i) - (i >= (l.length + off)) }).
value();
}
foo([13.626332, 47.989636, 9.596008, 28.788024], 100) // => [48, 29, 14, 9]
foo([16.666, 16.666, 16.666, 16.666, 16.666, 16.666], 100) // => [17, 17, 17, 17, 16, 16]
foo([33.333, 33.333, 33.333], 100) // => [34, 33, 33]
foo([33.3, 33.3, 33.3, 0.1], 100) // => [34, 33, 33, 0]