如何使四舍五入百分比加起来为100%

考虑下面四个百分比，用浮点数表示:

    13.626332%
    47.989636%
     9.596008%
    28.788024%
   -----------
   100.000000%

我需要用整数表示这些百分比。如果我简单地使用Math.round()，我最终得到的总数是101%。

14 + 48 + 10 + 29 = 101

如果我使用parseInt()，我最终得到了97%。

13 + 47 + 9 + 28 = 97

有什么好的算法可以将任何百分比数表示为整数，同时还保持总数为100%?

编辑:在阅读了一些评论和回答后，显然有很多方法可以解决这个问题。

在我看来，为了保持数字的真实性，“正确”的结果是最小化总体误差的结果，定义为相对于实际值会引入多少误差舍入:

        value  rounded     error               decision
   ----------------------------------------------------
    13.626332       14      2.7%          round up (14)
    47.989636       48      0.0%          round up (48)
     9.596008       10      4.0%    don't round up  (9)
    28.788024       29      2.7%          round up (29)

在平局的情况下(3.33,3.33,3.33)可以做出任意的决定(例如3,4,3)。

当前回答

我曾经写过一个un舍入工具，来找到一组数字的最小扰动来匹配一个目标。这是一个不同的问题，但理论上可以在这里使用类似的想法。在这种情况下，我们有一系列的选择。

因此，对于第一个元素，我们可以四舍五入到14，也可以四舍五入到13。这样做的代价(在二进制整数编程的意义上)对于向上舍入比向下舍入要小，因为向下舍入需要我们将该值移动更大的距离。同样，我们可以把每个数字四舍五入，所以我们总共有16个选择。

  13.626332
  47.989636
   9.596008
+ 28.788024
-----------
 100.000000

我通常会在MATLAB中使用bintprog(一种二进制整数编程工具)解决一般问题，但这里只有几个选项需要测试，所以用简单的循环就可以很容易地测试出16个选项中的每一个。例如，假设我们将这个集合四舍五入为:

 Original      Rounded   Absolute error
   13.626           13          0.62633
    47.99           48          0.01036
    9.596           10          0.40399
 + 28.788           29          0.21198
---------------------------------------
  100.000          100          1.25266

总绝对误差为1.25266。它可以通过以下替代舍入来略微减少:

 Original      Rounded   Absolute error
   13.626           14          0.37367
    47.99           48          0.01036
    9.596            9          0.59601
 + 28.788           29          0.21198
---------------------------------------
  100.000          100          1.19202

事实上，这就是绝对误差的最优解。当然，如果有20项，搜索空间的大小将是2^20 = 1048576。对于30或40个术语，这个空间将是相当大的。在这种情况下，您将需要使用能够有效搜索空间的工具，可能使用分支和绑定方案。

2012-11-21 00:01:46

其他回答

我已经实现了Varun Vohra的答案在这里的列表和字典的方法。

import math
import numbers
import operator
import itertools


def round_list_percentages(number_list):
    """
    Takes a list where all values are numbers that add up to 100,
    and rounds them off to integers while still retaining a sum of 100.

    A total value sum that rounds to 100.00 with two decimals is acceptable.
    This ensures that all input where the values are calculated with [fraction]/[total]
    and the sum of all fractions equal the total, should pass.
    """
    # Check input
    if not all(isinstance(i, numbers.Number) for i in number_list):
        raise ValueError('All values of the list must be a number')

    # Generate a key for each value
    key_generator = itertools.count()
    value_dict = {next(key_generator): value for value in number_list}
    return round_dictionary_percentages(value_dict).values()


def round_dictionary_percentages(dictionary):
    """
    Takes a dictionary where all values are numbers that add up to 100,
    and rounds them off to integers while still retaining a sum of 100.

    A total value sum that rounds to 100.00 with two decimals is acceptable.
    This ensures that all input where the values are calculated with [fraction]/[total]
    and the sum of all fractions equal the total, should pass.
    """
    # Check input
    # Only allow numbers
    if not all(isinstance(i, numbers.Number) for i in dictionary.values()):
        raise ValueError('All values of the dictionary must be a number')
    # Make sure the sum is close enough to 100
    # Round value_sum to 2 decimals to avoid floating point representation errors
    value_sum = round(sum(dictionary.values()), 2)
    if not value_sum == 100:
        raise ValueError('The sum of the values must be 100')

    # Initial floored results
    # Does not add up to 100, so we need to add something
    result = {key: int(math.floor(value)) for key, value in dictionary.items()}

    # Remainders for each key
    result_remainders = {key: value % 1 for key, value in dictionary.items()}
    # Keys sorted by remainder (biggest first)
    sorted_keys = [key for key, value in sorted(result_remainders.items(), key=operator.itemgetter(1), reverse=True)]

    # Otherwise add missing values up to 100
    # One cycle is enough, since flooring removes a max value of < 1 per item,
    # i.e. this loop should always break before going through the whole list
    for key in sorted_keys:
        if sum(result.values()) == 100:
            break
        result[key] += 1

    # Return
    return result

2017-09-13 22:34:47

  13.626332
  47.989636
   9.596008
+ 28.788024
-----------
 100.000000

 Original      Rounded   Absolute error
   13.626           13          0.62633
    47.99           48          0.01036
    9.596           10          0.40399
 + 28.788           29          0.21198
---------------------------------------
  100.000          100          1.25266

总绝对误差为1.25266。它可以通过以下替代舍入来略微减少:

 Original      Rounded   Absolute error
   13.626           14          0.37367
    47.99           48          0.01036
    9.596            9          0.59601
 + 28.788           29          0.21198
---------------------------------------
  100.000          100          1.19202

2012-11-21 00:01:46

可能做到这一点的“最佳”方法(引用是因为“最佳”是一个主观术语)是保持你所处位置的连续(非积分)计数，并四舍五入该值。

然后将其与历史记录一起使用，以确定应该使用什么值。例如，使用您给出的值:

Value      CumulValue  CumulRounded  PrevBaseline  Need
---------  ----------  ------------  ------------  ----
                                  0
13.626332   13.626332            14             0    14 ( 14 -  0)
47.989636   61.615968            62            14    48 ( 62 - 14)
 9.596008   71.211976            71            62     9 ( 71 - 62)
28.788024  100.000000           100            71    29 (100 - 71)
                                                    ---
                                                    100

在每个阶段，都不需要四舍五入数字本身。相反，将累积值四舍五入，并计算出从上一个基线中达到该值的最佳整数——该基线是前一行的累积值(四舍五入)。

这是可行的，因为您不会在每个阶段都丢失信息，而是更聪明地使用信息。“正确的”四舍五入值在最后一列，你可以看到它们的和是100。

在上面的第三个值中，您可以看到这与盲目舍入每个值之间的区别。虽然9.596008通常会四舍五入到10，但累积的71.211976正确地四舍五入到71 -这意味着只需要9就可以加上之前的基线62。

这也适用于“有问题的”序列，比如三个大约1/3的值，其中一个应该四舍五入:

Value      CumulValue  CumulRounded  PrevBaseline  Need
---------  ----------  ------------  ------------  ----
                                  0
33.333333   33.333333            33             0    33 ( 33 -  0)
33.333333   66.666666            67            33    34 ( 67 - 33)
33.333333   99.999999           100            67    33 (100 - 67)
                                                    ---
                                                    100

2012-11-20 22:43:54

我不确定你需要什么程度的精度，但我要做的就是简单地把前n个数字加1,n是小数总和的上界。在这种情况下，它是3，所以我将给前3项加1，然后将其余的取整。当然，这并不是非常准确，有些数字可能会四舍五入或在不应该的时候，但它工作得很好，总是会得到100%。

因此[13.626332,47.989636,9.596008,28.788024]将是[14,48,10,28]，因为Math.ceil(.626332+.989636+.596008+.788024) == 3

function evenRound( arr ) {
  var decimal = -~arr.map(function( a ){ return a % 1 })
    .reduce(function( a,b ){ return a + b }); // Ceil of total sum of decimals
  for ( var i = 0; i < decimal; ++i ) {
    arr[ i ] = ++arr[ i ]; // compensate error by adding 1 the the first n items
  }
  return arr.map(function( a ){ return ~~a }); // floor all other numbers
}

var nums = evenRound( [ 13.626332, 47.989636, 9.596008, 28.788024 ] );
var total = nums.reduce(function( a,b ){ return a + b }); //=> 100

你总是可以告诉用户这些数字是四舍五入的，可能不是非常准确……

2012-11-20 23:38:46

因为这里没有一个答案似乎能正确解决这个问题，下面是我使用下划线的半模糊版本:

function foo(l, target) {
    var off = target - _.reduce(l, function(acc, x) { return acc + Math.round(x) }, 0);
    return _.chain(l).
            sortBy(function(x) { return Math.round(x) - x }).
            map(function(x, i) { return Math.round(x) + (off > i) - (i >= (l.length + off)) }).
            value();
}

foo([13.626332, 47.989636, 9.596008, 28.788024], 100) // => [48, 29, 14, 9]
foo([16.666, 16.666, 16.666, 16.666, 16.666, 16.666], 100) // => [17, 17, 17, 17, 16, 16]
foo([33.333, 33.333, 33.333], 100) // => [34, 33, 33]
foo([33.3, 33.3, 33.3, 0.1], 100) // => [34, 33, 33, 0]

2012-11-21 03:27:55

如何使四舍五入百分比加起来为100%

推荐文章

最新文章

标签