计算32位整数中的设置位数

代表数字7的8位像这样:

00000111

设置了三个比特。

确定32位整数中设置位数的算法是什么?

当前回答

int countBits(int x)
{
    int n = 0;
    if (x) do n++;
           while(x=x&(x-1));
    return n;
}

或者:

int countBits(int x) { return (x)? 1+countBits(x&(x-1)): 0; }

在我最初的回答7年半之后，@PeterMortensen质疑这是否是有效的C语法。我发布了一个在线编译器的链接，显示它实际上是完全有效的语法(代码如下)。

#include <stdio.h>
int countBits(int x)
{
    int n = 0;
    if (x) do n++;           /* Totally Normal Valid code. */
           while(x=x&(x-1)); /* Nothing to see here.       */
    return n;
}   
 
int main(void) {
    printf("%d\n", countBits(25));
    return 0;
}

输出:

如果你想重新写清楚，它看起来是这样的:

if (x)
{
    do
    {
        n++;
    } while(x=x&(x-1));
}

但在我看来，这太过分了。

然而，我也意识到函数可以变得更短，但可能更神秘，写为:

int countBits(int x)
{
    int n = 0;
    while (x) x=(n++,x&(x-1));
    return n;
}

2015-02-12 04:19:05

其他回答

"最佳算法"是什么意思?短码还是长码?您的代码看起来非常优雅，并且具有恒定的执行时间。代码也很短。

但如果速度是主要因素，而不是代码大小，那么我认为以下方法可以更快:

       static final int[] BIT_COUNT = { 0, 1, 1, ... 256 values with a bitsize of a byte ... };
        static int bitCountOfByte( int value ){
            return BIT_COUNT[ value & 0xFF ];
        }

        static int bitCountOfInt( int value ){
            return bitCountOfByte( value ) 
                 + bitCountOfByte( value >> 8 ) 
                 + bitCountOfByte( value >> 16 ) 
                 + bitCountOfByte( value >> 24 );
        }

我认为这不会更快的64位值，但32位值可以更快。

2008-09-20 19:31:08

有些语言以一种可以使用有效硬件支持(如果可用的话)的方式可移植地公开操作，而有些语言则希望使用一些不错的库。

例如(从语言表中):

c++有std::bitset<>::count()或c++ 20 std::popcount(T x) Java有Java .lang. integer . bitcount()(也用于Long或BigInteger) c#有system . numbers . bitoperations . popcount () Python有int.bit_count()(从3.10开始)

不过，并不是所有的编译器/库都能在HW支持可用时使用它。(值得注意的是MSVC，即使有选项使std::popcount内联为x86 popcnt，它的std::bitset::count仍然总是使用查找表。这有望在未来的版本中改变。)

当可移植语言没有这种基本的位操作时，还要考虑编译器的内置函数。以GNU C为例:

int __builtin_popcount (unsigned int x);
int __builtin_popcountll (unsigned long long x);

In the worst case (no single-instruction HW support) the compiler will generate a call to a function (which in current GCC uses a shift/and bit-hack like this answer, at least for x86). In the best case the compiler will emit a cpu instruction to do the job. (Just like a * or / operator - GCC will use a hardware multiply or divide instruction if available, otherwise will call a libgcc helper function.) Or even better, if the operand is a compile-time constant after inlining, it can do constant-propagation to get a compile-time-constant popcount result.

GCC内置甚至可以跨多个平台工作。Popcount几乎已经成为x86架构的主流，所以现在开始使用内置是有意义的，这样你就可以重新编译，让它内联硬件指令时，你编译-mpopcnt或包括(例如https://godbolt.org/z/Ma5e5a)。其他架构已经有popcount很多年了，但在x86领域，仍然有一些古老的Core 2和类似的老式AMD cpu在使用。

在x86上，你可以告诉编译器它可以通过-mpopcnt(也可以通过-msse4.2暗示)假设支持popcnt指令。参见GCC x86选项。-march=nehalem -mtune=skylake(或-march=任何您希望您的代码假设和调优的CPU)可能是一个不错的选择。在较旧的CPU上运行生成的二进制文件将导致非法指令错误。

要为构建它们的机器优化二进制文件，请使用-march=native(与gcc、clang或ICC一起使用)。

MSVC为x86的popcnt指令提供了一个内在的特性，但与gcc不同的是，它实际上是硬件指令的一个内在特性，需要硬件支持。

使用std::bitset<>::count()代替内置的

理论上，任何知道如何有效地为目标CPU进行popcount的编译器都应该通过ISO c++ std::bitset<>来公开该功能。实际上，对于某些目标cpu，在某些情况下使用bit-hack AND/shift/ADD可能会更好。

For target architectures where hardware popcount is an optional extension (like x86), not all compilers have a std::bitset that takes advantage of it when available. For example, MSVC has no way to enable popcnt support at compile time, and it's std::bitset<>::count always uses a table lookup, even with /Ox /arch:AVX (which implies SSE4.2, which in turn implies the popcnt feature.) (Update: see below; that does get MSVC's C++20 std::popcount to use x86 popcnt, but still not its bitset<>::count. MSVC could fix that by updating their standard library headers to use std::popcount when available.)

但是，至少您得到了可以在任何地方工作的可移植的东西，并且使用带有正确目标选项的gcc/clang，您可以获得支持它的体系结构的硬件popcount。

#include <bitset>
#include <limits>
#include <type_traits>

template<typename T>
//static inline  // static if you want to compile with -mpopcnt in one compilation unit but not others
typename std::enable_if<std::is_integral<T>::value,  unsigned >::type 
popcount(T x)
{
    static_assert(std::numeric_limits<T>::radix == 2, "non-binary type");

    // sizeof(x)*CHAR_BIT
    constexpr int bitwidth = std::numeric_limits<T>::digits + std::numeric_limits<T>::is_signed;
    // std::bitset constructor was only unsigned long before C++11.  Beware if porting to C++03
    static_assert(bitwidth <= std::numeric_limits<unsigned long long>::digits, "arg too wide for std::bitset() constructor");

    typedef typename std::make_unsigned<T>::type UT;        // probably not needed, bitset width chops after sign-extension

    std::bitset<bitwidth> bs( static_cast<UT>(x) );
    return bs.count();
}

参见Godbolt编译器资源管理器上gcc、clang、icc和MSVC中的asm。

x86-64 gcc -O3 -std=gnu++11 -mpopcnt输出:

unsigned test_short(short a) { return popcount(a); }
    movzx   eax, di      # note zero-extension, not sign-extension
    popcnt  rax, rax
    ret

unsigned test_int(int a) { return popcount(a); }
    mov     eax, edi
    popcnt  rax, rax        # unnecessary 64-bit operand size
    ret

unsigned test_u64(unsigned long long a) { return popcount(a); }
    xor     eax, eax     # gcc avoids false dependencies for Intel CPUs
    popcnt  rax, rdi
    ret

PowerPC64 gcc -O3 -std=gnu++11发出(对于int arg版本):

    rldicl 3,3,0,32     # zero-extend from 32 to 64-bit
    popcntd 3,3         # popcount
    blr

这个源代码不是x86特定的，也不是gnu特定的，只是在gcc/clang/icc下编译得很好，至少在针对x86(包括x86-64)时是这样。

还要注意，对于没有单指令popcount的体系结构，gcc的回退是逐字节表查找。例如，这对ARM来说就不是什么好事。

c++ 20有std::popcount(T)

不幸的是，当前libstdc++头文件用特殊情况定义了它，if(x==0) return 0;在开始时，clang在编译x86时不会优化:

#include <bit>
int bar(unsigned x) {
    return std::popcount(x);
}

clang 11.0.1 -O3 -std=gnu++20 -march=nehalem （https://godbolt.org/z/arMe5a）

# clang 11
    bar(unsigned int):                                # @bar(unsigned int)
        popcnt  eax, edi
        cmove   eax, edi         # redundant: if popcnt result is 0, return the original 0 instead of the popcnt-generated 0...
        ret

但是GCC编译得很好:

# gcc 10
        xor     eax, eax         # break false dependency on Intel SnB-family before Ice Lake.
        popcnt  eax, edi
        ret

即使是MSVC也能很好地使用它，只要你使用-arch:AVX或更高版本(并使用-std:c++latest启用c++ 20)。https://godbolt.org/z/7K4Gef

int bar(unsigned int) PROC                                 ; bar, COMDAT
        popcnt  eax, ecx
        ret     0
int bar(unsigned int) ENDP                                 ; bar

2008-09-20 19:23:05

public class BinaryCounter {

private int N;

public BinaryCounter(int N) {
    this.N = N;
}

public static void main(String[] args) {

    BinaryCounter counter=new BinaryCounter(7);     
    System.out.println("Number of ones is "+ counter.count());

}

public int count(){
    if(N<=0) return 0;
    int counter=0;
    int K = 0;
    do{
        K = biggestPowerOfTwoSmallerThan(N);
        N = N-K;
        counter++;
    }while (N != 0);
    return counter;

}

private int biggestPowerOfTwoSmallerThan(int N) {
    if(N==1) return 1;
    for(int i=0;i<N;i++){
        if(Math.pow(2, i) > N){
            int power = i-1;
            return (int) Math.pow(2, power);
        }
    }
    return 0;
}
}

2015-09-25 17:31:37

你可以:

while(n){
    n = n & (n-1);
    count++;
}

这背后的逻辑是n-1位从n的最右边的集合位倒出来。

如果n=6，即110，那么5是101，位从n的最右边的集合位倒出来。

因此，如果我们&这两个，我们将在每次迭代中使最右边的位为0，并且总是到下一个最右边的集位。因此，计数设置位。当每一位都被设置时，最糟糕的时间复杂度将是O(log n)。

2017-08-23 17:31:11

对于232查找表和逐个遍历每个位之间的折中方法:

int bitcount(unsigned int num){
    int count = 0;
    static int nibblebits[] =
        {0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4};
    for(; num != 0; num >>= 4)
        count += nibblebits[num & 0x0f];
    return count;
}

从http://ctips.pbwiki.com/CountBits

2008-09-22 03:55:59

计算32位整数中的设置位数

推荐文章

最新文章

标签