
i*2 = i<<1
i*3 = (i<<1) + i;
i*10 = (i<<3) + (i<<1)


用say (i<<3)+(i<<1)乘10是否比直接用i*10更快?有没有什么输入是不能用这种方法乘或除的?


In the case of signed integers and right shift vs division, it can make a difference. For negative numbers, the shift rounds rounds towards negative infinity whereas division rounds towards zero. Of course the compiler will change the division to something cheaper, but it will usually change it to something that has the same rounding behavior as division, because it is either unable to prove that the variable won't be negative or it simply doesn't care. So if you can prove that a number won't be negative or if you don't care which way it will round, you can do that optimization in a way that is more likely to make a difference.


除了所有其他好的答案,让我指出当你指除法或乘法时不使用shift的另一个原因。我从未见过有人因为忘记乘法和加法的相对优先级而导致错误。我曾经见过,当维护程序员忘记了通过移位的“乘法”在逻辑上是乘法,但在语法上与乘法的优先级不同时,就会引入错误。X * 2 + z和X << 1 + z非常不同!

如果你处理的是数字,那就使用算术运算符,比如+ - * / %。如果您正在处理比特数组,请使用& ^ | >>这样的比特旋转操作符。不要把它们混在一起;一个表达式如果同时具有位旋转和算术,那么这个表达式就是一个等待发生的错误。



在我的软件公司,Math (add/sub/mul/div)应该用于所有数学。 当数据类型之间转换时应该使用Shift。字节长度为n>>8,而不是n/256。







#include <stdio.h>
#include <chrono>

static const unsigned s_bc = 32;
static const unsigned long long s_p = 1ULL << s_bc;
static const unsigned long long s_hp = s_p / 2;

static unsigned long long s_f;
static unsigned long long s_fr;

static void fastDivInitialize(const unsigned d)
    s_f = s_p / d;
    s_fr = s_f * (s_p - (s_f * d));

static unsigned fastDiv(const unsigned n)
    return (s_f * n + ((s_fr * n + s_hp) >> s_bc)) >> s_bc;

static bool fastDivCheck(const unsigned n, const unsigned d)
    // 32 to 64 cycles latency on modern cpus
    const unsigned expected = n / d;

    // At least 10 cycles latency on modern cpus
    const unsigned result = fastDiv(n);

    if (result != expected)
        printf("Failed for: %u/%u != %u\n", n, d, expected);
        return false;

    return true;

int main()
    unsigned result = 0;

    // Make sure to verify it works for your expected set of inputs
    const unsigned MAX_N = 65535;
    const unsigned MAX_D = 40000;

    const double ONE_SECOND_COUNT = 1000000000.0;

    auto t0 = std::chrono::steady_clock::now();
    unsigned count = 0;
    for (unsigned d = 1; d <= MAX_D; ++d)
        for (unsigned n = 0; n <= MAX_N; ++n)
            count += !fastDivCheck(n, d);
    auto t1 = std::chrono::steady_clock::now();
    printf("Errors: %u / %u (%.4fs)\n", count, MAX_D * (MAX_N + 1), (t1 - t0).count() / ONE_SECOND_COUNT);

    t0 = t1;
    for (unsigned d = 1; d <= MAX_D; ++d)
        for (unsigned n = 0; n <= MAX_N; ++n)
            result += fastDiv(n);
    t1 = std::chrono::steady_clock::now();
    printf("Fast division time: %.4fs\n", (t1 - t0).count() / ONE_SECOND_COUNT);

    t0 = t1;
    count = 0;
    for (unsigned d = 1; d <= MAX_D; ++d)
        for (unsigned n = 0; n <= MAX_N; ++n)
            result += n / d;
    t1 = std::chrono::steady_clock::now();
    printf("Normal division time: %.4fs\n", (t1 - t0).count() / ONE_SECOND_COUNT);

    return result;

这取决于处理器和编译器。一些编译器已经通过这种方式优化代码了,其他的还没有。 因此,每次需要以这种方式优化代码时,您都需要检查。
