Integer#bitCount(int)方法原理分析

发布于 2019-03-18 Java Integer

起因是在leetcode上写到了一个计算Hamming Distance(汉明距离)的题目。

The Hamming distance between two integers is the number of positions at which the corresponding bits are different.
Given two integers x and y, calculate the Hamming distance.

开始我的解法为遍历，按位计算。

public int hammingDistance(int x, int y) {
    int count = 0, num = x ^ y;
    for (int i = 0; i < 32; i++) {
        if ((num >>> i & 1) == 1) // can be replaced => count += (num >>> i & 1);
            count++;
    }
    return count;
}

写完后觉得时间复杂度应该是可以优化的，就想起了Integer#bitCount(int)方法，该方法返回数字二进制中1的数量。JDK的实现都是比较高效的，所以习惯性看了眼源码。

/**
 * Returns the number of one-bits in the two's complement binary
 * representation of the specified {@code int} value.  This function is
 * sometimes referred to as the <i>population count</i>.
 *
 * @param i the value whose bits are to be counted
 * @return the number of one-bits in the two's complement binary
 *     representation of the specified {@code int} value.
 * @since 1.5
 */
public static int bitCount(int i) {
    // HD, Figure 5-2
    i = i - ((i >>> 1) & 0x55555555);
    i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);
    i = (i + (i >>> 4)) & 0x0f0f0f0f;
    i = i + (i >>> 8);
    i = i + (i >>> 16);
    return i & 0x3f;
}

这种充斥着位运算的算法第一眼看上去往往都是比较不容易理解的，需要仔细分析下原理。
算法思想为分而治之，先计算每2个bit位中有几个1，结果保存至这2个bit位上；再计算每4个bit位上有几个1，保存至这4个bit位上…
直到最后一步合并低16bit和高16bit上结果即可，即通过五次&(与)以及>>>(无符号右移)来实现。

Hexadecimal	Binary
0x55555555	‭0b01010101010101010101010101010101‬
0x33333333	‭0b00110011001100110011001100110011‬
0x0f0f0f0f	‭0b00001111000011110000111100001111‬
0x00ff00ff	0b00000000111111110000000011111111
0x0000ffff	‭0b00000000000000001111111111111111‬
0x3f	‭0b00111111‬

(i & 0x55555555) + ((i >>> 1) & 0x55555555)即为第一步，先计算每两个bit位中有几个1，结果保存至这两个bit位上。
i & 0x55555555计算了每两位中右1位中1的数量；((i >>> 1) & 0x55555555)将i无符号右移1位后做相同与运算，也就是计算了每两位中左1位中1的数量。

有此可得算法原型。

public static int bitCount(int i) {

    i = (i & 0x55555555) + ((i >>> 1) & 0x55555555);
    i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);
    i = (i & 0x0f0f0f0f) + ((i >>> 4) & 0x0f0f0f0f);
    i = (i & 0x00ff00ff) + ((i >>> 8) & 0x00ff00ff);
    i = (i & 0x0000ffff) + ((i >>> 16) & 0x0000ffff);
    return i;
}

稍加优化后即为Integer#bitCount(int)的实现，优化的部分为减少与运算，将消位操作放到最后。

还有种比较有趣的BitCount算法为HAKMEM/MIT HAKMEM算法，感兴趣可以了解下。

上一篇: Hexo 博客迁移下一篇: 记一次锁等待超时排查（Gap Lock、Lock wait timeout exceeded）

Osmanthus

空想具現化