Bitfun: Popcount, Sideways sum, Hamming weight

Bitcount, popcount, sideway sum, Hamming weight

I was going through some simple coding puzzles yesterday night and became fascinated by this seemingly interesting function:

int CountBits (unsigned int x ) {
  static unsigned int mask[] = {
    0x55555555,
    0x33333333,
    0x0F0F0F0F,
    0x00FF00FF,
    0x0000FFFF
  };

  int i ;
  int shift ; /* Number of positions to shift to right*/
  for (i = 0, shift = 1; i < 5; i++, shift *= 2)
    x = (x & mask[i]) + ((x >> shift) & mask[i]);
  return x;
}

This function is one of the most (out of many) efficient algorithms to calculate the Hamming weight (also called the popcount, the sideways sum), i.e. the number of set bits in an integer.

It simply works” doesn’t cut it for me, and neither does it for you, probably. So let’s remember some simple bit arithmetic and find out how the heck does this function do its magic.

If you look at the masks you’ll notice a distinct pattern.

0x55555555 is 0b01010101010101010101010101010101
0x33333333 is 0b00110011001100110011001100110011
0x0F0F0F0F is 0b00001111000011110000111100001111
0x00FF00FF is 0b00000000111111110000000011111111
0x0000FFFF is 0b00000000000000001111111111111111

clearer

0x55555555 is 0b01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
0x33333333 is 0b0011 0011 0011 0011 0011 0011 0011 0011
0x0F0F0F0F is 0b00001111 00001111 00001111 00001111
0x00FF00FF is 0b0000000011111111 0000000011111111
0x0000FFFF is 0b00000000000000001111111111111111

The pattern is quite evident, we have an ever-widening fence comprising 1s and 0s. These alternate in ones, then twos, then fours, then eights and finally sixteens. So, as you can see these magic numbers aren’t too magic after all.

The loop is performed 5 times, once for each of these masks. The shift is multiplied by 2 with each step.

Let’s break this line x = (x & mask[i]) + ((x >> shift) & mask[i]); into two separate ones to try and understand what is going.

(x & mask[i]) masks out unset bits in the first portion of each pair using the logical AND operation.

For example 0b10001101 in the first iteration would get masked against 0b01010101

0b01 01 01 01... 0x55
0b10 00 11 01 - our integer
------------- AND
0b01 00 01 01

Simple, the next statement ((x >> shift) & mask[i]); does the same, but shifts the input bits to the right by 1 bit first.

0b01 01 01 01... 0x55
0b01 00 01 10
------------- AND
0b01 00 01 00

Notice, that our input is an unsigned integer in the example. Implementations of signed right bitshifting may differ from one programming language to another and may even differ between compilers (more info).

Both parts are then added.

0b01 00 01 01
0b01 00 01 00
------------- +
0b10 00 10 01

If you look hard enough and try to see what happened, you’ll notice that the bits in each pair got added together. First the second bit fell through, then a shift occurred by one bit, and the new second bit fell through and got added with the previous one. The whole concept of this algorithm runs down to this “compression” of sets of bits.

Here’s the next mask applied to the resulting value (remember the input is modified and stored as the input for the next step):

0b0011 0011... 0x33
0b1000 0100 (result from the previous step)
----------- AND
0b0000 0000

0b0011 0011... 0x33
0b0010 0010 (shift by 2 bits right)
----------- AND
0b0010 0010

0b0000 0000
0b0010 0010
----------- +
0b0010 0010 (result for next step)

** next iteration

0b00001111... 0x0F
0b00100010 (result from the previous step)
---------- AND
0b00000010

0b00001111... 0x0F
0b00000010 (shift by 4 bits right)
---------- AND
0b00000010

0b00000010
0b00000010
---------- +
0b00000100 (result = 4 bits are set)

Notice how each iteration sets are sort of overlapped and added together. First bit + second bit, then first two bits + second two bits, then first four bits + second four bits. It may be a little difficult to grasp at first, but once you do a couple of these it starts to make more sense.

Imagine the algorithm is almost sweeping the bits from left to right, and one each separate “floor” (iteration) the “holes” (active bits in the masks) through which the swept bits fall through to a lower level get larger and larger.

1101 <-- swept bits
0101 <-- floor
0101 <-- the ones fell through

0110 --> swept right
0101 <-- floor
0100 <-- the ones fell through the "holes"

** next "floor" **

0101
0100 on top of each other, getting squashed to

1001 <-- 
0011 <-- new floor, the '1's are holes, bits fall through
0001 <-- one fell through, "sweep" (bitshift) again

0010 --> swept right by 2 bits
0011 <-- floor hasn't changed yet
0010 <-- now this fell through

squished together
0001
0010
0011 = 3 bits were set

Quite fascinating and ingenious. Check out these other bit counting algorithms for even more intriguing mind-twisting code.