Bitfun: Popcount, Sideways sum, Hamming weight
I was going through some simple coding puzzles yesterday night and became fascinated by this seemingly interesting function:
int CountBits (unsigned int x ) { static unsigned int mask[] = { 0x55555555, 0x33333333, 0x0F0F0F0F, 0x00FF00FF, 0x0000FFFF }; int i ; int shift ; /* Number of positions to shift to right*/ for (i = 0, shift = 1; i < 5; i++, shift *= 2) x = (x & mask[i]) + ((x >> shift) & mask[i]); return x; }
This function is one of the most (out of many) efficient algorithms to calculate the Hamming weight (also called the popcount, the sideways sum), i.e. the number of set bits in an integer.
“It simply works” doesn’t cut it for me, and neither does it for you, probably. So let’s remember some simple bit arithmetic and find out how the heck does this function do its magic.
If you look at the masks you’ll notice a distinct pattern.
0x55555555 is 0b01010101010101010101010101010101 0x33333333 is 0b00110011001100110011001100110011 0x0F0F0F0F is 0b00001111000011110000111100001111 0x00FF00FF is 0b00000000111111110000000011111111 0x0000FFFF is 0b00000000000000001111111111111111 clearer 0x55555555 is 0b01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 0x33333333 is 0b0011 0011 0011 0011 0011 0011 0011 0011 0x0F0F0F0F is 0b00001111 00001111 00001111 00001111 0x00FF00FF is 0b0000000011111111 0000000011111111 0x0000FFFF is 0b00000000000000001111111111111111
The pattern is quite evident, we have an ever-widening fence comprising 1
s and 0
s. These alternate in ones, then twos, then fours, then eights and finally sixteens. So, as you can see these magic numbers aren’t too magic after all.
The loop is performed 5 times, once for each of these masks. The shift is multiplied by 2 with each step.
Let’s break this line x = (x & mask[i]) + ((x >> shift) & mask[i]);
into two separate ones to try and understand what is going.
(x & mask[i])
masks out unset bits in the first portion of each pair using the logical AND operation.
For example 0b10001101
in the first iteration would get masked against 0b01010101
0b01 01 01 01... 0x55 0b10 00 11 01 - our integer ------------- AND 0b01 00 01 01
Simple, the next statement ((x >> shift) & mask[i]);
does the same, but shifts the input bits to the right by 1 bit first.
0b01 01 01 01... 0x55 0b01 00 01 10 ------------- AND 0b01 00 01 00
Notice, that our input is an unsigned integer in the example. Implementations of signed right bitshifting may differ from one programming language to another and may even differ between compilers (more info).
Both parts are then added.
0b01 00 01 01 0b01 00 01 00 ------------- + 0b10 00 10 01
If you look hard enough and try to see what happened, you’ll notice that the bits in each pair got added together. First the second bit fell through, then a shift occurred by one bit, and the new second bit fell through and got added with the previous one. The whole concept of this algorithm runs down to this “compression” of sets of bits.
Here’s the next mask applied to the resulting value (remember the input is modified and stored as the input for the next step):
0b0011 0011... 0x33 0b1000 0100 (result from the previous step) ----------- AND 0b0000 0000 0b0011 0011... 0x33 0b0010 0010 (shift by 2 bits right) ----------- AND 0b0010 0010 0b0000 0000 0b0010 0010 ----------- + 0b0010 0010 (result for next step) ** next iteration 0b00001111... 0x0F 0b00100010 (result from the previous step) ---------- AND 0b00000010 0b00001111... 0x0F 0b00000010 (shift by 4 bits right) ---------- AND 0b00000010 0b00000010 0b00000010 ---------- + 0b00000100 (result = 4 bits are set)
Notice how each iteration sets are sort of overlapped and added together. First bit + second bit, then first two bits + second two bits, then first four bits + second four bits. It may be a little difficult to grasp at first, but once you do a couple of these it starts to make more sense.
Imagine the algorithm is almost sweeping the bits from left to right, and one each separate “floor” (iteration) the “holes” (active bits in the masks) through which the swept bits fall through to a lower level get larger and larger.
1101 <-- swept bits 0101 <-- floor 0101 <-- the ones fell through 0110 --> swept right 0101 <-- floor 0100 <-- the ones fell through the "holes" ** next "floor" ** 0101 0100 on top of each other, getting squashed to 1001 <-- 0011 <-- new floor, the '1's are holes, bits fall through 0001 <-- one fell through, "sweep" (bitshift) again 0010 --> swept right by 2 bits 0011 <-- floor hasn't changed yet 0010 <-- now this fell through squished together 0001 0010 0011 = 3 bits were set
Quite fascinating and ingenious. Check out these other bit counting algorithms for even more intriguing mind-twisting code.
Thank you for writing this article. It’s really good.
There is a typo in your first AND calculation.
The result is 0b00 00 01 01 not 0b01 00 01 01
Thanks!