Computers work with binary numbers. The binary system allows for some cool manipulations to be performed on the per-digit level; the problem is that relatively few people learn about these operations, since they seem at first glance to be quite complicated. This document will show that they're not, really.

For this quick run-through, I'll be assuming you know what the binary numbering system is, and how it works; furthermore, I'll assume a little familiarity with working in binary. Everyone who uses a computer, whether it be to connect to the Internet, for video games, or simply to edit a Word document, will have experienced the binary numbering system, as this is how all computers function internally. However, to make binary manipulations, you will need a slightly more advanced idea of how this system operates. I'll also be presenting any code examples using the syntax of C and its syntactical derivatives (C++, PHP, Java and the like). Don't worry if you don't know the C syntax for the operators; I'll be putting a small table at the end of the document.

AND: Putting on the mask

The first operation to look at is called AND. It's called that because that's exactly what it does: take two inputs, and only return any output if both input 1 AND input 2 are on.

in1in2in1 AND in2
000
010
100
111

That table is an example of a truth table; it plots out all the possible combinations of inputs, and what the operator will do with them. As you can see, the AND operator only returns 1 if both inputs are 1. But how does that help us in the real world, of numbers bigger than one bit?

Using AND to mask values

The major thing that can be done with AND is masking: only using the part of a number that you want to use. For example, let's say you have a 32-bit variable, and you're incrementing it in a loop. But you want to wrap the value after 255, back to 0. A simple case of using an if statement, you may think. But think again.

AND: Masking a 32-bit value

while(1)
{
    i++;
    // How we'd do this with an if clause
    // if(i > 255) i = 0;

    // How to do it with AND
    i = i & 255;
}

What's happening here? Let's have a look at a normal case first; say i is at a value of 77. The AND operation looks like this:

AND: Masking at i=72

00000000 00000000 00000000 01001101
00000000 00000000 00000000 11111111   [AND]
-----------------------------------
00000000 00000000 00000000 01001101

The AND "mask" is essentially passing through the low 8 bits of i into the result. In this case, 77 fits fine into 8 bits, so no change. What happens at the borderline case: when 255 becomes 256?

AND: Masking at i=256

00000000 00000000 00000001 00000000
00000000 00000000 00000000 11111111   [AND]
-----------------------------------
00000000 00000000 00000000 00000000

As stated above, the low 8 bits of i are being passed through. That's 0. The rest of the value, including the '256' bit, is essentially ignored by the mask, meaning the value automagically wraps from 255 to 0. Quite useful, you'll admit.

AND's role in IP addressing

Something else you can do with AND is to clear a certain portion of a value. For example, you want to check the network that an IP address lives on. An IP address is just another 32-bit number, and the subnet mask that accompanies it is exactly that: a mask, that's applied to the IP using the AND operator, to find the subnet for that IP address.

Let's take my setup at home. I have a few computers at home, and they all have addresses in a private IP range. One of those computers is 172.16.55.37, with a subnet mask of 255.255.255.224; how does the router know which network I'm coming from?

AND: What subnet are you on?

IP address: 10101100.00010000.00110111.00100101
Snet Mask:  11111111.11111111.11111111.11100000
AND:        -----------------------------------
Subnet:     10101100.00010000.00110111.00100000  [172.16.55.32]

So when the router sees a packet destined for 172.16.55.37, it applies this AND operation, works out that the network is in its route table, and forwards the packet. In other words, the Internet wouldn't work without the AND operator.

AND used to clear a bit

A final example for AND: When you want to make a Windows application, you generally want to display a window, and windows can have various styles associated with them. The most commonly used style is WS_OVERLAPPEDWINDOW, which is just a number given an easier-to-read name: it combines various styles which tell Windows to provide a caption, system menu, minimise and maximise buttons.

But what if you don't want a minimise button? You can build the style you need by taking the normal one, and cutting out the value for WS_MINIMIZEBOX. The style values were set with just this idea in mind: simple styles are binary power values, and complicated styles combine them together.

So let's give that a go: a window with no minimise button.

AND: Clearing a bit

OVERLAPPEDWINDOW:  00000000 11001111 00000000 00000000
MINIMIZEBOX:      (00000000 00000010 00000000 00000000)
Clear mask:        11111111 11111101 11111111 11111111
AND:               -----------------------------------
Overall style:     00000000 11001101 00000000 00000000

If the new value is passed to Windows, you'll get a shiny new window with no minimise button. To generate the clear mask, you can use the NOT operator, which we'll be coming to later.

OR: Setting bits

The second operator we'll look into is called OR. And as you can guess, it's called OR for a reason: it takes two inputs, and gives an output if either one or the other, or both, is set. Here's another of those truth tables.

in1in2in1 OR in2
000
011
101
111

Note how it doesn't matter whether in1 is on or off; if in2 is on, the result is on. And, of course, the inverse applies. So, how can that be used in the real world?

The major application of OR is in setting bits; filling a portion of a value with 1's. Let's take the Windows styles from above: you may want not just a normal window, but a window which can handle vertical scrolling of its contents. Luckily, it's simple to define such a window: just tack the WS_OVERLAPPEDWINDOW default style and WS_VSCROLL together, thus.

OR: Filling in a value

OVERLAPPEDWINDOW:  00000000 11001111 00000000 00000000
VSCROLL:           00000000 00100000 00000000 00000000
OR together:       -----------------------------------
New window style:  00000000 11101111 00000000 00000000

By using AND and OR in this manner, it's quite easy to build up exactly the style of window you're looking for.

XOR: OR with a twist

OR is the operation that allows you to set a bit if either or both of two inputs is set. But what if you only want to check either, and not both? There is an operator for that, and it's called the Exclusive OR, XOR for short. Another truth table for you:

in1in2in1 XOR in2
000
011
101
110

Now, it might seem a bit esoteric and theoretical, to have an operator that only sets output if either input is set, and not both. However, the XOR operation does come in useful.

XOR to clear a value

If you have a variable or a CPU register, and you want to clear it to 0, you don't care about what's inside; those contents will be obliterated anyway. Of course, you can simply move "0" into that variable, but in some cases that might be a problem. Instead, you can use XOR.

XOR: Clearing to zero

Random value: 11001001 01001110 11010010 11110001
Value again:  11001001 01001110 11010010 11110001
XOR:          -----------------------------------
Output:       00000000 00000000 00000000 00000000

For every one of the 32 bits in that example, either the top or the bottom line of the truth table matched, which means the output was 0 in both cases. The end result is, of course, that all the bits of the value turn out as 0 after the XOR operation. Now, why would one want to perform such an operation? Take a look at this example, in Intel assembly.

XOR: Assembly usage

mov eax, 0       ; Assembles to 5 bytes
xor eax, eax     ; Assembles to 1 byte

If you're at a premium for space, it's obvious which of the two you'd pick; instead of wasting 4 bytes of space, a bitwise operator can do the same job.

XOR to flip a bit

The other major application of XOR is to flip a bit, or range of bits, within a value. If you take a look at the truth table in two halves, the top half is essentially the bottom half upside-down; and that flipping is controlled by the value of in1. This comes in very useful for certain situations.

For example, let's say you've been tasked with producing a square wave: a constant series of pulses, 64 then 191, then 64, then 191. You could implement a complicated series of if statements, or you could use a simple XOR.

XOR: Making a 1kHz square wave

char output = 64;
while(1)
{
    output = output ^ 255;
    usleep(500);    // sleep for half the wave period
}

So, what's happening here? The output value starts at 64, and gets changed by the XOR every time the loop runs. How does that change manifest itself?

XOR: Sending the square wave high

output: 01000000
Mask:   11111111
XOR:    --------
Result: 10111111

So we started at 64, and the XOR has flipped all the bits, to 191. After 500 milliseconds, the loop is re-entered, and the XOR gets applied again:

XOR: Sending the same wave low

output: 10111111
Mask:   11111111
XOR:    --------
Result: 01000000

And as if by magic, the XOR returns the result we entered with the first time: 191 gets flipped to 64. Because the value is 64 for 500 microseconds, and 191 for another 500 microseconds, the overall wave period is 1 millisecond, and we've produced the 1kHz square wave, with just one XOR operation.

NOT: The unary one

If all you want to do is flip the whole of a value, as in the above example, there is an alternative. It's called NOT, and it's different to the rest of the bitwise operators we've covered so far. Instead of taking two inputs, it just takes one; the output is the opposite of the input.

xNOT x
01
10

NOT is one of those more esoteric operations; it doesn't see much use. One place where it can be used to good effect is to generate masks for use in the AND operation. Let's say you want to mask off all the bits of a value except the bottom two. Instead of hassling around with long strings of binary or hexadecimal digits, we can let bitwise operators do the grunt work for us.

NOT: Generating an AND mask

value = value & (~3);

Alright; short, succinct, but what's it doing? Let's take a look at the binary level.

NOT: How the generation happens

Three:  00000000 00000000 00000000 00000011
NOT 3:  11111111 11111111 11111111 11111100
Value:  11001001 00011110 10001101 00111011
AND:    -----------------------------------
Result: 11001001 00011110 10001101 00111000

In this way, you can easily generate the masks you need for your AND operations, without having to dig deep into hexadecimal strings and remembering any combination tables; NOT's doing the hard work for you.

Shifts: Do the shuffle

So far, the bitwise operators have been 1-bit affairs: take one bit from an input (or two), give one bit of output. There are operators, however, that take a whole string of binary digits, and do simple operations with that. The major example of those operators is the shift.

Shifting comes in two variants: left and right. As you can probably guess, a left shift takes a binary string of values and shifts it left, pushing it up the binary powers. Let's take a look with an example.

Left shift: Pushing a value up

Before: 00001111
Left 1: 00011110
Left 2: 00111100
Left 3: 01111000

The 1's get pushed up from the bottom of the value, towards the top; the gaps that are left are filled in by 0's. If you look carefully, you'll note that this is essentially equivalent to multiplying the value by powers of two: before shifting, we had 15. Shift left by 1 and we end up with 30; left by 2 gives us 60, and left by 3 gives 120.

However, what happens if we start with a 1 near the top, and then start shifting left? Let's take a look.

Left shift: The bit bucket

Before: 01100101
Left 1: 11001010
Left 2: 10010100
Left 3: 00101000

The value got shifted along sure enough, and 0's were inserted at the bottom, but where did the upper 1's go? The answer is, they vanished; basically, they fell off the end of the value as a result of the shifting operation. If you decide to use shifts, keep in mind that this will happen if you run out of space for the bits.

Also note how the most-significant bit changes during the successive shift operations. If you're dealing with signed values, the most significant bit is often used to denote the sign (1 meaning that this is a negative number); in this case, your number is going from positive to negative, then back to positive! That is one disadvantage of the shift operation that has to be kept in mind when you use it.

The right shift is very similar to the left shift, only it acts in the reverse direction: bits get moved to the right, and 0's are inserted from the left hand side.

Right shift: Pushing down

Before:  00111000
Right 1: 00011100
Right 2: 00001110
Right 3: 00000111

Just as left shift can be thought of as multiplication by a binary power, right shift can be used to divide by a binary power; in the case above, 56 becomes 28, then 14, then 7, simply by successive shifts to the right.

Just as with the left shift also, the right shift provides no safeguards if 1's are at the low end of the scale. In the example above, another right shift would yield a value of 3, since the lowest 1 simply drops off the end of the value.

Also just like the left shift, there's no safeguard on the highest bit; if it started out as 1 (which in a signed value denotes negative), a right shift would fill in 0's, making the value positive. For that reason, processors and programming languages often offer an "arithmetic" right shift operation, which fills with 0 if the top bit was 0, and fills with 1 if the top bit was 1; by using the arithmetic right shift, the sign of the value is preserved.

That's all well and good, throwing values around inside a binary string. But to what end can it be put? How can shifts be used in the real world?

Shifts to combine values

Let's say you have in your possession four 8-bit values, and you wish to build a 32-bit value by tacking them all together. The left shift makes this a very easy task to accomplish.

Left shift: Building a value

Values: 11001010, 00111010, 01001101, 00110011
Pushing all values into 32-bit variables:

var4:  00000000 00000000 00000000 00110011
var3:  00000000 00000000 01001101 00000000  <-- left shift 8
var2:  00000000 00111010 00000000 00000000  <-- left shift 16
var1:  11001010 00000000 00000000 00000000  <-- left shift 24
OR:    -----------------------------------
Final: 11001010 00111010 01001101 00110011

value = var4 | (var3 << 8) | (var2 << 16) | (var1 << 24);

Similarly, if you wanted to split that 32-bit value up into 8-bit chunks, you could simply perform a successive right-shift by 8, in conjunction with an AND operation to mask off the part of the result you're looking for.

Right shift: Splitting a value

Initial: 11001010 00111010 01001101 00110011
Mask:    00000000 00000000 00000000 11111111
AND:     -----------------------------------
var4:                               00110011

RShift:  00000000 11001010 00111010 01001101
Mask:    00000000 00000000 00000000 11111111
AND:     -----------------------------------
var3:                               01001101

RShift:  00000000 00000000 11001010 00111010
Mask:    00000000 00000000 00000000 11111111
AND:     -----------------------------------
var2:                               00111010

RShift:  00000000 00000000 00000000 11001010
Mask:    00000000 00000000 00000000 11111111
AND:     -----------------------------------
var1:                               11001010

var4 = value & 255;
var3 = (value >> 8) & 255;
var2 = (value >> 16) & 255;
var1 = (value >> 24) & 255;

A little bit long-winded, at least when written down in raw binary. Keep in mind, though, that the computer can do this at a stupidly high pace, especially when you use operators designed to work with bits.

Shifts used to multiply

Another use of shifts is to multiply values by multiplers that aren't direct binary powers. This isn't such a common operation any more, since multiplier units are quite quick nowadays, but if you ever come across a weird sequence of shift operations in some old code, you'll know what it's doing, and why it's there.

In the old VGA days, the most common graphic mode used on PCs had 320 pixels to a line. Pixels were written to the screen in a flat framebuffer: a portion of memory which translated directly to the screen, 320 bytes to a line. That meant finding out a location in memory, given an X and Y coordinate to draw to. The formula is pretty simple, but the multiplier units of the CPUs back then were quite slow, so every advantage was sought out.

Left shift: Multiplying by 320

Memory Location = (y * 320) + x
                = (y * 256) + (y * 64) + x
		= (y << 8) + (y << 6) + x

Since shifts were orders of magnitude faster than multiplies, this series of two shifts and two adds was done much more quickly than one multiply and one add. Of course, you probably won't see this so much any more, since CPUs actually got good at multiplying, but it's possible you'll come across this kind of technique in another place.

The End

So, that's all the bitwise operators you'll come across in your travels through programming. Some of them may come in very useful, and some may be used less often, but have no doubt: they'll all be used sometime.

A final note: This table shows how C-syntax languages denote the bitwise operators.

OperatorSymbolSyntax
ANDAmpersandx & y
ORVertical pipex | y
XORCaretx ^ y
NOTTilde~ x
Left shiftTwo less-than signsx << y
Right shiftTwo greater-than signsx >> y

Placed into the public domain by Imran Nazar, 2006.

Article dated: 22nd Sep 2006

Get the RSS feed