Computers work with binary numbers. The binary system allows for some cool manipulations to be performed on the per-digit level; the problem is that relatively few people learn about these operations, since they seem at first glance to be quite complicated. This document will show that they're not, really.
For this quick run-through, I'll be assuming you know what the binary numbering system is, and how it works; furthermore, I'll assume a little familiarity with working in binary. Everyone who uses a computer, whether it be to connect to the Internet, for video games, or simply to edit a Word document, will have experienced the binary numbering system, as this is how all computers function internally. However, to make binary manipulations, you will need a slightly more advanced idea of how this system operates. I'll also be presenting any code examples using the syntax of C and its syntactical derivatives (C++, PHP, Java and the like). Don't worry if you don't know the C syntax for the operators; I'll be putting a small table at the end of the document.
AND: Putting on the mask
The first operation to look at is called AND. It's called that because that's exactly what it does: take two inputs, and only return any output if both input 1 AND input 2 are on.
in1 | in2 | in1 AND in2 |
---|---|---|
0 | 0 | 0 |
0 | 1 | 0 |
1 | 0 | 0 |
1 | 1 | 1 |
That table is an example of a truth table; it plots out all the possible combinations of inputs, and what the operator will do with them. As you can see, the AND operator only returns 1 if both inputs are 1. But how does that help us in the real world, of numbers bigger than one bit?
Using AND to mask values
The major thing that can be done with AND is masking: only
using the part of a number that you want to use. For example, let's say
you have a 32-bit variable, and you're incrementing it in a loop. But you
want to wrap the value after 255, back to 0. A simple case of using an
if
statement, you may think. But think again.
AND: Masking a 32-bit value
while(1) { i++;// How we'd do this with an if clause // if(i > 255) i = 0;// How to do it with ANDi = i & 255; }
What's happening here? Let's have a look at a normal case first; say
i
is at a value of 77. The AND operation looks like this:
AND: Masking at i=72
00000000 00000000 00000000 01001101 00000000 00000000 00000000 11111111 [AND] ----------------------------------- 00000000 00000000 0000000001001101
The AND "mask" is essentially passing through the low 8 bits of
i
into the result. In this case, 77 fits fine into 8
bits, so no change. What happens at the borderline case: when 255
becomes 256?
AND: Masking at i=256
00000000 00000000 00000001 00000000 00000000 00000000 00000000 11111111 [AND] ----------------------------------- 00000000 00000000 0000000000000000
As stated above, the low 8 bits of i
are being passed
through. That's 0. The rest of the value, including the '256' bit, is
essentially ignored by the mask, meaning the value automagically wraps
from 255 to 0. Quite useful, you'll admit.
AND's role in IP addressing
Something else you can do with AND is to clear a certain portion of a value. For example, you want to check the network that an IP address lives on. An IP address is just another 32-bit number, and the subnet mask that accompanies it is exactly that: a mask, that's applied to the IP using the AND operator, to find the subnet for that IP address.
Let's take my setup at home. I have a few computers at home, and
they all have addresses in a private IP range. One of those computers is
172.16.55.37
, with a subnet mask of 255.255.255.224
;
how does the router know which network I'm coming from?
AND: What subnet are you on?
IP address: 10101100.00010000.00110111.00100101 Snet Mask: 11111111.11111111.11111111.11100000 AND: ----------------------------------- Subnet:10101100.00010000.00110111.00100000 [172.16.55.32]
So when the router sees a packet destined for 172.16.55.37, it applies this AND operation, works out that the network is in its route table, and forwards the packet. In other words, the Internet wouldn't work without the AND operator.
AND used to clear a bit
A final example for AND: When you want to make a Windows application,
you generally want to display a window, and windows can have various
styles associated with them. The most commonly used style is
WS_OVERLAPPEDWINDOW
, which is just a number given an
easier-to-read name: it combines various styles which tell Windows to
provide a caption, system menu, minimise and maximise buttons.
But what if you don't want a minimise button? You can build the
style you need by taking the normal one, and cutting out the value for
WS_MINIMIZEBOX
. The style values were set with just this
idea in mind: simple styles are binary power values, and complicated
styles combine them together.
So let's give that a go: a window with no minimise button.
AND: Clearing a bit
OVERLAPPEDWINDOW: 00000000 11001111 00000000 00000000 MINIMIZEBOX: (00000000 00000010 00000000 00000000) Clear mask: 11111111 11111101 11111111 11111111 AND: ----------------------------------- Overall style: 00000000 11001101 00000000 00000000
If the new value is passed to Windows, you'll get a shiny new window with no minimise button. To generate the clear mask, you can use the NOT operator, which we'll be coming to later.
OR: Setting bits
The second operator we'll look into is called OR. And as you can guess, it's called OR for a reason: it takes two inputs, and gives an output if either one or the other, or both, is set. Here's another of those truth tables.
in1 | in2 | in1 OR in2 |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 1 |
Note how it doesn't matter whether in1
is on or off; if
in2
is on, the result is on. And, of course, the inverse
applies. So, how can that be used in the real world?
The major application of OR is in setting bits; filling a portion of
a value with 1's. Let's take the Windows styles from above: you may want
not just a normal window, but a window which can handle vertical scrolling
of its contents. Luckily, it's simple to define such a window: just tack
the WS_OVERLAPPEDWINDOW
default style and
WS_VSCROLL
together, thus.
OR: Filling in a value
OVERLAPPEDWINDOW: 00000000 11001111 00000000 00000000 VSCROLL: 00000000 00100000 00000000 00000000 OR together: ----------------------------------- New window style: 00000000 11101111 00000000 00000000
By using AND and OR in this manner, it's quite easy to build up exactly the style of window you're looking for.
XOR: OR with a twist
OR is the operation that allows you to set a bit if either or both of two inputs is set. But what if you only want to check either, and not both? There is an operator for that, and it's called the Exclusive OR, XOR for short. Another truth table for you:
in1 | in2 | in1 XOR in2 |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
Now, it might seem a bit esoteric and theoretical, to have an operator that only sets output if either input is set, and not both. However, the XOR operation does come in useful.
XOR to clear a value
If you have a variable or a CPU register, and you want to clear it to 0, you don't care about what's inside; those contents will be obliterated anyway. Of course, you can simply move "0" into that variable, but in some cases that might be a problem. Instead, you can use XOR.
XOR: Clearing to zero
Random value: 11001001 01001110 11010010 11110001 Value again: 11001001 01001110 11010010 11110001 XOR: ----------------------------------- Output: 00000000 00000000 00000000 00000000
For every one of the 32 bits in that example, either the top or the bottom line of the truth table matched, which means the output was 0 in both cases. The end result is, of course, that all the bits of the value turn out as 0 after the XOR operation. Now, why would one want to perform such an operation? Take a look at this example, in Intel assembly.
XOR: Assembly usage
mov eax, 0 ; Assembles to 5 bytes xor eax, eax ; Assembles to 1 byte
If you're at a premium for space, it's obvious which of the two you'd pick; instead of wasting 4 bytes of space, a bitwise operator can do the same job.
XOR to flip a bit
The other major application of XOR is to flip a bit, or range of bits,
within a value. If you take a look at the truth table in two halves, the
top half is essentially the bottom half upside-down; and that flipping
is controlled by the value of in1
. This comes in very useful
for certain situations.
For example, let's say you've been tasked with producing a square wave:
a constant series of pulses, 64 then 191, then 64, then 191. You could
implement a complicated series of if
statements, or you
could use a simple XOR.
XOR: Making a 1kHz square wave
char output = 64; while(1) { output = output ^ 255; usleep(500);// sleep for half the wave period}
So, what's happening here? The output value starts at 64, and gets changed by the XOR every time the loop runs. How does that change manifest itself?
XOR: Sending the square wave high
output: 01000000 Mask: 11111111 XOR: -------- Result: 10111111
So we started at 64, and the XOR has flipped all the bits, to 191. After 500 milliseconds, the loop is re-entered, and the XOR gets applied again:
XOR: Sending the same wave low
output: 10111111 Mask: 11111111 XOR: -------- Result: 01000000
And as if by magic, the XOR returns the result we entered with the first time: 191 gets flipped to 64. Because the value is 64 for 500 microseconds, and 191 for another 500 microseconds, the overall wave period is 1 millisecond, and we've produced the 1kHz square wave, with just one XOR operation.
NOT: The unary one
If all you want to do is flip the whole of a value, as in the above example, there is an alternative. It's called NOT, and it's different to the rest of the bitwise operators we've covered so far. Instead of taking two inputs, it just takes one; the output is the opposite of the input.
x | NOT x |
---|---|
0 | 1 |
1 | 0 |
NOT is one of those more esoteric operations; it doesn't see much use. One place where it can be used to good effect is to generate masks for use in the AND operation. Let's say you want to mask off all the bits of a value except the bottom two. Instead of hassling around with long strings of binary or hexadecimal digits, we can let bitwise operators do the grunt work for us.
NOT: Generating an AND mask
value = value & (~3);
Alright; short, succinct, but what's it doing? Let's take a look at the binary level.
NOT: How the generation happens
Three: 00000000 00000000 00000000 00000011 NOT 3: 11111111 11111111 11111111 11111100 Value: 11001001 00011110 10001101 00111011 AND: ----------------------------------- Result:11001001 00011110 10001101 00111000
In this way, you can easily generate the masks you need for your AND operations, without having to dig deep into hexadecimal strings and remembering any combination tables; NOT's doing the hard work for you.
Shifts: Do the shuffle
So far, the bitwise operators have been 1-bit affairs: take one bit from an input (or two), give one bit of output. There are operators, however, that take a whole string of binary digits, and do simple operations with that. The major example of those operators is the shift.
Shifting comes in two variants: left and right. As you can probably guess, a left shift takes a binary string of values and shifts it left, pushing it up the binary powers. Let's take a look with an example.
Left shift: Pushing a value up
Before: 00001111Left 1: 00011110 Left 2: 00111100 Left 3: 01111000
The 1's get pushed up from the bottom of the value, towards the top; the gaps that are left are filled in by 0's. If you look carefully, you'll note that this is essentially equivalent to multiplying the value by powers of two: before shifting, we had 15. Shift left by 1 and we end up with 30; left by 2 gives us 60, and left by 3 gives 120.
However, what happens if we start with a 1 near the top, and then start shifting left? Let's take a look.
Left shift: The bit bucket
Before: 01100101Left 1:11001010 Left 2:10010100 Left 3:00101000
The value got shifted along sure enough, and 0's were inserted at the bottom, but where did the upper 1's go? The answer is, they vanished; basically, they fell off the end of the value as a result of the shifting operation. If you decide to use shifts, keep in mind that this will happen if you run out of space for the bits.
Also note how the most-significant bit changes during the successive shift operations. If you're dealing with signed values, the most significant bit is often used to denote the sign (1 meaning that this is a negative number); in this case, your number is going from positive to negative, then back to positive! That is one disadvantage of the shift operation that has to be kept in mind when you use it.
The right shift is very similar to the left shift, only it acts in the reverse direction: bits get moved to the right, and 0's are inserted from the left hand side.
Right shift: Pushing down
Before: 00111000 Right 1: 00011100 Right 2: 00001110 Right 3: 00000111
Just as left shift can be thought of as multiplication by a binary power, right shift can be used to divide by a binary power; in the case above, 56 becomes 28, then 14, then 7, simply by successive shifts to the right.
Just as with the left shift also, the right shift provides no safeguards if 1's are at the low end of the scale. In the example above, another right shift would yield a value of 3, since the lowest 1 simply drops off the end of the value.
Also just like the left shift, there's no safeguard on the highest bit; if it started out as 1 (which in a signed value denotes negative), a right shift would fill in 0's, making the value positive. For that reason, processors and programming languages often offer an "arithmetic" right shift operation, which fills with 0 if the top bit was 0, and fills with 1 if the top bit was 1; by using the arithmetic right shift, the sign of the value is preserved.
That's all well and good, throwing values around inside a binary string. But to what end can it be put? How can shifts be used in the real world?
Shifts to combine values
Let's say you have in your possession four 8-bit values, and you wish to build a 32-bit value by tacking them all together. The left shift makes this a very easy task to accomplish.
Left shift: Building a value
Values: 11001010, 00111010, 01001101, 00110011 Pushing all values into 32-bit variables: var4: 00000000 00000000 0000000000110011var3: 00000000 000000000100110100000000 <-- left shift 8 var2: 000000000011101000000000 00000000 <-- left shift 16 var1:1100101000000000 00000000 00000000 <-- left shift 24 OR: ----------------------------------- Final: 11001010 00111010 01001101 00110011 value = var4 | (var3 << 8) | (var2 << 16) | (var1 << 24);
Similarly, if you wanted to split that 32-bit value up into 8-bit chunks, you could simply perform a successive right-shift by 8, in conjunction with an AND operation to mask off the part of the result you're looking for.
Right shift: Splitting a value
Initial: 11001010 00111010 01001101 00110011 Mask: 00000000 00000000 00000000 11111111 AND: ----------------------------------- var4: 00110011 RShift: 00000000 11001010 00111010 01001101 Mask: 00000000 00000000 00000000 11111111 AND: ----------------------------------- var3: 01001101 RShift: 00000000 00000000 11001010 00111010 Mask: 00000000 00000000 00000000 11111111 AND: ----------------------------------- var2: 00111010 RShift: 00000000 00000000 00000000 11001010 Mask: 00000000 00000000 00000000 11111111 AND: ----------------------------------- var1: 11001010 var4 = value & 255; var3 = (value >> 8) & 255; var2 = (value >> 16) & 255; var1 = (value >> 24) & 255;
A little bit long-winded, at least when written down in raw binary. Keep in mind, though, that the computer can do this at a stupidly high pace, especially when you use operators designed to work with bits.
Shifts used to multiply
Another use of shifts is to multiply values by multiplers that aren't direct binary powers. This isn't such a common operation any more, since multiplier units are quite quick nowadays, but if you ever come across a weird sequence of shift operations in some old code, you'll know what it's doing, and why it's there.
In the old VGA days, the most common graphic mode used on PCs had 320 pixels to a line. Pixels were written to the screen in a flat framebuffer: a portion of memory which translated directly to the screen, 320 bytes to a line. That meant finding out a location in memory, given an X and Y coordinate to draw to. The formula is pretty simple, but the multiplier units of the CPUs back then were quite slow, so every advantage was sought out.
Left shift: Multiplying by 320
Memory Location = (y * 320) + x = (y * 256) + (y * 64) + x = (y << 8) + (y << 6) + x
Since shifts were orders of magnitude faster than multiplies, this series of two shifts and two adds was done much more quickly than one multiply and one add. Of course, you probably won't see this so much any more, since CPUs actually got good at multiplying, but it's possible you'll come across this kind of technique in another place.
The End
So, that's all the bitwise operators you'll come across in your travels through programming. Some of them may come in very useful, and some may be used less often, but have no doubt: they'll all be used sometime.
A final note: This table shows how C-syntax languages denote the bitwise operators.
Operator | Symbol | Syntax |
---|---|---|
AND | Ampersand | x & y |
OR | Vertical pipe | x | y |
XOR | Caret | x ^ y |
NOT | Tilde | ~ x |
Left shift | Two less-than signs | x << y |
Right shift | Two greater-than signs | x >> y |
Placed into the public domain by Imran Nazar, 2006.