practical applications of bitwise operations

  1. What have you used bitwise operations for?
  2. why are they so handy?
  3. can someone please recommend a VERY simple tutorial?
37642 次浏览

I use bitwise operators for security in my applications. I'll store the different levels inside of an Enum:

[Flags]
public enum SecurityLevel
{
User = 1, // 0001
SuperUser = 2, // 0010
QuestionAdmin = 4, // 0100
AnswerAdmin = 8 // 1000
}

And then assign a user their levels:

// Set User Permissions to 1010
//
//   0010
// | 1000
//   ----
//   1010
User.Permissions = SecurityLevel.SuperUser | SecurityLevel.AnswerAdmin;

And then check the permissions in the action being performed:

// Check if the user has the required permission group
//
//   1010
// & 1000
//   ----
//   1000
if( (User.Permissions & SecurityLevel.AnswerAdmin) == SecurityLevel.AnswerAdmin )
{
// Allowed
}
  1. They can be used for passing in many arguments to a function through one limited size variable.
  2. Advantages are low memory overhead, or low memory cost: Therefore increased performance.
  3. I can't write a tutorial on the spot, but they are out there I'm sure.

They can be used for a whole load of different applications, here is a questions I have previously posted here, which uses bitwise operations:

Bitwise AND, Bitwise Inclusive OR question, in Java

For other examples, have a look at (say) flagged enumerations.

In my example, I was using bitwise operations to change the range of a binary number from -128...127 to 0..255 (changing it's representation from signed to unsigned).

the MSN article here ->

http://msdn.microsoft.com/en-us/library/6a71f45d%28VS.71%29.aspx

is useful.

And, although this link:

http://weblogs.asp.net/alessandro/archive/2007/10/02/bitwise-operators-in-c-or-xor-and-amp-amp-not.aspx

is very technical, it is covering everything.

HTH

Anytime you have an option of 1 or more in combination of items then bitwise is usually an easy fix.

Some examples include security bits (waiting on Justin's sample..), scheduling days, etc.

I would have to say one of the most common uses is modifying bitfields to compress data. You mostly see this in programs attempting to be economical with packets.

Example of network compression using bitfields

One of the most frequent things I use them for in C# is producing hashcodes. There's some reasonably good hashing methods that use them. E.g. for a co-ordinate class with an X an Y that were both ints I might use:

public override int GetHashCode()
{
return x ^ ((y << 16) | y >> 16);
}

This quickly generates a number that is guaranteed to be equal when produced by an equal object (assuming that equality means both X and Y parameters are the same in both objects compared) while also not producing clashing patterns for low-valued objects (likely to be most common in most applications).

Another is combining flag enumerations. E.g. RegexOptions.Compiled | RegexOptions.CultureInvariant | RegexOptions.IgnoreCase

There are some low-level operations that are more commonly not necessary when you are coding against a framework like .NET (e.g. in C# I won't need to write code to convert UTF-8 to UTF-16, it's there for me in the framework), but of course somebody had to write that code.

There are a few bit-twiddling techniques, like rounding up to the nearest binary number (e.g. round up 1010 to 10000):

        unchecked
{
--x;
x |= (x >> 1);
x |= (x >> 2);
x |= (x >> 4);
x |= (x >> 8);
x |= (x >> 16);
return ++x;
}

Which are useful when you need them, but that tends not to be very common.

Finally, you can also use them to micro-optimise mathematics such as << 1 instead of * 2 but I include that only to say it's generally a bad idea as it hides the intent of the real code, saves almost nothing in performance, and can hide some subtle bugs.

Binary sort. There were issues where the implementation was using a division operator instead of a bitshift operator. This caused BS to fail after the collection got to sizes above 10,000,000

Although everyone seems to be hooked on the flags usecase, that isn't the only application of bitwise operators (although probably the most common). Also C# is a high enough level language that other techniques will probably be rarely used, but it's still worth knowing them. Here's what I can think of:


The << and >> operators can quickly multiply by a power of 2. Of course, the .NET JIT optimizer will probably do this for you (and any decent compiler of another language as well), but if you're really fretting over every microsecond, you just might write this to be sure.

Another common use for these operators is to stuff two 16-bit integers into one 32-bit integer. Like:

int Result = (shortIntA << 16 ) | shortIntB;

This is common for direct interfacing with Win32 functions, which sometimes use this trick for legacy reasons.

And, of course, these operators are useful when you want to confuse the inexperienced, like when providing an answer to a homework question. :)

In any real code though you'll be far better off by using multiplication instead, because it's got a much better readability and the JIT optimizes it to shl and shr instructions anyway so there is no performance penalty.


Quite a few curious tricks deal with the ^ operator (XOR). This is actually a very powerful operator, because of the following properties:

  • A^B == B^A
  • A^B^A == B
  • If you know A^B then it's impossible to tell what A and B are, but if you know one of them, you can calculate the other.
  • The operator doesn't suffer from any overflows like multiplication/division/addition/subtraction.

A couple of tricks I have seen using this operator:

Swapping two integer variables without an intermediary variable:

A = A^B // A is now XOR of A and B
B = A^B // B is now the original A
A = A^B // A is now the original B

Doubly-linked list with just one extra variable per item. This will have little use in C#, but it might come in handy for low level programming of embedded systems where every byte counts.

The idea is that you keep track of the pointer for the first item; the pointer for the last item; and for every item you keep track of pointer_to_previous ^ pointer_to_next. This way you can traverse the list from either end, yet the overhead is just half that of a traditional linked list. Here's the C++ code for traversing:

ItemStruct *CurrentItem = FirstItem, *PreviousItem=NULL;
while (  CurrentItem != NULL )
{
// Work with CurrentItem->Data


ItemStruct *NextItem = CurrentItem->XorPointers ^ PreviousItem;
PreviousItem = CurrentItem;
CurrentItem = NextItem;
}

To traverse from the end you just need to change the very first line from FirstItem to LastItem. That's another memory saving right there.

Another place where I use the ^ operator on a regular basis in C# is when I have to calculate a HashCode for my type which is a composite type. Like:

class Person
{
string FirstName;
string LastName;
int Age;


public int override GetHashCode()
{
return (FirstName == null ? 0 : FirstName.GetHashCode()) ^
(LastName == null ? 0 : LastName.GetHashCode()) ^
Age.GetHashCode();
}
}

If you ever need to communicate with hardware you'll need to use bit twiddling at some point.

Extracting the RGB values of a pixel value.

So many things

I don't know how practical, solving a sudoku you consider to be, but let's assume it is.

Imagine you want to write a sudoku solver or even just a simple program, that shows you the board and lets you solve the puzzle yourself, but ensures the moves are legal.

The board itself will most probably be represented by a two-dimensional array like:

uint [, ] theBoard = new uint[9, 9];

Value 0 means the cell is still empty and values from the range [1u, 9u] are the actual values in the board.

Now imagine you want to check if some move is legal. Obviously you can do it with a few loops, but bitmasks allow you to make things much faster. In a simple program that just ensures the rules are obeyed, it doesn't matter, but in a solver it could.

You can maintain arrays of bitmasks, that store information about the numbers that are already inserted in each row, each column a and each 3x3 box.

uint [] maskForNumbersSetInRow = new uint[9];


uint [] maskForNumbersSetInCol = new uint[9];


uint [, ] maskForNumbersSetInBox = new uint[3, 3];

The mapping from the number to the bitpattern, with one bit corresponding to that number set, is very simple

1 -> 00000000 00000000 00000000 00000001
2 -> 00000000 00000000 00000000 00000010
3 -> 00000000 00000000 00000000 00000100
...
9 -> 00000000 00000000 00000001 00000000

In C#, you can compute the bitpattern this way (value is an uint):

uint bitpattern = 1u << (int)(value - 1u);

In the line above 1u corresponding to the bitpattern 00000000 00000000 00000000 00000001 is shifted left by value - 1. If, for example value == 5, you get

00000000 00000000 00000000 00010000

At the beginning, the mask for each row, column and box is 0. Every time you put some number on the board, you update the mask, so the bit corresponding to the new value is set.

Let's assume you insert value 5 in row 3 (rows and columns are numbered from 0). Mask for row 3 is stored in maskForNumbersSetInRow[3]. Let's also assume that before the insert there were already numbers {1, 2, 4, 7, 9} in row 3. The bit pattern in the mask maskForNumbersSetInRow[3] looks like this:

00000000 00000000 00000001 01001011
bits above correspond to:9  7  4 21

The goal is to set the bit corresponding to the value 5 in this mask. You can do it using bitwise or operator (|). First you create a bit pattern corresponding to the value 5

uint bitpattern = 1u << 4; // 1u << (int)(value - 1u)

and then you use the operator | to set the bit in the mask maskForNumbersSetInRow[3]

maskForNumbersSetInRow[3] = maskForNumbersSetInRow[3] | bitpattern;

or using shorter form

maskForNumbersSetInRow[3] |= bitpattern;


00000000 00000000 00000001 01001011
|
00000000 00000000 00000000 00010000
=
00000000 00000000 00000001 01011011

Now your mask indicates that there are values {1, 2, 4, 5, 7, 9} in this row (row 3).

If you want to check, if some value is in the row, you can use operator & to check if corresponding bit is set in the mask. If the result of that operator applied to the mask and a bit pattern, corresponding to that value, is non-zero, the value is already in the row. If the result is 0 the value is not in the row.

For example, if you want to check if value 3 is in the row, you can do that this way:

uint bitpattern = 1u << 2; // 1u << (int)(value - 1u)
bool value3IsInRow = ((maskForNumbersSetInRow[3] & bitpattern) != 0);


00000000 00000000 00000001 01001011 // the mask
|
00000000 00000000 00000000 00000100 // bitpattern for the value 3
=
00000000 00000000 00000000 00000000 // the result is 0. value 3 is not in the row.

Below are methods for setting a new value in the board, maintaining appropriate bitmasks up to date and for checking if a move is legal.

public void insertNewValue(int row, int col, uint value)
{


if(!isMoveLegal(row, col, value))
throw ...


theBoard[row, col] = value;


uint bitpattern = 1u << (int)(value - 1u);


maskForNumbersSetInRow[row] |= bitpattern;


maskForNumbersSetInCol[col] |= bitpattern;


int boxRowNumber = row / 3;
int boxColNumber = col / 3;


maskForNumbersSetInBox[boxRowNumber, boxColNumber] |= bitpattern;


}

Having the masks, you can check if the move is legal like this:

public bool isMoveLegal(int row, int col, uint value)
{


uint bitpattern = 1u << (int)(value - 1u);


int boxRowNumber = row / 3;
int boxColNumber = col / 3;


uint combinedMask = maskForNumbersSetInRow[row] | maskForNumbersSetInCol[col]
| maskForNumbersSetInBox[boxRowNumber, boxColNumber];


return ((theBoard[row, col] == 0) && ((combinedMask & bitpattern) == 0u);
}

You'll use them for various reasons:

  • storing (and checking!) option flags in a memory efficient way
  • if doing computational programming, you may want to consider optimizing some of your operations by using bitwise operations instead of mathematical operators (beware of side-effects)
  • Gray Code!
  • creating enumerated values

I'm sure you can think of others.

That being said, sometimes you need to ask yourself: is the memory and performance boost worth the effort. After writing that sort of code, let it rest for a while and come back to it. If you struggle with it, rewrite with a more maintainable code.

On the other hand, sometimes it does make perfect sense to use bitwise operations (think cryptography).

Better yet, have it read by someone else, and document extensively.

Games!

Back in the days, I used it to represent a Reversi player's pieces. It's 8X8 so it took me a long type, and than, for example if you want to know where are all piece on board - you or both players pieces.
If you want all possible steps of a player, say to the right - you >> the player's pieces representation by one , and AND it with the opponent's pieces to check if there are now common 1's (that means there is an opponent piece to your right). Then you keep doing that. if you get back to your own pieces - no move. If you get to a clear bit - you can move there, and capture all the pieces on the way.
(This technique is broadly used, in many kinds of board games, including chess)

Dozens of bit twiddling examples here

The code is in C, but you can easily adapt it to C#