Steganography is a name given for a set of techniques where the targeted message is hidden inside other data, most often in the context of an image in some form. Common forms of steganography include deviations in the level of individual pixels, or changes to the low-level noise components of the image. In this article, I'll look at a technique that produces an image which is also a program, which can be run to produce the targeted message.
In order to keep the examples here simple, I've sought a form of program that has a simple instruction set. Many processors have a simple instruction set: the 6502 has around 150 unique codes in its architecture, which can be reduced further with careful selection of instructions. However, having written about Brainfuck before, it seems the logical choice: eight instructions, each with a unique code, and all other codes are ignored during execution.
The program to be encoded, then, is as follows:
Message before encoding
>>++++++++[<+++[<+++++>-]>-]<<[>+>+>+>+>+<<<<<-]++++++++[>->--> --->---->-----<<<<<-]<++++[>++++++++<-]>>>>>>++++.<<<.>+++++.<< <.>>+++++.++.-.---.>.<<+++++++++.<.>>--.<------.<.>>.+++++.<<.> +.>------.>.<<<.>>>-.<+.<-.>-.<++++.>>---.<<----.>.>++++.<<-. The monkey is in the dishwasher
Encoding program code into pixels
In order for the program to be converted to an image, the first step is to take the program as a stream of numbers. In Brainfuck, the eight operators can be treated as eight numbers to be inserted into a stream:
+ | 0x2B | - | 0x2D |
---|---|---|---|
[ | 0x5B | ] | 0x5D |
< | 0x3C | > | 0x3E |
. | 0x2E | , | 0x2C |
Hiding these values within image data would be difficult if the data were full-colour (24- or 32-bit), since the colour component would need to correspond exactly at a given point in the image. Instead of using full-colour images, it makes more sense to use palette-based images, where an 8-bit palette index refers to a table of colours defined beforehand.
A simple example of palette data would be a graded greyscale image of sixteen shades, like the one below:
If this image is produced in a standard fashion, a palette of sixteen entries and 240 blanks is saved alongside the image. By expanding the palette entries used to the maximum allotted number of 256, it's possible to use the extra entries for hiding information:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ||||||||||||||||
1 | ||||||||||||||||
2 | + | , | - | . | ||||||||||||
3 | < | > | ||||||||||||||
4 | ||||||||||||||||
5 | [ | ] | ||||||||||||||
6 | ||||||||||||||||
7 | ||||||||||||||||
8 | ||||||||||||||||
9 | ||||||||||||||||
A | ||||||||||||||||
B | ||||||||||||||||
C | ||||||||||||||||
D | ||||||||||||||||
E | ||||||||||||||||
F |
As can be seen above, each row of the palette is the same shade of grey, except where the Brainfuck operators map onto the palette. By changing arbitrary pixels in the image, it's possible to replace any given #2-grey pixel with, for example, a +
operator without changing the row of the palette used. Shifting pixel values along the palette row in this fashion allows us to change the value without changing the look of the image.
Before applying this concept to the example image in Figure 1, it's important to consider the type of image needed for this to work.
Uncompressed images: the BMP format
The major issue presented by image formats is that of compression: in order to make the image file size smaller, various methods of compression can be used to translate areas of similarity in the image to a much smaller amount of data than that represented by the raw image. Of course, when this is done the resultant compressed data bears no superficial relation to the raw image data; attempting to encode a Brainfuck program into the compressed data stream is beyond the scope of this article.
Instead, I'll be focussing on the raw image data itself; storage of the raw data can be achieved through an uncompressed data format such as TIFF. Examples of uncompressed 8-bit palette image formats include BMP and PCX; the former will be used here, due to its prevalence as a major image file format. When used as an uncompressed 8-bit format, the structure of a BMP file is as follows:
Field | Size (bytes) | Value |
---|---|---|
Signature | 2 | "BM" |
File size | 4 | Size of the full BMP |
Reserved | 4 | 0x00000000 |
Data offset | 4 | Offset of the pixel array |
BITMAPINFOHEADER | ||
Header size | 4 | 0x00000028 |
Image width | 4 | Width of the image in pixels |
Image height | 4 | Height of the image in pixels |
Colour planes | 2 | 0x0001 |
Color depth | 2 | 0x0008 |
Compression method | 4 | 0x00000000 |
Uncompressed image size | 4 | Width * height * bytes-per-pixel |
Horizontal DPM | 4 | Pixels per metre, horizontal |
Vertical DPM | 4 | Pixels per metre, vertical |
Colours used | 4 | Number of unique colours |
Important colours | 4 | Number of important colours |
Colour table (256 entries) | ||
Colour | 4 | RGBA value |
Pixel data, 1 byte per pixel |
The end result of this process is to produce a bitmap file that can be run as a Brainfuck program; to that end, we must be careful that the header data and colour table don't contain Brainfuck operators that may corrupt the initial state of the program. This is, fortunately, relatively easy: all that is needed is to pick an image width and height that don't contain an operator within the byte values, and the rest should automatically fall into place.
One other quirk of the BMP format is the order in which pixel data is stored. Most image formats treat the first row as the top row, in accordance with computer graphics principles; BMP lines up with mathematical graphing principles, and treats the bottom row as the first.
This means that the Brainfuck operators need to be encoded into the image starting at the bottom, running left-right, then moving up to the next row. By examining each operator in the program in turn, and finding the next appropriate pixel (of the same row in the palette) to replace, the following result is obtained.
Once the appropriate pixels have been replaced with versions shifted along the palette row, the palette can be amended to set the Brainfuck operators to the same colour as the rest of the row in which they reside. Doing this, and producing an uncompressed bitmap of the result, provides the following.
Running the bitmap image, shown on the right, through a Brainfuck interpreter results in the targeted message being revealed.
PHP Brainfuck interpreter
<?php list($c, $i) = split('!', file_get_contents($_SERVER['argv'][1])); $c = preg_replace('#[^\[\]\<\>,.+-]#', "", $c); $p = $v = 0; eval(strtr($c, array( "]" => "}", "[" => 'while($m[$p]){', "+" => '$m[$p]++;', "-" => '$m[$p]--;', ">" => '$p++;', "<" => '$p--;', "," => 'if(strlen($i)>$v)$m[$p]=ord($i[$v++]);', "." => 'echo chr($m[$p]);' ))); ?>
Sample result
$ php -f brainfuck.php http://imrannazar.com/content/img/bf-bmp-final.bmp The monkey is in the dishwasher$
Caveats and improvements
The process outlined above is simple enough that it can be automated: given a sixteen-step greyscale image, seek from the bottom left for a pixel in the desired palette row, replace the pixel and move forward. This, however, isn't ideal for steganographic purposes; the Brainfuck operators leave a unique signature in a bitmap of otherwise uniform numbers. There are two simple ways to increase the level of obfuscation:
- Colour: Instead of using a greyscale image, which offers easy colour matching, it's possible to use a coloured palette with any 8-bit image as the source. This involves replacing any existing instances of the operator characters with a close colour match, then further colour matching can be used to fill in the program, hiding the signal in an increased level of noise.
- Compression: Noise levels can be increased further by compressing the image once it has been encoded, hiding any relation between the bitmap data and any Brainfuck program contained within. This increases the complexity of both the encoding and decoding algorithms, since a compression step must be employed around the transmission of the image, but it allows the images to be transmitted in a format that avoids suspicion.
Imran Nazar <tf@imrannazar.com>, Nov 2011.