Many file formats contain data in a packed manner: a series of values which encode information, placed into the file one after another. As an example, TrueType font files encode information about the font represented in the file, such as the name of the font and how many characters are contained within the font face. As another example, Windows BMP files encode the dimensions and formatting of the image within the format.
The Need: Structures in C/C++
In most instances, the information in a file is encoded as a series of
structures, grouping related information such that a programming interface
can retrieve them easily. In C and C++, a special type is set aside for
just such a reason: the struct
.
BMP header structure, in C
typedef struct { unsigned char r;/* Red */unsigned char g;/* Green */unsigned char b;/* Blue */unsigned char reserved; } RGBQUAD;
The above is the C representation of one palette entry in a Windows BMP.
Once this structure has been defined as a type with typedef
,
using it is very simple:
Using a structure, in C
RGBQUAD palette[256]; fread(palette, sizeof(RGBQUAD), 256, file_handle); printf("Colour #0 is %02X%02X%02X.\n", palette[0].r, palette[0].g, palette[0].b);
It can be seen above that the data contained within a struct
can be accessed in much the same way as methods can be accessed within a class,
in C++ or any other object-oriented programming language. Indeed, in C++ the
keywords class
and struct
are equivalent, and mean
much the same thing.
The Problem: Structures in PHP
When using languages such as PHP, a problem arises: PHP does not support a
native struct
type. Further, since PHP is a loosely-typed
language, it's not possible to read data from an encoded file format directly
into PHP and manipulate it. This can be alleviated by using PHP's class
keyword to build a class containing the structure members:
A PHP class with structure members
class RGBQUAD {// Byte size of the structureconst SIZE = 16;// Structure memberspublic $r; public $g; public $b; public $reserved;// Initialise members given packed datapublic function __construct($data) { if($data) { list($this->r, $this->g, $this->b, $this->reserved) = unpack('v4', $data); } } }
This representation of the structure makes use of the unpack
function, to take a string of binary data and load it into the class members.
This can be used in a similar fashion to the C representation:
Using a structure, in PHP
$palette = array(); for($i=0; $i<256; $i++) $palette[$i] = new RGBQUAD(fread($file_handle, RGBQUAD::SIZE)); printf("Colour #0 is %02X%02X%02X.\n", $palette[0]->r, $palette[0]->g, $palette[0]->b);
This approach has distinct advantages over the C struct
type.
In particular, the PHP implementation is a class, and can contain methods
other than the simple constructor; furthermore, complex types can be contained
within the structure in a way that C cannot accomplish.
Advanced Usage: TrueType File Format
An example of complex usage of the structure pattern is in parsing of the TrueType file format. A TrueType file defines a vector or bitmap font face, and contains a series of data tables: a table of names, a table of Windows-specific font information, and tables of glyph definitions. Also contained in the file structure is a header defining the table "directory", which allows a parser to find these tables within the file.
A TrueType file begins with information about the version of the TrueType specification, followed by the table directory. This can be represented in PHP in a simple manner:
TrueType file header and table directory, in PHP
// File header. This contains the number of tables in the TTF.class ttfHeader { const SIZE = 12; public $majorVersion; public $minorVersion; public $tableCount; public $searchRange; public $entrySelector; public $rangeShift; public $tableDirectory; public function __construct($file) { $this->tableDirectory = array(); $header = fread($file, self::SIZE); list($t, $this->majorVersion, $this->minorVersion, $this->tableCount, $this->searchRange, $this->entrySelector, $this->rangeShift) = unpack('n*', $header); for($i=0; $i<$this->tableCount; $i++) { $this->tableDirectory[$i] = new ttfTableDirectoryEntry(fread($file, ttfTableDirectoryEntry::SIZE)); } } }// Table directory. Describes the location and size of a tables in the TTF.class ttfTableDirectoryEntry { const SIZE = 16; public $tag; public $checksum; public $offset; public $length; public function __construct($data) { list($t, $tag, $this->checksum, $this->offset, $this->length) = unpack('N*', $data);// Build a string tag from the numeric value$this->tag = chr(($tag>>24)&255). chr(($tag>>16)&255). chr(($tag>> 8)&255). chr(($tag>> 0)&255); } }
This example shows how it's possible for a structure to contain more
information than the sum of its members. In the case of the
ttfHeader
, the constructor can pull in the structure of an
entry in the table directory, and build its own array to represent the
directory.
All in all, the Structure pattern makes it possible for PHP to represent
structures in a similar fashion to C struct
s; it also allows
PHP to be more versatile, and represent more complex information related to
the data, which would normally have to be held outside the structure.
© Imran Nazar <tf@oopsilon.com>, 2008