Memory Usage of Constants in PHP

Display mode

Back to Articles

A common requirement in the development of web applications is that the interface is portable between languages: any phrases which appear within the interface must be in the viewer's preferred language. The consequence of this is that, instead of the phrases themselves, definitions which refer to the phrases must be used throughout the website.

In PHP, one of the more common ways to implement this is to generate multiple "phrase files", each of which contains all the phrases required for a given language. Each phrase is defined as a constant; an example of such a phrase file could run as follows.

phrases.tr.php: Turkish phrase file

define('_WELCOME',       'Hoşgeldiniz!');
define('_FREE_DELIVERY', 'Ücretsiz dağıtma');
define('_BRANDS',        'Markalar');
define('_COMPUTERS',     'Bilgisayarlar');

The object-oriented approach yields another method of holding these translation phrases, as class constants within a phrases class. This alternative approach would be performed as below.

phrases.tr.php: Turkish phrase file

class Phrases
{
    const WELCOME       = 'Hoşgeldiniz!';
    const FREE_DELIVERY = 'Ücretsiz dağıtma';
    const BRANDS        = 'Markalar';
    const COMPUTERS     = 'Bilgisayarlar';
}

Since there are probably a large number of these phrases, it makes sense to seek the most memory-efficient of these two constructs: in other words, the one which takes the least amount of memory to be held. It might be expected that both methods of defining the phrase file would cause the same amount of memory to be taken up; this is not the case.

(A note: the analysis presented below is based on the definitions held in PHP 5.2.10; the values produced may differ in future versions.)

Storage of a global constant

The first of the two methods above defines a series of constants in the global scope; internally to PHP, these were initially stored as an array of struct zend_constant. The layout of this data structure, and the memory used, are derived from the basic zval structure used by PHP to hold values.

Original scheme for global constants
Figure 1: Original storage scheme for global constants

This data structure is memory-efficient, being able to store the name and value of the constant, as well as the other data required by PHP for a value, such as the reference count. The problem with defining zend_constants as an array is that the time taken to find a particular constant grows linearly with the number of constants that are defined; since the PHP interpreter itself sets constants such as PHP_VERSION, there will always be a disadvantage for user code in terms of speed.

To resolve this, a HashTable data structure was introduced with PHP 3, to allow for logarithmic-time searching of the constants. The HashTable uses Buckets to store its data entries, each of which has a name attached.

Hashtable scheme for global constants
Figure 1a: Hashtable storage scheme for global constants

The trade-off in using a hash structure like this, is that more memory is taken for storage of the hashes and key lengths. In total, storage of a global constant takes 42 bytes before the strings for name and value are counted.

Storage of a class constant

With the introduction of objects in PHP 4, a new line of thinking was employed for class-level constants: if a HashTable is used, and a hash of the constant name is stored, the name itself doesn't need to be stored alongside it. This allows a good chunk of space to be saved, and the structure of the storage to be simplified somewhat.

Scheme for class constants
Figure 2: Storage scheme for class constants

As can be seen here, the name of the constant isn't stored at all once its hash has been calculated. This means that the zend_constant structure can be eliminated, leaving only the zval. As a result, a class constant needs 26 bytes to be stored, before the value is counted.

Conclusions

Two conclusions can be drawn from this analysis:

Imran Nazar <tf@imrannazar.com>, June 2010.