PHP variable variablesVariable Variables Usage

I call them ‘variable variables’, because it makes a funny sentence. What? Aren’t variables supposed to …vary, change, get altered? Indeed, they do. Yet, there are the ‘dynamic variables’ in PHP, which have been around as long as PHP itself (at least, PHP 3.+).

Recent tutorials around the web, like Dynamic variables in PHP, or Dynamic Variable Names in PHP show that they are quite fashionable. In fact, I remember learning about them as a little self-discovered treasure : yeah! I can produce thousands of variables in one script! Yes, I was young.

Nowadays, they are not so hot anymore. Adding variables in a method context lacks accountability and brings together worlds that should be kept isolated. How many are they? What do they contain? Are they interfering with my own variables?

Since 42% of PHP Open Source projects use them, we decided to set off on a journey to see how they are used, and what would be a modern alternative to them. Here are six ways to use PHP variable variables and some alternative strategies.

To resuscitate register_globals

I’ll start with a direct hit of nostalgia : everyone who used PHP 4, early 5 remember those hacks. Variable variables were the worst to emulate the old behavior. Yes, that’s a great introduction to the working of the dynamic variables features.

Until PHP 5.3, register_globals was a feature that automagically turned the incoming variables from an HTTP query (POST, GET, etc.) into global variables. That way, it gave immediate access to any incoming variable. Convenient. Unsafe, but convenient.

Needless to say, many applications relied on this behavior, and there was no way to convince authors to replace a simple $x call with $HTTP_POST_VARS['x']. Even when that last variable was shortened to $_POST and made super global.

So, variable variables saved the day, with the following solution.

 
<?php 
foreach($HTTP_POST_VARS as $k => $v) { 
  $$k = $v; 
} 
?>

You can easily read the code here : the name of the variable is in the key of the $HTTP_POST_VARSindex, so $$k points to the variable, whose name is $k. $v is the actual value, so this loop above creates a variable named $k, with its corresponding value.

Injecting Unknown Variables

The problem with the original register_globals, was that the PHP script would start with a unknown list of pre-created variables, on top of PHP’s own pre-defined variables. Anyone could add an extra variable or two to the POST target, and add a new variable inside the script. This usually meant that the initial value of those variables could not be trusted anymore.

Overwriting variables

Thenm variable variables introduced the problem of overwriting existing variables. The code shown above doesn’t check the content, nor any pre-existing variable. This opens the door to erasing a safe variable, with an external unsafe value.

Besides, the value itself must be checked before usage.

Creating Arbitrary Variables

Nowadays, register_globals is an old souvenir of the not-so-good old times. It is gone, and no one misses it. Variable variables stayed, and they are still used for the same purposes, although with slightly different contexts. Hopefully, it is better those days.

One purpose is to create many variables on the spot. Not from incoming values, mind you, but from a safer source. For example, a configuration file, or a cache.

Those sources are both quite safe, as they are usually totally under the application’s control. It is not an open backdoor, like $_POST was. Although, as usual, caution is always a good thing : validating those values is still the best practice.

Then, configuration files are dynamic by nature : a new directive could be added at any moment, and is supposed to be accessible in the application environment for easy access. This is a classic situation for variable variables : an arbitrary list of directives, with arbitrary names and values. The list itself is the only one being not arbitrary.

In such situations, variable variables are used with this code :

 <?php

function foo() { 
   $is_empty = true; 
   $param = unserialize($param); 

   foreach ($param as $key => $val) { 
      if (!empty($val)) { 
          $$key = strtolower($val); 
          $is_empty = false; 
          break; 
      } 
    } 
} 
?>

The directives are injected from an external source : here, it is a serialized PHP code. Everything from that source is recreated in the PHP scope, with the loop.

The origin of those variables may be any source that hold dataset. Here are some examples :

  • JSON file
  • INI file
  • serialized PHP code
  • Database
  • var_export() array
  • YAML file
  • Any sub processes
  • Redis hash
  • WDDX deserialize

In case of configuration, that loop appears only once in the code. In case of cache, there should be a second loop, with writing purpose. Possibly, with another usage of variable variables.

Missing Someone maybe?

One challenge appears when auditing those newly created variables. Are they all created? Are we missing any of them ? Basically, how do you run an audit and keep them under control?

There is the native PHP method get_defined_vars(), which will list the created variables. This is a useful tool for debugging, albeit a bit cumbersome.

Besides, when variables are created from an array-like structure (see example above), then this array is actually far easier to handle than a list of variables. The array acts as a pointer (or a reference), to a list of variables. Then, array functions apply to list them, count them, remove or alter them.

 <?php

function foo(array $vars) { 
    foreach ($vars as $name => $value) { 
      $$name = $value; 
    }

    // This list include $name, $value and $vars too! 
   print_r(get_defined_vars()); 
   print_r(array_keys($vars)); 
} 
?> 

In fact, the second PHP function to know is extract() : it produces variables from an array, without using the variable variables. And it also takes into account conflicts. Variables conflicts.

Variable Conflicts

When using variable variables in a loop as previously, we have skimmed over a hidden problem : variable overwriting. Since variables may change at any point of the context, overwriting one of them doesn’t yield any warning : this is normal behavior.

Yet, it is a frequent problem. Look here :

 <?php

foo(['name' => 1, 'vars' => 'bar']);

function foo(array $vars) { 
    foreach ($vars as $name => $value) { 
       $$name = $value;
     }

     // This list include $name, $value and $vars too! 
     print_r(get_defined_vars());

     /*Array ( 
       [vars] => bar 
       [value] => bar 
       [name] => vars 
      ) */

} 
?>

foo is called with some values to create as variables. Both will create havoc. First, 'name' is also the blind variable in the foreach. It is actually created first during the loop execution, but since $name is reused, it is then overwritten with latter values. So, arguments’ variables are in conflict with the local ones.

Secondly, vars is also overwriting the local variables. Now, foreach() works on a copy of the source array, so the loop runs without a problem : all the variables are created as expected. But, right after the loop, $vars has been overwritten by one of the argument variables, and it is now a string, not an array anymore.

Variable variables establish a bridge between variables namespace and literal values. Usually, both world are separated, and variables hold values. Here, values create variables, and this leads to those conflicts.

To avoid those conflicts, a check on variable existence is necessary. While it is possible to do it with a if (isset()) condition, extract() does a better job. its second argument is a flag to specify the behavior in case of conflict : there are several alternative should one arise.

EXTR_OVERWRITE overwrites the previous variable, which is the default behavior; EXTRA_IF_EXISTS overwrites only previous existing variables; EXTR_SKIP means the new value is skipped, and the previous value is kept; EXTR_PREFIX_SAME adds a prefix, passed as third argument, to any conflicting variable; EXTR_PREFIX_ALL adds that prefix to all new variables.

 <?php

foo(['foo' => 1, 'vars' => 'bar']);

function foo(array $vars) { 
   extract($vars, EXTR_SKIP ); 
   foreach ($vars as $name => $value) { 
      $$name = $value; 
   }

    // This list include $name, $value and $vars too! 
    print_r(get_defined_vars());

     /* Array ( 
        [vars] => bar 
        [value] => bar 
        [name] => vars 
        [foo] => 1 )
*/
} 
?>

Writing To A Dynamic Dataset

Besides reading variables, writing them is also possible. When the dataset has several layers, it is sometimes necessary to access an intermediate level, even when this level has an arbitrary name. Variable variables, then, are easier to read than references.

Look at this structure, which deals with a multi-dimensional array like $issues[$lang][$type][$line] = $issue;. The variables used in index shows that those values are arbitrary. The nested loops produce a grouping of the issues, by type.

 
<?php 

// $issues[$lang][$type][$line] = $issue; 
foreach ($issues as $lang => $languageIssues) { 
    foreach ($languageIssues as $type => $languageIssue) { 
        foreach ($languageIssue as $line => $issue) { 
          array_push($$type, 'Line ' . $line . '. ' . $issue); 
        } 
     } 
} 
?> 

$type can only be a string (or a number), so it cannot hold an array. Here, the name of the type is used to access a variable with the same name, which is the final array. That way, any type of issue is created and filled as the $issues array is read.

After the mass creation, those variables will have to be collected.

Instead of variable variables, an array can be used here. $types[$type] acts just like a variable : it creates index on the fly, accept value pushes. And it collects immediately all the types in a convenient array.

Also, magic methods in a class (__get() and __set()) are another solution to collect the values in this case. Note that such class will probably use an array to store the magic properties.

To Store Temporary Variables En Masse

We have seen situations where the variables are created from a list of preconfigured variables. When those dynamically created variables need some extra processing, it leads to creating even more variables.

 
<?php 
foreach (current($data) as $_v => $_k) { 
   if ($_v != $index_key) { 
      if (!isset(${$_v . '_temp'})) ${$_v . '_temp'} = ""; 
      ${$_v . '_temp'} .= " {$_v} = CASE {$index_key} "; 
   } 
} 
?>

In the example above, the incoming variables in $data are processed to add more syntax, before building a complex SQL query. The ${$_v . '_temp'} create a new variable, which is later retrieved and combined.

The suffix approach tries to avoid the variable conflicts we mentioned before. Hopefully, none of the incoming variable will be called a and a_temp at the same moment. Should the situation happens, some of the values will overwrite each other.

To avoid this problem, switching context when processing the values would be a better idea. Apply array_map or array_walk to the $data array, to keep the processing separate. That keeps the variable namespace pollution low.

Exporting Configuration

One possible trick with variable variables is to use them as alternative structure, or a switch for variable, close to the upcoming match() in PHP 8.0.

Look at this script : the condition is processed in the loop, and the condition chooses the name of the variable. The actual HTML code has been prepared before the loop, and might hold several operations or some large text. $linkImage and $linkText acts as cached values.

Then, to display the actual value, $displayValue gets the configuration, and access the HTML code via the variable name.

<?php
// $linkImage and $linkText are set before the loop

foreach ($resources as $resource) {
    $displayValue = ($resource->type) ? 'linkImage' : 'linkText';             
    $links .= '<a href="' . $resource->link() . '">' . $$displayValue . '</a>';
}
?>

This is a clever usage of variable variables. No need for them to be in arbitrary number, as we can see. Of course, this situation may be handled in different and less tactical ways.

Variable Variables forever

Variable variables are still a noticeable feature of the PHP world. PHP coders, new or experience alike, stumble upon them, and cherish them as a hidden treasure. They are useful in some tactical situations, like managing outgoing placeholders templates or incoming directives.

Internally to the application, variable variables are easy to replace with a simple array or a magic class. Those provide a greater control over the values, and standard tools to manipulate them. In particular, they keep those values separated from the variable namespaces, and from the internal heaps and gears of PHP. It is more secure.

PHP audits the code to plan for memory allocation within methods and functions. Dynamically allocated variables are hard to forecast. These forces PHP to rely on a default mechanism to process them, with lower performances.

Thanks to Lars Moelleken, for inspiration of this journey into variable variables usage. We used the Exakatcorpus, to measure variable variables usage and check individual details.