The Land Where PHP Uses eval()Who uses eval() ?

It is 2018, and the PHP world useseval() in more than 28% of every PHP code source. It is repeatedly reported as a security issue and a performance bottleneck, and a memory hazard. Yet, we can’t get rid of it.

It seems reasonable to think that most of eval capabilities are available as a PHP features. So, we took examples from 2000 PHP open source projects, and reviewed the situation. Here are real-life examples of eval usage : for each of them, we’ll discuss the actual replacement.

JSON decode replacement

This first situation is a light implementation of json_decode(). The initial warning says ‘enable native json’ and that’s a good piece of advice. Yet, as of today, the application still uses dol_json_decode() in two situations, and some tests.

<?php

function dol_json_decode($json, $assoc=false)
{
    dol_syslog("For better performance, enable the native json in your PHP", LOG_WARNING);

    $comment = false;

    $out='';
    $strLength = strlen($json);    // Must stay strlen and not dol_strlen because we want technical length, not visible length
    for ($i=0; $i<$strLength; $i++)
    {
        if (! $comment)
        {
            if (($json[$i] == '{') || ($json[$i] == '[')) $out.= 'array(';
            else if (($json[$i] == '}') || ($json[$i] == ']')) $out.= ')';
            else if ($json[$i] == ':') $out.= ' => ';
            else $out.=$json[$i];
        }
        else $out.= $json[$i];
        if ($json[$i] == '"' && $json[($i-1)]!="\\") $comment = !$comment;
    }

    $out=_unval($out);

?>

If we go back to a time where json_decode() was not a PHP native function, the code written here is a light alternative. Maybe not complete, but useful in many situations.

To implement the decoding, the JSON is read, tokenized and turned into a PHP code that builds an array. Then, the code is executed, and the array is finally a PHP piece of data, with the help of eval().

Instead of creating PHP code for eval(), it would have been better to build the array directly in the main loop. The hard point in this code is when the JSON nests several levels of arrays and objects. This meant recursion, and it was simpler to do with eval().

Multidimensional array

Here is a situation where the code builds a multidimensional array. Data are read from the database, and a big array of stats is built. And this requires eval().

<?php

function genSiteStatCache()
{
    $sqlQuery = "SELECT `Name` as `name`,
                        `Title` as `capt`,
                        `UserQuery` as `query`,
                        `UserLink` as `link`,
                        `IconName` as `icon`,
                        `AdminQuery` as `adm_query`,
                           `AdminLink` as `adm_link`
                        FROM `sys_stat_site`
                        ORDER BY `StatOrder` ASC, `ID` ASC";

    $rData = db_res($sqlQuery);

    $sLine = "return array( \n";
    while ($aVal = $rData->fetch()) {
        $sLine .= genSiteStatFile($aVal);
    }
    $sLine = rtrim($sLine, ",\n") . "\n);";

    $aResult = eval($sLine);

    $oCache = $GLOBALS['MySQL']->getDbCacheObject();

    return $oCache->setData($GLOBALS['MySQL']->genDbCacheKey('sys_stat_site'), $aResult);
}

function genSiteStatFile($aVal)
{
    $oMenu = new BxDolMenu();

    $sLink    = $oMenu->getCurrLink($aVal['link']);
    $sAdmLink = $oMenu->getCurrLink($aVal['adm_link']);
    $sLine    = "'{$aVal['name']}'=>array('capt'=>'{$aVal['capt']}', 'query'=>'" . addslashes($aVal['query']) . "', 'link'=>'$sLink', 'icon'=>'{$aVal['icon']}', 'adm_query'=>'" . addslashes($aVal['adm_query']) . "', 'adm_link'=>'$sAdmLink', ),\n";

    return $sLine;
}

?>

A list of rows is read from the database, with a simple SELECT query. Each row is passed at the function genSiteStatFile() to build a PHP piece of code. This function relies on an object call : nothing special is done beyond converting a string to a link. The result is a piece of code that represents an array. All those arrays are concatenated in one code, and they make up the final stats.

Given the complexity of the task, the secondary function seems superfluous, and the eval() may very well be skipped to build the stats within PHP.

<?php

$aResult[$aVal['name']] = array('capt' => $aVal['capt'], ...);

?>

It would be interesting to check if the secondary function helps manage memory. The code here is not recent, and when the instantiation is in a separate function, this may trigger more often the garbage collector, leading to a lighter process. It may be less impact full with PHP recent improvements.

Creating missing classes

Eval() is used to create whole classes, just like this :

<?php

    if ( ! isset($active_record) OR $active_record == TRUE)
    {
        require_once(BASEPATH.'database/DB_active_rec.php');

        if ( ! class_exists('CI_DB'))
        {
            eval('class CI_DB extends CI_DB_active_record { }');
        }
    }
    else
    {
        if ( ! class_exists('CI_DB'))
        {
            eval('class CI_DB extends CI_DB_driver { }');
        }
    }
?>

In fact, PHP accepts multiple definitions of classes in the code, and only activates them if the code is executed : this must be interesting to see in the Zend engine. This way, there is no need to create the classes with eval(), hard coding them is sufficient.

<?php
        if ( ! class_exists('CI_DB'))
        {
            class CI_DB extends CI_DB_driver { };
        }
?>

Rewriting classes on the fly

This one-line eval() is quite impressive in terms of features : the eval is applied to the result of a double-regex call to preg_replace(), applied to the imploded lines of a piece of code. The code is so well crafted that not a single verification is needed after several PHP native function calls.

<?php
            // Useful for the eval()
            $override_file = file($override_path);

           //The actual eval()
            eval(preg_replace(array('#^\s*<\?(?:php)?#', '#class\s+'.$classname.'\s+extends\s+([a-z0-9_]+)(\s+implements\s+([a-z0-9_]+))?#i'), array(' ', 'class '.$classname.'OverrideOriginal_remove'.$uniq), implode('', $override_file)));
            
?>

As you can read the regex, it actually replaces a piece of code with another name, on the fly. The original name is extended with OverrideOriginal_remove and a unique ID that was built a little before. Just below, another eval() with the same syntax loads the same class, but set the name with ‘Override_remove’ and the same unique ID. Later, with Reflection, the class is stripped of its methods and properties. The resulting code is then stored in a file, for later inclusion (not shown here).

Eval() is used here to include two classes with the same name. The initial problem for that piece of code is to load both the classes with the same name and be able to compare them.

Instead of applying preg_replace() to the class definition, it would be better to use a namespace renaming. That way, only namespace A\B\C is renamed with namespace A\B\C\Original. Then, the code may be loaded with Reflection from two different namespaces, and compared until the final writing.

Reordering arguments

What to do when you have the right variables, but not in the right format? Write a function that reorganize the arguments so they fit the correct API. This is what this function does :

<?php

function array_csort() {
    $args = func_get_args();
    $marray = array_shift($args);
    $i = 0;

    $msortline = "return(array_multisort(";
    foreach ($args as $arg) {
        $i++;
        if (is_string($arg)) {
            foreach ($marray as $row) {
                $sortarr[$i][] = $row[$arg];
            }
        } else {
            $sortarr[$i] = $arg;
        }
        $msortline .= "\$sortarr[".$i."],";
    }
    $msortline .= "\$marray));";

    eval($msortline);
    return $marray;
}
            
?>

The objective is to use array_multisort() to sort a multidimensional array. Yet, all the rows are in one array, aka the first argument. So, columns are extracted, sorted in an arbitrary order, and then, applied to the initial array. This is a close cousin to the ORDER BY clause in SQL.

Here, the problem is that the initial data has to be extracted column by column from $marray, and then, used at the right place in the call to array_multisort(). As you can see, the loop builds a serie of $sortarray, which are only referenced in the $msortline. Since eval() is executing the code in the current context, $sortarray will be available, and sorted.

Nowadays, we can use the ellipsis operator ... or the old-fashioned call_user_func_array() : both allows us to prepare the arguments in an array, with total freedom of organization, then submit them to the function.

Code compatibility

Imagine that your code wants to take advantage of a new PHP feature, from a new PHP version. Classic problems of migration : how to use a feature that is not yet available.

The real challenge appears when the upcoming feature doesn’t compile on your current version. For example, imagine a world where PHP has no clone operator. This is the case here :

<?php
    // in a class, 
    static function copy($object) {
        if (version_compare(phpversion(), '5') >= 0) {
            eval('$copy = clone $object;');
            return $copy;
        }
        return $object;
    }            
?>

The clone call is now in a string, which will only be evaluated if the version is compatible with this operator. Otherwise, the code is ignored, and the copy is done with another method. Yet, for this class to compile, clone must be avoided in the code, which is the case here.

This strategy allows old code to run new syntax, preparing for migration. Any speed gain related to a native clone operator is probably offset by the eval code and the static method call. Yet, this allows for cross version compatibility. The most important here is to remember to remove this piece of code once the older version has been totally abandoned.

This usage of eval may be the cleverest we have reviewed so far. We found situations where clone and instanceof have been protected that way.

Note also that this may be valid for backward compatibility. PHP 7 abandoned dynamic global variable, and this is a patch for compatibility.

<?php
    // global ${$var}
    eval('global $' . $var . ';')
?>

Escaping the sequence

This piece of code makes a clever usage of PHP’s escape sequences. This one is the hexadecimal format for characters : \xhh, where hh is hexadecimal characters.

<?php

        $dtime = dechex($this->unix2DosTime($time));
        $hexdtime = '\x' . $dtime[6] . $dtime[7]
        . '\x' . $dtime[4] . $dtime[5]
        . '\x' . $dtime[2] . $dtime[3]
        . '\x' . $dtime[0] . $dtime[1];
        eval('$hexdtime = "' . $hexdtime . '";');

?>

$dtime collects the time in a Dos format, then turns it into a hexadecimal string. Then, the hexadecimal time is crafted by inverting the order of the hexadecimal chars from the $dtimestring. Note that is still a string, yet it is used with an array syntax.

\xhh may be replaced by chr(hexdec()) call. chr produces a character from its ASCII representation. This representation needs to be decimal, but extracting the correct digits from $dtime has to be done with hexadecimal format. So, we both need dechex and hexdecto finish the calculation correctly.

<?php

        $dtime = dechex($this->unix2DosTime($time));
        $hexdtime = chr(hexdec($dtime[6] . $dtime[7])).
                    chr(hexdec($dtime[4] . $dtime[5])).
                    chr(hexdec($dtime[2] . $dtime[3])).
                    chr(hexdec($dtime[0] . $dtime[1]));

?>

Although it is longer to write, chr is 4 times faster than eval.

Dynamic variabling

One recurring usage of eval is the emulation of the variable variable. If $$var is a well-known situation, there are some tricky issues to solve. For example :

<?php
   eval('return isset($this->' . $property . ');')?>
   // identical to 
   isset($this->{$property});

This happens when the code knows which variable to use, but doesn’t know yet the value. Here is one :

<?php
    function foo($str2="<span class='folder'>\$name</span>") {
        $name = otherSource();
        eval("\$nstr = \"$str2\";");
     }

?>

$str2 string contains a variable, and it should be replaced with another value, from another source. For this illustration, the code is simplified, but that kind of on-the-fly replacement is often done after a long list of expressions.

Note also that eval only execute PHP code. There does not need for the opening tags';

Back to the initial expression. eval is used here for replacement : the value is in the namevariable, and the incoming argument is a kind of a template. The secure way to do this is to use str_replace, or preg_replace, or preg_replace_callback. The incoming template is used as a piece of data, and not executed as a piece of code.

Here is another example of such templating, with a database storage. The address format is stored in the database, as a piece of code : you can see the variable names, which will be taken from executing context.

<?php
    function format($address) {
        // Extracting variable en masse
      extract(address);
    
    // example of database row in the adequate table : 
    //INSERT INTO address_format VALUES (1, '$firstname $lastname$cr$streets$cr$city, $postcode$cr$statecomma$country','$city / $country');
    
    eval("\$address_out = \"$fmt\";");
    return $addres_out;
?>

eval could be replaced here with a single str_replace call :

<?php
    function format($address) {
    
    // Add the $ prefix to fit the actual format. 
    // Note the double $ in the closure : it is not a variable variable
    array_walk($address, function(&$a, $b) { $a = "$$a"; });
    
    // examples of database row in the adequate table : 
    //INSERT INTO address_format VALUES (1, '$firstname $lastname$cr$streets$cr$city, $postcode$cr$statecomma$country','$city / $country');
    
    $address_out = str_replace(array_keys($address), 
                                    array_values($address),
                                    $fmt);
    return $addres_out;
?>

Dynamic new call

One classic build of class name for instantiation.

<?php
  eval('$newback = new ' . $backend_name . '($param);');
  // other way to instantiate a dynamic class
  eval("\$proxy = new nusoap_proxy_$r('');");
?>

new accepts variables for instantiation, so in the first example, eval is not needed. As for the second, new doesn’t accept expressions, so it needs a workaround to create a dynamic name. Here,

<?php
  $className = "nusoap_proxy_$r";
  $proxy = new $className('');
?>

Evaluating math or logical expression

eval may be used to execute a subset of PHP functionalities. In particular, math expressions may be written by the user, and executed by PHP.

<?php
    $calc = preg_replace('/#([0-9]+)/', '$values[\1]', $option);
    // FIXME: kill eval()
    eval('$computed = ' . $calc . ';');
    $value = $computed;
?>

$option contains a math expression, with its specific format. The values are collected from another source, and this eval executes it.

You’ll be pleased to see that the eval is already targeted for elimination. But it has not been removed yet, for a good reason : the alternative requires a lot of work.

Coding math in PHP is easy : many operators are available, and parentheses, and precedence, and so are constant and functions. Thus, it is tempting to harness this power by running the expression as a string in eval(), along with the values.

The sane replacement is to use a math-parser component. For example, math-parser, or php-math-parser. For logical expressions, there is the expression language from Symfony.

Yet, replacing what looks like a small string, by a full-blown PHP component, spread over several classes is quite scary. The alternative is to filter the string thoroughly : accepting numbers, operators, parentheses and its nesting, and a few functions like log(), e() etc. There is feature creep written all over this possibility, and it will probably end up as a full-blown component, spread over several classes.

Evaluating a PHP expression

Quite obvious, right? This situation arises when a framework allows PHP code to be executed as part of its own process. For example, you may include PHP expression inside a PDF, ODT or XML template, or run customisation for an HTML tag. Since those are left to the developer, there is a need for PHP code to be written in one part of the source, and executed somewhere else.

This is how methods like this one are written :

<?php
    public function evaluateExpression($_expression_,$_data_=array())
    {
        if(is_string($_expression_))
        {
            extract($_data_);
            return eval('return '.$_expression_.';');
        }
        else
        {
            $_data_[]=$this;
            return call_user_func_array($_expression_, $_data_);
        }
    }
?>

evaluateExpression is nothing more than a glorified eval call : arguments are coming in one array, and are later extract()-ed, so they end up in the current context. Then, the PHP code, written in $_expression_ is executed. Note the prefixed return that is necessary to get the value from the eval, unless it is assigned to a local variable. Separation of context is difficult here.

The whole method looks like a function call : the body of the function is $_expression_, and the arguments are in $data. Actually, this is exactly how create_function works :

<?php

$anonymousFunction = create_function ( string $args , string $_expression_ );

?>

First, note that arguments and code are swapped. But they are both the same arguments as for evaluateExpression().

$args is a list of arguments names : it may be extracted from $data with array_keys(): $args = implode(', ', array_keys($data));, which will produce something like $a, $b, $c.

The result of create_function() is a string, that represents an anonymous function. It may be used later just like any function, with the values of the $data variable : $result = $anonymousFunction(...array_values($data));.

As the manual mentions it, create_function() is a bad idea : ‘This function internally performs an eval() and as such has the same security issues as eval(). Additionally it has bad performance and memory usage characteristics.’

Besides, create_function() is deprecated since PHP 7.2, and it is replaced by the Closures. The syntax is very close to the create_function, and a lot more elegant :

<?php

$closure = function ($a, $b, $c) { /* code from expression */ );

?>

Now, the result is a Closure object, which acts as a function (and more). In particular, in the context that we are studying, the closure may be written in one part of the framework, like in a custom module, handed to the framework, and transmitted until it is executed. This approach has the benefit of PHP opcodes, as the code is compiled before execution.

There is a last case that closure can’t cover : create_function used to allow function creation with codes submitted by the user. Closure require hardcoded code, so this use case is forbidden. For security reasons, it is actually a major upgrade. And that leaves very little situations where the eval() is useful to execute random PHP code, except maybe 3v4l.org.

Final weird stuff

All the above are real examples, extracted from the real code. Those make sense, even when there are alternative names. Here are a few other situations, which are plain weird.

<?php

// Are those trying to lint PHP code? 
eval('if(0){' . $code . '}');
@eval("return true; $f ;")  

// There must be better ways to throw an exception... 
eval('$e = new Exception($this->message, $this->code);throw($e);')

// function cleanAndSanitizeScr.... 
eval("\x66\x75\x6e\x63\x74\x69\x6f\x6e\x20\x63\x6c\x65\x61\x6e\x41\x6e\x64\x53\x61\x6e\x69\x74\x69\x7a\x65\x53\x63\x72" .  //... MORE code like that

// Now, that's a new one... 
eval($PHP_CODE);

// Who needs eval to echo ? $TP_tb_row may contain code...
eval("echo $TP_tb_row;")

// Yes, ketchup, but hold the onion
eval($ooo000($ooo00o($o00o)))

// Eval to build a closure ... 
eval('$fn = function($document) { return ' . UtilArrayQuery::buildCondition($criteria) . '; };')

?>

Little reasons for code that uses eval()

I started this journey with the firm belief that eval shows a lack for knowledge with PHP. All the situations we have detailed together, with the exceptions of three, have a corresponding PHP alternative that is better and more elegant.

  • Early json_decode() : recursion
  • Multidimensional array: direct PHP code
  • Creating missing class : use namespaces
  • Reordering arguments : use … or calluserfunc_array()
  • Dynamic variabling : use PHP dynamic syntax
  • Escaping the sequence : use chr() and dechex()

The valid reasons :

  • Code compatibility workaround : valid for transition, and so rare
  • Evaluating math or logical expression : the alternative is more than a few lines of code, and yet…
  • New dynamic instantiation : PHP could improve this syntax

My last question would be this : can we lint PHP code from PHP code ? Currently, lint must be run as an external program. And there is no way to compile PHP code from within PHP itself, without relying on eval() and its try/catch, with the threat that any injection will takeover the whole server. Tokenizing a string is possible, with token_get_all(), though it has no syntax validation. Yet, before running anything through eval(), there is no way to check if this run.

 

2 thoughts on “The Land Where PHP Uses eval()

  1. Pingback: The Land Where PHP Uses eval() | Blog of Leonid Mamchenkov

  2. Pingback: PHP Annotated Monthly – November 2018 | PhpStorm Blog

Comments are closed.