Building analysis to get ready for PHP 8.3 with static analysis

With PHP 8.3 coming up next week, it is time to review our code bases with static analysis, and update everything that can be before the new version hits the production server.

With a mixed bag of new features, changed behaviors and backward incompatible changes, PHP 8.3 is a prime subject of analysis. So, let’s cloth us with the mantel of the static analysis auditor, and let’s build rules to review code and assess the situation.

In this blog, we review different items on the migration list. After a short description, we analyze how we can detect them in the code. That way, we are able to detect future bugs with the incompatible syntax; we are also able to suggests modernizations, which have to wait for the version to be installed.

Let’s review the features.

Typed class constants

Typed class constant are a new feature, where class constants may get a type, like parameters or properties.

The syntax is new and backward incompatible: once the code uses it, there is no more compatibility with PHP 8.2 and older.

Typed constants impose that all the different versions of a constant in the family tree of a class are of the same type. Even if this was not a feature in PHP 8.2 and older, it could be easily be achieved with a bit of discipline.

We should be able to find candidates for the typing: namely, any class constant which is overwritten, at least once, and where all the values have a consistent type, one with another.

In the code below, B is always a string, while C may be an integer or null. A is a bit of an edge case.

<?php

class a {
  const A = 1;
  const B = '2';
  const C = 3;
}

class b extends a {
    // no A redefined
  const B = 'abc';
  const C = null;
}

?>

Indeed, constant with a single definition are merely an easy case. They can be typed with their value, as long as we can find that value. Which may prove to be harder than it sounds.

<?php

class a {
    // untypable
  const A = PHP_OS == 'osx' ? null : 1;

    // typable
  const B = PHP_OS == 'osx' ? 'yes' : 'no';
}

?>

This feature leads to a suggestion to modernize the code.

Variable Class Constant

Variable Class Constants is again a new syntax, and it also introduces a backward incompatibility with PHP 8.2.

On the other hand, unlike the typed constants, there was a workaround in PHP 8.2 : the constant() function.

<?php

class A { const B = 'C'; }

$name = 'B';
echo constant("\A::$name"); // display 'C'
echo constant(A::class.'::'.$name); // display 'C'

echo constant($constantName); // display ... something

?>

Some of the calls above can be interpreted statically, and turned into a suggestion to upgrade the code to the new syntax. Other calls, like the last one, or even other workarounds like reflexion API, might be too difficult to interpolate: those should be omitted from analysis.

This feature leads to a suggestion to modernize the code.

Override attribute

Override is a attribute class. It ensures that a method is always overwritten it is children. The main incompatibility with PHP 8.2 is that there should be no class with the name override in the global namespace. That part is easy, and set aside.

Attributes are already backward compatible. They can be used in the current code as well as in older code bases. PHP actually uses it in 8.3, and ignores it in previous versions.

This attribute checks that a method has been overriden in any child class that is being used. That can be handled easily by static analysis tools, and reported in any version of PHP.

<?php

class A { 
    #[Override]
    function foo() {}
}

class B extends A { 
    function goo() {}
    // foo method is missing! 
    // may be a typo with the previous method? 
}
?>

This feature leads to a rule that can be used in older PHP versions.

__clone() magic method can change readonly properties

This is a new feature, that is not backward compatible. Any code that needed that feature in older PHP versions had only one choice: skip the readonly option of the property.

Suggesting properties to be readonly is a difficult task. When the assignation is done outside the constructor, the writing of a property depends on the execution. This is makes the usage difficult to understand for static analysis tool and developers alike.

When the property is assigned directly in the constructor, and then, only altered in the __clone() method, then, it is possible to suggest using readonly with this new feature.

<?php

class A { 
    public int $p;
    public int $q;
    public int $r;
    
    function __construct(int $p) {
      $this->p = $p;
      $this->r = $r;
    }
    
    function __clone() {
      $this->p = $this->p + 1;
    }
    
    function foo($q, $r) {
      $this->q = $q;
      $this->r = $r;
    }
    

}
?>

In the example above, $p is a good candidate for an upgrade with readonly : it is assigned only in the constructor and in the __clone method. $q is assigned another method, and $r is assigned several times: they shall be omitted.

This feature leads to some suggestions of modernization.

Static variables init

PHP 8.3 introduces the ability to initialize static variables with other variables and function calls. This makes sense, since static variables are variables, not constants. There is no need to restrict them to using static constant expressions.

The feature is backward incompatible with PHP 8.2. Yet, it is easy to think of a workaround. Since functioncalls and variables are not allowed, the initialization has to be postponed to a future expression. That expression probably uses the default literal value to detect the initialization phase and run.

All that can be spotted in the code, and reported for modernization.

<?php

function foo($a) {
  static $s = 'default', $t = null;
  
  if ($s === 'default') {
     $s = strtolower($a).'bca';
  }

    // alternative: a special if() for null values
  $t ??= strtolower($a).'abc';

  // PHP 8.3 version 
  //static $s = strtolower($a).'bca';
  //static $t = strtolower($a).'abc';
}

?>

This feature leads to suggestions of modernization.

Tighter usage of ++ on strings

The arcane and much discussed feature to increment strings in PHP is getting a bit of dusting in PHP 8.3.

<?php

function foo(string $s = 'a') {
    ++$s;
    print $s; // b, by default
}

?>

This behavior change is difficult to spot. Types and default values help quite a bit, as here we can see that $s is both a string and incremented.

Yet, it is difficult to understand if this is a feature or if the string actually contains an integer or a string which ends in ASCII characters, or another special format. So, it is reasonable to detect incremented strings, and leave the rest to the developer to figure out.

This feature change allows a wide net rule, and require human help to finish the job.

json_validate()

json_validate() is a new function. Until now, invalid JSON was tested by catching the related exception, or checking for error message availability. The validation was hidden in the process: without an error, it is validated; with an error, there is no dataset.

json_validate() is probably useless to any code that process the decoded data, since it would be double work: validate first, then decode.

json_validate() is useful when the validation and the usage of the dataset are separated in the code. For example, the controller validates the JSON, and an entity decodes it and assigns the data further.

This means that we could see json_decode() being used, but the resulting data being ignored. That would emulate json_validate() with json_decode(), at the price of repeating it.

<?php

try {
    $json = json_decode($json, flags: JSON_THROW_ON_ERROR);
} catch (JsonException $e) {
    print "There was an error\n";
}

/// alternative

if (json_validate($jsonText)) {
    $json = json_decode($json);
} else {
    print "There was an error\n";
}

?>

Another option would be to use a dedicated package, such as seld/jsonlint.

This feature may prove difficult to detect, as there are several ways to build a workaround, and the new feature is very similar to the old one.

Conclusion

We have reviewed several ways to convert a migration challenge to a static analysis pattern. In fact, each of them used a different technic to create a pattern:

  • typed class constant : rely on previous code discipline
  • variable class constant : rely on current syntax
  • override attribute : easy to extend to previous PHP versions
  • static variable init : spot possible workaround
  • string incrementation : cast a wide net, and let human finish
  • json_validate() : rely on current workaround

Features, new, changed or removed, never exists in a void. They usually have a workaround, however cumbersome, to enable it in older versions. With its integration in the PHP engine, such workarounds and disciplines tends to be nullified, and shall be removed.

When analyzing source code, it is critical to identify those impacts in the code so has to report them: as a bug to be removed, or as a suggestions for later.

Happy PHP Auditing!