PHP 7 migration audit

PHP 7 migration audit

PHP 7 is probably the ‘the easiest upgrade yet‘. After having checked that the old PHP 5.x is lintable with PHP 7, the next challenge is to read the ‘ backward incompatible changes‘ and see if it applies to the legacy code.

The list is not long, and a quick text search will lead us to many places in the code. For example, usort() has a new sorting behavior for equals values, so targeting usort() and uksort() is a good idea.

On the other hand, there is also room for a lot of false positives. For example, foreach() has different changes depending on by-value and by-reference type, and impact within the loop block or right after it. This is not trivial to check with a text search.

Note : I won’t detail what is changed but focus on what can be done to pinpoint interesting issues in the code and actually find them.

Indirect variable

Some complex variables access have a new interpretation. For example, $$foo[‘bar’][‘baz’], or $foo->$bar[‘baz’]. Multiple dimensions arrays are OK, so is properties/methods chaining. But mixed syntax and variable parts, such as variable variable or variable properties are in need for some curly braces. Targeting $$ and ${$ is a good start.

Global requires simple variables

The global keyword requires now simple variables, like $x. Anything more complex must be reviewed and removed. Global ${$a->b}, $$p, $$q[3], $$o->b;

Parentheses around variables or function calls

Parenthesis may be used to hide some errors, just like the @ operator. That won’t be the case in PHP 7, and it is wrong in PHP 5 anyway. One solution is to rely on PHP error log : crank it up to strict standards and search for ‘Strict Standards:’.

Alternatively, search for parenthesis in arguments calls and check if they are useless :

<?php
f( (array_pop($d))) ;
echo ($d) ;
?>

Array elements or object properties that are automatically created by reference

Spot all array elements that are created with references (both &= and = &). The created order may change. Just like the previous, spot PHP Notice: Undefined index: in the logs, since this report that a variable is created without value, especially with &=. Otherwise, search for reference assignation to an array. $a[‘b’] = & $a[‘a’] ; or $a[‘b’] &= $a[‘a’] ;

list() will no longer assign variables in reverse order

Spot array appends within a list call. list($a[], $a[]) = $something ;

Empty list() assignments are no longer allowed

List now requires at least one valid argument to be called. It can’t be called with without argument, nor with only empty slots or empty version of itself. Even just one element in the list is legit. Search for list(), and refine with double commas, ‘(,’ or ‘,)’. Then review.

list() no longer supports unpacking strings

There is no easy way to find the string in the right operand of an assignation to list, unless it is a literal. Just like the previous, search for list() and review the other side.

list() is now always guaranteed to work with ArrayAccess

Just like the previous, there is no easy way to refine beyond searching for list() itself and reviewing the right operand.

Iteration with foreach() no longer has any effect on the internal array pointer

This means that code placed after a foreach loop, that breaks at some point and then continue processing from there, is impacted. Search for foreach loops, and check it is followed by calls to current(), next() or prev() on the source array.

<?php
$a = [1,2,3,4] ;
foreach($a => $b) {
  if ($a == 2) { break 1 ; }
}
$b = current($a) ; // 3 in PHP 5.6, 1 in PHP 7
?>

Search for foreach(), and review the code after.

When iterating arrays by-value, foreach will now always operate on a copy of the array

Any by-value (no reference) foreach that uses current(), prev(), next() on the source array in the loop is impacted. Search for foreach() with by-value (no &) on the key, then checks the loop.

<?php
$a = [1,2,3,4] ;
$c = 0 ;
// count half the array
foreach($a as $b) {
  next($a) ;
  $c++ ;
}
?>

When iterating arrays by-reference, modifications to the array will continue to influence the iteration

Spot modifications of the source array in the by-reference loop. Any modification to the source array has to be reviewed. It is especially true if the source array is appended with something, or merged with another array. Without condition : this is a infinite loop.

<?php
$a = [1];
foreach($a as &$b) { 
  $a[] = 3 ;
} // PHP 7 infinite loop
?>

Iteration of plain (non-Traversable) objects by-value or by-reference will behave like by-reference iteration of arrays.

This didn’t work before, so PHP 5 code has workaround to do this, or avoid doing it. No need to search.

It is no longer possible to define two function parameters with the same name.

Indeed, this happens. It is easy to spot the method definition, but harder to find the double argument, since we need to find the name first. Spotting the argument is difficult in itself, since Constant Scalar Expression, like shown below. That requires a real parser to understand the various parts of the definition.

<?php
function f($a, $a = (2 + 3) * 4) {}
?>

The func_get_arg() and func_get_args() functions will no longer return the original value

Spot functions that uses func_get_arg and func_get_args. Then, review if the arguments have been changed. func_get_arg use first thing in the function is probably OK.

Exception backtraces no longer display the original value

Same as above. Search for ‘getTraceAsString’ non-static method usage, with -> and empty arguments.

Invalid octal literals

Any number starting with 0 has to be checked, such as $x = 0783 ; Most of the time, octals are only used with mkdir and chmod, so you may spot them there, with some classic values of 0777, 0666, 0755, etc. Otherwise, regex like [^”0-9′]0[0-9]+ should bring a short number of solutions to review.

Bitwise shifts by negative numbers

Shifts are done with the << and >> operators. Then, review them.

Large bitwise shifts

Same as above.

Strings that contain hexadecimal numbers are no longer considered to be numeric

Hexadecimals look like 0x[0-9a-fA-F]+. They may be standalone, or they may be inside a string . When in a string, make sure they are at the beginning of the string, being the ‘, “, <<<HEREDOC and <<<‘NOWDOC’ : anything deeper in a string is not interpreted by PHP.

Codepoint Escape Syntax

Lint the code, and it will report any problems with invalid ‘\u{‘ in strings.

Removed support for static calls to non-static calls form

Statically calling a non-static method is now forbidden. Searching for :: is probably going to yield a lot of results, and finding them manually is made difficult by class hierarchies and use statements. This is better left to PHP logs, or the exakat engine

The yield language construct no longer requires parentheses

Yield has changed of precedence. It may now be used without parenthesis within expression, though it may need some parenthesizing to be consistent with PHP 5 allowed syntax. Search for Yield and review anything that has more than one variable until the semi-colon (yield $y ;).

$HTTP_RAW_POST_DATA is no longer available

Easy textual search.

Removed support for assigning the result of new by reference

$x &= new someClass() ; or $x &= new someClass() ; & and new on the same line is the target. PHP 5 emits a ‘Deprecated’ error and PHP 7 a ‘Parse error’ so linting is the solution here.

Removed support for /e

85 % of preg_call use a hard coded regex or a in-site concatenation. Search for preg_replace, and see if the regex uses e as option. Of the 15 % remaining, most of them have variable that are defined close-by, so review them.

Finally, this will generate a PHP Warning : PHP Warning: preg_replace(): The /e modifier is no longer supported, that you can check in the logs.

Removed string category support in setlocale()

Check that setlocale has no more strings (‘ and « ) as first argument. Searching for setlocale(«  or setlocale(‘ should yield some results.

usort does not return the same result

When using a custom function for sorting, the order of the elements that are equals may have changed. Check the code there : https://bugs.php.net/bug.php?id=69158&edit=3.

Search for ‘usort’, ‘uksort’, and then review the callback function associated to this. If the callback never returns 0, for ex-aequo, then it is safe. Otherwise, the order of those values may change.

Wrap up

Searching with textual search is a good first step. This will provide quick results and also bring attention to part of the code that may be in need of spring cleaning.The large amount of false positive will slow the process.

Often, textual search is blind to the context of the code, and miss the actual semantic value of PHP tokens. The exakat engine version 0.3.8 (coming up) cover more than 50% of the previous issues, and, of course, new or removed functions, classes, keywords and interfaces.

Done To do Not 
Indirect variable
The global keyword
Parentheses around variables or function calls
Array elements or object properties that are automatically created by reference
Empty list() assignments are no longer allowed
list() will no longer assign variables in reverse order
list() no longer supports unpacking strings
list() is now always guaranteed to work with ArrayAccess
Iteration with foreach() no longer has any effect on the internal array pointer
When iterating arrays by-value, foreach will now always operate on a copy of the array
Iteration of plain (non-Traversable) objects by-value or by-reference will behave like by-reference iteration of arrays.
It is no longer possible to define two function parameters with the same name.
The func_get_arg() and func_get_args() functions will no longer return the original value
Exception backtracesno longer display the original value
Invalid octal literals
Bitwise shifts by negative numbers
Left bitwise shifts by a number of bits beyond
Strings that contain hexadecimal numbers
Codepoint Escape Syntax
The yield language construct no longer requires parentheses
$HTTP_RAW_POST_DATA is no longer available
Removed support for assigning the result of new by reference
Removed support for /e
Removed string category support in setlocale()
usort does not return the same result
New Functions
Removed Functions
New Classes
Removed Classes
New Constants
Totals : 30 18 10 2