PHP 7 static analysis tools
PHP 7 static analysis tools

PHP 7 Static analysis tools

With PHP 7 officially live, it it time to review code and get it ready for migration. It is now time to think about migrating code to the new version, taking advantage of new features et reduced server load. This means reviewing all the code : it may be too much to be done manually.

This is where static analysis is an important help. It is a type of software that read code without executing it, and search for pattern that leads to issues. And you may rejoice : we found no less than three Open source PHP 7 Static analysis tools.

PHP 7 introduce several features that are beneficial to static analysis. The main is the internal AST : Abstract Syntactic Tree. This is an organized representation of PHP code, between tokens and execution. At that point, PHP tokens are totally organized as a tree, that only needs to be indexed and traversed to find bug patterns.

There are three Open Source Static Analyzers for PHP : phan, exakat and tuli. Let install them, and run them on a middle size project of 1000 PHP scripts.

Phan

Phan is the experimental PHP static analyzer by Rasmus Lerdorf. It was started in July 2015, to start collecting contributions and has grown tremendously. It relies on PHP 7’s AST extension, to decipher PHP code. Then, using two passes, phan scans the code, build several hashes of features, and spot interesting pieces that should be reviewed.

Installation

Phan requires PHP 7 and the ext/ast from Nikita. Ext/ast is is not included in the standard distribution, nor on pecl, and is available on github. Phan also uses composer for extra components.

git clone https://github.com/etsy/phan.git
cd phan
composer install

Running

Phan works on standalone files or a provided of files. It works on PHP 5 and 7 code. Phan supports phpdoc comments, so when ‘@return string’ may be used to typehint return in PHP 5, and phan reports potential violations. It spots undefined classes, wrong parameters or Uniform Syntax Variables.

php phan file.php
php phan -f filelist.txt

// Other help
php phan -h

Results

Phan results are in text, with file, line and error, all on one line. Phan reported 3080 issues : Undefined classes, wrong time for arguments, etc. It reported lots of Undefined classes.

Tasks/CleanDb.php:124 UndefError call to method on undeclared class Exception
Tasks/Doctor.php:54 VarError Variable $stats is not defined
Tasks/Initproject.php:45 UndefError call to undeclared method \Tasks\initproject->scheck_project_dir()
Tasks/Jobqueue.php:72 TypeError arg#2(mode) is bool but \stream_set_blocking() takes int
Tasks/Jobqueue.php:77 TypeError arg#2(mode) is bool but \stream_set_blocking() takes int
Tasks/Jobqueue.php:105 TypeError arg#2(mode) is bool but \stream_set_blocking() takes int
Tasks/Load.php:151 VarError Variable $x is not defined
Tasks/Load.php:270 VarError Variable $T is not defined

Phan runs fast, as analyzing 1000 files was done in 4 seconds. It processed 16 analysis, which are available in the src/Phan/Analyze folder of the source.

Finally

phan is in early stage. The github project is the good start for providing feedback or contributing. Documentation is scarce : the unit test folder is a great resource to find information.

Exakat

Exakat is the engine made to enforce clearPHP’s coding reference. It aims at auditing PHP code and providing a complete report with explainations, file and line of code. Exakat was started in 2014. It is based on a home-made AST, compatible with PHP versions from 5 and 7. It runs in two phases : the first load the tokens in the databases, build the AST and interesting hashes. The second run the actual analysis.

Installation

The code itself is available on github, or as a phar, on exakat.io. Exakat requires a Neo4j graph database, with Gremlin plugin. Exakat runs on a standard PHP 5 or 7.

  • Download the phar from http://178.62.231.40/download-exakat/
  • Install neo4j and gremlin from the manual : https://github.com/exakat/exakat/blob/master/docs/manual.md
  • check install : php exakat.phar doctor

Running

Exakat scans a whole project : it only requires the root folder and find all PHP scripts in it.

php exakat.phar init -p myproject -R https://github.com/exakat/exakat/
php exakat.phar project -p myproject

Exakat ran about 22 minutes over the 1000 files used previously. A total of 502 analysis were run on the code. Though the first results were available after 5 minutes, and were completed as the engine was still running.

The final report is available as text, xml or HTML.

Results

Exakat reported 4956 issues, with 64 analyzes that reported some finding. Analysis are grouped in several recipes : Security, Compatibility PHP 7, Performance, Dead code, New Features, etc.

scripts/merge_sqlite.php:46 Double Assignation
scripts/merge_sqlite.php:46 Static Methods Called From Object
scripts/merge_sqlite.php:40 Echo With Concat
scripts/merge_sqlite.php:33 Echo With Concat
scripts/merge_sqlite.php:24 Uses default values
scripts/merge_sqlite.php:24 No Hardcoded Path
scripts/merge_sqlite.php:30 Static Methods Called From Object
scripts/merge_sqlite.php:32 Queries in loops
scripts/merge_sqlite.php:48 Static Methods Called From Object
scripts/merge_sqlite.php:51 Static Methods Called From Object
scripts/merge_sqlite.php:57 Queries in loops
scripts/merge_sqlite.php:61 Static Methods Called From Object

Finally

Exakat is updated weekly, and keeps adding more analysis and speed improvement for large scale projects. Documentation is lagging, and feedback are welcomed on the github project.

Tuli

Tuli is the brain child of Anthony Ferrera. It is a prototype that offers a solid platform to start writing static analysis. It was started late 2015.

Installation

Clone tuli it from Github (and composer), or use directly composer. The AST is build using Nikita Popov’s PHP-parser. This one is made for PHP 5, and is replaced by ext/ast in PHP 7. No special dependency otherwise.

git clone https://github.com/ircmaxell/Tuli
cd tuli
composer install

Running

Tuli is quite fast, taking 33s for the 1000 files, though several files had to be removed from the original list to run to the end (and the recursion limit from xdebug had to be lifted).

Tuli ran 3 analysis, that are detailed in the code. This folder is also the folder where to write new rules.

The analyzer reported 266 issues.

Type mismatch on str_replace() argument 2, found unknown expecting string|array Analyzer/Docs.php:172
Type mismatch on str_replace() argument 2, found unknown expecting string|array Analyzer/Functions/IsExtFunction.php:36
Type mismatch on str_replace() argument 2, found unknown expecting string|array Analyzer/Interfaces/IsExtInterface.php:35
Type mismatch on strtolower() argument 0, found unknown expecting string Analyzer/Analyzer.php:1989
Type mismatch on strtolower() argument 0, found unknown expecting string Analyzer/Analyzer.php:2002
Type mismatch on strtolower() argument 0, found null|bool|int|float|string|object|callable expecting string Analyzer/Analyzer.php:2006
Type mismatch on strtolower() argument 0, found unknown expecting string Analyzer/Classes/toStringPss.php:35
Type mismatch on substr() argument 0, found null|bool|int|float|string|object|array|callable expecting string Analyzer/Analyzer.php:127

Finally

tuli is definitely a working prototype. It provides already some interesting feedback and is open to contributions. It uses a context free grammar for PHP, which is also a side-project from Anthony (https://github.com/ircmaxell/php-cfg).

Conclusion

PHP 7 is a laying ground for more static analysis, especially with Return and scalar Type Hint. Though, all three 3 PHP 7 static analysis happily work on PHP 5 code. Indeed, they all provide feedback to prepare code for PHP 7. That is a good reason to start using them today, and move to PHP 7 as soon as possible.

If we’ve missed another static analyser, please tell us about it, and we’ll add it. We also know there are some commercial analyzers, that we may review later.