PHP 7 Static analysis tools
With PHP 7 officially live, it it time to review code and get it ready for migration. It is now time to think about migrating code to the new version, taking advantage of new features et reduced server load. This means reviewing all the code : it may be too much to be done manually.
This is where static analysis is an important help. It is a type of software that read code without executing it, and search for pattern that leads to issues. And you may rejoice : we found no less than three Open source PHP 7 Static analysis tools.
PHP 7 introduce several features that are beneficial to static analysis. The main is the internal AST : Abstract Syntactic Tree. This is an organized representation of PHP code, between tokens and execution. At that point, PHP tokens are totally organized as a tree, that only needs to be indexed and traversed to find bug patterns.
There are three Open Source Static Analyzers for PHP : phan, exakat and tuli. Let install them, and run them on a middle size project of 1000 PHP scripts.
Phan is the experimental PHP static analyzer by Rasmus Lerdorf. It was started in July 2015, to start collecting contributions and has grown tremendously. It relies on PHP 7’s AST extension, to decipher PHP code. Then, using two passes, phan scans the code, build several hashes of features, and spot interesting pieces that should be reviewed.
Phan requires PHP 7 and the ext/ast from Nikita. Ext/ast is is not included in the standard distribution, nor on pecl, and is available on github. Phan also uses composer for extra components.
git clone https://github.com/etsy/phan.git cd phan composer install
Phan works on standalone files or a provided of files. It works on PHP 5 and 7 code. Phan supports phpdoc comments, so when ‘@return string’ may be used to typehint return in PHP 5, and phan reports potential violations. It spots undefined classes, wrong parameters or Uniform Syntax Variables.
php phan file.php php phan -f filelist.txt // Other help php phan -h
Phan results are in text, with file, line and error, all on one line. Phan reported 3080 issues : Undefined classes, wrong time for arguments, etc. It reported lots of Undefined classes.
Tasks/CleanDb.php:124 UndefError call to method on undeclared class Exception Tasks/Doctor.php:54 VarError Variable $stats is not defined Tasks/Initproject.php:45 UndefError call to undeclared method \Tasks\initproject->scheck_project_dir() Tasks/Jobqueue.php:72 TypeError arg#2(mode) is bool but \stream_set_blocking() takes int Tasks/Jobqueue.php:77 TypeError arg#2(mode) is bool but \stream_set_blocking() takes int Tasks/Jobqueue.php:105 TypeError arg#2(mode) is bool but \stream_set_blocking() takes int Tasks/Load.php:151 VarError Variable $x is not defined Tasks/Load.php:270 VarError Variable $T is not defined
Phan runs fast, as analyzing 1000 files was done in 4 seconds. It processed 16 analysis, which are available in the src/Phan/Analyze folder of the source.
phan is in early stage. The github project is the good start for providing feedback or contributing. Documentation is scarce : the unit test folder is a great resource to find information.
Exakat is the engine made to enforce clearPHP’s coding reference. It aims at auditing PHP code and providing a complete report with explainations, file and line of code. Exakat was started in 2014. It is based on a home-made AST, compatible with PHP versions from 5 and 7. It runs in two phases : the first load the tokens in the databases, build the AST and interesting hashes. The second run the actual analysis.
- Download the phar from http://18.104.22.168/download-exakat/
- Install neo4j and gremlin from the manual : https://github.com/exakat/exakat/blob/master/docs/manual.md
- check install : php exakat.phar doctor
Exakat scans a whole project : it only requires the root folder and find all PHP scripts in it.
php exakat.phar init -p myproject -R https://github.com/exakat/exakat/ php exakat.phar project -p myproject
Exakat ran about 22 minutes over the 1000 files used previously. A total of 502 analysis were run on the code. Though the first results were available after 5 minutes, and were completed as the engine was still running.
The final report is available as text, xml or HTML.
Exakat reported 4956 issues, with 64 analyzes that reported some finding. Analysis are grouped in several recipes : Security, Compatibility PHP 7, Performance, Dead code, New Features, etc.
scripts/merge_sqlite.php:46 Double Assignation scripts/merge_sqlite.php:46 Static Methods Called From Object scripts/merge_sqlite.php:40 Echo With Concat scripts/merge_sqlite.php:33 Echo With Concat scripts/merge_sqlite.php:24 Uses default values scripts/merge_sqlite.php:24 No Hardcoded Path scripts/merge_sqlite.php:30 Static Methods Called From Object scripts/merge_sqlite.php:32 Queries in loops scripts/merge_sqlite.php:48 Static Methods Called From Object scripts/merge_sqlite.php:51 Static Methods Called From Object scripts/merge_sqlite.php:57 Queries in loops scripts/merge_sqlite.php:61 Static Methods Called From Object
Exakat is updated weekly, and keeps adding more analysis and speed improvement for large scale projects. Documentation is lagging, and feedback are welcomed on the github project.
Tuli is the brain child of Anthony Ferrera. It is a prototype that offers a solid platform to start writing static analysis. It was started late 2015.
Clone tuli it from Github (and composer), or use directly composer. The AST is build using Nikita Popov’s PHP-parser. This one is made for PHP 5, and is replaced by ext/ast in PHP 7. No special dependency otherwise.
git clone https://github.com/ircmaxell/Tuli cd tuli composer install
Tuli is quite fast, taking 33s for the 1000 files, though several files had to be removed from the original list to run to the end (and the recursion limit from xdebug had to be lifted).
Tuli ran 3 analysis, that are detailed in the code. This folder is also the folder where to write new rules.
The analyzer reported 266 issues.
Type mismatch on str_replace() argument 2, found unknown expecting string|array Analyzer/Docs.php:172 Type mismatch on str_replace() argument 2, found unknown expecting string|array Analyzer/Functions/IsExtFunction.php:36 Type mismatch on str_replace() argument 2, found unknown expecting string|array Analyzer/Interfaces/IsExtInterface.php:35 Type mismatch on strtolower() argument 0, found unknown expecting string Analyzer/Analyzer.php:1989 Type mismatch on strtolower() argument 0, found unknown expecting string Analyzer/Analyzer.php:2002 Type mismatch on strtolower() argument 0, found null|bool|int|float|string|object|callable expecting string Analyzer/Analyzer.php:2006 Type mismatch on strtolower() argument 0, found unknown expecting string Analyzer/Classes/toStringPss.php:35 Type mismatch on substr() argument 0, found null|bool|int|float|string|object|array|callable expecting string Analyzer/Analyzer.php:127
tuli is definitely a working prototype. It provides already some interesting feedback and is open to contributions. It uses a context free grammar for PHP, which is also a side-project from Anthony (https://github.com/ircmaxell/php-cfg).
PHP 7 is a laying ground for more static analysis, especially with Return and scalar Type Hint. Though, all three 3 PHP 7 static analysis happily work on PHP 5 code. Indeed, they all provide feedback to prepare code for PHP 7. That is a good reason to start using them today, and move to PHP 7 as soon as possible.
If we’ve missed another static analyser, please tell us about it, and we’ll add it. We also know there are some commercial analyzers, that we may review later.