As PHP evolves
With the upcoming PHP 7.3, the question of the next migration is back on our tables. We’ll hear a strange mix of begging to move to the new versions for features and security, mixed with a constant threatening that some old versions will soon be unmaintained or worse. Indeed, there is versions impact on PHP code bases.
Particular witnesses of that evolution are open source projects. They usually have to support a wide range of versions, and in the same time, are actively encouraged to use the newest features. It is a delicate work of balancing compatibility and progress.
To observe the way Open Source projects handle the evolution of PHP, we have audited 1977 Open source PHP projects, by linting them with PHP versions from 5.5 to 7.3. This taught us the impact of PHP on code, over a long period of coding. Let’s review them.
First, we’ll introduce PHP lint. PHP lint is one of the older static analysis tool: it reads a text file, and checks its spelling and some of its syntax, without running it.
The check is extremely fast, as PHP also uses them when executing code: it is the first phase of executing a script. As such, those are light checks: basically, linting makes sure that the file may be parsed into tokens, and quite a lot of sanity rules apply. Those are light checks, as more validation of the code will happen at execution time. Code may lint, but, sometimes, won’t execute.
PHP lint is a command line call. Open a Terminal, and direct PHP to a file:
php -l test.php No syntax errors detected in test.php
Linting everyday is just as important as flossing. It’s actually vital to only push to the repository code that compiles: it does save a lot of ridicule. At least, files must compile in one of PHP version that your application must support. More is better, though.
Setting up hooks to check PHP code at commit time to have a positive impact on your application : no more surprising uncompilable PHP code, which are always sneaky enough to reach production. No more one minute fix that end up in a catastrophe.
53% of OSS project lints from 5.5 to 7.3
We checked 1977 OSS PHP projects, and found 53% are consistently compilable across PHP versions from 5.5 to 7.3. The maximum error reporting level was used, so every notice and warning were reported. And yet, most of the code never yielded any error.
3.5% never lints
Never linting code is the contrary to always linting code. Some code repositories actually hold files that are consistently never compilable, in any PHP version. Since they are part of a functional application, the only hope is that they are never executed. This also implies that they are never tested. Sadly, this happens. We found 3.5% projects that are hold such files.
Most of them have 1 unlintable file, but they may have more: up to 21. Although 21 is a world record, there are also applications with 13 or even 7 files that are never compiled. That’s very weird.
As usual, there are some valid reasons for allowing files that are not compilable in a repository. They are rare, but they do happen, obviously.
The most common reason is PHP code generation: when the original code has to generate some PHP code, it actually does it with the only tool that PHP has to produce code: templates. Templates are a text file with placeholders that are filled with values at execution time. Most of them are not executed per se, but will processed before being executed by PHP. They end up looking like PHP files, but they are not really.
Young and old projects
Besides the always and never linting projects, there is another breed of applications: the young and the old ones. Let met show you them:
The one on the right is a young application. The one on the left is an old.
All applications are started with the current PHP version, or sometimes, the upcoming. For example, if you had to start a PHP application from scratch today, betting on PHP 7.3 is a good idea. Starting 2 years ago, we would be betting on 7.0. By the time a prototype is actually running, your code will be fit for the best PHP version ever.
When starting a new application, little consideration is applied to backward compatibility. No one wants to track the new features of PHP that are used in his code. Nor set aside a great new feature just for backward compatibility sake. Indeed, this is the main reason to start from scratch. So, a new project start with a new version of PHP, and keeps forward compatibility.
So, young projects are usually 100% compatible with a version and all its following versions. And it usually is less lintable with the older versions, as more and more incompatibilities adds up.
On the other hand, old projects have the opposite trajectory. They may have started with the best PHP version of their day, but at some point in history, they stopped upgrading the code. The source is now vulnerable to backward incompatibilities, and keep adding more of them each version. So, it starts with perfect compatibility and then it degrades as PHP version goes up.
19% young code, 19% old code
With those definitions in mind, we could detect that 19% of PHP codes are young codes, and as much are old code. In fact, old projects are slightly less numerous than young ones.
The ratio between young and old seems quite balanced. It is natural for older projects to be abandoned, and for new projects to emerge. The balanced ratio shows a healthy maturity for the platform.
Version of birth, version of death
Obviously, we cannot apply the same reasoning to ‘always-linting’ code, which are representing the large majority of PHP projects. On the other hand, based on the definition of old and young, the most interesting piece of information is the version of start or end. Here is the breakdown in terms of numbers.
PHP 7.0 obviously attracted a lot of attention, and generated a whole new generation. PHP 7.1 and 5.6 both have the same level. They represent the appeal of a new version on developers. The more interesting features yield more new projects.
PHP 7.0, with its larger than usual amount of backward incompatibilities, acts as a dam: projects started in 7.1 may easily support PHP 7.0, while those starting with PHP 7.0 definitely don’t want to support one major version before.
On the other way about, a lot of application died with PHP 7.0, just for the same reasons as above: the update of the code base was too much for a project that was running out of energy. Yet, some also died with PHP 7.0, as PHP 7.1 has some incompatible backward features.
The final 36 applications that are ending with PHP 7.2, are probably still in the process of getting an upgrade. That number may very well go down within the next month, as code is upgraded and ready to run for 7.2, then 7.3.
The middle aged application
There is one last remarkable pattern in PHP linting evolution. Initially, we shall call it the middle-aged application. This is an application that is both old and young at the same time. It looks like this :
The whole graph is a kind of valley. It looks both like a young application which started with a lot of incompatibilities, or an old application that is fighting to stay alive with a specific version (or two).
Those are quite rare, as they represent company work. Such code is not open source, so it is probably under represented in our survey. Yet, some projects are open sourced as part of company strategy, so some of them ended up as open source projects.
It starts as a young code, with the last and brightest PHP version of their time. Then, it goes to production, based on this version, and in the same time, it gets stuck on that version. Production supports revenues. Since production is based to the code, no stakeholder dares to go for an upgrade : too much uncertainty.
At that point, the code is trapped in an self-reinforcing loop : the more delays it accumulates, the riskier gets the upgrade, the harder it is to convince everyone to actually make it happens. The code is kept running, at its initial version, and then never upgraded to the newer versions.
If you’re not currently linting your code, just do it. It is easy to set up, and to enforce. That will tell you when your code is becoming vulnerable to an upgrade, but with a few months of advance. That gives more time to prepare, either fixing, or circumventing the problem. Anticipation is always cheaper.
Your own linting profile tells a lot about your application and its environment. May it be a special attention to backward compatibility, a knack for using only the most advanced PHP features, or hidden reasons to be stranded on a specific PHP version: next time you get a legacy piece of code, check that profile and be ready.
When migration is important for your code, add static analysis on top of linting. Tools like Exakat are able to review PHP code and report errors that are not detected by PHP lint. This helps you target your migration effort to the right part of the code, and be more than ready.