Method fossilisation

Fossilisation happens when things get harder to update. It is a process, more than a final state. In particular, it happens to methods when the signature is shared across multiple classes. With interfaces and inheritance, changing some of the elements of a method’s signature means refactoring the code in several other locations. The more the harder : method fossilisation at work.

Single Methods

A single method may be updated quite freely. It is single when it is not overwritten by a child class, nor it has a clone in an ascendant class. As such, updating the signature has only a local impact.

Obviously, it may have impacted the callers of this method. In this article, we will only consider the impact on the rest of the code, not on the code users.

Indeed, changing the code signature may be without impact. For example, adding a new typehint to a method : the method is impacted, and beside any illegal call, the callers won’t be affected.

<?php

class x {
    // target : function method(int $a) {
    function method($a) {
        // simple method, isn't it?
        return $a + 1;
    }
}

?>

Inherited Methods

An inherited method is a method with one or more parent or/and child. In the illustration below, you can see two inherited methods : one is in class x, and the other in class y. Since their signatures have to be compatible one with the other, we may consider them as one and only signature. The actual block of executed code is not important here.

<?php

class x { function method($a) { // simple method, isn’t it? return $a + 1; } }

class y extends x { function method($b) { // overwritten method, with distinct code return $b + 2; } }

?>

Argument names are free

Changing the name of one of the methods will split them, and we’ll be back to our initial case of single methods. That point is quite obvious.

Then, we can see that PHP doesn’t enforce the same names in both signatures : arguments may be freely renamed, between versions, without impact. At least, no impact for PHP, as it will have significant impact on whoever is reading the code. Imagine the code above, with method($a, $b) turning into method($b, $a) in the next class, to see the potential damages.

This will change when PHP gets named parameters, which is bound to happen in PHP 8.0. Until then, arguments are not part of the compatibility checks for PHP.

Numbers of argument are important

On the other hand, the number of compulsory arguments is important. It is not possible to change the number of arguments, unless it doesn’t change the number of compulsory arguments. Those are the arguments without a default value. The simple rule is that x::method() may be called just like y::method(). This is the principle of replacement : the methods are interchangeable.

The number of actual arguments may change between versions of method, as long as the extra arguments hold a default value. When this happens, the number of compulsory arguments stays the same, although both methods may be called with different numbers of arguments. Those arguments will simply be ignored when they are not compulsory : this is different, but compatible and very PHP style.

<?php

class x {
    // one compulsory argument
    function method($a) {
        return $a + 1;
    }
}

class y extends x {
    // one compulsory argument, one optional argument
    function method($b, $increment = 2) {
        return $b + $increment;
    }
}

?>

Note that having different numbers of arguments between inherited methods is both compatible and quite surprising. Given that those classes live in different files, and are rarely in the same screen, unlike in the illustration above, such difference will not be visible, until someone used to class y will try to use the second argument on class x, or someone used to class x, will clean the code and remove argument 2. What is legit for PHP syntax may be cumbersome for a shared code base, or code evolution.

Finally, the argument constraints don’t apply to the constructor : they may change their signature freely, from one class to its child.

<?php

class foo {
    function __construct() {}
}

class bar extends foo {
    function __construct($a, $b) {}
}

?>

Method fossilisation

After reviewing the simple situations for inherited methods, we are now reaching the special case of fossilisation. This happens when the same method is overwritten multiple times.

Method fossilisation in real code
Method fossilisation in real code : up to 13 copies

For example, in this project, we can see those two methods have up to 13 versions, and several have 11 versions. Commonly, constructors and other magic methods, follow the same patterns in class hierarchies, although it is not necessary.

Such a high level of inheritance emerges with historic code, where several layers of codes were added over the course of several years; or, when a component strategy has been adopted, were one template has been chosen, as an interface or an abstract class, and multiple components were created, fitting the same mold.

As the code evolves, you may end up with a diagram like this one.

 

The tree of fossilisation
multiple levels of fossilisation

Each square is a class, with one common method : only the class is shown. If some adaptation is needed in C2, it means refactoring the method in C2. Which, in turns, leads to refactoring the method in C1, since C1 is the parent class. C1 has 2 other children, namely C3 and C4, which will also require some update. Then, we’ll move to the grandparent, C5, where the same process will repeat itself. In the end, changing the method signature in C2 means changing code in 6 classes, across the application.

Application to Typehints

This has a direct impact on the choice of typehints in method signatures. Typehints are checked by PHP for compatibility between a family of classes. This means the choice of the typehint is important setting it.

With PHP 7.4, return typehints are covariant, which means that they may return a more specific type than the parent; the argument typehints are contravariant, which means that they may accept a more generic type than their parent.

Adopting typehints

In particular, this means that it is easy to overlook the problem of fossilized methods when adopting typehint. As you can see below, sprinkling a scalar typehint in some of the methods is possible and legit.

<?php

class foo {
    function method(int $a)  {}
}

class bar extends foo {
    function method($a) : int {}
}

?>

Later, this approach will run into a difficult situation where some methods have adopted different typehints. At that point, fossilized methods will require collaboration across the code to harmonize it.

Scalar cul-de-sac

Covariance and contravariance are very useful to escape method fossilisation hell. Yet, they have a dead corner : scalar typehints. Scalar typehints are usually the first types adopted: by mere observation of the current situation, or propagation of the already established types, scalar tend to invade the code faster than class typehints : they do not require much set up, and are easy to understand.

They are also harder to refactor.

Being a scalar value means that variances (direct or contra) can’t be used to turn a simple float value into a more (or less) meaningful object.

<?php

class foo {
    function method(float $a) : y {}
}

class bar extends foo {
    function method(float $a) : x {}
}

// For context only
class x extends y {}

?>

Combined with fossilized methods, scalar types create a nice roadblock : the change must be coordinated across multiple classes, to have every of them jump from scalar to class typehint. This is usually difficult.

Interfaces typehints

Finally, the fossilized methods are the most flexible with interface-style typehints (or abstract classes). There is some margin to evolve both arguments and return types independently, while keeping a consistent way to call all those methods in a sane way.

<?php

class foo {
    function method(x $a) : y {}
}

class bar extends foo {
    function method(y $a) : x {}
}

// For context only
class x extends y {}

?>

Avoiding method fossilisation

Given all the ground we have covered from the beginning, you may have spotted several ways to escape method fossilisation already.

  • Monitor fossilisation : Exakat’s static analyzer spots all the fossilized methods. In the Ambassador report, check ‘Audit logs > Fossilized methods’.
  • Reduce and remove fossilisation : at low levels, some inheritance is not avoidable, nor desirable. Exakat has an analysis to set the level of acceptable fossilisation, so you can be warned in time.
  • Limit the usage of scalar typehints for large families of classes : make sure that returning a boolean or an integer will be reasonable in the forseeable future. Skip scalar types and move to interface-style types, for more flexibility.
  • Don’t forget your users : changing a typehint or a method signature is a compatiblity break. Even when it seems without impact, mention it in the docs, and keep it for a major version.
  • Skip inheritance : no inheritance, no fossilisation.

Exakat is a tool for analyzing, reporting and assessing PHP code source efficiently and systematically. Exakat processes PHP 5.2 to 7.4 and 8.0 code, as well as reporting on security, performance, code quality, migration, modernisation…