Exakat 1.2.4 review

Exakat 1.2.4 features a special report ‘Confusing variables’, that help readability by reducing the number of look-alike among variables. Several bugs were also hunted and get rid of. In the same time, we have some more recommendations about memory usage. It is a bright cold day in April, and the versions are striking 1.2.4: the Exakat 1.2.4 review.

Out of memory error

Exakat uses very little memory by default. It will stop when it reaches PHP or Gremlin memory limits. This prevents very long processing time, in particular when one is starting to use it. Exakat has to be configured with two directives to run with larger codebases.

PHP

By default PHP CLI is limited to 128M which is OK for the smallest sources. It is advised to set memory_limit to -1, aka unlimited. If still want limits with PHP, just in case, give PHP-cli 2G of RAM, and it should be enough for any codebase.

JAVA_OPTIONS

Exakat makes uses of gremlin server. Gremlin runs with Java, and its memory limit can be configured with the $JAVA_OPTIONS.

Set the variable before starting Exakat. For example: export JAVA_OPTIONS="-Xms32m -Xmx1512m";. Xmx is the option that controls the maximum amount of memory : here, it is set to 1,5G, or 1512Mb.

By default java runs with 512M of RAM, which is good for smaller projects, up to 200 files (your mileage may vary). Exakat has been tested with 15M tokens projects and it needs 6,5 Gb of RAM for that. A total of 8G of RAM is enough to audit the largest PHP open source code.

exakat doctor

Check that all is OK before running Exakat with php exakat.phar doctor.

Exakat 1.2.4 now reports an explicit message when it reaches out of memory error.

Reporting Closely Named Variables

Variable names are totally free in PHP: as long as you start with a $ and use letters, numbers, underscores and a wealth of other unicode characters, you can use pretty much what you want to name them. Yes, there are conventions to name variables, such as snakecase, CamelCase and UPPERCASE for globals.

Yet, it is infuriating to search for a bug and realize that $_attributevalue is not the same as $attributevalue1, $attributevalue2, $attributevalues, and that it was actually $_attributevalue.

Confusing variables

The Ambassador report (the default report of exakat) has a ‘Confusing variables’ section in the ‘Inventories’. It collects variables that are look-alike, and may be confused.

Variable names differ from each other by small amounts, that makes them hard to tell apart, if not paying close (sic) attention:

  • one _: $table_name and $tablename
  • one figure: $action, $action1 and $action2
  • Case only: $value and $Value
  • One letter: $sub and $stub
  • Inversion: $fieldtaxonomy and $taxonomyfield

More conventions?

As the results from this analysis emerge, some new patterns emerge. For example, numerous variables have a plural form, to manage list of items. For example, $rolenames and $rolename. Though, $idslist and $idlist tends to be hard to read.

Also, a lot of variables has an initial _ that may be a left over from older time, where variables were dubbed protected by merely adding an _ to its, and respecting it.

Moreover, some variables may differ slightly, but still look very different to the human eye: $_page, $lpage, $pager and $pages are good examples. They are build with a lot of similarities, but their usage in the natural languages (such as English, French…) makes them easy to part.

Finally, those variables may be used in very different files and context. Is it important to make $tablename consistent with $nametable in two distinct files, three folders away ? No, until the day someone has to review code in both files, and suddenly mistake one for the other.

The local Global

An ultimate issue that bites at inappropriate moments: local global. Those are variables that are usually global, but are only used locally in specific situation. Basically, it is global everywhere, but not here.

<code class="language-none"><?php

// $x is global
global $x;
$x = new mysqli($host, $user, $pass);

function foo() {
    // This is a local variable, 
    // and it is usually a connexion.
    $x = 1; 

    //....
    $wpdb
}

?>

With the scoping of variables, the code is completely valid. Yet, there is obviously some cognitive collision that is waiting to happen : Why is $x an integer, while it is usually the database connexion?

Just remember that while $_GET and $_POST are superglobals, and don’t need the global keyword, other such as $http_response_header do need the global keyword to be working inside a function.

Where to go from there ?

Like with any inventories, the information lies in the list. Read the list, and decide for yourself which variable’s name is too error prone. Chances are that most of not so bad, but quite a few requires a fix. Then, fire up your IDE, and rename them.

After all, $customsymbolsserialized and $serializedcustomsymbols may stay in the code and still be viable.

Happy PHP code reviews

All the 340 analyzers are presented in the docs, including the juicy Could Use Short Assignation: upgrade your code with this sleek assignation, that combine an operator, an assignation and memory optimization. It’s a Classic PHP bug, rating at 45%.

You can check all of the exakat reports at the gallery: exakat gallery.

Download Exakat on exakat.io, install it with Docker, upgrade it with ‘exakat.phar upgrade -u’ and like us on github.