Declare all the types (with rector)

Hi friends,

since the first introduction of type declarations for arrays and objects back then in PHP 5.3, via type declarations for simple types in PHP 7.0 up to PHP 8.0, which will allow union types, all PHP projects have to choose whether they want to take the duck typing approach (walks like a duck, quacks like a duck, it’s a duck) or the type declaration approach, which strictly defines if a method allows ducks or not.

TYPO3 always chose the latter and since that decision for or against type declaration isn’t up to the discussion, the way and pace are. Less metaphorically speaking: TYPO3 still has dozens of classes, methods and properties without a type declaration and the question is how to change this.

Until now, type declarations have been added to new code and code that has been “hardened” or refactored. When I hardened classes and methods in Extbase in TYPO3 10, I did it patch by patch and had to write a lot of documentation as all those changes were potentially breaking. I did a lot of research last year and looked for a way to ease those migrations and there actually is one called “rector”. Rector is a tool, which performs changes on your code base, based on certain rules. There are hundreds of so called rectors out there to automatically change your code base and one set of those rules is just about type declarations. https://github.com/rectorphp/rector/blob/master/docs/rector_rules_overview.md#typedeclaration

I used rector to perform automated updates on the TYPO3 code base, to check out it’s potential. Rector is still a bit unstable but in general it performs great and it can literally save us days of manual work. There are two patch sets that show the result of using rector on the TYPO3 code base. Both have been done in minutes actually.

What I would like to have as discuss here, is the usage of rector to automatically add type declarations in one single patch to the whole code base.

The pros are:

  • When it’s done, it’s done for good.
  • It’s potentially breaking, but just once.
  • It saves a lot of time doing all that work manually
  • We have a more stable code base at once.

The cons are:

  • It’s potentially breaking, can and will result in regressions
  • It will result in more backporting issues as patches will not be applicable without merge conflicts

I would like to hear your opinion on this topic. What do you think about the pros and cons I listed here? What are pros and cons I didn’t list here? Will this do more harm that it does good or is that a big opportunity to lift up the code base in one go? Let me know in the comments.

Edit 1:

I would like to add that extension authors can use rector as well, migrating their own code base in minutes, too. When using XClasses for instance, an author usually wants to override one or several methods while keeping the method signature as is. Using rector, those methods can easily be automatically adjusted to fit the method signature of the original class.

Edit 2:

Covariance and contravariance have improved a lot since PHP 7.2 and even more since PHP 7.4. That means, that not all type declaration changes are by default breaking. A more specific type declaration for method params is not an issue as seen here Online PHP editor | output for Y7Q4W

In contrast to that, having a less specific return type (or none) will eventually raise an error as seen here Online PHP editor | output for 3jirO

Edit 3:

Here are two patches covering Claus’ approach:

Organizational

Topic Initiator: Alexander Schnitzler
Topic Mentor: tbd

In general I am all for stricter types, but as an extension developer and maintainer I must admit that it worries me slightly, mostly due to the expected large scope of potential breaking when passing parameters that before would be duck-typed, to methods that then require strict types. It also worries me that perhaps the same code base might no longer be able to support multiple TYPO3 versions.

I’d need to take a more thorough look at “how bad” it would actually be before I can fully form an opinion.

We might also want to consider if it makes sense to start with private and protected methods, and leaving the public API untouched in the first iteration?

That is a brilliant idea. Would require us to fork some rectors and make them configurable but that’s totally doable.

Such a mass change really needs a deprecation path (like the introduction of namespaces). Both code in the core and in extensions will break if strict typing is introduced.
Is there a way to introduce deprecation notices first and have strict typing in the next release? It seems that a rector can be made to turn PhpDoc type “hints” into deprecation notices. And in a later release remove them and put the type declarations in place.

Cons

  • Takes much longer to have strict typing

Pros

  • deprecations can be used to fix usages inside the core without regressions
  • core sticks to deprecation strategy for breaking changes

Just to understand how that would look like. You mean something like this…

/**
 * Convert from one charset to another charset.
 *
 * @param string $inputString Input string
 * @param string $fromCharset From charset (the current charset of the string)
 * @param string $toCharset To charset (the output charset wanted)
 * @return string Converted string
 */
public function conv($inputString, $fromCharset, $toCharset)
{

…would result in that…

/**
 * Convert from one charset to another charset.
 *
 * @param string $inputString Input string
 * @param string $fromCharset From charset (the current charset of the string)
 * @param string $toCharset To charset (the output charset wanted)
 * @return string Converted string
 */
public function conv($inputString, $fromCharset, $toCharset)
{
    if (!is_string($inputString)) {
        trigger_error('$inputString is supposed to be of type "string", "int given"', E_USER_DEPRECATED);
    }

   // the same if statement for the rest of the params

, right?

I did a little more research and there are some results I would like to share with you all.

I modified a rector to only introduce type declarations for private and protected methods. This decreases the chance to break public api massively and there are no deprecations needed in this case. The demo patch is this one https://review.typo3.org/c/Packages/TYPO3.CMS/+/64430. The patch clearly shows that introducing typo declarations will result in bugs since we do not follow type hints internally yet.

So the question is, how to make sure that those issues do not occur and it seems to be an impossible task but it is not. Gladly we already introduced phpstan in our code base and have plenty of checks enabled. One of those checks is about checking method arguments and type safety which is currently disabled in TYPO3 because once enabled it throws 1300 errors. That’s very much but if we get those tackled in the upcoming months, we can be very sure that the patch from before will not result in failing tests and it will be 99% regression free. Because static code analysis is very accurate and reliable.

So, my idea would be to improve phpstan coverage and then introduce type declaration changes in internal code without deprecation warnings and then think about deprecation warnings in our public api.

I like this idea too, but I suggest optimizing these steps for review. Yes, I still like to have manual reviews of those changes done by the tools.
So an idea would be to start on a per-extension basis and there with the “leaf”-extensions.
The last one would be ext:core then.
This way we have reviews in a doable size, reduce the area of potential regressions and learn a lot along the way of migrating ext by ext.
What do you think?

Yeah! A way better solution to fix all calls too.

I like the idea, imho starting also with the newest should have the least impact in the extension world.

This topic was automatically closed after 14 days. New replies are no longer allowed.