Preloading support in 7.4

'ello, peoples.

TYPO3 v11 requires PHP 7.4 at least. 7.4 introduced Opcache Preloading. We should provide support for it.

What is Preloading: PHP: Preloading - Manual

Then the question is how to support it. After a tiny bit of research and discussion in Slack, I’ve determined the following:

  • There is an off-the-shelf Composer package we could pull in and use to generate the preload file: GitHub - Ayesh/Composer-Preload: Preload your sweet sweet code to opcache with a composer command, making your code faster to run. . It is configured via the composer.json file.
    Advantages: It’s already made and we can just use it. Other people use it (per Packagist), too, so it’s likely to stay maintained.
    Disadvantages: It uses opcache_compile_file(), which is sometimes good, sometimes bad. It would probably only work in composer mode. (Which I think is fine.) Each package (or extension) is responsible for determining what should and shouldn’t get preloaded, which is not always ideal.

  • Symfony built their own mechanism, which also allows Bundles to declare their own preload bits as well as YAML-based configuration (because Symfony). In their case, it’s done via class_exists() calls in a PHP file referenced by their standard preload builder. We could write something similar.
    Advantages: class_exists() is more robust, if the autoloader is available at the time preloading happens. Symfony sets it up that way, and we could do the same. It would also allow including files that contain just functions with include_once(). 5.1 also added support for a preload tag in their container configuration, although I don’t know much about it. That may offer some fine-grained control if we want to just steal (yay OSS) whatever their code for that looks like.
    Disadvantages: I don’t know that Symfony’s approach is packagable, so we’d be writing our own implementation based on theirs. Not the worst thing in the world, but it would require a command line tool of some kind in core, even if it’s just a one-off script.

  • An unclear question is who is in the best position to determine what should be preloaded: The extension author or the site owner. Both have good arguments in their favor. From experience, I don’t think “allow both to do anything” is a good idea because then you need some kind of alter event/hook to let the site build process modify what extensions do, or what other composer libraries do, and that gets to a really messy and bug-ridden place very quickly. The most flexible we probably want to get is allowing packages to opt-in files, and allowing a site-specific config to opt-in files, but no way to opt-out files that the package opted in. Making it purely additive keeps it simple.

  • Any files that are specified in a composer 'files’ autoload list should be preloaded. They’ll be loaded on every request anyway, so may we well preload them and get it over with. There’s probably some way to introspect the composer.lock file and get a list of those, but I don’t know of an existing bit of code for it. I may just have not found one yet. (If so, we should write one as a stand alone component to share.) Note: There might be a caveat on this one with some files, like environment set as some hosts use (eg, Platform.sh). I haven’t experimented with this fully.

  • As a reminder, this would all be opt-in. I’d recommend it for most sites unless they’re on crappy cheap hosting that share their FPM process with other sites. (Preloading is a bad idea there.) The level of benefit will vary widely, but in most cases it should be at least something.

There are likely other considerations. This isn’t a mandatory feature, but for a codebase this large it seems a good win to get something in place, even if it’s intended to evolve as we get a better sense of what the “best” set of classes to load is.

Discuss.

I’ve been looking into this question since the feature was announced for 7.4 and discussed (and thrown out again) as a composer feature rather than a composer plugin and from my understanding, the problem is indeed to know what you need so you don’t fill up the memory with unneeded classes and code which might end up slowing down your website rather than improving its performance.

Back then, someone proposed to go down the path of usage statistics with opcache_get_status() and generate a preload.php based on the loaded files. This way, the chunk of your application/framework that “simply is there” but you don’t actually need 99% of the time sits on the disk and does not clutter your memory. This would go with your idea of leaving the choice to the site owner.

The question is: how and when do we get the opcache_get_status() statistics so we can build the preload.php :-/

In a CI/CD environment for a high traffic website, you could schedule a task runs one hour after the last deployment and then create a new preload.php based on the new statistics, deploy the file and reload the PHP-FPM service. But how could you achieve this in a “simple” composer based construct?

A question/problem that’s still unsolved for all of the PHP environment is how to properly restart your fpm service without generating any downtime, but that’s another can of worms.

My thinking here is to provide the framework and mechanism to do preload building, include the core bits that are obvious, and provide a clear way for site owners to do their own analysis and further preload stuff if it makes sense for their use case. Making it strictly additive makes it much simpler, so whatever happens out of the box should not try to be perfect, just a good start

Which I suppose means we’ll need to do some kind of custom work above and beyond the composer plugin, even if we start with that as a component of it.

I see two possibilities on how to achieve an optimal minimum set:

  • build a tool or script that lets the community gather information about the opcache cached core scripts (should be fairly simple) and ask as many people as possible to share this information and base a fixed preload on this
  • try our best to guess what files are actually loaded

This would allow to create a fixed set that we can provide. On top of that, we can build a typo3/preloader or an extension/plugin for helhum/typo3-console (as an example). Preloading on a system base could then be configured via yaml or any or php, allowing website integrators (and by extension administrators) to define what they want to preload in versionable manner that can be used to build a list of files on build-time.

I already got an initial list by sticking a print statement in the composer autoloader, then loading the home page and admin dashboard. Not perfect, but it does get a list of a few hundred classes that are most likely used nearly-always. I figure that’s an acceptable v0.1 list.

At the moment, I think I am now leaning toward including a file in core that reads vaguely like (pseudo-code-ish):

require_once('autoload.php');

class_exists(Some\ClassHere::class);
class_exists(Other\ClassHere::class);

if (file_exists('custom_preloader.php')) {
  require_once('custom_preloader.php');
}

We can prebuild the initial list ourselves, and then tell individual site owners “yo, stick either class_exists() or opcache_require_file() for stuff you want in custom_preloader.php if you are so inclined”.

Then if someone wants preloading, they add the ini setting to load that file. It picks up all of the core files we whitelist, plus whatever a site owner wants to add in their own file. kthxbye.

What that does not allow is for individual extensions to add their own files to the list. Whether that’s good or bad, I’m not sure. If we want to add that, it would probably be an extra block between the hard coded list and site-specific include.

For completeness’s sake, I found another pre-existing library: https://github.com/DarkGhostHunter/Preloader

I don’t think that’s viable either, though. It works by analyzing the opcache and generating an optimized file out of it. Which is great, but requires a writeable file system which many cloud hosts do not have, for good reason. (It’s a security issue.) There might be ways to make that work, but it seems much more complicated.

For now, I’ve posted a static-list patch for consideration and iteration: https://review.typo3.org/c/Packages/TYPO3.CMS/+/69369