Unified Settings API RFC

crell · June 9, 2022, 3:14pm

Following my earlier posts on modernizing the TYPO3_CONF_VARS system using classed objects, I put together a prototype that was posted here. I got some pushback from leadership, though, mainly Mathias and Benni, that using PHP classes didn’t allow enough horizontal modification (one extension messing about with another extension’s configuration) and wouldn’t have a large enough impact. So that was put on hold.(I still think that approach has a huge amount of potential, just perhaps not here.)

Instead, they asked me to look into the TypoScript Constants and ExtConf systems, both of which use the same syntax at present, including a proprietary and inscrutable data definition language built into comment strings. Despite using the same syntax these are, at present, not the same system, and work entirely differently from each other. That is, I would argue, not a feature.

The goal has been to develop a better data definition schema for both systems and, potentially, unify them in a better way. I have a standalone prototype of that now available, and before I try to integrate it into core to see what happens we want to get some broader feedback, as that will impact the direction going forward.

Goals

Have a more robust schema definition for configuration values.
This schema definition needs to support robust validation, including runtime validation (i.e., “this is an int”, “this int is between 5 and 10”, and “this int is a valid pid in the system.”)
All saved values must be valid.
All values must be defined, and MUST have a default value at all times.
Defining a new value via a configuration file (YAML) must be straightforward.
Defining or modifying a property via PHP must be straightforward.
Accessing values from PHP code (in extensions) must be simple, robust, and DI-friendly. No globals, anywhere.
Values must be available from TypoScript, just as constants are now.
The existing constants.typoscript and ext_conf.template.txt go away, replaced by the new schema either in a single unified file or separate files. (See below.)
The more robust validation and format information allows us to build a much more robust automatic UI editor (similar to the existing Constants Editor or Ext Configuration editor, but much more feature-rich), so that extensions can simply provide values and hand over user-interaction to a common tool in core.
As a nice-to-have bonus, having more robust information of this sort available would enable fancier things like a code-assistance feature for accessing constants in TypoScript edit dialogs in the backend. (This is NOT in scope at present, but an example of the sort of cool stuff we want to enable.)

The basic design

The “Settings API”, as I’ve been calling it, uses the dotted-name format of TypoScript constants. However, there is no intrinsic hierarchy. Settings Properties are free-standing key/value pairs. Each definition consists of one or more validators, plus some widget definition to control what the form representation of it is. (The widget definition hasn’t been defined yet, and the name “widget” was chosen rather arbitrarily. It can be changed.) Most of those can be folded into a type definition that MAY correspond to a PHP type, but doesn’t have to.

There is both a PHP class to represent that definition and an approximately parallel YAML definition to achieve the same goal. Most extensions would define their settings exclusively in YAML and move on with life, but the PHP option is there, mainly for modifying other extensions’ definitions. This is roughly parallel with how the Dependency Injection Container works.

A typical YAML definition looks like this (most examples based on the felogin extension):

styles.content.loginform.showPermaLogin:
 type:
   class: Crell\SettingsPrototype\SchemaType\BoolType
 default: false
 form:
   label: Display Remember Login Option
   description: If set, the section in the template to display the option to remember the login (with a cookie) is visible.

(Class names shown are from the prototype, but naturally would be renamed when they move into core.)

This defines a Setting named styles.content.loginform.showPermaLogin. It has all the “type” information of the BoolType class, which serves as a wrapper for validators and form widget defaults. It has a default value of false. (A default is always required.) It has a form label, description, and optionally separate help text and an icon (not shown here). As a BoolType it defaults to a checkbox widget, but that could be overridden by specifying a widget key that specifies something else. That definition would map over into the appropriate form API calls. (Again, details of the widget part are not yet nailed down so don’t pay too much attention to those at the moment.) There is also an optional validators key, which can then specify additional validators that are not included in the type. However, I expect that to be a not-common case.

For the moment, the validation is done using a simple custom set of objects. That may get replaced with the Symfony Validator component. I haven’t decided yet if that would save time or be a clunky fit in this case. A bit more research and experimentation is needed yet.

The reason for the type/class split is that the type class can take additional configuration. For example:

styles.content.loginform.recursive:
 type:
   class: Crell\SettingsPrototype\SchemaType\IntType
   allowedValues: [0, 1, 2, 3, 4, 255]
 default: 0
 form:
   label: Recursive
   description: If set, also subfolder at configured recursive levels of the User Storage Page will be used

This Setting is an IntType, but also includes specific allowed values. The allowedValues property maps 1:1 to the constructor of IntType, and implies an additional “the value is one of these legal ones” validator. That could also be defined as a stand-alone validator if desired. That is, the following is equivalent, although I expect this style to be rare.

styles.content.loginform.recursive:
 type:
   class: Crell\SettingsPrototype\SchemaType\IntType
 default: 0
 form:
   label: Recursive
   description: If set, also subfolder at configured recursive levels of the User Storage Page will be used
 widget:
   class: Crell\SeettingsPrototype\Widgets\SelectField
   values: {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 255: 255}
 validators:
   -
     class: Crell\SettingsPrototype\Validator\TpeValidator
     type: int
   -
     class: Crell\SettingsPrototype\Validator\AllowedValues
     values: [0, 1, 2, 3, 4, 255]

IntType has the basic logic included to generate both the widget and validators parts for you, but this is fundamentally what’s going on under the hood.

Extended types

The above examples use primitive types: string and int. The design does not limit you to that, however. For instance:

EmailType implies a string that also has email address validation, and can use an email field in HTML for better accessibility and validation.
ColorType implies a color field, which can validate a string value as an RGB triplet while offering a color picker widget in the UI.
PageId implies an integer that corresponds to an existing page in the database.
MultiListType implies a “you have 10 options, pick one or more of them” type field, which stores as a sequence but shows a fancy pullbox style UI.
PathType implies a string, but validates that it’s a valid path on disk relative to some location.

And so on. These are all user-defined types, meaning extensions can trivially define their own robust settings types if they wish.

The same is true of validation and widgets. This gives extensions a huge amount of flexibility in terms of how they want to be configured, what to expose, and the type of data to expose.

Definition Passes

As mentioned, there is a PHP API for defining Settings values as well. It is modeled on Symfony’s DI Component’s Compiler Passes, as it is a pretty good model and one TYPO3 devs are going to be familiar with.

A pass class looks like this:

class MySchemaData
{
   public function __invoke(SettingsSchema $schema): void
   {
       $schema->newDefinition('foo.bar.baz', new IntType(), 1);
       $schema->newDefinition('beep.boop', new StringType(), 'not set');
      
       $def = $schema->getDefinition('scheduler.maxLifetime');
       $def->default = '2048';
   }
}

(I have not yet decided how that gets registered with and exposed to the rest of the system, so some parts here are subject to change.) Of particular note, the newDefinition() method requires a name, type, and default value, so it is impossible to define a setting without them. The newDefinition() method also returns the newly created SettingDefinition object if you want to attach other widgets or validators or whatnot. More relevantly, getDefinition() lets you access and modify-in-place any other definition. That is most commonly done by site-specific extensions or site packages to override the defaults of other extensions, as shown here.

The resulting schema definition is cached somewhere, so YAML parsing time is not a performance concern. (The specific somewhere is still TBD, but it will have to be.)

Setting values

Ideally, setting values would be a very rare task outside the auto-generated UI. However, for illustrative purposes it’s two methods:

$settings->set(5, 'styles.content.loginform.recursive');
$settings->setMultiple(5, [
   'styles.content.loginform.recursive' => 4,
   'styles.content.loginform.showPermaLogin' => true,
]);

(set() is just a convenience wrapper around setMultiple(). One is a special case of many.)

The first argument is the page ID in which context to set the properties. I haven’t implemented it yet, but we would also need a delete[Multiple]() method to remove an override on a particular page to allow its value to inherit.

Since the idea is to supplant TypoScript constants, that per-page setting capability is necessary. To supplant extconf, we would also need an explicit global-level setting method. (See further discussion below.)

Reading values

There are two possible places one could read a Setting: TypoScript and PHP code.

For TypoScript, the plan is to inject defined Settings into the TypoScript context as though they were constants. I’m not sure yet exactly what that implementation would be, as Christian Kuhn is in the process of rewriting the TypoScript parser. The intent, though, is that TypoScript can read and use Settings values the same way it reads and uses Constants today.

For PHP code, this is where it gets interesting. The current plan includes two possible access points.

First, there’s a Settings service, which can be requested like any other service. (Preferably via DI, but technically GeneralUtility::makeInstance() would also work.) It has a get() method, which works like you’d expect. Of note, however, if a value is not defined the code will throw an exception. You also can rely on the value that comes back having been validated, so it is definitely of the expected type and within the right range and all that other stuff. That makes it safe to use without any additional null checks or type casting, modulo telling your IDE about the type.


/** @var $recursive int */
$recursive = $settings->get('styles.content.loginform.recursive');

// If the code gets this far, you are absolutely guaranteed that $recursive
is defined, it is an integer, and that integer is one of 0, 1, 2, 3, 4, or 255.

/** @var $recursive int */
$recursive = $settings->get('styles.content.loginform.recursive', 5);

// In this case, $recursive is whatever its value would be in the context of the
// page with ID 5.  If not specified, the current page is used.

However, I would discourage this approach in most cases, quite frankly. The second option is, generally, superior:

class SomeService
{
   public function __construct(
       #[Setting(name: 'styles.content.loginform.recursive')]
       private readonly int $recursive,
   ) {}

   public function doStuff(): string
   {
       $whatever = $this->recursive;
   }
}

By using an attribute to control injection, a class that is instantiated via DI can get the raw Setting injected into it directly via the constructor. That means it can be properly typed, promoted, etc. The language itself now also guarantees that it is defined, properly typed, and otherwise correct. Additionally, testing the SomeService class becomes much easier as there is no need to mock anything; simply pass in an integer for the constructor argument and run your tests. This is very similar to the injection mechanism that was planned for using classes as the schema; in fact, Symfony’s DI component as of 5.3 includes a compiler pass hook for exactly this use case for attributes. The prototype shows it in action. (See AutoinjectTest.php)

In either case, the reading process would climb the page tree, just like Constants do now. Absent any page with a specified setting, it would read a globally set value, and absent that would read from the provided default. All of that is insulated from the caller.

Open questions

Here we get to the debatable part. Aside from feedback on the overall concept, we are looking for feedback on a few specific questions.

Categorization

At the moment, the prototype has no category or tag support. We would need such a setup for a real core implementation, though. It’s clear from discussions with others that categories need their own metadata, like icons, help text, etc, so they need to be defined as their own objects. That’s fine, but should those have an intrinsic hierarchy? What other functionality should they have? Do we also want free-tagging of some kind?

Each category would also need some kind of ordering control. That could be “priority” numbers a la Symfony (higher number comes first), “weight” numbers a la Drupal (lower number comes first), or before/after flags for topological sorting.

I personally have no strong preference, other than we should be deliberate in what we decide to do rather than “Larry, just do whatever.”

Unified or Parallel systems

This is the big question, and will influence a number of further design decisions. There’s two broad approaches we can take; I have my own strong preference, but I will try to present both as fairly as possible.

As noted in my first writeup, TYPO3 has on the order of 6 or so distinct configuration systems, all of which overlap but none of which truly duplicate each other. This is… a mess, and serves no one.

We have the opportunity here to use this new Settings approach to unify two, eventually possibly three, configuration-related systems into one more robust system. Specifically, Constants and extconf. They already use virtually the same syntax, so unifying them is a natural upgrade step. The caveat is that while they can be used for similar things, they are not used for exactly the same things today. Keeping them separate introduces some complexity, both code and user. Merging them introduces some other complexity, both code and user. Which complexity is the better tradeoff is at present an open question.

Combined system

A combined system is, largely, what is described above, particularly in the setting/getting sections. There is a single “pool” of values, defined by a single Schema. Each extension would have one Settings.yaml schema definition file (or similar). That pool is used for injecting/replacing into TypoScript, and for the Settings service, and for the attribute-derived injection.

With that, we can build a single editor experience, with common tooling. Settings can be set at a global level or per-page. We can also introduce a settings level for per-site, something that does not currently exist.

For upgrade purposes, we would likely write a hard cut-over upgrade hook that takes any existing extconf data (and possibly also parses constants.txt) and populates the global level settings from that. Thereafter, extconf/constants.txt are ignored in v12, only Settings is used. In v11, only extconf and constants exist. That means an extension that support v11 and v12 by just having both files defined; in code, the easiest way to support both versions would be like this:

class SomeService
{
   public function __construct(
       #[Setting(name: 'styles.content.loginform.recursive')]
       private readonly int $recursive = null,
   ) {}

   private function getRecursive(): int
   {
       return $this->recursive ?? GeneralUtility::makeInstance(ExtensionConfiguration::class)->get('styles.content.loginform.recursive');
   }

   public function doStuff(): string
   {
       $whatever = $this->getRecursive();
   }
}

That way, if the injected value isn’t set (because it was never injected), you easily fall back to the existing v11 code path.

Where it gets especially interesting is storage. Per-page overrides would be stored per-page in the database, like Constants are now. (Maybe even the same field, not sure yet.) I’m not sure where per-site would live. Global could be stored in a dedicated DB table for just that.

Or, global Settings could be stored in their own YAML file on disk. That then opens up all the dev/stage/prod, deployability, “edit but only if the file system is writeable”, and so forth logic that we discussed back in second and third writeups. I’m not sure currently if it would have per-site on-disk overrides, but that’s a conversation that can be had.

That would retain the ability to edit top-level extension configuration from the UI, then be able to deploy it cleanly through Git. It would also allow for hand-editing of those files (they’d most likely be basic key/value YAML files) for power admins who find that easier.

We could also go even further. While it is not in the scope for now, there’s nothing I can see that would prevent us from migrating TYPO3_CONF_VARS over to that same system later, for all the same benefits. That would get us the same deployability benefits, dev/stage/prod, etc., for both core and extensions. And also knock out another configuration system, merging three down into one.

The main downside of a unified approach, however, is that it introduces the question of what should or should not be overridable at a given level. Anything that is currently a TypoScript Constant probably makes sense to be overridable at the page level… but that may not be universally true. There’s probably some Constant somewhere that is a Constant for TypoScript access, not because overriding it makes any sense in practice. (Probably many constants somewhere.) Also, most things that are currently in extconf likely don’t make sense to override per-page… but some might. Most parts of TYPO3_CONF_VARS, if that is ever integrated, should really not be overridable per-page… but there may be some values where it wouldn’t be OK.

That means any given Setting property would also need to have some way to indicate at what scope levels it can be overridden. Is that just a per-page yes/no toggle? What about per-site? Do we need to have both DB-based globals and disk-based globals, as they’d deploy differently? So we now have potentially four levels that could be allowed or not? How do we most easily indicate that in the schema? What should the default be?

These are all answerable questions that could be resolved, but they are important questions we would need to resolve, and have at least some idea of what we want to do early on, so that we don’t design ourselves into a corner. (Eg, if we know that there will eventually be 3 or more editable “levels”, then a boolean “edit per page, yes or no” flag would be a bad design that hamstrings us later.)

Another concern is if we’re injecting values via DI that could be page-sensitive, that creates potentially a host of issues around code reentry. If we trigger an in-process subrequest, there’s already a lot of complexity that has to happen to reset and clean global data. This could add to that complexity. This is essentially the problem of “request data in services,” which I know Symfony struggled with for a long time and went through several attempts to find a solution for; I am not sure off-hand what their end conclusion was, to be honest.

Discrete systems

The alternative is to maintain the current separation between TypoScript-targeted values and PHP-targeted values. In this case, we’d have a single schema parser, but two separate pipelines. That means two separate YAML files (TypoScript.yaml and Settings.yaml, or something), two separate sets of “compiler passes” (that use the same syntax, but would have to be identified separately somehow), two separate caches, and two separate get/set services. The per-page one (replacing constants.txt) would likely not have an attribute-injected option for retrieval, though the global one (replacing ext_conf_template.txt) still could.

The upside of keeping the systems discrete is that the immediate impact is a lot less. At least in concept (I have not tried yet), the existing editor UIs could be populated as-is from the new schema, just ignoring all the new bits until the editors get upgraded. It’s likely we could also ignore any new storage logic and use the existing storage, although new get/set APIs are probably necessary, or at least wise.

We could then evolve the storage mechanisms, editors, and so on of both parts on their own time in their own directions. Or not, if we want to just leave them as-is. If we wanted to move the extconf-replacing global config to a YAML storage mechanism as discussed, we could do that without impacting the per-page one at all.

The upside here is simplicity. Neither of the two main challenges of the unified system are relevant here. The “what level is it edited at” question goes away because different files automatically mean different systems, and since the per-request/page settings are not injected through DI, we don’t run into the request-data-in-DI problem.

The big downside is duplication, which brings its own complexity. At best, we end up with the same number of configuration systems as we have today. While the YAML syntax is the same, they’re not actually the same thing. The keys look the same, but they’re not; they’re in separate data pools. There are two sets of services, two sets of compiler passes, two of everything except some core utility code. That’s more concepts for extension developers to keep track of and more code for core developers to maintain.

If we decide in the future to migrate TYPO3_CONF_VARS to the same syntax to eliminate another global, we’d end up with three mostly-but-not-really-identical systems, all using the same syntax but working in importantly different ways. That makes the risk of them evolving contrary to each other higher, and thus the odds of that partial-unification becoming a problem rather than a benefit higher.

We’d also need to decide if the Category list of each pool is unified (via a separate Categories.yaml file or something) or discrete. Either is equally implementable, I expect, but they’d be very different implementations, so we would have to decide which we want.

Others

There are probably other open questions that I’ve not run into yet. If you can see any, please ask about them in the comment below. If you have DX input as well (Developer Experience), please share those, too, as its important to get the DX right the first time to avoid BC issues trying to clean it up later.

Conclusion

It’s probably obvious from my descriptions above that I favor the unified system. I think the benefit potential is much higher, and if done properly (and in the right time-frame) offers a cleaner transition process. I would call it the “medium risk, high reward” option, while the separate systems approach is the “low risk, low reward” approach. (Though, as noted, it is not no-risk.)

It’s possible that we may temporarily go through a separate-systems phase as a step-wise implementation of the unified system. If that can be made to work, that’s fine, but that still greatly impacts the design so we need to know what the end goal is before we commit anything to core. (For instance, do a just-schema-swap change to either the Constants or ExtConf, get that fully ported over, and then expand it to absorb the other one.)

Neither approach is without its drawbacks and risks of blowing up in our face. But we do need to decide which approach we want to pursue, and which end-game we want to target, before work on core directly can really happen in earnest.

Discuss.

masterofd · June 10, 2022, 9:05am

I would be in favor of having extConf behave the same as the rest of the configuration, regarding the PID thing, but possible with a zero value for global (which could be the default, and therefore be straight forward for any extension, as anybody can define the basic configuration for the whole website) and being able to override extConf¨¨ on a per page manner (I’ve encountered the use case in the past, where I wanted to have indexed_search configured in different ways accross multiple websites in the same system but that is not possible at the moment without some clunky workarounds).

To make it short: let them override what they want.
To make it long: You are absolutely right, this might be a problem, if someone decides to override something crucial which they shouldn’t on runtime, but if the question cannot be solved on how and if configurations should be overridable, I’d suggest that for the sake of improvement, everything could be overridable in a first instance and made immutable in v13 or v14, when the understanding on how the community uses the new configuration system is available. This way, the understanding of what really needs to be overridable at runtime (although it seems pretty clear that most constants should be, at least) and what absolutely should not (database credentials?) and what maybe shouldn’t be overridden (thinking about indexed_search configuration where most stuff really shouldn’t be overriden but other stuff really should be overridable) might be hard to grasp from an extension/core developer’s point of view, since they are most likely not the people implementing the solution for real life problems.

I’m on your side, regarding unifying the configuration and I’m curious to see how this turns out.
Thanks for the good work!

jonaseberle · June 10, 2022, 9:39am

If the setting is in extConf, it is purposely not meant to be per-page

masterofd · June 10, 2022, 9:52am

As I mentionned, this is limited to maybe one or two options in indexed_search, but it’s still needed ^^’

Would you consider that basically, all extConf should be immutable then? Do others agree with this?

jpmschuler · June 10, 2022, 12:54pm

For a general response: I can image the amount of work that goes into this (which could or should be a dealbreaker) but yes this sounds awesome (and that although I despise YAML a lot)!

Would you consider that basically, all extConf should be immutable then?

I understood it as “the interfaces and logic should be the same” the part of immutability/overridability is stated as not changeable but still be either

pid (formerly TypoScript constants)
site (formerly Site configuration)
system (formerly TYPO3_CONF_VARS)

To make it short: let them override what they want.

I rather not “let them override what they want” because for complex systems currently it might be different users which have access to that 3 levels - e.g. local admins of the customers allowed to change TypoScript constants, while the deployment creates read-only-versions of site config and TYPO3_CONF_VARS.

I wanted to have indexed_search configured in different ways

That seems like a specific extension problem, as the current TYPO3 architecture allows a solution for your problem, which just isn’t implemented there (might it be for good or bad reasons, but alas that is not the topic here).

benni · June 10, 2022, 3:06pm

Hey @crell ,

thanks for the extensive summary!

As Extension Settings (ext_conf_template.txt) serve Site Admins which are global and installation-wide, and Constants are used for the purpose of evaluating Frontend / TypoScript for Integrators, I see two separates areas to be working.

Thus I personally opt for

using the same syntax for the definition (YAML or PHP possible)
using a different “entry point” (= files) where the definitions are located: One is for the frontend rendering stack, whereas the Extension Configuration should be installation-wide, and they both address two completely different target groups also for the editing UI.

In addition, I see a huge benefit of using the same syntax to finally introduce a GUI for site settings (= third option), which would reside in the Sites module.

All three places have a different storage place currently (sys_template.config, sites/config.yaml and TYPO3_CONF_VARS[EXTENSIONS] via the existing API ExtensionConfiguration.php), so I think that’s a viable solution to keep. It is clear to everybody where a setting is persisted, which one is writing to disk / deployable etc.

IF we go for a unified API definition file, I would use an option such as “scope” which contain the values “page”, “site” or “system”.

For the categorization system (which is independent of the name of the setting such as “plugin.tx_felogin…”, I opt for limiting this to three or four levels max, as I don’t see a huge benefit of building a GUI with super-duper-nesting (making the UI complicated just because we allow the Settings Categorization system to be fully flexible).

My 2 cents so far…

crell · June 10, 2022, 3:28pm

@jpmschuler Heh, I hate YAML myself, as well. But it’s better than JSON, and I am in the minority still liking XML. (Though it probably wouldn’t be optimal in this case.)

@masterofd I don’t think allowing everything to be per-page overridable is wise. I don’t have a complete sense of everything that people store in extconf or constants (since both are by nature open-ended), but for example just looking at core:

ext:backend’s extconf includes a login background image, login logo and alt text, etc. I cannot imagine those having any reason to be overridden per page, though I suppose there’s no harm in it.
ext:extensionmanager’s extconf includes an automaticInstallation flag. That really should not be per-page. It also has an offlineMode flag, which… probably shouldn’t be, but thinking about it I might see a use for taking one “section” of a site offline temporarily while heavy editing is happening? Not sure, but that’s the kind of “I dunno” that makes me confident we need both mechanisms.

Also, I didn’t mention it, but as @jpmschuler notes different people may have access to different admin forms, so may not be able to set a value globally, or may not be able to override it per-page, even though a super admin could.

jpmschuler · June 10, 2022, 3:44pm

@jpmschuler Heh, I hate YAML myself, as well.

And all the problem with it are irrelevant as the proposed magic around it takes care of stuff like too many spaces by accident. That’s what I wanted to underline.

Not to say what crazy stuff could be achieved with YAML includes on top of that.

crell · June 15, 2022, 1:37pm

If we go with 3 separate systems, that means 3 near-duplicates of the pipline. We also then need three different names for everything to avoid confusion, and some way to clearly indicate to developers which is which. That’s a lot of Naming Things and code duplication to have to maintain.

That also raises the question of storage. Presumably we want to get extconf (or whatever its replacement gets called) out of TYPO3_CONF_VARS. So where do we put it, that is not going to conflict with the others? Especially if we want to avoid colliding with a Settings-ified CONF_VARS in the future, it makes sense for both of those to live on disk to be deployable, but then we need separate directories for them. (The per-page bits should be in the database; I don’t think there’s much dispute on that point.)