Deprecating direct sys_log access

crell · June 21, 2021, 5:10pm

Surfacing this here for planning purposes.

Over on Forge, we’ve been working on figuring out how to finish the transition to using PSR-3 for all logging. cf: Epic #94356: Embrace PSR-3 - TYPO3 Core - TYPO3 Forge

That runs into some interesting challenges, as a number of systems read from sys_log directly for various tasks, violating encapsulation. That means we cannot simply stop writing to sys_log, because the logging framework can be configured to write everything or nothing to the log table. Net result: For this to work, not only do we need to remove all writes to the sys_log table from anywhere but the logging system, we also need to refactor anything else that relied on sys_log to do something else. That’s a not-small task.

My current thinking on this front is captured below, and will be updated based on feedback. Naturally, this is complicated by how late in the v11 cycle we are right now so some changes may simply not be possible, or we/I won’t have time to do so. TBD. Feedback and/or context where I don’t know things is welcome and requested.

`RemoteServer::getCommentsForRecord()`

This appears to be getting metadata about objects for the grid display. However… if I trace back the function calls, it’s only called once, from a method that’s only called once, from a method that’s only called once, from RemoteServer::getWorkspaceInfos(), which my IDE says is never called. I’m not sure what’s going on here, or if this code path even needs to still exist.

If it does, then I suspect the same data should be derivable from another source, although I’m not sure what that source is yet.

`SysLogErrorsDataProvider::getNumberOfErrors()`

This is getting a summary of how many errors were logged in a given timeframe, for dashboard purposes. Assuming sys_log remains as the target for DatabaseWriter, that’s probably a valid use case to keep. However, it should ideally be refactored to use the same data model as belog does, I think. But this isn’t a blocker, aside from the caveat that the data there is not reliable unless you are logging all errors to the DB (which is certainly an option, but not the only way to configure it).

`NumberOfFailedLoginsDataProvider::getNumber()`

Tracking the number of failed logins specifically seems like a task that should be its own small subsystem. That could handle tracking of failed logins specifically, pruning of old data, etc. Oddly this routine doesn’t appear to track the user that generated the failed login, which… limits its usefulness as now it cannot be used for flood control. This is also a dashboard-only widget, so I’m not sure how critical it is.

`AbstractExceptionHandler::writeLogEntries()`

This is writing exceptions to the log. The new PSR-3 code is already there, so I have a simple patch to just kill the old code path: https://review.typo3.org/c/Packages/TYPO3.CMS/+/69524

`DataHandling::printLogErrorMessages()`

This is a ginormous class, and I don’t quite follow what all it does. (From the comments, it appears the answer is “everything”, which is a separate issue to address.) It… looks like it generates up to 256 Flash messages off of log entries. It’s called from 5 places: 4 in sysext/backend/Classes/Controller and 1 in sysext/recordlist/Classes/Controller. I don’t quite understand the use case, or why you’d want to dump that much data into flash messages, so I’m not quite sure what to do here. I think it needs to be replaced outright with something that displays data properly and we can determine from that what it should actually be doing, but that’s going to be a larger lift.

`BackendUserAuthentication::writelog()`

This is the biggie. It’s a write location, not a read location, and it’s called from a zillion places. Fortunately, I think this is probably one of the easier places to fix as it’s already centralized; it should, in theory, be a “simple” matter of replacing the existing code there with the equivalent PSR-3 write.

Curiously, its sister-class, FrontendUserAuthentication, inherits the parent’s noop implementation of writelog(). I do not understand that at all, but it suggests to me that the frontend/backend user logic is unnecessarily intertwined. That’s a topic for another time, though.

belog’s `LogEntryRepository`

This is part of belog’s display code, to show sys_log records in the UI. It may need to get modified as more data is transitioned to context, rather than dedicated fields (or possibly the context can be denormalized on write? TBD), but if anything it’s the most legitimate sys_log read case in the whole system. Basically, stet for now.

`SystemInformationController::appendMessage()`

This is, I think, the code that shows the “hey, you’ve got bad stuff in your logs!” message icon in the admin. While not ideal, I think like belog itself it’s reasonably fine as-is. It’s telling you there is something in the DB log that you could view through belog’s viewer, which is… accurate. Stet, although I think it would benefit from unification with belog’s data model, for simplicity.

`PasswordReset::log()`

Another case that should be straightforward to just swap to PSR-3 and call it a day.

`FailedLoginAttemptNotification::createPreparedQuery()`

As with NumberOfFailedLoginsDataProvider, I think this would benefit from using a proper flood control system instead. The sys_log table is an unreliable data source, and will become moreso in time.

Conclusion

I think, then, there’s a few tasks coming out of this. Please correct my misconceptions as appropriate:

BackendUserAuthentication, AbstractExceptionHandler, and PasswordReset are write locations, and those can be directly converted to PSR-3. That should be the easy part.
SysLogErrorsDataProvider, belog, SystemInformationController are probably safe to leave as is for now. They are all read contexts on the sys_log table, which is a legit use case, with the caveat that certain log configurations would result in the data being unreliable. But that is essentially by design, so stet.
Add a proper flood control subsystem, and modify FailedLoginAttemptNotification and NumberOfFailedLoginsDataProvider to use that instead of sys_log.
I have no idea what to do with DataHandling. It seems wrong to me on several levels, but adjusting that is a not-small lift. Please advise.
RemoteServer I could also use feedback on, as I don’t know what it’s trying to do, or if it should be doing it at all.

I will start on the tasks for point 1 as those are straightforward, but input on the rest is welcome. Thanks.

benni · June 22, 2021, 7:04pm

Hey @crell

thanks for the extensive summary. Here is a quick brain dump from my side.

BackendUserAuthentication::writelog() and DataHandler::log* / DataHandler::printLogErrorMessages() are the key places to revolve around.

The most important part is that we have a comprehensive list of log information to group / search and filter for. This is why the DB table sys_log has existed for a long time. If we turn this into PSR-3 writers (with context information etc, and map the context properties back to specific DB table fields with such a writer) we do have a chance. However, we have one important part to consider… or maybe one chance to do this properly without breaking BC compat.

I think we need to use the PSR-3 interface, BUT we need to have one or multiple dedicated loggers which cannot be configured to be turned off completely. This way we use PSR-3 but site owners are not able to remove information because the system depends on it (especially DataHandler::printLogErrorMessages()). Reading from the DB table should be encapsulated in one specific class to hide the implementation of the DB table for all usages where sys_log is read from.

So: Either we go with a Symfony-style “Channel” approach, where we configure one channel which cannot be turned off, and use that one directly and wire it via DI (https://symfony.com/doc/current/logging/channels_handlers.html) and we’re good to go for “writing purposes”, or we make everything configurable, but then we’d need a good solution for reading log files (in a CMS, that seems to be a bit absurd).

The main issue is that we need to get the right configuration for the right logger (and not the LogManager::get(__CLASS__) approach that is currently all over the place. We could either go with one logger, or with multiple loggers (see the TYPO3\CMS\Core\SysLog\Type for the types we might need), which is then hard-configured in LogManager for the time being.

Once we solved this, exchanging every code piece to PSR-3 should not be a problem (and yes, starting with BackendUserAuthentication::writelog() might be a good thing).

What do you think?

crell · June 22, 2021, 8:35pm

Hard coding that all logs always get handled by DatabaseWriter, period, is certainly an option. (I’m not sure off hand how we’d wire that in, but I’m sure it’s doable.) The caveat, and the reason Drupal, for instance, lets you turn that off, is for performance. Logging to syslog or a remote log aggregator is faster than a blocking database write, and if you have a case that’s generating lots of logs (eg, lots of user logins, not just errors), then that could be a measurable performance hit.

If we’re OK with eating the cost of DatabaseWriter being unavoidable, then that does keep the various things that read from the sys_log table reliable. It doesn’t change that some of them should probably not be reading from that table at all, however.

The channel question is tricky, primarily because of the desire to have per-namespace arbitrary configuration. That pretty well mandates having some custom routing mechanism akin to LogManager, and to make it worse, if it’s done inside a PSR-3 object directly then the only way to determine the channel to use is to get a backtrace and track up the stack to see where it was called from… but generating a backtrace is one of the slowest things you can do in PHP.

If channels are arbitrary strings, then it becomes fairly simple to have a single LoggerInterface implementation that takes a string constructor and then sticks that into the context array of whatever it’s called on, before passing that along. Then any channel is just another entry in the Container, and you get the channel(s) you want injected via DI like anything else. But that doesn’t work with per-class overrides.

The per-class configuration is also the main, probably only, blocker to us just using Monolog. It already has a similar writer/processor split, and I suspect TYPO3’s was modeled on it in the first place.

For making the existing sys_log read locations more abstracted, are you thinking of reusing the existing code in ext-belog, or writing something new?

benni · June 22, 2021, 9:07pm

Hard coding that all logs always get handled by DatabaseWriter, period, is certainly an option. (I’m not sure off hand how we’d wire that in, but I’m sure it’s doable.) The caveat, and the reason Drupal, for instance, lets you turn that off, is for performance.

I’m aware of that, but such a specific logger is something that is not a “system-level” logger, but actual logging application information (because the application reads from it). The fact that we use that logic already right now in the places you mentioned above are already in place, and are used to mainly log “actions” (most of them should be migrated to Events / as in Event Sourcing in an “Audit Log” but that’s a topic for another day), so the calls are OK IMHO and necessary currently. The place in the ErrorHandler / ExceptionHandler is actually the exception here.

If channels are arbitrary strings, then it becomes fairly simple to have a single LoggerInterface implementation that takes a string constructor and then sticks that into the context array of whatever it’s called on, before passing that along. Then any channel is just another entry in the Container, and you get the channel(s) you want injected via DI like anything else. But that doesn’t work with per-class overrides.

This is how I imagined to have it built. A string as channel (similar to Symfony), wired via DI.

The per-class configuration is also the main, probably only, blocker to us just using Monolog.

I think we should keep this for the time being but migrate slowly towards sensible defaults for proper channels and channel names. An “auth” (or security) logger, similar to *nix systems seems much more usable and would make extension authors more aware to think of how to log things in the future - thus: providing a set of useful channels by default, and allow extension authors to use them by DI as well. And the “sys_log” logger (e.g. a “actionLogger” or “eventLogger”) would be one of them.

For making the existing sys_log read locations more abstracted, are you thinking of reusing the existing code in ext-belog, or writing something new?

Something new.

Deprecating direct sys_log access

RemoteServer::getCommentsForRecord()

SysLogErrorsDataProvider::getNumberOfErrors()

NumberOfFailedLoginsDataProvider::getNumber()

AbstractExceptionHandler::writeLogEntries()

DataHandling::printLogErrorMessages()

BackendUserAuthentication::writelog()

belog’s LogEntryRepository

SystemInformationController::appendMessage()

PasswordReset::log()

FailedLoginAttemptNotification::createPreparedQuery()