#1792536: Remove the install backend and stop catching exceptions in the default database cache backend exposed a number of problems when executing multiple concurrent test runs like e.g. the testbot does.

a) Make sure that run each operation in the right order and environment. It looks like setUp() first clears caches and then execute operations that result in those caches being populated again in the parent environment (file_prepare_direct() calls result in module list cache/state set queries)

b) Try to avoid persistent cache changes if possible. For example, we drop the variables cache and rebuild it in tearDown(), possibly 8 times at the same time. Check if we can for example just restore global $conf (or once variables are gone, just restoring the container with the right factory and cached config objects might be enough)

CommentFileSizeAuthor
#1 remove-unecessary-resets.patch817 bytesBerdir
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Berdir’s picture

Status: Active » Needs review
FileSize
817 bytes

Hm. What if we wouldn't need all this resets at all...... :)

We reset drupal_static() stuff in TestBase::tearDown() (twice!), we reset the module list and global $conf to the previous values.

Those reset functions all reset a ton of persistent caches (including the complete field *data* cache) that a) takes time and b) we need to rebuild on the next request.

Seem to be able to run tests just fine with this locally, let's see what the testbot thinks.

chx’s picture

Assigned: Unassigned » sun

I would be inclined to just RTBC this but sun has quite an experience in these problems -- what do you think? It passes the current tests, can you think of anything that'd be broken?

sun’s picture

Assigned: sun » Unassigned
Issue tags: +Testing system

Thanks for letting me know of this issue.

Those removals seem to be OK, since

1) drupal_static caches are cleared in TestBase already, and
2) Cache and other service instances are bound to the DIC today, so just reverting the DIC to the original is sufficient to get back to the original scope.

What this patch does not really address are the apparent environment setup problems in TestBase::prepareEnvironment() that are visible in the call stack traces of the original issue — essentially, it seems like we're hitting the wrong database (prefix) under (un)certain circumstances.

Btw, looking at the remaining code in WebTestBase, I just noticed that DUTB::tearDown() doesn't exist. WebTestBase::tearDown() explicitly calls $this->kernel->shutdown().

— And oh, WOW, Kernel::shutdown() + Bundle::shutdown() + Kernel::terminate() + Kernel\TerminableInterface are essentially the WhateverService::__destruct() + drupal_register_shutdown_function() equivalents that we've been struggling hard with.

Berdir’s picture

Yes, I have only looked at tearDown() so far, yet need to look into setUp(). Not sure if we clear caches before calling the file prepare functions.

I didn't know about the shutdown part, but we are using a terminate event that is terminating actually initialized services, I don't think such a concept already exists in Symfony.

sun’s picture

we are using a terminate event that is terminating actually initialized services, I don't think such a concept already exists in Symfony

Well:

interface KernelInterface extends HttpKernelInterface, \Serializable
{
    /**
     * Shutdowns the kernel.
     *
     * This method is mainly useful when doing functional testing.
     *
     * @api
     */
    public function shutdown();

Contrary to terminate(), shutdown() performs all actions that are necessary to properly destruct a service container.

Berdir’s picture

I can't really see what is causing the deadlocks that go through file_prepare_directory() but I suspect the situation should improve considerably when the cache deletes are gone from tearDown(). Because it should hopefully (almost?) remove the need for persistent cache rebuilds in the parent sites for the test runners.

Let's get the install backend issue in and then we can try by removing the cache set try/cache again and see if there are still deadlocks...

sun’s picture

Title: Improve concurrent testing by avoiding unecessary persistent rebuilds and making sure that the right database is accessed » Improve concurrent testing by avoiding unecessary persistent rebuilds
Status: Needs review » Reviewed & tested by the community

Let's revisit the setUp()/prepareEnvironment() code later.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Hm. Looks a little scary, but I'll try anything to help make testbot more reliable. :)

Committed and pushed to 8.x. Thanks!

Berdir’s picture

Ok, I was able to confirm now that running tests (with a concurrency of 10 ;)) leaves the caches of the parent site stable, the created timestamps in the cache, cache_bootstrap and cache_config tables are not changing.

Which is awesome and means that we do not need to look into setUp() I think.

sun’s picture

Yeah - sleeping over this commit, I basically concluded the same yesterday morning, while staring out of the window and slurping the first cup of coffee.

Glad to hear that actual tests actually confirm that. :)

Automatically closed -- issue fixed for 2 weeks with no activity.