Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
A client project is using Drupal\FunctionalJavascriptTests\WebDriverTestBase to test a React Application which has a Javascript backend. The test frequently fails on the CI service and it appears to be random.
Debugging this has shown that this is because ChromeDriver is emit emitting a stale element reference: element is not attached to the page document
error.
It turns out that the retry process in \Drupal\FunctionalJavascriptTests\WebDriverCurlService gets in the way of wait functions in \Drupal\FunctionalJavascriptTests\JSWebAssert(). Here's why:
- If the waitFor* functions we call a function like
$page->getText()
- The
getText()
method end up making 2 curl calls - the first to find the element and the second to get the text from the element. - The first call is successful and returns an element locator to use for the second request
- In the meantime JS has replaced this element with different one that still matches the old selector
- Then the second call fires and retries 10 times - there's no way the element can come back and then triggers a
CurlExec
exception - Since there's an exception the wait won't try again because we're process the exception and bailing from the test.
Proposed resolution
- Disable retries in WebDriverCurlService when waiting
- Catch CurlExec exceptions when waiting
Remaining tasks
User interface changes
API changes
Data model changes
Release notes snippet
Comment | File | Size | Author |
---|---|---|---|
#26 | 3211164-2-26.patch | 8.51 KB | alexpott |
#26 | 18-25-interdiff.txt | 1.17 KB | alexpott |
#18 | 3211164-2-18.patch | 8.49 KB | alexpott |
#18 | 3211164-2-18.test-only.patch | 3.02 KB | alexpott |
#17 | 3211164-2-17.patch | 5.46 KB | alexpott |
Comments
Comment #2
alexpottComment #3
alexpottI need this for 8.9.x so posting a patch here to share...
Comment #4
alexpottAdding the logging from D9 to the D8 patch.
Comment #5
alexpottBackporting more of the improvements for D9.
Comment #6
longwaveLooks like a sensible fix, hopefully it will solve the random stale element errors we have seen in core as well.
Comment #10
catchCommitted 0f2313a and pushed to 9.2.x, cherry-picked to 9.1.x. Also committed the 8.9.x patch since makes sense to improve test stability there too. Thanks!
Comment #11
alexpottI think we should revert this for two reasons. One, it has broken PHP 7.0 on Drupal 8 and two...
After much sleuthing finally worked out what is going on. It turns out that the retry process in
\Drupal\FunctionalJavascriptTests\WebDriverCurlService
gets in the way of wait functions in\Drupal\FunctionalJavascriptTests\JSWebAssert()
. Here's why:$page->getText()
The solution is to catch the exception in the wait functions and retry - without doing our retries in WebDriverCurlService as these are pointless because we're ready to wait already.
This change results in a test suite for a site I'm working on going from an almost certain random fail to no random fails on repeated runs of the JS tests.
Bumping this to critical because I'm pretty sure that this affects core too - as per the linked issues.
Comment #12
alexpottIt broke PHP 7.0 because
Thanks @mglaman for pointing this out.
Comment #16
tim.plunkettPer the last line of #11
Comment #17
alexpottThis patch attached fixes random fails on my client project and makes a suite of JS tests that previously were very unreliable pass even on slow infrastructure.
Comment #18
alexpottHere's test coverage!
Comment #19
alexpottUpdating issue summary with the actual bug and solution.
Comment #21
alexpottHmmm.... so this will not fix #3203712: [random test failure] EntityDisplayTest::testExtraFields() or #3099427: [random test failure] FieldLayoutTest::testEntityView() unfortunately. Those fails are not part of the wait system. They are similar though... they are both occurring as part of \Behat\Mink\Element\NodeElement::dragTo() ... this runs a series of webdriver commands and eventually the element is no longer there for some reason. We need to check the destination element we're using on those issues and try to work out why it might be replaced.
The reason the client project was failing so much is because we're testing a React application that makes a lot of updates to the DOM including replacing the entire tree. This change will make testing such applications with Drupal's test framework and as the test shows this change makes our waits much more robust.
Comment #22
alexpottComment #23
longwaveThe test proves that this does solve the problem even if it's not one that seems to affect core, it might still make our tests a bit more reliable.
Downgrading from critical as this doesn't affect the core tests that we thought it did.
Comment #25
catchNit: are are
Shouldn't this be FALSE?
Think it'd be a bit clearer to assign this to a $max_retries variable outside the while.
Comment #26
alexpottThanks for the review @catch. The disable retries is to allow us to return to the wait in the \Behat\Mink\Element\Element::waitFor() earlier and not wait twice. So it is an optimisation that is hard to test.
Fixed everything else. If I'd done #25.3 I would have saved a few hours of my life wondering wtf was going on :)
As #25 is nits and not really changing logic setting back to RTBC.
Comment #27
longwaveRTBC again for #26.
Comment #31
catchCommitted/pushed to 9.3.x and cherry-picked to 9.2.x, thanks!
Also cherry-picked to 9.1.x on the basis that if there turns out to be an unexpected core tests it's easy to revert on 9.1.x for DrupalCI, but if it fixes contrib tests then it's not very useful in 9.1.x except for a patch release. Hopefully this is the right way 'round in reality.