This issue is for addressing the Active Trail tests. It looks like there's a bunch of fails there, but based on other sections it's probably one or two things that are breaking all of them.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Niklas Fiekas’s picture

Status: Active » Postponed

This is failing because $localized_options are not passed to l(), so depending on the language stuff, I believe.

aspilicious’s picture

Status: Postponed » Active

Active it is, not fixed yet

effulgentsia’s picture

To reproduce directly (outside of a simpletest environment):
- Manually enable 'menu_test' module (e.g., by setting status=1 in {system} table, or with Drush en).
- Manually set 'menu_test_record_active_trail' to TRUE (serialized as 'b:1;') in {variable} table (or use Drush vset).
- Go to admin/config/development/performance and clear all caches (or use Drush cc).
- Go to admin/config/system/site-information and add menu-test/custom-403-page as your custom 403 page.
- Then logout, and while logged out go to admin/config.

I get a WSOD on this instead of the custom 403 page. There's a couple things I notice while stepping through:

- LegacyRequestSubscriber::onKernelRequestLegacy() runs multiple times, cause something (Symfony? Drupal?) handles an AccessDeniedHttpException by triggering a subrequest. We could avoid calling hook_init() multiple times by adding a if (drupal_get_bootstrap_phase() < DRUPAL_BOOTSTRAP_FULL) { check in that function or perhaps in some other way, but that alone doesn't prevent the other problems.
- AccessSubscriber::onKernelRequestAccessCheck() runs multiple times, and I think each time it gets back the admin/config item, so then it keeps throwing the exception. This seems wrong.
- While I'd expect the above to result in an infinite loop, something causes the flow to stop after about 3 iterations.

I don't think I'll be able to troubleshoot this any further for a few days, so wanted to post this info in case someone else wants to give it a go.

Crell’s picture

Yes, error handling does now issue subreqests. That's by design.

Actually, the easy way to ensure that the legacy bootstrap only runs once is to check the request type out of the event. If it's a master request, we continue. If it's a subrequest, we just break out of the legacy listener and do nothing, since it's already been done.

aspilicious’s picture

FileSize
49.72 KB

You probably get a wsod because there is an underlying fatal error.
active trail.png

We have to figure out why localized_options isn't populated inside menu_get_active_breadcrumb().
That should fix all the other errors and warnings. (I think/hope)

No time to debug this now.

aspilicious’s picture

FileSize
38.64 KB

Found 2 more minutes to put a debug statement.
I printed active trail and all the admin/* paths don't have localized options in it.

active trail.png

Thats strange because localized options is always being set in "menu_set_active_trail".
Unless it is set before without localized stuff and cached somewhere.

cosmicdreams’s picture

I am highly suspicious of the array_pop that may be removing the localized options if it were the last item in the array.

aspilicious’s picture

The array pop removes entire items not something inside an item. In this array there are 2 "items". Array pop removes one of them.

Crell’s picture

I can confirm that the "custom 403 page returns an HTTP 200" is caused by this localized_options problem. localized_options is not there, which errors twice, and then fatals inside l() on $options += ... That means Apache returns a PHP fatal error message with an HTTP 200 (guh?), and then the next couple of tests fail because there's no actual page and the response code is wrong.

That does not resolve the "expected 3 items, got 2" problem further down.

sun believes this is related to #1018614: Inaccessible menu items in navigation links cause "Notice: Undefined index: localized_options". He says he'll investigate further on Saturday.

Crell’s picture

I just committed the following code block to the ExceptionController:

      // The active trail is being statically cached from the parent request to
      // the subrequest, like any other static.  Unfortunately that means the
      // data in it is incorrect and does not get regenerated correctly for
      // the subrequest.  In this instance, that even causes a fatal error in
      // some circumstances because menu_get_active_trail() ends up having
      // a missing localized_options value.  To work around that, reset that
      // value and force it to be regenerated on demand.
      // @todo It is likely that there are other such statics that need to be
      //   reset that are not triggering test failures right now.  If found,
      //   add them here.
      // @todo Refactor the breadcrumb system so that it does not rely on static
      // variables in the first place, which will eliminate the need for this
      // hack.
      drupal_static_reset('menu_set_active_trail');

It only took sun and I about 3 hours to figure out it needed that one line. *sigh*

That brings us down to 4 test fails according to my computer.

cosmicdreams’s picture

Created the follow up issue #1574476: Refactor the breadcrumb system

Crell’s picture

Status: Active » Fixed

Turns out the problem with 404 pages was exactly the same, but it was failing a totally different way. :-) I applied the same fix there and that resolved all of the active trail errors. I'm testing a new patch now through testbot. Fingers crossed!

sun’s picture

Status: Fixed » Active

Hrm. Please bear with me, I'm seriously pissed off right now, because drupal.org decided to nuke my entire POST request, after writing a follow-up on this issue for two hours...

I can't invest another two hours to re-explain and re-write the entire train of thought in an as friendly and elaborate manner as I did previously. So, hopefully enough warnings, on to the summarized version of what I wanted to post:

  1. Sleeping over this, I'm surprised that we want to introduce an entirely new notion of sub-requests without properly accounting for the environment requirements they introduce.
  2. I'm aware that the final goal is DIC, which has a built-in notion of scope, so cached values/dependencies can change for sub-requests. However, the fundamental notion of statically caching the results of a complex computation will remain. Only the container that holds the computed result will change.
  3. The first @todo really needs to be replaced and expanded into: "Reset all statics that are different per request." This not only applies to menu.inc, it applies to pretty much almost all statics throughout Drupal.
  4. I am very confident that not introducing a proper temporary workaround for this will mean that all developers who are working on completely different stuff for core will have to spend 3+ hours each to debug mysterious test failures, only to figure out that the new sub-requests introduced with the kernel are not properly resetting statically cached information.
  5. We absolutely need a temporary workaround for this to prevent total breakage after the kernel patch landed.
  6. There are two types of statics throughout Drupal: Persistent (e.g., module info, theme registry, etc.) and request-specific (pretty much everything else). A sub-request has to reset all request-specific statics.
  7. The majority of statics are request-specific. The persistent should be a small minority, AFAICS.
  8. The most simple workaround I'm able to think (without properly introducing the DIC and whatnot everywhere, which is a huge undertaking of its own) is to "split" all drupal_static()s into two groups: persistent and request-specific:
    -function &drupal_static($name, $default_value = NULL, $reset = FALSE) {
    +function &drupal_static($name, $default_value = NULL, $persistent = FALSE, $reset = FALSE) {
    
    -function drupal_static_reset($name = NULL) {
    +function drupal_static_reset($name = NULL, $include_persistent = FALSE) {
    

    This means that almost all drupal_static() calls throughout core can be left alone. Only the persistent need to be changed.

  9. Given that, whenever the kernel initiates a new sub-request, it first and foremost invokes drupal_static_reset(), so all request-specific statics are cleared out.
  10. The few persistent statics need to be updated to pass the additional TRUE flag.
  11. All existing drupal_static_reset()s throughout core most probably expect to reset everything, so all of them need to be updated to pass NULL, TRUE.
Crell’s picture

Well, the question there becomes which stopgap we want until drupal_static() becomes unnecessary. What it does now should become encapsulated into object properties as an implementation detail. Right now, there's only two places we fire a subrequest: custom 403 pages and custom 404 pages. I don't know that it's worth adjusting the static system if we want to eliminate the need for it. I'm fine with switching those calls to drupal_static_reset() and blowing away everything, but I don't think adding functionality to the static system is appropriate for the kernel patch.

sun’s picture

My point is that all of the DIC conversions won't happen "tomorrow", and at the same time, we will introduce more sub-requests for the blocks/layout changes (-- we have to; or we can forget about blocks/layout in D8). Thus, those currently two 404/403 special cases will consequently turn into more.

Since the full DIC conversion foresees to mostly get rid of drupal_static(), a temporary workaround as outlined in #13 sounds acceptable to me.

As you already mentioned during our debugging of the test failures in IRC, we cannot invoke a drupal_static_reset() without the separation currently, because that resets too much (also the persistent) and thus blows up (and would also heavily slow down) the processing of sub-requests. (When you tried that, you got a fatal error saying that "theme() cannot be called before all modules are loaded", which is expected behavior, since the reset caused the system to forget that it loaded all modules already [which in turn is intentional and required for at least drupal_flush_all_caches() to work correctly and be able to re-initialize and (re)build the entire system environment from scratch].)

sun’s picture

Status: Active » Needs review
FileSize
3.3 KB

Something along the lines of attached patch should do it.

sun’s picture

And apparently, the kernel would also have to restore the previous request-specific statics after completing the sub-request and getting back to the parent request; i.e. in pseudo-code:

// Backup current request-specific statics.
$current_statics = drupal_static();

// Reset and execute sub-request.
drupal_static_reset();
$response = $this->request(...);

// Continue current request with original statics.
$statics = &drupal_static();
$statics = $current_statics;

My patch in #1214168: Bogus drupal_static()s allows to pass no arguments to drupal_static() in order to retrieve all currently stored values.

cosmicdreams’s picture

Since the full DIC conversion foresees to mostly get rid of drupal_static()

I'm all for implementing the DIC, I just don't have the architectural knowledge to know where and how to implement it all. Is there a list of all the tasks we need to do in order to implement the DIC?

If so, I'm eager and ready to get started with them.

Crell’s picture

Rob Loach is leading the DIC process righ tnow. The basic idea, though, is to isolate subsystems into one or a few classes, and make sure that *all* of their data and dependent objects, including the database connection, is passed into it. Then it becomes very easy to have it spawned via the DIC, and then the DIC's scope system handles whatever the static caches turn into. (Mostly they turn into protected properties on objects, and simply creating a new object replaces the concept of a static reset.)

In this case, I don't know that we need a full static reset per se. The affected items here are menu things, which I want to reduce or eliminate in the very next major patch, the new router. So I don't now how much effort we want to go to tweaking the static system when we're trying to remove the necessity for it, at least here, anyway.

cosmicdreams’s picture

Crell’s picture

Status: Needs review » Fixed

This all got resolved somehow. :-)

sun’s picture

What means "somehow"? I don't think this is really resolved.

For one, the DIC won't be able to replace all statics. Statics are used not only in core subsystems.

Second, the current Symfony DIC is extremely slow. My own and other people's benchmarks for other changes revealed it's so slow that I'm highly concerned about any attempt in seeing and using it as a general replacement for drupal_static(). This deserves its own core issue, in which we need to do proper benchmarks to officially confirm its bad performance, and consequently have to think about replacing it with a simpler and faster implementation.

Third, this bug was merely the symptom, which revealed that we're not properly accounting for resetting statics in sub-requests yet, as outlined in #13. As mentioned in there, similar unexpected behaviors and test failures can happen pretty much everywhere in the system currently, as soon as a sub-request is involved.

RobLoach’s picture

Second, the current Symfony DIC is extremely slow. My own and other people's benchmarks for other changes revealed it's so slow that I'm highly concerned about any attempt in seeing and using it as a general replacement for drupal_static(). This deserves its own core issue, in which we need to do proper benchmarks to officially confirm its bad performance, and consequently have to think about replacing it with a simpler and faster implementation.

Could this possibly be from what you were attempting to do with it? Once a service is instantiated, it's an isset() check whenever you ->get() an object. Not sure how an isset() could be slow. Either way, we still don't want to be STUPID.

There are places where it still could be improved upon and used properly. Our current use of it is in a state of flux, and will make more sense once we have the ContainerAwareEventDispatcher.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.