Problem/Motivation

Drupal derives a lot of information out of the HTTP request. There is a lot of information there, too, and from that information we extract all sorts of information: What node is currently being viewed, what page callback to use, who the current user is, what the language of the site should be, and dozens of other important facts. However, there is no consistent way to do so. Every system and every module must invent their own way of doing so, usually with some global function or global variable. In order to build a more robust REST server within Drupal, we need to unify that contextual information in to a coherent and integrated system.

All of that information, collectively "context", is information about how the code should behave that comes from the current request cycle. However, that is only part of the story. In order to support REST services, Varnish, backbone.js, and other advanced page building techniques in a robust fashion we need to be able to render any portion of the page in isolation. That means being able to control, and alter, what context that portion of the page "sees", regardless of whether we're rendering all of the page at once or parts of the page in separate requests. That means being able to layer the rendering of the page, and change, or override, that context at any point.

By providing an integrated, extensible system for contextual information, we can build a more coherent, robust, flexible response system on top of it.

Proposed resolution

The solution is to build an extensible, flexible carrier for that context information. That carrier can be responsible for retrieving HTTP information, information that depends on the HTTP request, information from a command line argument (if the code is running via Drush), etc.

The "context object" acts as a central gateway to such information coming from the client (who may or may not be a human being at all), as well as information we derive from it, such as the current node, current user, and so on.

To handle the need to override context for a given portion of a page or request, we implement a "context stack". When Drupal starts up, there is a single context object. That context object holds no information itself, but acts as a "mapper" to handlers that know about the current context. Context is identified by a colon-delimited key.

So for example, the value of $_GET['page'] is a piece of context information. It is accessed via $context->getValue('http:query:page'). The $context object in turn will map that request to handler object that knows how to derive that information, and return a string, say, "5", which the $context object will cache and then return to the caller.

In another example (as seen in this patch), $context->getValue('path:system') represents the Drupal internal path of the reuqest. That in turn can call $context->getValue('path:raw') to get the raw path of the request, combined with potentially calls to Drupal's database to do path alias lookups. That in turn, as of Drupal 7, could depend on $context->getValue('http:query:q'), $context->getValue('http:request_uri'), or even $context->getValue('http:php_self').

Because that is all cleanly separated, but maintained through the context system, it is far easier to track, grok, and override than trying to keep track of the current mix of global functions and global variables we use:

request_uri()
current_path()
request_path()
$_GET['q'] (which is not, in most cases, actually $_GET['q'])
$GLOBALS['base_url']

We also maintain a "stack" of context objects, to allow us to change the context for a given part of a request or page.

It is also possible to push a new context object onto the stack. First a new context object is created, which references the previous one. Then individual keys can be overridden, either with a new handler or with a new exact value. The object is then "locked", and goes into read-only mode. That ensures that code in one part of the system can only modify the context for code it calls, not for code in some far distant unexpected part of Drupal. When the function completes, the context object it created is automatically removed from the stack.

If a context object has no value for a given key, it will pass the request down the stack to the next context object. That way we can maintain only a single copy of most context information in the "root" context object, and override only those values we care about with minimal fuss and no wasted memory.

Being able to quickly and easily create new wrapping and/or fake context objects is critical for 2 reasons:

1) It allows us to use context for the Panels-like super-blocks design in phase 4 of the Web Services Initiative, where a given block (nee Panels Pane) can be used multiple times on different nodes, or users, or whatever. That allows us to render blocks individually and reliably, which means easy ESI support, easy partial page caching within Drupal, easy in-browser layout (a la FaceBook's BigPipe), etc.
2) By refactoring code to get all of its information from a fake-able context object, we make it possible to test more of Drupal's code with a true unit test, DrupalUnitTestCase, rather than having to recreate an entire Drupal instance with DrupalWebTestCase. That is conservatively 1000x times faster to test, as well as encouraging cleaner, better-factored code in general.

Current tasks

You can find the most up-to-date code in http://drupal.org/sandbox/Crell/1260830 in the wscci branch. Patches posted here are rolled from the wscci branch against core's 8.x branch. Most low-level implementation detail discussion happens in that sandbox.

User interface changes

N/A; this is an API change only.

API changes

With the context object, you would never, ever, ever access one of the super-globals ($_GET, $_POST, etc.) directly. Most current global variables and utility methods would also end up in the context object, as would many random utility functions. So for instance:

$GLOBALS['base_url'] is http://example.com/drupal
base_path() (aka $GLOBALS['base_path']) is /drupal/
request_uri() is /drupal/documentation?page=1
request_path() is documentation
current_path() is foo/26419

In the new API, these look like the following:

$context->getValue('http:base_url');
$context->getValue('http:base_path');
$context->getValue('http:uri');
$context->getValue('path:raw');
$context->getValue('http:system');

Also, because we expose nearly the entire HttpFoundation request object through the context object it is now totally simple to access (and override if necessary) any of the other HTTP information available, such as the HTTP Method, the acceptable content types, the raw unparsed HTTP request body, etc. The API is exactly the same for all of that.

Once this patch is in, we can continue expanding its reach to fold more "random global stuff" into a consistent API. High on the list to convert are:

arg() (#1279944: Eliminate arg() in favor of context)
menu_get_object() (#1260874: Convert menu_get_object('node') to context)
global $user (#1260864: Convert global $user to a context key)
The current language (#1260918: Convert language globals to contexts)

Notes on implementation

Some background explanation on various parts of the design:

Why a stack?

The stack is important for keeping current procedural code working. For OO code, a context object should be passed into an object to be used. For procedural code, however, it will be possible to call drupal_get_context() from (almost) any point and get back the current context object from the top of the stack only. That allows a given function to care about the context, but also allows it to respond to mocked context objects cleanly.

That includes, for example, hooks. It's not reasonable to expect that we can pass a $context object to every single function and every single hook in Drupal. In our current architecture that would be infeasible at best and impossible at worst. Instead, if we need to change the context we push a new context object onto the stack. That affects any code that runs after it in that scope. Once the code that added the new context object returns, however, that context object is popped off the stack and the entire system reverts to the previous context state, just as we want. At some point in the future it may be possible to refactor Drupal to the point that it is not necessary, but that is very unlikely to happen within Drupal 8's timeframe.

What's setValue() vs. setHandler()?

Some context information may be rather expensive to figure out. While information from the HTTP request is fairly simple, the "current node" requires several database queries and the "current user" requires an open session. If we don't need that information there's no reason to waste time generating it.

setValue() sets an explicit value for a context key. That is most useful when overriding a value.

setHandler() sets the object, called a handler, that will be responsible for figuring out the value of a context key (checking other context keys, loading a node, doing database lookups, etc.) when requested, and not before. That delayed derivation helps performance by not doing work we don't need.

What's this unnamed function business?

Anonymous functions are a new feature of PHP 5.3. They are similar to anonymous functions in Javascript, with the main difference being that in PHP variables are only inherited from the scope where the anonymous function is defined if they're specified explicitly. That means you won't get extra random variables showing up in your functions.

The context system expects most handlers to be registered using anonymous functions. This is a pattern borrowed from Symfony, Silex, and other modern frameworks. The advantage is that the handler object not only isn't instantiated until we use it, the code isn't even loaded and parsed. The only memory usage is the body of the anonymous function itself, which is really really small.

However, we also get the flexibilty of doing more than just returning an object of a class with no configuration or control. See the way that HttpFoundation\Request is used in the patch, for instance. Rather than trying to twist around getting a request object instantiated properly inside the HandlerHttp class, we simply pass it in as a constructor parameter. However, because of the anonymous function neither object is created until it's actually used. It doesn't get much more efficient than that.

The unit tests in context.test also show another benefit. We're able to have a mock context handler object that we can reuse for most tests. We simply set what its customized behavior should be inside the anonymous function, then return it. That makes the unit tests easier to read and understand because all of the logic is right there together.

That is, similarly, an advantage of an anonymous function over a named function. A named function we'd have to worry about having loaded, it could live who-knows-where in the file, and that function cannot be configured. or controlled, either. However, if we wanted to split up the Http handler, for instance, it would be extremely simple to do:

$request = Request::createFromGlobals(); 
$context->setHandler('http', function() use ($request) {
  // Woo, dependency injection! :-)
  return new DrupalContextHandlerHandlerHttp($request);
});
$context->setHandler('http:query', function() use ($request) {
  // This is the same $request object as above.  The exact same.
  return new DrupalContextHandlerHandlerHttp($request);
});
$context->setHandler('http:headers', function() use ($request) {
  // Yep, still the same $request object.
  return new DrupalContextHandlerHandlerHttp($request);
});

Documentation

Some still in-progress documentation is available in the handbooks. It will expanded as time allows, and once more code makes it into core.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

dmitrig01’s picture

.

Lars Toomre’s picture

subscribe

Gábor Hojtsy’s picture

.

yched’s picture

sub

MacRonin’s picture

Subscribe +1

mparker17’s picture

Subscribe

amateescu’s picture

.

jherencia’s picture

sub

bejames’s picture

.

AndrzejG’s picture

subscribe

sdboyer’s picture

bippityboppityboo

colan’s picture

Subscribing.

catch’s picture

So for example, the value of $_GET['q'] is a piece of context information. It is accessed via $context['http:get:q'] (using PHP's ArrayAccess interface). The $context object in turn will map that request to handler object that knows how to derive that information, and return a string, say, "node/5"

In Drupal 7, $_GET['q'] does not always come from the http request at all, sometimes it's empty when Drupal is first requested, and injected later. You can see this in drupal_environment_initialize(), where it's set from request_path(), which uses $_GET['q'] if it's set, but $_SERVER['REQUEST_URI'] and whatever else if not. I realise it's just an example, but D7 handles this differently to D6 (where it actually was derived from $_GET['q'] consistently in the first place iirc), so a simpler example might be better - we won't be able to use an http library for that without refactoring the whole process again.

, which the $context object will cache and then return to the caller.

This is more for #1178762: [meta] What common context information do we have? but caching actual data in the context object itself gets very tricky. While the context object is read only, it will have objects in it that can be modified via CRUD functions (an obvious example would be a node view page triggering a node_save() via rules or something - say when it gets to 1,000,000 page views or whatever).

Keeping a cached copy of anything that's in Drupal itself in the context object means either no-one can ever dynamically update anything that might be used for context (so no more rules module), or the stuff in the context object would have to be allowed to go stale during the course of the request (might be OK), or you'd need to trigger a cache clear of the context information (gets extremely messy very fast). If we can keep all these issues isolated to the actual context providers it may be a bit more up front work but could save a lot of issues down the line.

sdboyer’s picture

Keeping a cached copy of anything that's in Drupal itself in the context object means either no-one can ever dynamically update anything that might be used for context (so no more rules module), or the stuff in the context object would have to be allowed to go stale during the course of the request (might be OK), or you'd need to trigger a cache clear of the context information (gets extremely messy very fast).

Quite agreed. I've dabbled in trying to cache blocks of ctools context information before, and had very similar problems. The only safe approach, which I think is what's generally been discussed, is that your cached context is limited to being a tree of class names - providers - but none of the data those providers might actually use. So all you really save is the lazy-load-friendly skeleton.

Crell’s picture

$context['http:get:q'] should always be the literal q from the request. The "Drupal internal path", which we bastardize $_GET['q'] for, should be some other context key. Modifying any of the incoming superglobals directly is a Doing It Wrong(tm).

But let's please keep such discussion to other threads. This is an announcement thread only. :-)

catch’s picture

@Crell that's a different discussion, we bastardize $_GET['q'] twice during bootstrap. Once in drupal_environment_initialize() and once in drupal_bootstrap_path(), I think you're referring to the latter case. Really happy having this discussion somewhere else though, but I didn't want to belabour it on the http library issue either 'cos it also doesn't belong there.

dipen chaudhary’s picture

.

kylebrowning’s picture

.

nicl’s picture

subscribe

fgm’s picture

subscribe

nlambert’s picture

subscribe

mr.moses’s picture

.

Crell’s picture

Status: Active » Needs review
FileSize
242.15 KB

Here's an interim snapshot patch, for now dubbed unstable 1. :-)

This patch includes:

- The Symfony2 ClassLoader library for PSR-0 autoloading support.
- The Symfony2 HttpFoundation library for more robust Http handling.
- The context system in its current form.
- Lots of unit tests.

This patch is being posted here to keep subscribers up to date on the status of the code. Please do *not* review this patch here, however. All work is being done in the WSCCI sandbox. In order to keep this thread manageable I will unpublish any detailed reviews put here. Sorry, but we do not want to split the conversation.

Setting to needs-review so that testbot notices it...

Crell’s picture

OK, and a new patch, dubbed beta1. :-)

I consider this more or less "ready" from an architectural point of view. There will almost certainly be some revision and cleanup as time goes on, but I think it's mostly solid.

This patch includes the core context system, lots of unit tests, and a conversion of selected parts of the path handling system to the context API. Specifically, the utility functions relating to munging and messing with the incoming path have been replaced with their context equivalent. See the issue summary for specific examples.

Note: There are still some documentation todos in the code, and some of the unit tests need to be cleaned up. This is not quite in a commitable state, but I think it's fairly close.

Odds are the testbot will reject this. There's some menu bugs that the dev process ferretted out that are still being worked on in other issues. As soon as those get fixed I will reroll the patch. Still, please review!

If you have questions about the API, please see the issue summary (I tried to make it as detailed as possible) and the linked handbook documentation. If there are any gaps I will try to fill it in, in one place or the other.

If there are more detailed issues to discuss, I may ask that we split those back out into issues in the new wscci component to keep this thread manageable.

Huge thanks to everyone who has worked on this code in the WSCCI sandbox so far! Let's try to get this home ASAP.

Crell’s picture

Issue summary: View changes

Update and expand the summary, especially for recent API changes.

Status: Needs review » Needs work

The last submitted patch, 1233232-context-api-beta1.patch, failed testing.

aspilicious’s picture

Status: Needs work » Needs review
FileSize
101.95 KB

Lets try this one. will explain after I got the results.

Status: Needs review » Needs work

The last submitted patch, 1233232-context-with-user-fix-26.patch, failed testing.

aspilicious’s picture

1) the http handler test is failing because the test bots return someting else than 'POST' for http:method

from irc:

<beejeebus> / Masquerade as Apache for running tests.
<beejeebus>   simpletest_script_init("Apache");
<beejeebus> aspilicious: ^^
<beejeebus> aspilicious: that's from core/scripts/run-tests.sh
<beejeebus> so the code in the failing test tries to not run if it is called from the command line
<beejeebus> but our command line test runner pretends, that, its, apache
<beejeebus> and hilarity ensues
<beejeebus>     if (!isset($_SERVER['SERVER_SOFTWARE']) && (php_sapi_name() == 'cli' || (is_numeric($_SERVER['argc']) && $_SERVER['argc']
<beejeebus> i'm not sure what all the extra checks are
<beejeebus> if (php_sapi_name() == 'cli') {
<beejeebus> should do

2) Second one is caused by cached pages

    // Verify logging of a cached page.
    $this->drupalGet($path);
    $this->assertIdentical($this->drupalGetHeader('X-Drupal-Cache'), 'HIT', t('Testing a cached page.'));
    $log = db_query('SELECT * FROM {accesslog}')->fetchAll(PDO::FETCH_ASSOC);
    $this->assertTrue(is_array($log) && count($log) == 2, t('Page request was logged.'));
    $this->assertEqual(array_intersect_key($log[1], $expected), $expected);
    $node_counter = statistics_get($this->node->nid);
    $this->assertIdentical($node_counter['totalcount'], '2');

I believe this is caused by

      // Restore the metadata cached with the page.
      $_GET['q'] = $cache->data['path'];

Line 2314 (after applying patch) in bootstrap.inc.
We need to get rid of the $_GET['q'] call.

Maybe I'm wrong as we need to set $_GET['q'] or path:system/path:raw in this case...
At this moment the cached pages skip a part bootstrap phase.
This means the context handlers are not initialised. (path:system and path:raw, will not contain the cached values)
We need to trigger the context system somehow.

I could be wrong again, but I'm positive this will help fixing the bug :)

3) // still looking

Crell’s picture

I pulled the test code there from the drupal_is_cli() function, or whatever it's called. It was working on my system when I ran tests through Drush, but it looks like Drush and run-tests.sh do things differently. Lovely.

Honestly I'm fine just removing that test. It's a poor test anyway since it's testing particulars of the testing framework, which is an unreliable thing to be testing in the first place.

aspilicious’s picture

For 2: this is caused by this

http://drupal.org/node/692044#comment-2981412

webchick’s picture

Assigned: Crell » Dries

Dries will probably want a look at this. Yoinking from Crell. :)

Crell’s picture

Status: Needs work » Needs review

New patch. This removes the broken unit test from #28, does a little other cleanup, and updates for the new namespace usage standards. The rest of the breaking tests, it looks like are due to the statistics module. How that of all things could be breaking, we're not entirely sure. Anyone who wants to jump in here and help please do, because I'm confused.

This could still use human reviews!

Crell’s picture

As I was saying...

Status: Needs review » Needs work

The last submitted patch, 1233232-context-api-beta2.patch, failed testing.

pounard’s picture

@Crell

We are actually doing a code sprint in France with fgm and DjebbZ trying to make the test pass. I spotted one problem which actually the #1346490: User path is broken issue.

I think I may have a temporary (working solution), please see the comment there: http://drupal.org/node/1346490#comment-5294944

pounard’s picture

Status: Needs work » Needs review
FileSize
103.66 KB

Status: Needs review » Needs work

The last submitted patch, 1233232-context-api-beta3-temporary-menu-workarround.patch, failed testing.

pounard’s picture

Status: Needs work » Needs review
FileSize
104.71 KB

Thanks to fgm who debugged hard enough to find the real source of statistics module errors, we managed to fix it. Patch attached, hopping this one is the right one!

The solution is not ideal, but if tests pass, it will allow WSCCI to eventually be included to core, which then will allow us to fix those problems in atomic patches.

Status: Needs review » Needs work
cweagans’s picture

Can you guys post interdiffs with further patches please? It's hard to see what's changing.

Crell’s picture

A branch off of wscci in the wscci sandbox would be useful, too, so we can see the change history and then do a proper merge when you're done. Thanks!

pounard’s picture

Status: Needs work » Needs review

Will do.

pounard’s picture

Found the cause of some the remainings Statistic module related errors, opened an issue of its own: #1355118: Move path.inc functions into OOP autoloaded potential code to avoid WSOD when calling HandlerPathSystem::getValue() [statistics] needs commenting and review please.

Dries’s picture

Because that is all cleanly separated, but maintained through the context system, it is far easier to track, grok, and override than trying to keep track of the current mix of global functions and global variables.

My initial reaction to this patch is that I'm not sure that this statement is really true -- it may take some getting used to, but at least for now it is not easier for me. It seems like it is just different. In fact, now there is a mapping layer people need to decipher to figure out what function a 'context variable' maps onto; the way the mapping works, doesn't make that easy. So, I don't buy that this is "easier to track, grok, and override" just yet.

However, based on our conversations that was never really the point. I think it is much more important to understand how this will make web services APIs easier (compared to maintaining the status quo). I'll have to go back to some of the earlier architectural discussions to refresh my mind on why this is important to get RESTful web services in Drupal 8. If you can link us to a concise description of that, that would be helpful for me and all other reviewers.

webchick’s picture

I decided to try a "Learning by Discovery" exercise tonight with this patch.

I find this value in a piece of code:

drupal_get_context()->getValue('path:system');

echoing that gets me "node". I'm curious how that got on the page.

If I want to see all of the context info available, you can var_dump(drupal_get_context()); which will get you:

object(Drupal\Context\Context)#8 (8) {
  ["queryString":protected]=>
  NULL
  ["handlerClosures":protected]=>
  array(3) {
    ["http"]=>
    object(Closure)#7 (0) {
    }
    ["path:raw"]=>
    object(Closure)#4 (0) {
    }
    ["path:system"]=>
    object(Closure)#5 (0) {
    }
  }
  ["handlers":protected]=>
  array(8) {
    ["path:system"]=>
    object(Drupal\Context\Handler\HandlerPathSystem)#12 (1) {
      ["params":protected]=>
      array(0) {
      }
    }
    ["path:raw"]=>
    object(Drupal\Context\Handler\HandlerPathRaw)#11 (1) {
      ["params":protected]=>
      array(0) {
      }
    }
    ["http:query:q"]=>
    NULL
    ["http:query"]=>
    NULL
    ["http"]=>
    object(Drupal\Context\Handler\HandlerHttp)#61 (2) {
      ["request":protected]=>
      object(Symfony\Component\HttpFoundation\Request)#13 (18) {
        ["attributes"]=>
        object(Symfony\Component\HttpFoundation\ParameterBag)#56 (1) {
          ["parameters":protected]=>
          array(0) {
          }
        }
        ["request"]=>
        object(Symfony\Component\HttpFoundation\ParameterBag)#54 (1) {
          ["parameters":protected]=>
          array(0) {
          }
        }
        ["query"]=>
        object(Symfony\Component\HttpFoundation\ParameterBag)#55 (1) {
          ["parameters":protected]=>
          array(0) {
          }
        }
        ["server"]=>
        object(Symfony\Component\HttpFoundation\ServerBag)#59 (1) {
          ["parameters":protected]=>
          array(33) {
            ["HTTP_HOST"]=>
            string(9) "localhost"
            ["HTTP_CONNECTION"]=>
            string(10) "keep-alive"
            ["HTTP_CACHE_CONTROL"]=>
            string(9) "max-age=0"
            ["HTTP_USER_AGENT"]=>
            string(117) "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2"
            ["HTTP_ACCEPT"]=>
            string(63) "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
            ["HTTP_ACCEPT_ENCODING"]=>
            string(17) "gzip,deflate,sdch"
            ["HTTP_ACCEPT_LANGUAGE"]=>
            string(14) "en-US,en;q=0.8"
            ["HTTP_ACCEPT_CHARSET"]=>
            string(30) "ISO-8859-1,utf-8;q=0.7,*;q=0.3"
            ["HTTP_COOKIE"]=>
            string(26) "Drupal.toolbar.collapsed=0"
            ["HTTP_IF_NONE_MATCH"]=>
            string(12) ""1322560916""
            ["HTTP_IF_MODIFIED_SINCE"]=>
            string(31) "Tue, 29 Nov 2011 10:01:56 +0000"
            ["PATH"]=>
            string(29) "/usr/bin:/bin:/usr/sbin:/sbin"
            ["SERVER_SIGNATURE"]=>
            string(0) ""
            ["SERVER_SOFTWARE"]=>
            string(66) "Apache/2.2.21 (Unix) mod_ssl/2.2.21 OpenSSL/0.9.8r DAV/2 PHP/5.3.6"
            ["SERVER_NAME"]=>
            string(9) "localhost"
            ["SERVER_ADDR"]=>
            string(9) "127.0.0.1"
            ["SERVER_PORT"]=>
            string(2) "80"
            ["REMOTE_ADDR"]=>
            string(9) "127.0.0.1"
            ["DOCUMENT_ROOT"]=>
            string(19) "/Users/abyron/Sites"
            ["SERVER_ADMIN"]=>
            string(15) "you@example.com"
            ["SCRIPT_FILENAME"]=>
            string(33) "/Users/abyron/Sites/8.x/index.php"
            ["REMOTE_PORT"]=>
            string(5) "53338"
            ["GATEWAY_INTERFACE"]=>
            string(7) "CGI/1.1"
            ["SERVER_PROTOCOL"]=>
            string(8) "HTTP/1.1"
            ["REQUEST_METHOD"]=>
            string(3) "GET"
            ["QUERY_STRING"]=>
            string(0) ""
            ["REQUEST_URI"]=>
            string(5) "/8.x/"
            ["SCRIPT_NAME"]=>
            string(14) "/8.x/index.php"
            ["PHP_SELF"]=>
            string(14) "/8.x/index.php"
            ["REQUEST_TIME"]=>
            int(1322560984)
            ["argv"]=>
            array(0) {
            }
            ["argc"]=>
            int(0)
            ["HTTP_REFERER"]=>
            string(0) ""
          }
        }
        ["files"]=>
        object(Symfony\Component\HttpFoundation\FileBag)#58 (1) {
          ["parameters":protected]=>
          array(0) {
          }
        }
        ["cookies"]=>
        object(Symfony\Component\HttpFoundation\ParameterBag)#57 (1) {
          ["parameters":protected]=>
          array(1) {
            ["Drupal_toolbar_collapsed"]=>
            string(1) "0"
          }
        }
        ["headers"]=>
        object(Symfony\Component\HttpFoundation\HeaderBag)#60 (2) {
          ["headers":protected]=>
          array(12) {
            ["host"]=>
            array(1) {
              [0]=>
              string(9) "localhost"
            }
            ["connection"]=>
            array(1) {
              [0]=>
              string(10) "keep-alive"
            }
            ["cache-control"]=>
            array(1) {
              [0]=>
              string(9) "max-age=0"
            }
            ["user-agent"]=>
            array(1) {
              [0]=>
              string(117) "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2"
            }
            ["accept"]=>
            array(1) {
              [0]=>
              string(63) "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
            }
            ["accept-encoding"]=>
            array(1) {
              [0]=>
              string(17) "gzip,deflate,sdch"
            }
            ["accept-language"]=>
            array(1) {
              [0]=>
              string(14) "en-US,en;q=0.8"
            }
            ["accept-charset"]=>
            array(1) {
              [0]=>
              string(30) "ISO-8859-1,utf-8;q=0.7,*;q=0.3"
            }
            ["cookie"]=>
            array(1) {
              [0]=>
              string(26) "Drupal.toolbar.collapsed=0"
            }
            ["if-none-match"]=>
            array(1) {
              [0]=>
              string(12) ""1322560916""
            }
            ["if-modified-since"]=>
            array(1) {
              [0]=>
              string(31) "Tue, 29 Nov 2011 10:01:56 +0000"
            }
            ["referer"]=>
            array(1) {
              [0]=>
              string(0) ""
            }
          }
          ["cacheControl":protected]=>
          array(1) {
            ["max-age"]=>
            string(1) "0"
          }
        }
        ["content":protected]=>
        NULL
        ["languages":protected]=>
        NULL
        ["charsets":protected]=>
        NULL
        ["acceptableContentTypes":protected]=>
        NULL
        ["pathInfo":protected]=>
        NULL
        ["requestUri":protected]=>
        string(5) "/8.x/"
        ["baseUrl":protected]=>
        NULL
        ["basePath":protected]=>
        NULL
        ["method":protected]=>
        NULL
        ["format":protected]=>
        NULL
        ["session":protected]=>
        NULL
      }
      ["params":protected]=>
      NULL
    }
    ["http:request_uri"]=>
    NULL
    ["http:script_name"]=>
    NULL
    ["http:php_self"]=>
    NULL
  }
  ["contextValues":protected]=>
  array(6) {
    ["http:query:q"]=>
    string(0) ""
    ["http:request_uri"]=>
    string(5) "/8.x/"
    ["http:script_name"]=>
    string(14) "/8.x/index.php"
    ["http:php_self"]=>
    string(14) "/8.x/index.php"
    ["path:raw"]=>
    string(0) ""
    ["path:system"]=>
    string(4) "node"
  }
  ["usedKeys":protected]=>
  array(6) {
    ["http:query:q"]=>
    string(12) "http:query:q"
    ["http:request_uri"]=>
    string(16) "http:request_uri"
    ["http:script_name"]=>
    string(16) "http:script_name"
    ["http:php_self"]=>
    string(13) "http:php_self"
    ["path:raw"]=>
    string(8) "path:raw"
    ["path:system"]=>
    string(11) "path:system"
  }
  ["locked":protected]=>
  bool(true)
  ["stack":protected]=>
  object(Drupal\Context\Stack)#9 (1) {
    ["stack":protected]=>
    array(1) {
      ["000000001203fd58000000005c089356"]=>
      *RECURSION*
    }
  }
  ["parent":protected]=>
  NULL
}

Grepping that for "path:system" to try and find out where that value comes from gets me:

["handlerClosures":protected]=>
  array(3) {
    ...
    ["path:system"]=>
    object(Closure)#5 (0) {
    }
  }
  ["handlers":protected]=>
  array(8) {
    ["path:system"]=>
    object(Drupal\Context\Handler\HandlerPathSystem)#12 (1) {
      ["params":protected]=>
      array(0) {
      }
    }
   ...
  ["contextValues":protected]=>
  array(6) {
...
    ["path:system"]=>
    string(4) "node"
...
  ["usedKeys":protected]=>
  array(6) {
...
    ["path:system"]=>
    string(11) "path:system"
...

object(Drupal\Context\Handler\HandlerPathSystem)#12 (1) { gives me some idea what to start grepping for to find where the parent function is defined, but since there's nothing there that points out the core/includes directory I need to know that the Drupal namespace is stored there.

After popping open core/includes/Drupal/Context/Handler/HandlerPathSystem.php and grepping for "path:system" I find nothing. However, pretty much the only thing in there is a getValue() function:

  /**
   * Implements HandlerInterface::getValue().
   */
  public function getValue(array $args = array(), ContextInterface $context = null) {
    $system_path = '';

    $path = $context->getValue('path:raw');
    if (!empty($path)) {
      $system_path = drupal_get_normal_path($path);
    }
    else {
      $system_path = drupal_get_normal_path(variable_get('site_frontpage', 'node'));
    }

    return $system_path;
  }

...and there's where I find the drupal_get_normal_path() call, which gives me the "node" value. Whew!

I will say though that it's nice we get rid of both current_path() and $_GET['q'] (two ways to refer to the same thing) in favour of one value. But $_GET['q'], although ugly, is something universally understood by PHP programmers. This custom mapping layer is not, and takes some getting used to.

bryanhirsch’s picture

subscribe

webchick’s picture

Priority: Normal » Major

Increasing to major to get some eyeballs on this.

David_Rothstein’s picture

How about just keeping current_path() in the codebase, as a wrapper around drupal_get_context()->getValue('path:system')?

It's easier to read, type and remember, and the API docs for that function can explain exactly what it returns, which is all that 99.9% of developers will care about.

That would leave the infinite grepping exercise that @webchick got drawn into only for the 0.1% who are very very curious about how the deep internals work :)

Crell’s picture

We want to get to the point that drupal_get_context() Is removable, or at least minimally used. I don't expect that to happen within Drupal 8's lifecycle, but that's a guideline to move toward. Retaining current_path(), arg(), request_uri(), and so forth doesn't help people learn the new system, nor does it help them understand that where that data is coming from is not reliably always the global request in the first place. It actually makes it more difficult for them to think in terms of injected request information, since it's so easy to ignore. So I'd actually think that's counter-productive.

I will respond to Dries and webchick in a bit, but I'm on 8 hours of phone calls today. :-)

Crell’s picture

Priority: Major » Normal

@Dries: The following excerpts from the summary talk about how this relates to web services / WSCCI:

All of that information, collectively "context", is information about how the code should behave that comes from the current request cycle. However, that is only part of the story. In order to support REST services, Varnish, backbone.js, and other advanced page building techniques in a robust fashion we need to be able to render any portion of the page in isolation. That means being able to control, and alter, what context that portion of the page "sees", regardless of whether we're rendering all of the page at once or parts of the page in separate requests. That means being able to layer the rendering of the page, and change, or override, that context at any point.

1) It allows us to use context for the Panels-like super-blocks design in phase 4 of the Web Services Initiative, where a given block (nee Panels Pane) can be used multiple times on different nodes, or users, or whatever. That allows us to render blocks individually and reliably, which means easy ESI support, easy partial page caching within Drupal, easy in-browser layout (a la FaceBook's BigPipe), etc.

So it's not just that we need a consistent API (which is by and large a good thing on its own). It's that we need a consistent API that we can easily "fake out", and to isolate a portion of a page (a block) from the rest of the page. This Context API is intended to provide the channel by which we do that. As a nice side effect, it means a given piece of code can be written to ignore various globals and just pay attention to context passed into it. That means we can unit test it without worrying about the unit test breaking the testing environment itself, since it's all running in the same request with the same globals and request information. (Hacking that information and trying to restore it after the test is a grotesque hack that is very brittle and error prone.)

@webchick: True, it's not immediately obvious where a given piece of context came from when you ask for it. To an extent, that's intrinsic in making it pluggable. Just as you can't tell which cache implementation is going to be used just from looking at a cache_set() call and have to go look up the configuration to see where it will get routed to, the context call itself doesn't tell you exactly where it comes from. It is designed to be something that can be changed.

There is discussion in another thread (#1323210-9: Discoverability and documentation and following comments) about ways to better document the "usual" case. Currently the best proposal is a new @ doxygen tag for "the context key this is probably associated with is X". Better ideas welcome.

I agree that a var_dump of drupal_get_context() is not the most useful thing. Generally I find var_dump() fairly unuseful period, frankly. :-)

catch’s picture

I'm not sure cache_get() is a good comparison.

When you call cache_get() (or now cache($bin)->get($cid), you know it is going to fetch a cache item corresponding to $id, stored in a bin called $bin. It is possible for a cache back-end to do something tricky (like point all bins to the same place and prefix them), but unless you are setting up the particular cache back-end that does this or tracking down a bug with it, you really don't care about that.

While it is not very good, there is also a mechanism to discover which bins are available on the site, and we know that cids are defined entirely by calling code.

Currently with the drupal_get_context()->getValue('path:system'); syntax it is more equivalent to having every possible level of a namespaced cid swappable, so cache($bin)->get('foo:bar:baz'); you might have either a handler or a value set at foo, bar or baz and any of those could be responding to the request. I count six possible places that information could be pulled from within the context system itself - that doesn't include swapping anything out, just the ability to put handlers and values anywhere.

This is what makes it hard to figure out where things come from, not that a particular handler might be swappable IMO. If I knew there was only ever going to be a single 'path' handler (which might be mocked or plugged, but only one place to start looking), then that cuts down the possible places I have to look by half.

David_Rothstein’s picture

Retaining current_path(), arg(), request_uri(), and so forth doesn't help people learn the new system, nor does it help them understand that where that data is coming from is not reliably always the global request in the first place.

We still have node_load() and user_load() in core, even though they are basically just dumb wrappers around entity_load(). I don't think that's prevented developers from learning about entities once they're ready to; rather, it just makes it so they aren't forced to be confronted with the "entity" concept when they're trying to do something simpler.

To me this seems like a similar situation. It's a really nice-looking patch from what I've seen so far, but I don't see why the innards have to be exposed in so many places. If I'm just trying to write code (or read someone else's code) that uses l() to link to the current path, it doesn't make sense to me that I have to be aware of the swappable context system underneath it all.

effulgentsia’s picture

$_GET['q'], although ugly, is something universally understood by PHP programmers.

Is it? If I'm a PHP programmer new to Drupal, and I'm looking at the URL of http://drupal.org/index.php?q=node/38878, then indeed I know exactly what $_GET['q'] means and why it's 'node/38878'. However, if I'm looking at the URL of http://drupal.org/project/views, I have absolutely no idea why $_GET['q'] is 'node/38878'. I'd have to first understand about how Drupal implements clean URLs (and chooses to map them to a 'q' parameter), then understand about Drupal's mapping of 'q' 'aliases' to 'paths', and then understand that once Drupal does all this, it modifies a PHP superglobal. On a multilingual site, I might also need to know something about how language codes within the URL I see in my address bar relate to what Drupal defines as a 'path'.

Personally, I think current_path() would make a lot more sense to a new Drupal developer than $_GET['q'] would (though perhaps asking non-Drupal PHP developers would be better than me guessing). So, unless we hear interesting feedback about this from non-Drupal PHP developers, at a minimum, I think we should replace all current code that accesses $_GET['q'] to instead call current_path().

It's existing Drupal developers, not general PHP programmers, who will be most affected by having to unlearn using $_GET['q'] for accessing information that must be dynamic, and therefore, is properly a function or method. Of course, we care about this demographic, so let's get feedback from them.

As far as removing current_path() and forcing all client code to call drupal_get_context()->getValue('path:system'), can we please remove that from this patch, and take that up in a separate issue? It's clearly controversial, and I'd rather see us use this issue to review and refine the meat of the patch.

Crell’s picture

Let's also keep in mind that as of Drupal 7, $_GET['q'] is not, actually, the 'q' GET parameter. That's only true if clean URLs are off. There's now a series of possible places that we could extract the active path from, and in the end we assign that TO $_GET['q'] (something that you should never, ever do in PHP), for, as near as I can tell, backward compatibility reasons. (I didn't realize that either for a long time, even though catch tried to point it out to me. I think that change went past a lot of people.) So really, using $_GET['q'] at all is inaccurate, and would be misleading for a PHP developer used to $_GET being, well, the GET parameters.

I can re-add current_path() as a BC function for now, but it should be marked @deprecated and documented as such. I still want to see it removed before D8 ships, though.

David: Bear in mind also that drupal_get_context() is only for legacy procedural code. Most of the code that we build on top of this will be object oriented and take a context object injected into it. That means you'll have a $context property to begin with. So even if we do have the old functions for BC reasons, the actual context API itself will be exposed all over the place, by design.

catch: Would you be OK with a @suggested-key documentation approach? So far that's the best anyone has suggested. I'm still open to better ideas, though.

I'll be posting a new patch shortly. Stand by.

catch’s picture

I'd be more interested in making the API a bit less verbose than keeping what will be potentially dozens of wrapper functions lying around.

For example just context() instead of drupal_get_context() is less than 50% the typing.

We are hoping to kill some of the wrapping levels of the entity API as well, node_load() is mainly still there because we only had about 1/8th of an API in D7 so had to keep it 'internal' and because it went in not very long before code freeze.

@Crell I have seen some mention of only allowing handlers at the root level (i.e. at 'foo:', not 'foo:bar'), which would allow for using __call() as well, which potentially means something like

 - drupal_get_context()->getValue('path:system');
+ context()->path('system');

However I can't locate where this was suggested or what the answer was. I'd like to understand why it's necessary to allow a handler at every level, and what is actually gained from this compared to (potentially) simplifying the API for end users.

Crell’s picture

I'm totally fine with replacing drupal_get_context() with context() if no one else has any objections.

I don't recall anyone mentioning root-only handlers explicitly. It may have been chx, as I think I vaguely recall him mentioning that syntax. However, there's a few problems with that.

1) The magic methods on objects are less and less used in PHP of late from what I've seen. For one, they're terrible for debugging. (Certainly no better than closure{}, arguably worse.) They're also bad for performance, as __call() is about twice as slow as a normal method and usually ends up stacked with some other indirection (cufa() or similar), which makes it even slower. With PHP 5.4 supporting closure-object-binding, I would not be surprised to see __call()'s usage die out entirely in the next few years. I'm reluctant to go there when the trend seems to be to move away from there. (Note: This is not based on a scientific analysis, just the sense I get from reading php-internals and Planet PHP.)

2) We still need to support setting an explicit value at any level. Setting a value with $context->setValue('foo:bar:baz') but then reading it with $context->foo('bar:baz') is asymmetrical, which I'd argue hurts learnability. You couldn't just grep for foo:bar:baz and see where it is set-or-used.

3) We already have path:raw and path:system registered as different handlers, because they are, really, different things. Merging those into a single handler would require hard coding a switch statement inside the one object for no other benefit than working around an API limitation. That doesn't strike me as a good trade off. (Currently we do have a switch statement inside the HttpHandler, but that was because we only want a single request object. Now that we're using closures for handler registration and an injected request object we could fairly easily break that handler up into multiple if we wanted to.)

Crell’s picture

Priority: Normal » Major
FileSize
104.94 KB

So as discovered by pounard in #1355118: Move path.inc functions into OOP autoloaded potential code to avoid WSOD when calling HandlerPathSystem::getValue() [statistics], the main reason tests were failing is because of an arg() call in statistics_exit(). The real fix for that is to refactor the path system to be an object that we can lazy load rather than hard code an include for, but that will come later. So after talking with beejeebus (comment #5), we're just going to hack around it for now.

This patch also includes a ported version of #1337076: Clean up the many active / preferred things in menu API, as that's also needed to make tests pass. We can sort out later whether that should go in first or if it should just be included in the merge of this issue.

Also, setting back to major per webchick, since I unset it by accident crossposting earlier. :-)

Testbot, are you happy now?

catch’s picture

For one, they're terrible for debugging.

Compared to explode() of a colon delimited string and calling functions based on that?

We already have path:raw and path:system registered as different handlers, because they are, really, different things. Merging those into a single handler would require hard coding a switch statement inside the one object for no other benefit than working around an API limitation. That doesn't strike me as a good trade off.

I'm not sure that's such a bad trade-off, considering that people dislike the current syntax so much they want to keep request_path() and current_path().

edit: also with path:system and path:raw the 'path' is only there for namespacing. This could be pathSystem() and pathRaw() if we wanted, no switch required, so $context()->pathSystem().

effulgentsia’s picture

I'm totally fine with replacing drupal_get_context() with context()

Can we go one step further? We have cache($bin)->get()/set(), and the CMI sandbox has config($name)->get()/set(). Would it be possible, therefore, to have context()->get()/set() instead of getValue()/setValue()?

Also, looks like CMI is currently using a '/' delimiter for the key passed to get(). Can we have context and config use the same key delimiter? I prefer ':' since that's what we use for tokens, but mentioning it here in case there's a reason it must be '/' in CMI.

pounard’s picture

Simple accessors like get() and set() can potentially conflict with other accessors on the same object. drupal_get_context() has one great advantage over context(): it's explicit, and drupal namespaced, and it contains a verb which tells you what the function does. EDIT: cache and context has two really different API's and trying to harmonize further than in coding conventions them seems foolish in my opinion.

@catch Less code doesn't mean smaller function names, it's an horrible mistake: less explicit names leaves misunderstandings all over the code, for people that doesn't want to read the full core, and confuse incoming developers.

yched’s picture

I vote for ->get() rather than ->getValue() as well.
On the "set" side I do understand the setValue() / setHandler() dichotomy, but it doesn't exist on the "get" side - you don't have a getHandler() in the external-facing API, right ?
From a consumer point of view, I always "get" a value, and the fact that it comes from an actual stored litteral value or gets computed by a handler is not my business. So "getValue()" is unnecessarily verbose and tells me things about the internals I don't want to know or care about. I can't "get" anything else than a "value", so please don't make me specify I want a "value".

gdd’s picture

CMI is not actually using '/' as a delimiter anywhere that I am aware, we are using '.' but mostly that is convenience since we are often referring to file names.

pounard’s picture

@yched The API contains other getters. I think we really need to keep things explicit.

EDIT: Moreover, the object has a setValue() method, we need to keep those in sync. You can setHandler() or setValue(), which is not the same thing. For the end user, you want a value, so getValue() seems explicit enough to keep all consistent. Other accessors exists, get() alone makes no sense at all.

yched’s picture

@pounard - "The API contains other getters" :

That would be getStack(), getParent(), getHandlerAt(). AFAIK those all seem pretty internal ? If I got things right, one of the design guidelines of the Context API was something like : the concept of stacked contexts, and whether a value comes from a handler or a litteral (either natively or possibly overriding a handler lower in the stack), are hidden from the general consumer API, which is just about 'getting the value for a given key'.

If that is correct, then naming the method getValue() because there are other, internal-facing getters, while
- this "get" is the main (only ?) method in the consumer API,
- other subsystems (cache, CMI...) simply name their own method get(),
is a DX mistake IMO, the method should get promoted to a shorthand, both for syntactic and conceptual ease.

bojanz’s picture

Oh yes, definitely context() over drupal_get_context();
And I like get().
So instead of drupal_get_context()->getValue('path:system') we have context()->get('path:system'), which is much nicer (this is something we'll be typing a lot until the subsystems get refactored, so the "nice" factor is important). Though I guess I could live with getValue...

pounard’s picture

@yched Internal or not, they are part of the interface. API consumers also are people setting their handlers, even if internal it's part of the public API for some good reasons. Writing less doesn't mean being more efficient. Let's call all our variables $a, $b, $c ... $z it's so much easier to write.

yched’s picture

@pounard : "API consumers also are people setting their handlers" : true, but much less than people reading from it. So shaping the method names to promote one "side" of the API and make it more intuitive or friendly is a valid DX decision, than we can chose to make, or not.
Given that other subsystems are based on a get() method, I personnally think it makes sense.

+ I tried to outline in my 2 previous posts that this is not just about code brevity. It's about "don't force me to know you have internal stuff like handlers, stacks, etc". So the "sure let's rename all our variables $a, $b, $c to keep terse code while we're at it" argument is rhetoric overdramatization.

pounard’s picture

@bojanz
If I do a search in my IDE or using grep, in order to find all context() function call, I will have a lot of garbage all over the results. If the function name remains drupal_get_context() I won't have this problem.

Because drupal_get_context() is a temporary procedural wrapper that is not meant to be widely used, I seriously advise you to keep it long, unique, and explicit, because when the refactor time comes, you will be able to find it easily.

I definitely give the same piece of advice regarding the cache() function. Using a single yet widely used word as a function or method name is definitely a bad idea, for a lot of reasons: code comprehension, beginners learning curve, and automated code introspection/search.

Short names are bad.

pounard’s picture

@yched Yes, it is overdramatisation, but it definitely is revelant. In all the messages I read the only pros of using short function names are "code is nicer, it's easier to write". This is nonsense. cache, context and get are widely used words, and using shorter names drown their usage inside a huge mass of junk code as soon as you start using IDE search or refactor functions. For people that don't use IDE's, I assume you all use grep or equivalents.

EDIT: This idea of having such short name is as bad as the id() method over the EntityInterface introduced by the D8MI initiative. It seems that everyone yays because code is "short, nice" but it's comprehensible only for people who wrote it; It's unbearable to work with as soon as you are a consumer that tries to debug the code because you will never be able to do proper search in this codebase.

Re-EDIT: Provide a good DX also implies that you leverage development tools. If you don't think of that, your pseudo DX worth nothing.

aspilicious’s picture

I understand the concern about drupal_get_context() while working on the patch I found the long name anoying.

jherencia’s picture

Maybe I'm wrong but if somehow other frameworks in the future want to use Drupal CMI or Context API as we are currently doing with Symphony, context() and cache() could not fit well.

webchick’s picture

Don't PHP namespaces take care of that?

bojanz’s picture

If I do a search in my IDE or using grep, in order to find all context() function call, I will have a lot of garbage all over the results. If the function name remains drupal_get_context() I won't have this problem.

Because drupal_get_context() is a temporary procedural wrapper that is not meant to be widely used, I seriously advise you to keep it long, unique, and explicit, because when the refactor time comes, you will be able to find it easily.

So we keep it ugly because someone might need 10min more to roll a future patch that removes it?
This is not something I consider a valid argument.

And I fail to see how drupal_get_context() is more friendly to beginners than context().
Once they get past our itsafunction()->ohwait() way of doing things, it is irrelevant, and comes down to the amount of typing.

Crell’s picture

RE #72: PHP namespaces take care of that iff the code in question is namespaced. At this time no function in Drupal is namespaced, and I'm not really sure that they ever will be.

Symfony, Zend, and almost every other component framework we could ever imagine using is mostly or entirely OO. I don't know off hand if anyone is namespacing their functions... Neither context() nor cache() is likely to bump into something else unless we're integrating with another "full stack" system, or more likely legacy code for a specific site that was also written to assume that it could "own" the global namespace. (Drupal has always made that assumption, as has almost any procedural framework or application.)

Re #71: I reiterate that the *real* Context API is pure OO. drupal_get_context() is a Drupal-specific convenience wrapper to support Drupal's large library of procedural code. Symfony2, for instance, as I understand it uses the $request object in much the same way that we want to use the $context object; it passes it along from one renderable object thingie to the next, always. If we could do that in Drupal, then there would be no need for drupal_get_context() at all, nor even a formal stack object. We cannot, however, hence drupal_get_context().

I don't think that a strict getX/setX model is necessary for all objects; it strikes me as a very JavaBean way of thinking, which I have argued before is fundamentally flawed. (Sorry, Java fans, it's true.) Certainly jQuery has done just fine without get/set prefixes on everything. I think we originally had the method named value(), and I'd be OK with going back to that if consensus moves that direction. I am not picky here.

Re #58: Assuming we want that sort of namespacing at all, designing an API that we have to hack around right from the start means we've designed a broken API. Using a string key gives us flexibility, nestability, and it's obvious that you're doing an indirect lookup. If we make everything a pseudo-method, then instead of going "wtf is this var_dump so big?" they'll be going "wtf? Where is this method and why can't I find it?" That's hiding complexity that they're still going to run into as soon as they try to investigate what's going on under the hood.

Whether or not we want namespacing, well, I like the flexibility it gives us. I'd probably ask EclipseGc to weigh in here as our resident "current uses of a somewhat similar system of context" expert. :-)

I'm not sure that's such a bad trade-off, considering that people dislike the current syntax so much they want to keep request_path() and current_path().

One person asked for that, and one other said we should punt on that question. That's hardly justification to say that "people dislike the syntax so much".

jherencia’s picture

@webchick

Yep, but currently the function is not part of any namespace and almost every function in bootstrap follows drupal_ pattern.

I think, and correct me if I'm wrong, that it is better to follow the same conventions for all the functions, so using namespaces for new functions and keeping the old ones untouched make harder the learning curve.

@Crell

Thanks for the explanation, I already knew that so I was just trying to point that currently no wrapper function is namespaced and in case drupal_get_context is renamed to context it will probably make more difficult to use Drupal with other frameworks.

Crell’s picture

A reason we shouldn't just do ->pathSystem(): By design, we want to be able to override just a single value in a structure like http, say to override just http:query:page or http:post:some_form_value. How would that make any sense if we don't have nested keys?

sun’s picture

As others have outlined, the primary interest of a consumer is to get a value, without having to care for API internals. At the same time, values should be visible for debugging, and some setter methods potentially want to override particular values. So we have a nested hashmap compound of key/value pairs being set up by sub-context specific handlers. Did we consider ArrayAccess?

$context['http']['post']['site_name']

Wouldn't that be the most natural form?

Obviously needs a separate $context = context(); first, but that's considered temporary anyway, so pointless to debate.

Crell’s picture

sun: Yes, we originally were using ArrayAccess. It was dropped because it was, actually, a fairly poor fit for the logic we were trying to implement.

Also, ArrayAccess cannot nest like that until later versions of PHP 5.3, and would require that we have an actual object or array at every one of those levels.

See #1283542: Don't use ArrayAccess and the linked g.d.o thread for the gory details of how we ended up with the current syntax. I held out for ArrayAccess myself for a long time, but in the end ArrayAccess forced us to support operations that just didn't make sense in this context (no pun intended).

EclipseGc’s picture

OK, so I guess I should weigh in at this point... a few things:

1.) I REALLY like getValue() because I can do a grep for getValue() and see everywhere that it is used and understand very quickly exactly how I should leverage this context thing. Change it to just get() and I'm suddenly digging through configuration and cache, and God only knows what else that may choose to use simply get()... or I'm building a significantly more complicated grep...

2.) drupal_get_context() is probably just fine as it is, because at some point the basic code within it will get moved to an appropriate spot in the bootstrap process, and just pass context along everywhere. At that moment my point in number 1 becomes even more important because really, we have no indicator that we're dealing with a context object other than the fact it got passed to our function, so I can't just grep for drupal_get_context() or context() or whatever anymore. I'm not opposed to this, I'm just pointing it out.

As far as my experience with 'namespacing' most of that is in relation to the plugin system which is to say, when two different providers both need cache plugins, the plugins they define of that type may be vastly different depending on their needs, so... just play it safe and scope your name, it'll probably save you a ton of headaches in the future, and I submit to you that that logic is sound in most any circumstance.

I've actually been experimenting with the context system quite a bit, and overall I think it's pretty straight forward, and fairly simple to read through most of the classes that actually affect the lion-share of developers. Building a custom handler may get a bit more hairy, I don't know, I've not had to do that yet, but in so far as actually using the context system, my biggest complaint is that the rest of drupal is not yet ready to utilize it... So I am excited for us to all start agreeing on these points and get an initial patch in so that the other big blockers to really utilizing this system can start getting fixed.

Eclipse

EDIT* I say there's no indication that you're dealing with a context object, and that's not exactly true, the function signature will probably be something like function my_function(Context $context) or something similar, so we WILL know, it just won't be as easy to grep for as drupal_get_context().

catch’s picture

This is not remotely a proper review, and I specifically ignored the actual context classes themselves instead looking at usage and core changes around it, but to get some things down:

+++ b/core/includes/bootstrap.incundefined
@@ -1120,9 +1122,10 @@ function bootstrap_invoke_all($hook) {
+  $args = func_get_args();
   foreach (module_list(FALSE, TRUE) as $module) {
     drupal_load('module', $module);
-    module_invoke($module, $hook);
+    call_user_func_array('module_invoke', array_merge(array($module), $args));

There's no indication of why this is needed.

+++ b/core/includes/bootstrap.incundefined
@@ -1416,7 +1419,7 @@ function drupal_serve_page_from_cache(stdClass $cache) {
 function bootstrap_hooks() {
-  return array('boot', 'exit', 'watchdog', 'language_init');
+  return array('boot', 'exit', 'watchdog', 'language_init', 'context_init');

Adding an extra bootstrap hook makes me wince, given that context providers only need to register stuff here, not actually do anything. Additionally some requests (for example an request to generate an image derivative) are very unlikely, if at all, to require module-provided context, so this really goes against one of the end goals of lighter bootstraps.

Also once we have any core modules at all implementing this hook, it's very likely to affect page caching performance (considering the difference between db_query() and db_select() managed to hose page caching performance in 7.x - see #1064212: Page caching performance has regressed by 30-40%). So marking 'needs benchmarks', although that's not a great tag, I'd rather see some profiling with a couple of likely core modules implementing the hook. I'm also wondering just how many context handlers we'll end up with being registered.

+++ b/core/includes/bootstrap.incundefined
@@ -2411,6 +2414,66 @@ function _drupal_bootstrap_variables() {
+  require_once DRUPAL_ROOT . '/core/includes/common.inc';
+
+  drupal_bootstrap_context();

This should be a new bootstrap phase, not buried inside _drupal_bootstrap_variables() which is already overloaded.

+++ b/core/includes/menu.incundefined
@@ -490,7 +491,7 @@ function menu_execute_active_handler($path = NULL, $deliver = TRUE) {
-  $read_only_path = !empty($path) ? $path : $_GET['q'];
+  $read_only_path = !empty($path) ? $path : drupal_get_context()->getValue('path:system');

The more I read through the patch reading this, the more the verboseness gets wearing. Incidently I don't think the namespacing is remotely an issue here, if another framework wants to use this, they'd simply use the classes and not the factory. And I don't see how get() is less greppable than getValue(), that's only the case until some other class has a method called getValue() then you're back to square one. Likening it to $a, $b, $c is just silly, no-one is arguing for c()->g().

pounard’s picture

And I don't see how get() is less greppable than getValue(), that's only the case until some other class has a method called getValue() then you're back to square one. Likening it to $a, $b, $c is just silly, no-one is arguing for c()->g().

What's the difference between 1:

$object = $c->get('menu');

And 2:

$object = $c->get('menu');

Answer:

1 starts with:

$c = cache('cache');

And 2 starts with:

$c = context();

This is not easy to grep, search, or understand when reading the single line, that's why explicit getSoemthing() is IMHO really important in that regard.

back to square one

It's "back to square one" / 2 because we already splitted two use cases, which makes half the results while grepping: this is best than all.

@Crell And for what it worth, quoting jQuery is not relevant, the language is definitely different, and jQuery is not easy to debug (at all) and really hard to read when it comes to the internals. jQuery uses polymorphic accessors, getters and setters are the same method depending on the provided arguments at runtime: by writing OOP code we are relying on the compiler, and not on runtime magic: we are NOT doing this the way of jQuery, hence the get and set prefixes, which becomes definitely mandatory in our context.

Please lose the jQuery arguments, it worthes nothing in that particular discussion.

And please lose the rant against Java it's a language not a convention: we're not using it as an example, we're using PHP frameworks such as Symfony and Zend as examples.

catch’s picture

If people do $c = context() or $c = cache() that's their own fault. Change your example to $context = context(); and $cache = cache() and it's no harder to grep for at all, which is again why arguing for single-word method names is not the same as arguing for single letter variable names.

pounard’s picture

People will do, you cannot expect everyone to respect assiduously the rules, even harder in contrib. Moreover people will probably call their variables $cache, $backend, $cache_backend, $context, $cache_object, $object, $current_context, $current, $parent, etc... depending on their own algorithmic context.

pounard’s picture

I'd definitely advise a bit of reading from Joel on Software. Don't misread the article, it's not about hungarian notation, it's about making things so explicit that someone that don't know the API's can tell the algorithm is wrong.

The shorter and the more implicit are our method and object names, the more our algorithm are prone to errors.

It's not something I or Java invented, it's about 30 years of software development and experiences of others.

webchick’s picture

Sorry, but I'm with catch, yched, and others. Consistency and leigibility++. Unless you change getValue to getContextValue (and please, no), there's absolutely no guarantee that this remains greppable either. From the short-hand $context()->get(), it is quite clear what I'm "getting" so the extra specificity is just extra verboseness. That's what people are railing against. The point about $c[ontext]->get() is fair enough, but CMI and the cache system also have this. Once again, consistency should win out over any other factor here. We're trying to make Drupal easier for new developers to grok.

So I'd recommend simply changing this in the patch and then dropping this line of discussion for now, because it's completely distracting people from reviewing the actual context system itself, which is the entire point of this issue. If, as D8 matures and more CMI / cache system patches get committed, we decide that it's actually pretty confusing to have ->get() here, renaming that core-wide is going to be fairly easy, as long as we have decent automated test coverage. And if we don't, that's a separate issue. ;)

pounard’s picture

getContextValue() over getValue() would be a redundancy with ContextInterface interface name.

Go for it, and call every get() method in core as you can, the majority will win I guess. But if reading the Drupal code I just see get($something) everywhere my guess is the code is going to be unreadable.

// Totally stupid code, but pretty much what definitely can happen in core.
$cache = cache('cache'); // This one always make we laught really hard.
$context = context(); // This one still is weird.
$node_ctrl = entity_get_controller($node);
$conf = config('core');
$other = other(); // Let's assume this one is doing really hard work.

$do_it = $conf->get('do_it');
if ($do_it) {
  $depth = $conf->get('menu_depth');
  $key = 'menu:' . $depth;
  $cached = $cache->get($key);
  if (FALSE === $cached) {
    $value = $context->get($key);
    $built_value = $other->get($value);
    $cache->set($key, $built_value);
  }
  $node = $node_ctrl->get($some_id);
  // We did 6 get() in less than 10 lines.
  // My brain already has forfeit reading this.
}

Seems like a get() and c[a-z]+() code contest.

If the whole core code is going to be this, I can ensure you the result will be somehow really weird in the end.

Still this is dumb because if I want to combine interfaces, I can't (I don't see a direct use case now, but still seems stupid for so widely used objects, why a cache backend couldn't provide its configuration somehow?).

EDIT: Notice this one is not bad either:

if ($conf->get('do_it'))
  if (FALSE === ($cached = $cache->get('menu:' . $conf->get('menu_depth'))))
    $cache->set($key, $other->get($context->get('menu:' . $conf->get('menu_depth'))));

EDIT: @webchick Ho, and as you asked, I won't continue about this topic as it is not the issue's goal. Let's continue with the patch as-is and modify it later on a per-issue basis (meaning: keeping the getValue() and drupal_get_context() as-is until the new "consistency" issue is solved). Let's not make on-the-fly modifications as the politicians usually do in the summer while people are trying to focus on the holidays.

sun’s picture

I'm not sure what to say, but both code examples in #86 actually look very appealing to me. It's very easy to understand what is gotten from where, and as a consumer of these APIs, I'd definitely love the simplicity and consistency. It looks like we seem to be in a unique situation, in which we can actually make that design decision for a much simpler DX.

What concerns me way more are the actual keys/identifiers that people will have to use and understand, and figuring out what keys/identifiers are used or available. With regard to that, these code snippets are probably way too pseudo and won't match reality at all... rather:

if ($conf->get('system.cache.page.fast')) {
  $cache = cache('menu');
  $cid = 'menu:tree:all:' . implode(',', array_keys($context->get('user:current:roles'))) . ':' . $context->get('language:ui:langcode') . ':' . $conf->get('menu.tree.' . $menu_name . '.depth');
  if ($cached = $cache->get($cid)) {
    $result = $cached->data;
  }
  else {
    $result = $other
      ->get($context->get('menu:' . $menu_name), $conf->get('system.menu.' . $menu_name))
      ->render();
    $cache->set($cid, $result);
  }
}
Crell’s picture

It's been my experience that Drupal people are actually surprisingly good at using consistent variable names. Especially if we expect most context to be injected into objects and we have a variable name in the method signature of the interface called $context, that will take care of it for the vast majority case. Also, getValue() Is also used on the handler interface so just grepping for getValue() would still give some false positives.

If the consensus is to move to context()->get() rather than drupal_get_context()->getValue() (which it seems like it is), I'll adjust that in the next patch.

catch: The change in bootstrap_invoke_all() is because hook_context_init() requires a parameter, and no other bootstrap hooks do. Currently the code assumes that no bootstrap hook does, so takes a shortcut that we cannot with hook_context_init().

I'm honestly not a fan of hook_context_init() either, for precisely the performance concerns that catch mentions. I was originally hoping that it would just be temporary until CMI was ready. However, CMI is still not ready and we are. Also, closures (which work really well here) cannot be serialized so have to be defined at runtime.

So the problem is:
1) We need a way for modules to inject new context handlers with at little overhead as possible.
2) We do not have a disk-store available to us, and even once we do that won't work as well as closures for some (many?) use cases.
3) Code generation (which is what Symfony2 uses for its dependency injection container) is not an option for us because we cannot edit code on disk from the GUI securely. (And requiring a CLI script every time you want to change a setting, including adding or updating a module, is simply not practical.)

So that leaves hook_context_init() as the only option I can see. If someone else has a better idea, I would love to hear it because I don't like hook_context_init() either.

The only way I can see to improve the potential performance hit from it would be:

1) Move all bootstrap hooks into their own $module.bootstrap.inc file. This is arguably a good idea anyway, and I would support that regardless of the context system.
2) Pass $context to hook_boot() and let hook_boot() to double duty. I've not looked at the code for that yet to see how feasible that is.
3) The above are not mutually exclusive.

I was wary of adding more bootstrap "levels", on the grounds that long term I want to eliminate the idea of bootstrap levels entirely through better lazy loading. (Start up config and context always, and everything else loads as-needed.) If we want to add more boostrap levels for code tidiness, though, then we should add one for the autoload hookup, too. catch, if you would prefer I can break up the bootstrap levels more but I am hoping that is only a temporary measure.

And in other news, zOMG the patch passes testbot! :-)

David_Rothstein’s picture

There are pros and cons of drupal_get_context()->getValue() vs. context()->get(), but I agree we shouldn't worry much about that now. It's not a fundamental feature of the patch, and it's easy to change later.

I'm still more interested in the fact that lots of code throughout Drupal is being required to call that directly. Because either option basically looks like this to me:

-  current_path();
+  very_general_concept()...get...('random.string.key');

It's true that lots of other areas of Drupal already do that kind of thing (e.g., the variables/config and cache systems). However, variables and caching are simple enough concepts that I can give a new Drupal developer a quick explanation of what those are and how you use them, as well as what kinds of things you might expect to find in them.

"Context" seems to be a fundamentally more complicated concept to explain or understand; hence, my initial inclination is to not expose it in the code so much.

@Crell's explanation in #54 of why we can't just hide it under procedural wrappers has some good points:

Most of the code that we build on top of this will be object oriented and take a context object injected into it. That means you'll have a $context property to begin with. So even if we do have the old functions for BC reasons, the actual context API itself will be exposed all over the place, by design.

OK, so that makes sense; if a context object is passed to me I should be using it. And we don't want to keep current_path() and similar functions around if there are situations where calling them will return data from the wrong context. However:

  1. We could easily have current_path() and friends take an optional $context object as a parameter (and default to using drupal_get_context() only when one isn't provided), right? I'm not sure that's a good idea, but I'm throwing it out there anyway.
  2. More fundamentally, if the difference between an injected $context and the one returned by drupal_get_context() actually matters, that implies we are supporting more than one context object in the same page request. I'm curious why that's necessary for this system to support (compared to the alternative of having a single global context).

    I understand why we need to be able to modify the context via $context->addLayer() at various parts of the page request and have those modifications gracefully go away when they're finished (and so far I really like that part). But I'm not sure why that implies multiple totally separate context objects.

    So I guess what I'm really asking is in code like this:

      $newContext = $context->addLayer();
    

    Why does the addLayer() method have to return a completely new context object, rather than just modifying the existing one?

    A related issue (although not 100% the same as what I'm asking) seems to be this: #1341924: Make the WSCCI stack a singleton

David_Rothstein’s picture

Because drupal_get_context() is a temporary procedural wrapper that is not meant to be widely used, I seriously advise you to keep it long, unique, and explicit, because when the refactor time comes, you will be able to find it easily.

That's from #68, but I've seen similar sentiments in other comments too.

I don't entirely understand these statements. Give this point from the summary at the top of this issue:

It's not reasonable to expect that we can pass a $context object to every single function and every single hook in Drupal. In our current architecture that would be infeasible at best and impossible at worst.

...I'm pretty sure that means that drupal_get_context() will not be temporary at all, and will continue to be necessary to call (either directly or indirectly) from giant swaths of the Drupal core and contrib codebases. If that's not the case, could someone explain why?

pounard’s picture

@sun, #87
In #86 code samples, the same method name is used on 4 different objects which actually do very different things (each one of them has a specific business). It makes the code unclear; It doesn't make the errors being obvious at first read. Having the same method name everywhere, in this case, is not being consistent: on the contrary it hides the business differences where they do exist and especially where the developer should see them at the first sight in order to easily spot errors. For exemple, the cache()->get() method definitely is poorly named, because the get() triggers a fetch over a remote server, and this fact is not highlighted by the actual method name. This is bad because this leads to wrong cache usage where developers lose from their sight that the cache system actually does remote server requests when each one of the backend methods are called.

@#90
It means that the procedural accessor is only here for the existing core codebase; While it will probably exist for a long time being because the core is huge and won't evolve as fast as the new APIs. But it has been created to be deprecated from the begining because the real use case of this API is WSCCI components. Those components will gradually replace all the code that really need the context API on the the middle or long term, such as MVC/REST router. Other places such as new plugin system, CMI, D8MI, etc... are also OOP based components that won't need the procedural wrapper either if they are written accordingly, which is one effort to maintain over all iniatives.

pounard’s picture

@#89

Why does the addLayer() method have to return a completely new context object, rather than just modifying the existing one?

The new "layer" (aKa child context object) scope is only the function tree where it has been spawned. Once the runtime gets out of this functions, the parent is restored for the caller. It's a local override. That's why we don't modify the existing one. If we'd modify the existing one, we'd lose this encapsulation and local scoped modifications would become global.

Crell’s picture

pounard: Please let it go on the method names. You're now arguing naming philosophy in areas I don't even agree with you. We're going to switch to the shorter names and we can revisit that in a few months when more Drupal systems are using such patterns.

David: pounard is basically correct on the layering logic. An individual object is not costly, and a lock-and-stack design is much easier to implement than trying to make changes to an object, record the previous versions, then at some trigger put the old versions back and hope nothing got lost in the process. The thread you note is really discussing the mechanism by which we track the "currently active/top of stack" context object; The code as implemented avoids having an actual singleton anywhere in the Context system itself (which is a good architectural approach in general for many reasons I won't digress into here) and has just the one external static in drupal_get_context() itself. The trade off is more object references internally than we'd like. That thread was proposing eliminating some of those references by merging the static singleton back into the context objects themselves, which I don't believe is a good idea architecturally as it makes the code more brittle.

As for current_path($context), I don't really see a value to that. It would be wrapping a single line, so there's no benefit of code simplification, and it actually obfuscates what's actually going on. "path:system" is no more obscure, I think, and actually makes it clearer what it is. It's the system-path, always, not the raw, unaliased path from the request itself. So in that sense it's clearer. That said, if there's a better name for that key than "path:system", I'm fine with changing that, either now or later when we come up with something better.

"Context" seems to be a fundamentally more complicated concept to explain or understand; hence, my initial inclination is to not expose it in the code so much.

"The context of the page is the HTTP request itself, plus information that depends on it. It defines the variables that may change with every incoming request."

I think that expresses the basic concept well enough, especially since many other systems already have a formal request object. We're just taking that concept one step further.

Anonymous’s picture

spent some time discussing this patch with EclipseGc in IRC today, and looking through the code. i have some major concerns about the conceptual complexity of the patch at this stage. i'm coming at this posing the question "why can't we land context with something that is conceptually just like Zend::Registry with lazy loading?". (http://framework.zend.com/manual/en/zend.registry.using.html)

1. i'm not convinced about locking the context object(s)?. i suspect that much of the mocking would go away if code throughout the request could add values. locking a context object, then allowing for values to be 'mocked' doesn't actually achieve 'the value for this key won't change after this point', but it does encourage overloading the meaning of a context key (more on that below). i'm not 100% opposed to locking, but perhaps we can look at locking at a finer scope, or still allowing 'inserts' but not 'updates'?

2. i'm concerned about overloading the meaning of a context key, and the magic around that, particularly if it's expected to be a best practice. part of it seems to be an answer to 'how do i get my stuff in? this *$#*$%& context object is locked'. if we have code whose purpose is to prepare some data in $context for other code to consume, why can't that code just add something?

if the mocking code is temporarily overloading the meaning of a key, i think we're really doing it wrong. magic scope popping implies a contract between the code doing the mocking and the code being called - "thou shalt only call code i expect you to, and not call any code that expects the key i just changed to be something else". EclipseGc pointed out the problems that are caused by something like user_page():

function user_page() {
  global $user; 
  if ($user->uid) {
    menu_set_active_item('user/' . $user->uid);
    return menu_execute_active_handler(NULL, FALSE);
  }
  else {
    return drupal_get_form('user_login');
  } 
} 

i'm not in any way defending that code (ewww), but it shows clearly some code that will probably make mocked context data available in code that the mocker never intended it to be. this will be a general problem for any code that uses mocking and overloads a key.

and i'm not having a go at "a lock-and-stack design is much easier to implement than trying to make changes to an object, record the previous versions, then at some trigger put the old versions back and hope nothing got lost in the process" as an implementation. i'm concerned about making overridden context keys a best practice.

threewestwinds’s picture

As EclipseGc explained it to me in IRC, the configuration for a block (for example) should tell it where to find the information it wants - eg, render_node_as_block() should be told "find your node here in the context".

Given that, what purpose does mocking serve? Any time a previous value would need to be "covered up" seems like a failure of the above mentioned pattern.

threewestwinds’s picture

Fixing tags.

pounard’s picture

Please let it go on the method names. You're now arguing naming philosophy in areas I don't even agree with you. We're going to switch to the shorter names and we can revisit that in a few months when more Drupal systems are using such patterns.

This will never switch back if name changes right now, I'm starting to getting used to this. For what it worth people want to change names absolutely where we did the exact opposite some weeks ago during a long IRC meeting with maybe 10 people here (there's not really more people arguing this here). When it happened in the past, all the arguments that have been said here, in the past 30 comments already have been raised even more detailed. You were the one to agree with me at this time, for a internal API naming consistency reasons and because just "get" alone doesn't really mean anything.

David_Rothstein’s picture

David: pounard is basically correct on the layering logic. An individual object is not costly, and a lock-and-stack design is much easier to implement than trying to make changes to an object, record the previous versions, then at some trigger put the old versions back and hope nothing got lost in the process.

I don't believe the layering logic would need to change, though. It could still put objects on a stack and remove them. The patch already seems to have all the right machinery; it's just a question of how it's used. Basically, if you took the existing patch and renamed "class Context" to "class ContextLayer" and then renamed "class Stack" to "class Context" (and moved some of the current Context class's methods up to it), it looks to me like you'd be most of the way there.

From a public-facing perspective, instead of this:

$new_context = $context->addLayer();
$tracker = $new_context
  ->setValue('a', 1)
  ->setValue('b', 2)
  ->setValue('c', 3)
  ->lock();

It might become this:

$tracker = $context->addLayer()
  ->setValue('a', 1)
  ->setValue('b', 2)
  ->setValue('c', 3)
  // This is just a renamed version of lock().
  ->publishLayer();

Perhaps this is better explored in a separate issue. However, where it's relevant here is that if we aren't going to restrict things to a single context object, then the patch has implications not just for current_path() and other simple one-line wrappers we've talked about here, but also any API function that anyone writes anywhere in Drupal which (directly or indirectly) calls drupal_get_context(); any such function would either need to take $context as a parameter or else not be safe to call from within an object-oriented context, right?

On the one hand, perhaps this restriction is not too different from current best practices (which e.g. say not to use "global $user" in your functions unless you take $account as an optional parameter to override it), but on the other hand, it seems a lot bigger, and to me, less ideal.

"The context of the page is the HTTP request itself, plus information that depends on it. It defines the variables that may change with every incoming request."

That looks good. Where I think it may get complicated for people, however, is the "information that depends on it" part.

I do agree there's some value in having everything that comes from the context easy to spot in the code. But so far I'd prefer a system that didn't require everyone to access the context object so directly. For example, "current" in the function name... in addition to current_path() (maybe better renamed to current_system_path() per above discussion), we could have things like current_user() and current_node() in core, but more important, a contrib module can do mymodule_current_whatever() which instead of being a one-line wrapper is maybe a twenty-line API function that gets something from the context and then processes it for a while before returning.

David_Rothstein’s picture

As EclipseGc explained it to me in IRC, the configuration for a block (for example) should tell it where to find the information it wants - eg, render_node_as_block() should be told "find your node here in the context".

Given that, what purpose does mocking serve? Any time a previous value would need to be "covered up" seems like a failure of the above mentioned pattern.

My understanding is that a node which appears in the context will by default only be there if you're on a path like node/[nid], and will be derived from that path.

So I think where mocking would be useful is if you had a block that used information from the current node when rendering itself, and if you wanted to say "what would this block look like on the node/123 page?" even though you yourself are not currently on the node/123 page...

pounard’s picture

I do agree there's some value in having everything that comes from the context easy to spot in the code. But so far I'd prefer a system that didn't require everyone to access the context object so directly. For example, "current" in the function name... in addition to current_path() (maybe better renamed to current_system_path() per above discussion), we could have things like current_user() and current_node() in core, but more important, a contrib module can do mymodule_current_whatever() which instead of being a one-line wrapper is maybe a twenty-line API function that gets something from the context and then processes it for a while before returning.

I definitely understand that. Where I dislike the actual Context API code (and that's a bit paradoxal because I'm helping to keep it the way it is now) is that it's not tied to the framework business at all.

The procedural wrappers you propose are definitely what's missing for people to be able to naturally write code without having to read a 100 pages book to know everything that's gonna be in that context.

But -and that's the paradox- is I defend the all-purpose generic API. There is in my mind a solution to this problem: Context class could remain generic by design (so really simple too), but Drupal core could have a specialization of this class, instanciated at boostrap (and returned correctly type hinted by drupal_get_context() which is a specific use case) that would provide some shortcut methods: CoreContext->getPath(), CoreContext->getUser(), ... for example.

By doing this we leave the procedural code to a minimal and keep the full API encapsulated in its own namespace, but we provide explicit shortcuts to the most used pieces of data, it doesn't sound that bad in my mind, but I hear from here Crell having a heart attack.

The whole goal of this API is to provide a compatibility layer for procedural code (what drupal_get_context() actually does, we don't need more since it opens the door to the full API usage just using this one and only accessor). I don't see why people wouldn't be able to access directly the current context, on the contrary masking it behind weird PHP accessors sounds like a lot of code indirection.

I guess it's definitely not perfect in that regard, documentation is the first great effort to make in my opinion, but having some helper accessors in a specialized implementation doesn't sound bad in my mind.

David_Rothstein’s picture

There is in my mind a solution to this problem: Context class could remain generic by design (so really simple too), but Drupal core could have a specialization of this class, instanciated at boostrap (and returned correctly type hinted by drupal_get_context() which is a specific use case) that would provide some shortcut methods: CoreContext->getPath(), CoreContext->getUser(), ... for example.

This wouldn't be my first choice personally, but it could be a reasonable compromise. The good thing about it is that it preserves the ability to click on a function name (e.g. "getPath" in this case), in either an IDE or on api.drupal.org, and immediately go to the documentation saying what that particular piece of code does.

pounard’s picture

Yes I guess there is no perfect solution, but these are good questions that worth to continue thinking about as soon as the first piece of patch get commited, if it gets commited.

chx’s picture

Just for posterity, re #56 #1349176-3: Why HandlerHttp?

Crell’s picture

So, let me take a step back and give a bit more perspective and history.

The lock-and-stack logic came around out of a desire to use it to revamp the page building workflow (WSCCI part 4, which we recently decided should be called a "shot"). See this post for the kinds of things we want to be able to do.

All of those require isolating the data used by a given block (what WSCCI calls a block is much closer to a Panels content pane than core blocks today), so that block rendering becomes effectively asynchronous.

Blocks of course could be nested inside blocks (Regions and Blocks in core, or Panels/Mini-Panels in contrib). Also, a given block could be used many times with different information, such as a two-up design with the node from the URL and some other, maybe-related node in the other side. (Today I'd do that with Panels Everywhere, but I did it with just hook_theme hacking back in D6, too.) So we need to use configuration to change the information that is fed into a block, as well as to render a block out of its expected "context" (for ESI, etc.), and do so in a way that a block in the left sidebar cannot impact a block in the right sidebar (because that would make blocks dependent on each other, and therefore not asynchronous). And, of course, true unit testing is not possible if we have code that pulls its dependencies from elsewhere rather than having dependencies passed to it.

Hence the Context API, which combines all of the stuff-that-varies-per-request that we need to deal with into a single channel. By controlling that channel, we can make blocks asynchronous, make blocks easily "layered", and make them unit testable all in one go. (OK, there's more to unit testability than just that, but eliminating current_X() calls is a big part of it.)

So overriding the context key is how we say "Yes I know this url is for block/$block_id?foo=bar, but this block should still run as though it were on node/123 and use node 123 as the node it should be displaying." That's what's important about it.

While something like Panels/Ctools/Page Manager today handles that with an explicit key at every step for every single "context object [value]", I don't believe that scales well once we get into folding even more request-type data into it and nesting further. We all know how much the Panels/Page Manger UI annoys us, because it's dealing with an inherently complex problem space. Imagine if we had a block 4 layers deep in regions (which I don't think is unreasonable if you consider the entire page a region, and every region a region, etc.; that is just one "mini-panel" today) and that block cared about the current user.

Do you want to have to go back and configure a "user context" for every single layer to get there before you can add that block? Or would you rather have the "current user" context always available and lazy-initialize the first time it's used? The latter is, I think, a much better experience both for developers and for site builders. That's why the Context API is built the way it is, with keys always-available but lazy-initialized on demand.

"why can't we land context with something that is conceptually just like Zend::Registry with lazy loading?". (http://framework.zend.com/manual/en/zend.registry.using.html)

Actually the Context API is much more like a stackable/lockable version of Pimple. That was partially coincidence, partially by design. See http://www.slideshare.net/fabpot/dependency-injection-in-php-5354 for a more detailed description of it.

However, where it's relevant here is that if we aren't going to restrict things to a single context object, then the patch has implications not just for current_path() and other simple one-line wrappers we've talked about here, but also any API function that anyone writes anywhere in Drupal which (directly or indirectly) calls drupal_get_context(); any such function would either need to take $context as a parameter or else not be safe to call from within an object-oriented context, right?

Actually, the existence of drupal_get_context() is how we make it safe to call a function from within an object-oriented context. It creates an out-of-band channel that we can use to update the context for such functions. If we didn't need to make those functions safe to call, we could eliminate both drupal_get_context() and the stack object, and just demand that we pass the context object in constructors or method calls.

While current_X() { context()->get('X'); } may seem like a DX convenience at first, I don't think it is in practice. Firstly, it means that you cannot call that code until that function is loaded. That means we have a dependency on always pre-loading whatever file has a few dozen current_X() functions in it. That means we once again have a "full bootstrap" dependency for any code using that function, which is an overhead problem as well as a unit testing problem.

Secondly, it introduces a lot of additional function calls that offer little benefit, on a scale that is probably enough that we would notice it in benchmarking.

Thirdly, since we don't have a complete list of all context keys that may exist (contrib modules can add them, so such a list is impossible), there will always be some context keys that do not have a wrapping function. Will we make one for every possible part of the HTTP request? I doubt it. That means developers will always have to know how to use $context->get() anyway, because not everything will have a wrapping function. So that only increases the number of syntaxes that developers need to deal with.

Fourthly (yes, I'm using fourthly!), as noted above we are trying to shift from "pull from a big random pile of functions" architecture to a "well-organized objects that talk to each other in known ways" architecture. That's an integral part of the way WSCCI has been and is being designed. That gives us much more flexibility, modularity, testability, and (through lazy loading) performance. Reintroducing a big random pile of functions runs counter to that goal.

So I think it's better to bite the bullet and build a learnable system that we document and teach, rather than trying to paper over it with something that, by nature, will be incomplete and counter to the goals at hand.

David_Rothstein’s picture

Actually, the existence of drupal_get_context() is how we make it safe to call a function from within an object-oriented context. It creates an out-of-band channel that we can use to update the context for such functions. If we didn't need to make those functions safe to call, we could eliminate both drupal_get_context() and the stack object, and just demand that we pass the context object in constructors or method calls.

OK, I'm a little confused because this sounds different than what I read earlier.

I think the issue I'm raising might be easier to illustrate with some sample code. Does the code below make sense, and if so, is there a solution that avoids the problem? I ran this code and verified it behaves like the code comments say it does:

define('DRUPAL_ROOT', getcwd());
require_once DRUPAL_ROOT . '/core/includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

$context = drupal_get_context();
$new_context = $context->addLayer();
$new_context->setValue('path:system', 'test');
$new_context->lock();

$printer = new Printer($new_context);
// This prints 'test', like it should.
$printer->printSystemPathGood();
// But this prints something like 'node'.
$printer->printSystemPathBad();

class Printer {
  private $context;
  public function __construct($context) {
    $this->context = $context;
  }
  public function printSystemPathGood() {
    print "System path (good): " . $this->context->getValue('path:system') . ". ";
  }
  public function printSystemPathBad() {
    // This is a dumb example because it's easy to fix. But if we were calling
    // some other Drupal API function here that results in drupal_get_context()
    // being called indirectly, that's where a realistic problem could occur.
    print "System path (bad): " . drupal_get_context()->getValue('path:system') . ". ";
  }
}
catch’s picture

I'm honestly not a fan of hook_context_init() either, for precisely the performance concerns that catch mentions. I was originally hoping that it would just be temporary until CMI was ready. However, CMI is still not ready and we are. Also, closures (which work really well here) cannot be serialized so have to be defined at runtime.

So the problem is:
1) We need a way for modules to inject new context handlers with at little overhead as possible.
2) We do not have a disk-store available to us, and even once we do that won't work as well as closures for some (many?) use cases.
3) Code generation (which is what Symfony2 uses for its dependency injection container) is not an option for us because we cannot edit code on disk from the GUI securely. (And requiring a CLI script every time you want to change a setting, including adding or updating a module, is simply not practical.)

So on this, here's a possible way to avoid the extra bootstrap hook, discussed this a little with beejeebus in irc:

The problem is that there is no way to identify that a handler or value might be available for any context key without it being explicitly registered beforehand. Currently that's being solved by calling out to every module that might want to register one (outside of mocking) as early as possible in the bootstrap process. The performance issue is that page cache requests (and any other lightweight request that doesn't require context, which we don't have many of now, but image derivative generation is a possible target), then have to load all those modules and run that code even when they will only be using context from http.

What we could do instead is have the context object call out to hook_context_init() the first time that a context key is requested for which there is no handler. A cached page request would never hit that code path, most others will (but then the registration mechanism is the same). Once the hook has been called once, that's kept track of to ensure it doesn't get called again (if a context key is requested that simply doesn't exist for example).

I can see two replies to this, so I'll try to answer them now:

- it introduces a dependency of the context object on the module system. However #100 suggests a way to do that without it introducing a strict dependency.

- module_invoke_all() is very fragile since it depends on a full bootstrap, and it's possible that code before then is accessing context keys etc. This is really a general problem that is well documented by issues like #592008: Don't save theme registry before modules are included and #534594: [meta] system info cache writing and registry rebuilding is broken without a full bootstrap so while that needs to be dealt with, it shouldn't prevent considering that here - needs fixing anyway.

Anonymous’s picture

So overriding the context key is how we say "Yes I know this url is for block/$block_id?foo=bar, but this block should still run as though it were on node/123 and use node 123 as the node it should be displaying." That's what's important about it.

this implies some code that runs ahead of the block, and understands that it should prepare some data for the block. my first reaction to this is - why can't we settle on a more direct way for that code to prepare the node for the block?

if i want to reuse block $machine_name, and i know that block renders $entity_type, why not have the code that sets up the block's context do:

context()->set('block:' . $machine_name . ':' . $entity_type, $the_entity_i_wanted_rendered);

i think this is extremely obvious code, and any code somewhere else in the system that gets surprised by the value of that key is doing it wrong.

so, for the ESI/ajax example, we're half way there with no need at all for layers, mocking, locking etc. to render node/123 from $block_id, we do:

context()->set('block:' . $block_id . ':node', node_load(123));

not 100% sure what "run as though it were on 'node/123'" means, but i'm taking it to mean "change some context key that the block will reference that is not a key clearly scoped to the block, like 'path:system'". i don't think the magic stack popping based on __destruct and scope code complexity is worth it here, because we could just do:

context()->set('path:system', 'node/123');
// code for the block runs.
// for ESI/ajax, where this is the whole page, we're done.

for "normal" blocks, we have to put it back:

$system_path = context()->get('path:system');
context()->set('path:system', 'node/123');
// code for the block runs.
// for normal block, we put it back
context()->set('path:system', $system_path);

this requires zero mocking/locking/stacking. if you know what you're doing, and you're sure that no code the block calls in to actually needs the 'real' system path (or logged in user, or ...), then you can still change keys that are 'global'.

context stacking/mocking/locking or whatever will do absolutely nothing to stop code that relies on 'path:system' being the 'real' system path getting confused, and i think we should discourage this for normal situations.

this is drupal. people will write code in that block you're reusing that invokes hooks in a bunch of other modules, all of which will see the faked out global context key, leaving the possibility of awesomely subtle bugs. (think anything at all that caches something the first time it runs, which is a common pattern. but that thing is effected by the value of the global key we modified three miles away. then we call the reusable block again with a new overridden value, and hilarity ensues.)

moshe weitzman’s picture

I'd like to focus back on Dries question from #44 which was How this will make web services APIs easier (compared to maintaining the status quo).. Crell's reply from #50 includes

In order to support REST services, Varnish, backbone.js, and other advanced page building techniques in a robust fashion we need to be able to render any portion of the page in isolation. That means being able to control, and alter, what context that portion of the page "sees", regardless of whether we're rendering all of the page at once or parts of the page in separate requests. That means being able to layer the rendering of the page, and change, or override, that context at any point.

.

I personally don't see this patch as a prerequisite for Web Services in core. Consider the top Services that folks use. They want to retrieve a given entity (with Fields, and possibily related entities like the node's author). They want to login and post new entities. These are services we can and do deliver today with Services module. The calls often cost us a full bootstrap but this patch doesn't come close to fixing that.

My problem with this patch is that it adds a huge subsystem (especially if you consider the Symfony code already added) for a benefit that is far in the future and still uncertain. It seems to be that we are going about this a bit backwards. We should be refactoring Drupal for dependency injection and when thats ready, we commit this patch. We should be doing the Drupal work on the rendering pipeline (WSCCI phase 4) and when thats ready, we commit this patch. Adding this code now is complexity without the benefit.

I get that we sometimes add APIs and then later decide how and where to use them. But we usually do that with small APIs like UUID. This is a behemoth and it will touch Drupal enough to be annoying but not enough to actually deliver value. The fact is that we don't know how we are going to do dependency injection nor how we are going to do flexible render pipeline. If we had architecture defined and early patches for those, I would feel better about proceeding with this.

Note that I am not commenting on the quality of this patch. I am just saying that now might not be best time to commit it.

catch’s picture

More issues occurred to me with hook_context_init():

* The class registry is not reliable until full bootstrap. So if a module registers a handler before full bootstrap, and it's requested for example before hook_boot(), then there is a good chance you can end up with an unrecoverably borked Drupal install. See #534594: [meta] system info cache writing and registry rebuilding is broken without a full bootstrap.

* If I register a context handler that ends up calling a function in another module, that other module might not available before full bootstrap either, so pretty sure we could get a nice fatal error that way too (did not try either of these yet).

* Crell mentioned adding module.boot.inc or similar in irc as a way to minimize the performance impact of the new bootstrap hook, but I see the same problem happening in terms of for example user.module not being loaded while user.boot.inc registers a handler that calls user_load().

Crell’s picture

Both of the issues in #109 can be resolved by using PSR-0 autoloaded classes instead of the core registry autoloader or functions. We only need to get as far as module_list() to have all namespaced classes available to the autoloader, which we would need anyway to get to hook_*.

Calling functions from within a handler is, I'd argue, bad practice. That already bit us in this very issue, as discussed in #1355118: Move path.inc functions into OOP autoloaded potential code to avoid WSOD when calling HandlerPathSystem::getValue() [statistics], for which the conclusion was "hack it for now and refactor it to be OO later, so that this problem simply goes away."

The "hook_context_init() on miss" concept in #106 is interesting. I don't think pounard's suggestion is the right way to do that in a non-hacky way, though. Probably by allowing for a final handler that fires if nothing else is possible, which would then invoke hook_context_init(). We can toy with that.

Moshe: There is certainly a chicken-and-egg question about any new API. The reason we want to commit things in stages is to avoid diverging too far from core. For one, that makes the eventual review for merging more difficult. For another, it makes "chasing HEAD" progressively more difficult. For a third, we don't have a testbot on any issue in WSCCI except this thread. That means the longer we're sitting off without merging, the less we're able to leverage the test system. (Because really, I don't remember the last time I even attempted to run a full test suite on my laptop instead of via testbot, nor do I want to eat my system up for a day to do so.)

Also related, for those who didn't catch it earlier, in the WSCCI status meeting this morning we decided to try and implement a super-stripped down implementation to see what happens. I predict it will show why the stacking/locking logic is needed, but I acknowledge that the opposite is not impossible.

Stay tuned.

catch’s picture

We only need to get as far as module_list() to have all namespaced classes available to the autoloader, which we would need anyway to get to hook_*.

Currently 'as far as module_list()' is full bootstrap - since that is when it starts returning all modules instead of just bootstrap modules.

Due to issues such as #1061924: system_list() memory usage getting the list of all modules is currently very expensive.

So the problem is the same if using autoloaded classes provided by modules (since the system does not know the location of those modules until full bootstrap currently) as it is with the registry, in terms of availability at different bootstrap stages.

pounard’s picture

If fetching the module list is a problem, fetching plugins definitions, discovery and other nice stuff WSCCI/CMI are doing will be a bit of hell in your memory.

Crell’s picture

Oh full bootstrap, how I hate you...

We're drifting a little off topic here, but some thoughts:

1) "full" bootstrap can become cheaper if less code is in functions and more code is in classes, because they classes don't need to load when the .module file does. That's one of the side benefits of moving things to OO plugins, as WSCCI is also trying to do.

2) If we can move more stuff to config files, those get copied into a common directory that can be read super-early. (At least that's what CMI is trying to do.) So if we can move more of our lookup data from modules to config files, that could help as well.

3) This relates to #1290658: Move all module-provided classes to PHP namespaces (PSR-0 or similar), and autoload them without the registry [policy, no patch] as well. What if we registered namespaces=>paths via info files? One default that we can always record, and then let a module say "and ALSO root \Foo\Bar\Baz in this directory". That's in the info file so it can also be read pre-full-bootstrap. (And if info files turn into XML config files, fine, whatever, that's just a syntax question.)

pounard’s picture

My guess is that removing the system table completely and replacing the bits we really need in both settings.php and some configuration files would be a win.

What we really need *at minima* (i.e. enabled module list with their respective path and their declared namespaces) could be a forced, simple, array into the settings.php file (yes, that means no UX I know... but on production sites, über fast and way more secure); Of course still used as a cached config entry when in development mode (or later produciton mode if you cannot access the FS easily) which means UX. Concerning what we'd need later (anything like some other .info file information) could go into some cached configuration or not configuration objects (and be accessed only when needed).

The rest (i.e. information about non enabled modules) can just be removed from database completely, because the only place where we'd want to see it is the module UX, and in that particular place, we can afford to lookup into the FS (on a production site, nobody would go to the modules admin page, sounds like a dumb thing to do, which parse the FS anyway even if data exists in system table).

catch’s picture

OK I think we are talking past each other now...

- getting the module list is particularly expensive because of a regression between Drupal 6 and Drupal 7, see the linked issue.

- that can be made much cheaper (by reviewing and committing the patch in the linked issue)

- however it is not going to be cheap enough that you can get away with doing it in the critical path of page caching without a noticeable performance regression, because you can't get away with hardly anything in the critical path of page caching without a noticeable performance regression.

The "hook_context_init() on miss" concept in #106 is interesting. I don't think pounard's suggestion is the right way to do that in a non-hacky way, though. Probably by allowing for a final handler that fires if nothing else is possible, which would then invoke hook_context_init(). We can toy with that.

This would be great if it works, and we can work on the fragility of the module system in other issues without introducing a performance regression here then.

What if we registered namespaces=>paths via info files?

Putting crap into info files that we need on every request is one of the problems we introduced in Drupal 7 I've been trying to reverse, not keen.

If fetching the module list is a problem, fetching plugins definitions, discovery and other nice stuff WSCCI/CMI are doing will be a bit of hell in your memory.

Yes I expect that to be complete hell if they are going to be storing and retrieving large registries, that is a considerably larger problem than the amount of code loaded in Drupal and has not had a great deal of attention.

Anonymous’s picture

re #110 and a super-stripped down implementation - is there a branch that's happening in for any of that want to play along at home?

Crell’s picture

There's no code yet. Once I start writing it will be in a new branch in the existing WSCCI sandbox.

Putting crap into info files that we need on every request is one of the problems we introduced in Drupal 7 I've been trying to reverse, not keen.

I think the problem is not that it's defined in the info files, but that we leave it there in serialized form rather than turning into a lookup-optimized data structure. That's OT for this thread, though.

andreiashu’s picture

sub

Anonymous’s picture

spammer be gone!

Dries’s picture

Crell wrote: Also related, for those who didn't catch it earlier, in the WSCCI status meeting this morning we decided to try and implement a super-stripped down implementation to see what happens. I predict it will show why the stacking/locking logic is needed, but I acknowledge that the opposite is not impossible.

I think that trying to reduce the scope of this patch to the bare minimum could help. The general sentiment is that in its current state, this patch is "too much" and "too uncertain" and therefore in conflict with the current burning desire to reduce core complexity. So thanks for taking the time and effort to try and reduce this patch. That should be really helpful.

However, I wonder if a completely different approach would be better.

In #108, Moshe suggested that maybe we start with phase 4, the rendering pipeline, and work backwards towards this patch to enable incrementally more fine-grained web services. I wonder if that is a viable approach in your and other people's mind. If possible, I believe such an approach could help us win the hearts and minds of more core contributors.

I don't feel like you answered Moshe's question in #110. You gave "operational reasons" (tracking core, merge-ability, lack of testbot) instead of "architectural reasons" for why this patch is needed and why it needs to go in first.

I understand this is a tough question to ask, as it may require you to rethink your strategy from the ground up. Regardless the answer, it is worth having this conversation.

pounard’s picture

Pretty much agree with starting from Phase 4 since it's supposed to heavily use the HttpFoundation/HttpKernel from Symfony, and this will add new contraints regarding how to store and pass context data. It may involve using DIC container too, which is another factor to add.

Crell’s picture

Dries: Yes, the reasons for pushing Context API now on its own are largely operational, not architectural, because operational issues will hold us back. If we came back with one patch in 6 months that changes 90% of the code in Drupal in one shot, well, *I* certainly wouldn't want to commit that, to say nothing of the difficulty of chasing HEAD. Operational considerations are relevant.

That said, it's interesting that you suggest starting from phase 4... Please stay tuned.

catch’s picture

I'm marking #1323210: Discoverability and documentation as a duplicate of this one. There is quite a lot of discussion in there, that really should've been in here in the first place, but better late than never to merge the issues.

DjebbZ’s picture

Basically the discussion in #1323210: Discoverability and documentation is about discoverability of the API, especially the context keys (like http:uri, http:query, etc.), and how to document them. Nothing had been settled yet. Anyway we're waiting for some feedback from Crell (WIP : reducing and/or refactoring WSCCI, AFAIK). As he would say, please stay tuned :)

chx’s picture

to support ... Varnish .. and other advanced page building techniques ... need to be able to render any portion of the page in isolation.

that's simply not true. http://drupal.org/project/redis_ssi Drupal 7 already supports this just fine! Your problem is cache invalidation on content creation and there is nothing on this page that helps with that. #636454: Cache tag support helps with that.

neclimdul’s picture

Glancing through the code it seems to handle nodes. Does it support rendering blocks through SSI? I also don't see it handling things like hook_page_alter(). hook_page_alter() is fairly untenable in terms of maintainability and consistent rendering but blocks and the generic "everything is content" model being heralded for d8 are some of the more difficult questions being addressed by this. That's not a challenge to fix it in D7 just what we're trying to architect a solution for in d8.

chx’s picture

Not wanting to derail the discussion but: a) doesn't support blocks yet but it could with not a lot of effort b) yes it does support hook_page_alter or anything else, it works on a diff level.

pounard’s picture

@#127

that's simply not true.

I think this is true, Drupal stable core has not been designed to do this: in order to render blocks in the esi_api module, I had to do a shitload of hacks (and it's not working for all blocks) because it's hard to recreate a page context without knowing in advance what's revelant for the block build: that's where I think having this Context API (maybe not the one proposed today) is really important.

Plus your module is doing ASYNC HTTP calls to regen the page! Seems like trying to use a bazooka... I hope we won't have to fire an ASYNC HTTP call for each bit of data we updated: if I use VBO and update 1000 nodes at once, aren't you afraid this will kill your server by doing 1000 HTTP queries simultaneously?

PS: That's sad you started a redis_* module without asking me about the redis module first: we could join forces instead of duplicating effort.

catch’s picture

Just in case anyone following this thread missed it: http://groups.drupal.org/node/198538

catch’s picture

I (very briefly) spoke to merlinofchaos in irc about the context API and wscci in general. I'm paraphrasing the discussion here and adding some embellishments.

This particular objection didn't come up on this thread, but it has come up a bit in http://groups.drupal.org/node/198538 and it's something I've been thinking about but not really found time to write properly.

Apologies if I mis-quote at all.

First a code example stolen from a pastebin:

// Assumes a content type that wants a node and a user for whatever its
// nefarious purposes are.

// CTools content types:

ContentType::render($node, $user) {
  // render here
}

// Current proposal
// Note that the keyword content appears in the context is up to the $context
// not the content.
ContentType::render($context) {
  $node = $context->get('node'); // note magic keyword
  $user = $context->get('user'); // note magic keyword
  // render here
}

Now the problem with the 'ctools' version:

- further down the rendering pipeline, you might need some other information that's not in the arguments that were passed to the method ($node and $user in this example) - like a particular field formatter cares about the current organic group or something. If there's no way to get this context, then we're back to globals and it breaks dependency injection.

However the problem with the 'current proposal' version:

- it is starting to look like a very big $options array.

- Rather than calling a function with different a different value in the default case of simply wanting a different node rendered (as you can with node_view() or similar now), you need to mock the $context object.

- To see what a block cares about, you have to look at the actual code and see which keys it's requesting rather than function/method arguments which will have phpdoc.

- For caching, you'd likely need to serialize the entire context object then hash it, because you can never tell which particular bits of context a block cares about vs. ones that were requested higher up the stack. Some things in the context object are going to have zero effect on the rendered output of a block, but could vary a lot between requests causing a very low cache hit rate - for example some blocks might not care about the current user at all, but that will always be in the context object during a full page request due to access checks for the path or similar.

If we have both, so:

ContentType::render($context, $node, $account)

Then that would reduce the frequency that $context needs to be mocked (especially if down the call stack most work is done via passing explicit arguments). It also means that if there's a UI for "show this block, with this specific entity here", then we have explicit information about what that block expects than can be fed back up to the UI.

It's very late here and this isn't as thorough as I'd like, but was reminded of this by comments like fago's at http://groups.drupal.org/node/198538#comment-657963 (not sure if that's what he was actually getting at though).

EclipseGc’s picture

OK, that's great but I THINK the actual proposal is a bit more like this:


ContentType::render(Context $context, Configuration $config) {
  $node = $context->get($config->get('my node context key'));
  $user = $context->get($config->get('my user context key'));
  // render here
}

Which I think means we could actually only cache the relevant things to the block based upon the configuration information for this specific block.

Caching aside, this is almost exactly how page_manage/panels approaches passing information to our "content_type" plugins. The concern here revolves around the fact that Crell has done his best to not discuss a UI for this system, but he acknowledges that we need one.

catch’s picture

Well that still has some issues:

- you still can't look at the method arguments to see what the callback cares about.

- there is no hard, traceable link except following code execution between the callback and the configuration (it's calling out to configuration keys now that could be arbitrary), so I don't really see how this helps a UI (or someone browsing PHPdoc) know what's expected.

- everyone implementing a block needs to provide default configuration in a file which they then need to fetch again manually in their callback.

- still plenty of mocking going on for things that are not strictly tied to the request context in the first place (for example if I want a block that simply renders the same 'footer' entity on every single page of my site, then that context is not coming from the request in any sense at all).

- how does the rendering system know which configuration was requested by the block, when that's done inline in the render callback?

If we keep the idea that each block has some related default configuration saying what it needs, then the real change here would be that the calling code would need to request this (to know which arguments to pass in) rather than it happen in-line. That's very similar to how the menu system works now with page arguments and menu callbacks already.

yched’s picture

If I'm not mistaken :
Content type are Ctools plugins, right ? I mean, "implementations of the 'Content type' plugin type".
Similarly, the current scenario for core "blocksTNG" relies on the various blocks being Implementations of the 'block' Plugin type.
(Capitalizing the notions that are part of the current state of the WSCCI Plugins concepts as per http://groups.drupal.org/node/193103)

In which case, the "signature" of the block - which elements it needs to be fed (a $node and a $user in the current example) would be exposed in the description of the corresponding Plugin Implementation (whether an info hook à la D7 or in the CMI files as currently envisioned). It's not "either it's in the PHP signature of a given method or you need to dig in the method code".

Besides, if the render() method is part of an interface, it's signature cannot vary from one block to the next anyway ?

neclimdul’s picture

Eclipse and merlin are both ctools dev's and content types are honestly very similar to "blocks that take context" so its a pretty mental jump for ctools devs to think of blocksTNG as content types. I think that's why merlin used it in his example pasted here and why eclipse continued the use in his clarification.

Yeah, you can't vary the signature so we're looking at something similar to the $variables argument sent to theme functions/preprocessing. Which actually segues to the problem I couldn't address with merlin and catch brought up too is that without explicit context we don't know and basically can't document anything about the function. The implicit global "state of the system" context is great but is there another explicit "things the block needs" context we've been just cramming into the same location? And if so where does one end and the other begin?

yched’s picture

What I mean is that the answer to "what are the things the block needs" (number and nature of the "parameters" and names of the context keys it will look for - i.e what is needed to build a UI) lies in the definition of the block as a Plugin Implementation of the 'block' plugin type.

If the callback is a method implementing a generic interface for blocks - render() -, then by definition its signature is "useless" regarding what this specific block makes or needs.

neclimdul’s picture

Right, falling back to the theme system example, the plugin definition information would be similar to the arguments hook_theme provides. This makes sense as plugin definitions are mostly a better info hook system.

Dries’s picture

Assigned: Dries » Unassigned

Based on recent discussion in the WSCCI group, I think we might be going a different direction on this, so I'll hold off reviewing until given the word. Plus, in two weeks we're having a face-to-face code sprint to talk about this is detail. So I'm un-assigning for now. This is a very important decision for the future of Drupal, and one that I want to be part of so please re-assign this to me at the right time.

bojanz’s picture

Status: Needs review » Closed (won't fix)

The current WSCCI issue is #1463656: Add a Drupal kernel; leverage HttpFoundation and HttpKernel. This one represents a dead end.
Marking as "won't fix". If someone thinks "duplicate" is closer to the truth, feel free to change it.

bojanz’s picture

Issue summary: View changes

Clean up wording around the stack object and its reasoning.