Context values: Keys or objects? [#1177246]

Splitting off from #1100188-14: Cloning of context objects and following...

Whatever the internals of the context object, it will return something. There are two things it could return: A primitive lookup key or an object. For example, $context['node'] could return:

1) A fully loaded node object.
2) A small array containing nid and language (or similar), and then the caller is responsible for calling node_load() on that.

Pros for option 1:
- It's really simple on the caller. You get back an object (if what you're requesting is an object in the first place) and you can use it immediately, even treating $context['node'] as a variable itself.
- Unit testing and mocking of context is really easy. You simply load the node object you want to take out, assign it to the right context key, and you're done. Piece of cake.

Cons for option 1:
- Because objects pass by handle in PHP, what you get back is a reference to the object in the EntityController class's static cache. Change that object and you're now changing global state outside of your scope, which we want to prevent.
- We would still need to reduce the context information to primitives anyway in order to use it as a cache key, which we want to do for many reasons, including Varnish caching and smart cache invalidation. #636454: Cache tag support

Pros for option 2:
- We don't carry around or cache large objects, only small primitives.
- It's really easy to turn a context object into a cache config key.

Cons for option 2:
- It's more work for the caller. They need to pull back $context['nid'] and $context['language'] and call node_load() themselves.
- It's harder to unit test and mock. We'd have to mock node_load() too, or else move that to an injected service object that gets called separately. (That may be a good idea anyway, granted.) (Note that "fill in the database before running the test" like we do now is not unit testing, it's slow-arse integration testing that we want to minimize.)

I'm not sure which trade-off is good here. Discuss. :-)

Comment	File	Size	Author
#46	butler_context_values_1177246_46.patch	302 bytes	Stalski
#43	butler_context_values_1177246_43.patch	9.86 KB	Stalski
#40	butler_1177246_40.patch	11.45 KB	Stalski
#38	butler_1177246_38.patch	11.11 KB	Stalski
#34	1177246-context-value.patch	3.65 KB	xtfer
#32	1177246.patch	5.39 KB	xtfer

Comments

Comment #1

pounard

French

commented 3 June 2011 at 10:23

Keys would definitely be better for cache keys and static cache purposes, as you said, I definitely stick on that side.

But I'd propose a 3rd option:

You don't store objects
Caller can get $context['nid'] but it returns a "context helper", with two functions: getRawValue() and getNode()
If context is an object, the context carrying $context['nid'] has a specific implementation with an helper function loadEntityNode() that does the node_load() itself and returns it

Pros:

No business logic for cache key computing, just get all current context keys, sort them, and build a hash
User doesn't have to type 3 lignes of code to fetch its object, and a lot of nice helpers can be implemented at the specific contextes level.

Cons:

It creates a new layer in the API, maybe that's a good thing, maybe not, depends on how many of these context objects will be spawn during runtime probably, it's somehow code indirection
User have to test for context object class at runtime before calling them

Example:

interface ContextInterface {
  public function getRawValue();
  public function getKey();
  public function __construct($key, $value);
}

class ContextNode implements ContextInterface {
  // ... Methods from ContextInterface
  public function getNode() {
    return node_load($this->_value);
  }
}

class ContextStack {
  /**
   * @return ContextInterface or NULL
   */
  public static function getContext($key);
  // ... Stack handling
  public static function hash($revelant = NULL) {
    // Some kind of static cache here would be good.
    $array = array();
    foreach (self::$contextStack as $objectHashcode => $object) {
      $value = $object->getRawValue();
      if (!isset($revelant) || in_array($revelant, $value)) {
        $array[$object->getKey()] = $value;
      }
    }
    ksort($array);
    return md5(serialize($array));
  }
}

// Somewhere when instanciated:
ContextStack::stack('node', new ContextNode('node', 251));

// Somewhere in user code:
if (($context = ContextStack::getContext('node')) instanceof ContextNode) {
  $node = $context->getNode();
}

// Other place, compute hash (for example a block relying on user and page only):
$hash = ContextStack::getContext(array('user', 'get:q'))

Comment #2

catch

he/him

English

commented 3 June 2011 at 15:17

Couple of things here.

There are likely to be some context objects that you need to know the identifier but don't always need the full object. A classic example would be language - while we currently have global $language, 99% of the time in core, the only thing that gets accessed is the language code - $language->language for use in caching keys.

So there are three parts to this:

1. Determining the current context for a particular key - in some cases (maybe all?) this may itself be done when that's demanded (i.e. find me the current organic group for this node does not need to be front loaded).

2. Finding the current organic group based on a node should give you a node ID - at least entityFieldQuery + entity_load() enforce that kind of pattern and it's a good one. The node ID (or langcode or user ID) may be enough to generate a cache key just by itself - if that's all that's needed, then it'd be great to just do that. However it's possible cache key generation happens via methods on the context object, not by requesting context keys and figuring it out.

3. Lastly, there is requesting the actual object that the context handler is referring to.

If we decide #2 is only valuable for caching (and possibly very rare edge cases), then as long as the context system has helpers to generate cache keys that doesn't need to be available to code that requests the context object, so returning objects sounds good (although I'd still like to store only keys internally.

Because objects pass by handle in PHP, what you get back is a reference to the object in the EntityController class's static cache. Change that object and you're now changing global state outside of your scope, which we want to prevent.

On this, what happens if a function that requests context needs to trigger a node_save() or any other crud operation? The more things request context, the more likely this is going to become. Custom code, triggers, actions all do things like this all the time. In that case, you very much do not want later functions in the same page to be operating with a stale node object.

Currently changing objects without then _save()ing them is unsupported even though it's not enforced. This is part of #154859: Document that cached entity objects are not necessarily cloned any more, I'd want to see very, very detailed performance numbers on clone before going down that route, especially memory. Either way I'm not sure that very old discussion, which is nothing new, should affect decisions made here.

Comment #3

catch

he/him

English

commented 3 June 2011 at 15:30

I originally posted that comment on the duplicate issue that Crell opened, hadn't seen pounard's when I wrote it.

Being able to get both the key and the value may be worth the extra method for callers. However I'm not sure why we need a special getNode() method, isn't something like this enough?

class NodeContext Implements ContextInterface {
  protected $nid;

  function __construct($nid) {
     $this->nid = $nid;
  }

  function ___get($name) {
     if ($name == 'id') {
       return $this->nid;
     }
     else {
      return node_load($this->nid);
     }
  }
}

$node_context = $context->get('node');

if ($node_context) {
  $nid = $node_context->id;
  $node = $node_context->value;
}

The magic __get() is supposed to be expensive, but I doubt we are going to be requesting context keys and values hundreds or thousands of times per request.

Comment #4

catch

he/him

English

commented 3 June 2011 at 15:41

Just to clarify on clone:

1. the context class does something like $context['node'] = (clone) node_load($nid); and static caches this itself.

2. Some function reacting to a rule or trigger does $node->foo = 'bar'; node_save($node);

3. node_save($node); resets the static cache, but it will not reset anything for $context['node']; so now it's stale.

4. More unlikely, say another function does the same as #2, it's going to save a $node with stale data.

That seems like as likely a scenario as people doing $node = node_load($nid); $node->foo = 'bar'; then not saving it.

If nodes aren't a great example, presumably $_SESSION is going to be available as context at least potentially, and functions need to set or unset stuff in $_SESSION regularly.

But if we don't cache any objects in the context class, then this whole argument devolves to the various APIs that provide contexts, and that seems good to me.

Comment #5

pounard

French

commented 3 June 2011 at 15:55

@#3 Your code is pretty much what I explained upper, do not store obects, just set a proxy method to the real object factory.

The __get() method is a bit expensive, but it's not if you don't call it a thousand time. And by the way, PHP 5.3 did serious optimizations with function calling and stuff, so that tends to be not that true anymore (PHP 5.2 is officially unsupported now). What I mean here is that even if Drupal officially supports PHP 5.3, a lot of hosting will probably switch to PHP 5.3 so the performances should be targetted for this environment.

@#2 +1 for language, who needs the full metadata anyway? 99% of time you only need the language code.

If we decide #2 is only valuable for caching (and possibly very rare edge cases), then as long as the context system has helpers to generate cache keys that doesn't need to be available to code that requests the context object, so returning objects sounds good (although I'd still like to store only keys internally.

That's pretty much I'd propose the middle solution, that pretty much looks like your code. Plus, even if you don't have the helpers for spawning the node, considering that you are actually fetching the 'nid' key means that you KNOW what you are fetching and you know how to load it.

Comment #6

Crell commented 3 June 2011 at 23:51

My concern with having an id and value method/property of every context key is two fold:

1) There's no guarantee that the context value you're accessing is an object with obvious ID. You could be looking for "current node", but could also be looking for "value of the http accepted compression format header", which is always going to be a string. (Obviously you would never want to a cache key based on that, but you could want to access it.) Or, less crazy example, $context['language'] itself is just a string, not an object. I don't know that ->id and ->value makes sense as a universal pattern.

2) Some objects may need multiple IDs to load. Catch likes to use the nid-and-language example on a multi-lingual site. What does ->id return then?

3) How would you setup a mock value when overriding? Currently, the syntax is something like:

$c2 = $c1->mock();
$c2['node'] = $some_node_object;
$t = $c2->lock();

dpm($c2['node']); // Gives you a node object.

You can also set an alternate handler rather than a literal value.

So I'm not sure how that would work if you're tracking both lookup key and value. Which one do you specify when overriding?

Comment #7

pounard

French

commented 4 June 2011 at 10:06

#6 for 1) and 2) The getter method seems transparent enough to allow something natural, if I take you code sample and try to adapt it with catch's suggestion:

class ContextValueRaw implements ContextValueInterface {
  protected $_value;
  public function __get($name) {
    if ('value' === $name) {
      return $this->_value;
    }
  }
}

class ContextValueNode extends ContextValueRaw {
  public function __construct($node_or_nid) {
    $this->_value = is_object($node_or_nid) ? $node_or_nid->nid : $node_or_nid;
  }
  function ___get($name) {
    if ('value' === $name) {
      return $this->nid;
    }
    else if ('node' === $name) {
      return node_load($this->nid);
    }
  }
}

$c2 = $c1->mock();

// For this I would prefer not to use ArrayObject, but magic __get and __set, because
// we could do: $c2->language = "fr" // And create default ContextValueRaw magically
// behind.
$c2['language'] = new ContextValueRaw('fr');

$c2['node'] = new ContextValueNode($some_node_object);
$t = $c2->lock();

$nid = $c2['node']->value; // Nid
$node = $c2['node']->node; // Node, if not the right instance, returns null.

// Could work the same for HTTP headers?
$langcode = $c2['language']->value; // String

Still for 1) The syntax, even with the value accessor seems straighforward enough, just need to be documented.

For 3) The mockup function would just be a deep clone of current context array object. I don't need to write a sample here because it's quite basic, but in this case clone keyword is probably best than mock() function.

$c1['language'] = new ContextValueRaw('fr');
$c2 = clone $c1;
$c2['node'] = new ContextValueNode($some_node_object);
$t = $c2->lock();
$nid = $c2['node']->value;
$node = $c2['node']->node;
$langcode = $c2['language']->value;
// Or for fun we could also do this implementing __toString() method, but would
// work only for pure string function, for the sake of consistency value property is
// probably better:
$langcode = (string) $c2['language'];

For 1), about the context propagation (language at node load time), you have three solutions:

Let the user handle it (that's probably what we don't want).
Create a context back reference in the context value itself (that should be replace at __clone time, and this same value object getter use it for loading.
Even better, let the value object context back reference (with the same pitfall at clone time), make the node factory (the entity controller) aware of context and make context being an optional parameter of the NodeController::load($conditions, $context = NULL) method.

Sample:

class ContextValueRaw implements ContextValueInterface {
  protected $_value;

  protected $_context;

  public function getContext() {
    return $this->_context;
  }

  public function setContext(ContextInterface $context) {
    $this->_context = $context;
  }

  public function __get($name) {
    if ('value' === $name) {
      return $this->_value;
    }
  }
}

// Using the magic ArrayObject methods here could call implecitely the
// ContextValueRaw::setContext($this) method:
$c2['node'] = new ContextValueNode($some_node_object);

class ContextValueNode extends ContextValueRaw {
  public function __construct($node_or_nid) {
    $this->_value = is_object($node_or_nid) ? $node_or_nid->nid : $node_or_nid;
  }
  function ___get($name) {
    if ('value' === $name) {
      return $this->nid;
    }
    else if ('node' === $name) {
      return NodeController::getInstance()->load($this->_value, $this->_context);
    }
  }
}

// Would just propagate the context to the controller itself, that would test for
// revelant values to use for loading, such as language.
$node = $c2['node']->node;

I really don't like back ref, that's what made Views 2 almost impossible to debug, but in this case, if the code is clean and the design remains simple and is put on some diagrams for easy comprenhension, I don't see why we shouldn't use them.
This method ensure context propagation in the lower level API's and that's good!

To avoid backref, we could imagine something else:

class Context extends ArrayObject
{
  protected static $_current;
  public static function getCurrent() {
    return isset(self::$_current) ? self::$_current : new ContextNullObject();
  }
  protected static function _setCurrent(Context $context) {
    self::$_current = $context;
  }

  // ..

  // And in lock method:
  public function lock() {
    // ... do whatever is needed, then:
    self::_setCurrent($this);
  }

  //  ..
}

Then the NodeController could itself have:

class NodeController {
  public function load($conditions) {
      // Still be null if no language, but valid with no errors.
      $context = Context::getCurrent();
      $language = $context['language']->value;
    }
  }
}

Comment #8

Crell commented 4 June 2011 at 17:47

It's going to take me a bit to fully digest #7, but I think there may be something in there. I was hoping to avoid having both a context handler object AND a context value object, though. That is, as noted above, extra complexity (more places for things to go wrong, more bugs, etc.).

One side note, the reason for using ArrayAccess is so that we can do $context['http:post:var']; we determined in Copenhagen that the namespacing potential there is very powerful, especially when you can attach a single handler to http:post. But you obviously can't do $context->http:post.

Given that we would still have a context handler (eg, if you don't specify $c['node'] directly, a handler fires that will examine $context['http:get:q'] for a node/nid pattern, grab that nid, and load that, and then potentially cache it), where would you split the functionality between the handler and the value wrapper? That seems unclear to me.

Comment #9

pounard

French

commented 5 June 2011 at 00:26

Basically:

$value = $context['http:get:q'];

Is exactly the same as:

$value = $context->http->get->q;

It's just a convention.

I'd bet that the right choice is the one that allows the more flexibility technically, but otherwise it's exactly the same. You can even implement both on the same object!

Value objects only allows to embed business method inside the context itself and provide a proxy pattern toward the business factories, but nothing mandatory here, it's not the more important part about it. I think the more important part is the fact that the context could propagate to the business stuff such as entity controllers themselves.

Comment #10

catch

he/him

English

commented 6 June 2011 at 05:52

Some objects may need multiple IDs to load. Catch likes to use the nid-and-language example on a multi-lingual site. What does ->id return then?

Loading a node is not dependent on language - this is the case whether using entity-level translation (core translation module) or field translation.

So you would have $context['node']; and $context['language'] - two completely separate keys, not one combined.

Language is something that is determined either by querying a list of nodes restricted to a particular language or languages (Drupal 6 style), or rendering fields based on a langcode, so http://api.drupal.org/api/drupal/modules--node--node.module/function/nod... (Drupal 7 core style, which now co-exists with D6 methods).

In the former case the query will be affected based on language context. In the second example, node_view() requires language context (or optionally requires it depending on whether locale and other modules are enabled).

That doesn't mean there's a no valid use case for $id being an array, but node + language isn't it.

Comment #11

pounard

French

commented 6 June 2011 at 13:00

Yes I see, my point was more about "passing the context to the factory" in a generic way (not only for language itself, it could be for anything else, maybe some context restrictions or query rewrite).

Comment #12

Crell commented 6 June 2011 at 19:40

@catch: Ah, OK, I misunderstood you then since language and node are frequently mentioned together.

Comment #13

jherencia commented 9 June 2011 at 09:34

Subscribing...

Comment #14

pounard

French

commented 9 June 2011 at 10:01

Just saying, but maybe at the early beginning it should remain simple, and should not provide any of business functions except __get() and __set() (or ArrayObject implementation, doesn't really matter). Let the consumer do whatever it has to do with the raw scalar values inside the context.

It would allow having a partial working context system really quickly, and business matter can be added later what it worth. Business will be useful only at the two "context usage endpoints":

When creating it (but we don't need any kind of helper here, the business layer would be able to do it itself so whatever);
At final usage time, where the consumer need the "nid" value. But the middleware layer doesn't really need to know what's a node or how to use it.

Comment #15

dmitrig01 commented 11 July 2011 at 00:47

Hrm. I'm not really sure I really see either of the cons of option 1 as that significant:

- Because objects pass by handle in PHP, what you get back is a reference to the object in the EntityController class's static cache. Change that object and you're now changing global state outside of your scope, which we want to prevent.

Just return a clone of EntityController's static cache. Problem solved.

- We would still need to reduce the context information to primitives anyway in order to use it as a cache key, which we want to do for many reasons, including Varnish caching and smart cache invalidation.

The caching layer, I assume, will be invisible to people writing plugins, at least for the most part. Thus, it should be fairly simple to keep track of the id of the object requested internally, and afterwards, when accumulating keys to cache, ask each context handler thingy for the id by which they want to be referenced. Then, they can take this id to retrieve the full object.

I do, however, think that option 2 does have one advantage: when only the node id is needed, it's a waste to load the whole node. However, there are two possible ways I see around that. The first is to have it be possible to get $context['node'] and $context['node:id']. The second is something like this (this would probably be implemented on the entity level, and mitigates the first con mentioned of option 1):

class LazyNode {
  protected $nodeLoaded = FALSE;
  function __construct($nid) {
    $this->nid = $node;
  }
  function __get($key) {
    if (!$this->nodeLoaded) {
      $node = node_load($this->nid);
      // Do this for performance so __get doesn't keep getting hit.
      foreach (get_object_vars($node) as $node_key => $value) {
        $this->{$node_key} = $value;
      }
      $this->nodeLoaded = TRUE;
      if (isset($this->{$key})) {
        return $this->{$key};
      }
    }
    // Some sort of error -- key not found on node.
  }
}

Comment #16

pounard

French

commented 11 July 2011 at 07:55

Just return a clone of EntityController's static cache. Problem solved.

Whatever is the context, a business object such as an entity shouldn't be contextually modified. Clones are bad.

Comment #17

eaton commented 14 July 2011 at 03:06

Whatever is the context, a business object such as an entity shouldn't be contextually modified. Clones are bad.

That applies to almost anything that's "contextual" though. The user object shouldn't be modified directly, but it's still used for contextual decisions. If something appears in context, it's for use in decision-making, not for modification/editing. I don't think that issue should affect the choice of key vs. object...

Comment #18

dmitrig01 commented 29 July 2011 at 06:17

In a perfect world, clones may be bad, but in that perfect world, people wouldn't modify the objects returned by context. Unfortunately, we live in an imperfect world, which means that we need to solve these problems. In our imperfect world, putting a ->lock()/->unlock() method on an Entity object would significantly slow down all of Drupal, since all properties of all entities would need to be accessed with __get.

Comment #19

catch

he/him

English

commented 29 July 2011 at 08:56

You're not even attempting to justify why we should do that just stating 'we must' and making assumptions. If the context system stores keys then it's not that system's problem to deal with the clone or not clone issue. And no one has answered what should happen if a function that takes context triggers an entity save.

Comment #20

fago

German

Vienna

commented 29 July 2011 at 10:43

> Just return a clone of EntityController's static cache. Problem solved.

Whatever is the context, a business object such as an entity shouldn't be contextually modified. Clones are bad.

That's how we handle it everywhere in Drupal since d7, so why should we handle it different here or change it again?

Comment #21

Crell commented 30 July 2011 at 05:05

catch: I'm unclear which post you're responding to?

At the moment I am leaning toward saying keys (well, primitive values) only, as it simplifies a great deal in our implementation. However, the downside of that approach is that a context consumer (most likely a plugin) would then need to use that key to load the object it needs. That means we are immediately breaking encapsulation, since we cannot simply inject a fake node into a consumer. We have to go create the node first, then pass in the fake node's key for it to node_load(). That eliminates half the benefit of forcing all input through this channel.

The natural solution to that would be a service locator, as a few others have suggested elsewhere. That's a very nice solution with a long history of existing implementations, but is also another moving part I was hoping to avoid for the time being.


class MyPlugin {
  protected $context;
  protected $service;

  public function __construct(Context $context, Service Locator $service) {
    $this->context = $context;
    $this->service = $service;
  }

  public function doStuff() {
    // With full service locator.  This is extremely clean architecturally but rather verbose.
    $node = $this->service->getService('node')->load($this->context['node_id']);

    // Keys only but no service locator.  Simpler, but we now have that 
    // hard coded node_load() that breaks encapsulation.
    $node = node_load($this->context['node_id']);

    // Passing an object in context.  This is the simplest for the API consumer, but introduces all sorts of
    // annoying implementation details regarding caching, Varnish IDs, unexpected side-effects, etc.
    $node = $this->context['node'];
    

     // Do stuff with $node;
  }
}

Now, on a conceptual level I rather like the service locator approach. It's clean, flexible, powerful, extensible, all that good stuff. I would actually rather like it if we end up there eventually, as that would allow us to move the vast majority of Drupal's useful code over to true unit tests. However, it's also yet another moving part that we have to think about, implement, test, and get working throughout Drupal. It also results in a more verbose usage; whether that's too verbose for people to get comfortable with I'm not sure.

Just cloning the node returned from whatever load mechanism may help with the side effect issue, but it does nothing to solve the caching and Varnish ID issues. So that's not a complete solution either way.

So I think that's where we stand currently. If I'm missing a piece, please speak up. :-)

Comment #22

dmitrig01 commented 30 July 2011 at 05:41

What would the downside of the approach I outlined in #15 be? When it's mocked, a different node object, filled out with fake properties, can be provided, which greatly simplifies it. It also provides a way for having either keys or actual node ids work.

If we wanted to get really complex, we could introduce some sort of tracker on the offsetGet method that would track whether the node that was gotten ended up filled out, etc., but that may be overcomplicating it.

Comment #23

catch

he/him

English

commented 30 July 2011 at 06:36

@Crell I was referring to Dmitri's post immediately above mine insisting we should always do something wrong (clone everything) because sometimes people get things wrong. I should try to get out of the habit of responding to issue from my phone but I am getting fed up with that discussion both here and elsewhere.

$node = $this->service->getService('node')->load($this->context['node_id']);

This is closer to what I'm after, but I agree it's overly verbose, and I'm not sure it needs to be so verbose to get the same effect.

If we have a service locator, then what you've written doesn't look that far of what entity_load() does already - currently entity_get_info() has all kinds of dependencies on the module system but that doesn't necessarily need to be the case eventually (or for other context-providing stuff) - the config initiative and general clean-up should allow the entity info itself to be mocked or injected including which class is used to load it. In that case everything would have a dependency on the configuration system (and possibly plugin system if and when that exists), but those could be used to handle control otherwise.

Another option would be the context system allowing dependency injection directly:

So to get the id:

$nid = $context->nodeId;

To get the full node object:

// Ignore the method name.
$node = $context->getObject('nodeId');

When you need dependency injection, use a setter:

$context->setService('nodeId', 'my_node_load');
$context->getObject('nodeId');

getObject uses whichever service is set.

Internally it looks something like:

$context->nodeId = 1;
$context->services = array('nodeId' = 'node_load');

function getObject($id) {
  // Hopefully not like this but you get the idea.
  $function = $this->services['nodeId'];
  $function($this->nodeId);
}

If we could figure that out cleanly, then the user of the $context object isn't doing much more than accessing $context->node - the complexity is handled internally, and we provide methods to mess with it when people need to.

Comment #24

Crell commented 3 August 2011 at 00:19

In today's WSCCI meeting in IRC, we kicked this question around for a good 90 minutes. :-) What we basically concluded was:

1) We can't perfectly lock down all context values, especially where objects are concerned, like entities.
2) We can design the system to discourage people from mucking around in context, however.
3) We're going to assume that the Entity system is going to get improved in such a way that all loaded entities become classed objects. (If it doesn't, then the Entity system has bigger problems than we're dealing with here anyway.)

We therefore came up with the following plan, which we should try to implement and see what happens. :-) Basically it is along the same general lines as pounard in #7 and dmitrig01 in #15.

Proposal:
There are two things that can be returned from a context handler, or assigned to a context key explicitly:

Primitives (strings and numbers)
Classed objects that implement ContextValueInterface.

That's it. Nothing else.

ContextValueInterface includes a method that will return a load key for that object. It is up to the object to return something meaningful. The load key is the value by which we can load that object later, such as nid, view machine name, etc.

Because it's an interface, it can be implemented by any class or base class.

Example:

interface ContextValueInterface {
  public function contextKey();
}

class Entity implements ContextValueInterface {
  public function contextKey() {
    return $this->id();
  }
}

class Node extends Entity {
  public function id() {
    return $this->nid;
  }
}

This offers the following benefits:

Most context values are primitives, and so don't need any fancy wrapping.
We can assign an object directly to a context key (or return it from a handler), and the consuming code (plugin?) can just access it directly.
We can reliably take a list of context keys and turn them into an HTTP-friendly string. Those can then be used for cache keys, Varnish ESI keys, and similar hotness.
Assuming the Entity system gets improved correctly, it can implement ContextValueInterface in the base class so that all entities are inherently context-friendly.

Note that this still allows for the possibility of a caller to modify a context value object, and for a caller to call node_save() (or equivalent) on an object tracked by context. At present, the answer to the second is "don't do that, stupid!". For the former, we want to try and block that but don't know how yet. We will revisit that question later.

We also considered forcing all primitives to be wrapped in an object for consistency. However, that may cause other problems in terms of both API /DX and performance. For the time being we decided to table that. That does mean that a consumer does need to know if the key is it requesting is an object or a primitive.

Thus, when a consumer accesses content:

$context['http:get:page'] // This returns a string.
$context['node'] // This returns an object of class Entity, Node, or whatever, that implements ContextValueInterface

When turning a context object into a set of keys that can be used to regenerate it (which we need for caching, ESI, etc.), we can do something like:

foreach ($keys_we_want as $key) {
  $val = $context[$key];
  if ($val instanceof ContextValueInterface) {
    $list[$key] = $val->contextKey();
  }
  else {
    $list[$key] = $val;
  }
}
// $list is now an array of primitives that we can easily turn into cache tags, ESI tags, or whatever.

Now we need someone to volunteer to try and implement this thing. :-) Any takers?

Comment #25

sun

German

Karlsruhe

commented 3 August 2011 at 18:34

Issue tags:

+WSCCI

Comment #26

xtfer commented 14 August 2011 at 06:40

Its a bit OT, but in the process of building a registry module for D6 (http://drupal.org/project/registry), I've been paying close attention to developments in this thread. I expect the Alpha 3 release to have some semblance with the proposal above (the current Alpha 2 is a bit different), at least as far as the kinds of properties to be set and how to access them, and I mention it in case anyone is interested.

Comment #27

xtfer commented 15 August 2011 at 13:08

Is there any reason why getValue and getKey cant be implemented in the same class? For example...

class ContextHandlerExample extends ContextHandlerAbstract {
  public $id;
  
  public function contextKey(){
    return $this->id;
  }
  
  public function getValue(array $id = array()) {
    return example_load_function($this->id);
  }
}

This keeps the key and value implementation consistent. Otherwise we end up with contextKey implemented on the objects themselves, while the handlers requiring an extra class definition.

Comment #28

Crell commented 15 August 2011 at 17:05

xtfer: Two reasons.

1) Performance. That's an extra function layer we have to go through every time we want to access the value.

2) DX. Having some context keys return a useful value and others return an object from which you need to extract a useful value, and that not being cleanly delineated in the context key itself, makes the code uglier and harder to work with.

Comment #29

xtfer commented 15 August 2011 at 22:01

Okay, I understand 1:performance, however the performance overhead would potentially be quite small, and you would also be able start addressing concerns about modifying context values, loading objects not properly classed (like the current entity system), or situations where more information was needed about a string context (context is a string, but that string is part of a larger object/array)? Also, you could potentially use ContextHandlerExample::contextKey() without instantiating ContextHandlerExample, which would be quicker.

In the case of 2:DX, aren't you going to run into that anyway? contextKey implemented on Node or ContextHandlerNode still has to return node_load($id) either way (probably), and under the proposal in #24, the key and value accessors will be split across the handler and the object, which seems more complicated to me.

Additionally, it would be useful if Butler ever gets a workable 7.x release, as its unlikely that 7.x core will be reworked to accomodate the class implementation (my comments stem from an attempt to write the contextKey implementation into the Butler 7.x branch and having no reliable way to load objects like nodes for testing).

Comment #30

Crell commented 16 August 2011 at 02:07

In practice I'm not sure we'll be able to do a full backport, frankly. It's looking less viable these days. :-( Either way, the needs of D8 should be the primary focus, not backportability. Make it work first before we make it backport. :-)

I don't know why you'd want to call ContextHandlerExample::contextKey() statically. In fact if it's a non-static method you can't. (At least not without generating an E_STRICT, because really, don't do that.)

If it's Node itself that is returned, then you can do simply $context['node']->title() to get the node object's title (for instance). Rather than having to do $context['node']->getValue()->title(), whereas you wouldn't need the getValue() if it's a string. That makes it inconsistent.

Comment #31

xtfer commented 16 August 2011 at 11:13

I think Im behind the times on static methods... I was under the impression they were faster but having done more research this is probably not the case, so ignore that.

Comment #32

xtfer commented 20 August 2011 at 05:14

Status:

Active

» Needs review

Status	File	Size
new	1177246.patch	5.39 KB

Here's a patch which basically implements the functionality described in #24...

1) interface ContextValueInterface
2) A new method, contextKeys(), on DrupalContextInterface, which returns an array of keys for caching etc.

Additionally, there is an extra sub-module Butler Examples, to hold development related code not part of Butler itself. This contains an object implementing ContextValueInterface, for testing.

This is also available in a sandbox repository at http://drupal.org/sandbox/xtfer/1254810.

Comment #33

Crell commented 21 August 2011 at 00:19

Status:

Needs review

» Needs work

Woah, thanks, xtfer!

+++ b/butler.module
@@ -325,7 +339,27 @@ class DrupalContext implements DrupalContextInterface {
+   * Implements DrupalContextInterface::contextKeys().
+   *
+   * @todo Is it worth statically caching this if we are locked?
+   */
+  function contextKeys() {

I think the intent of this method isn't quite right. Rarely would we need to know the value of every context key in a given context object. What we may need to know is the value of all context keys used in a given scope. For that, we'd probably need to track if a given key was accessed, and at what stack level. I think it would make sense that if we access key foo, then it is considered "used" by that context object and all of its ancestors. Then we can retrieve that data.

I wasn't intending to go quite that far in this patch, but if you want to include that logic I'm OK with it.

It's probably not good to cache that in a property since a context object can still be read from, and usedKeys() (which this method should really be named) should be idempotent.

+++ b/butler_examples.info
@@ -0,0 +1,4 @@
+name = Butler examples
+description = Examples and utility code for Butler
+core = 7.x
+files[] = butler_examples.module

I don't believe we need an examples module. We're hopefully moving this code into a core sandbox soon. Instead, this code should go in the existing test suite to make sure it all works properly.

+++ b/butler_examples.module
@@ -0,0 +1,48 @@
+class ButlerExample extends ContextHandlerAbstract {
+      ¶
+  public function getValue(array $id = array()) {
+    return butler_examples_sample($id);
+  }
+}
+
+/**
+ * Test implementation for ContextValueInterface
+ */
+class ButlerExample implements ContextValueInterface {
+  public $id;
+    ¶
+  public function contextKey(){
+    return $this->id;
+  }
+}

Uh, this should cause a parse error. :-)

There's a fair bit of stray whitespace in the patch, too.

Comment #34

xtfer commented 21 August 2011 at 09:43

Status	File	Size
new	1177246-context-value.patch	3.65 KB

Whoops! That's sloppy. I've cleaned up the patch.

What we may need to know is the value of all context keys used in a given scope. For that, we'd probably need to track if a given key was accessed, and at what stack level. I think it would make sense that if we access key foo, then it is considered "used" by that context object and all of its ancestors. Then we can retrieve that data.

Given that keys are normally set when they are accessed via $context->offsetGet($offset), for any given request only those keys required should be set, which is roughly what the implementation in the patch at #32 assumes.

Variants on this...

a) Log the $offset in offsetGet then iterate over them in usedKeys(), or
b) Set the usedKey record in offsetGet directly and merely return the extant values in usedKeys, or
c) Assume that anything set has been accessed, and just iterate over that (i.e. #32).

I assume that (c) would be faster during the construction of the $context object, and (b) quicker when returning usedKeys. (a) is only faster if, for some reason, there are context values which aren't required for the request (but that begs the question what they are doing in context). (a) also has a downside that the key must be unset if the value is removed.

The exception I can see here is for context values set directly (i.e. via $context->offsetSet($offset)), do we assume that this has now been "accessed", or do we just ignore it till someone uses it – the problem here being instances where the object is used then set. Do we accomodate that (c), or filter them out till referenced (a or b).

Anyway, this patch implements (a) for comparison. Its missing the "scope" concept, it merely returns the current contexts keys (not its parents).

Comment #35

xtfer commented 21 August 2011 at 09:45

If thats the approach you were thinking of, let me know and I'll add tests.

Comment #36

Crell commented 21 August 2011 at 20:37

Yes, that's more what we're looking for, I think, approach A. We cannot assume that anything with a value has been used, since it may be set explicitly.

Although... Now that I think about it, I've had this discussion with Sam and David Strauss before. We cannot rely on "this key has been used" to be the deciding factor on whether or not it's relevant. A given block/plugin/scope/whatever may only use a given key under some circumstances (eg, check the nid only if the user is not in the admin role). So we don't just need "give me what's been used", we need "give me these values that I know are relevant, just trust me." Or maybe both. Hm.

Let's go ahead with this for now and add explicit fetching in a later patch. We still need tests, and there's still some whitespace in there. :-)

Comment #37

xtfer commented 21 August 2011 at 21:06

Yes, I'd been considering that as well, though that shouldn't be too hard. Array of values? The caller could do its own filtering in any case. Tests forthcoming,,,

Comment #38

Stalski commented 22 August 2011 at 00:05

Status	File	Size
new	butler_1177246_38.patch	11.11 KB

I also gave it a try, I did not see these new posts before.
I think the patches could be merged, since I did not do anything for the cached keys yet.

In this patch are included:
- A example object that is fetched from context through handler
- Overrides of the example object
- Problem with context requests not knowing that the contextKey was updated.
This is done in a page callback on butler-objects/1 and one simpletest function is added.

Comment #39

attiks commented 23 August 2011 at 14:16

sub

Comment #40

Stalski commented 24 August 2011 at 13:41

Status	File	Size
new	butler_1177246_40.patch	11.45 KB

The patch in #38 shows we might have a problem with requesting offset keys within context (keys).

I've added a simpletest function that exposes this problem in level 2 where a "id" property is requested while the id returned from level 1. So currently, if we don't explicitly set the offset key for a context key (E.g. example:foo), than we will get the wrong ID.

The question here is: Do we need to do load the properties of an object that is put into context?

Larry told me to create an issue for it, but this is a better place imo. The patch included is improved for testing purposes.
Thx for feedback.

Comment #41

Crell commented 25 August 2011 at 01:34

I think the patch in #34 is the right way forward here. It's more complete, but just needs tests. Hopefully either xtfer or Stalski can write those in the next day or so, as then we can get this committed before Friday's code sprint. :-)

Stalski did find and point out to me a possible issue with the way overrides interact with nesting. That's what he's talking about above. That is irrelevant to the question of the value we return itself, so I do believe that should be a separate issue. Stalski, please open a new ticket for it.

Also, xtfer, you should re-pull your sandbox. I committed a few patches late last week, including fixing the "oh yeah, we don't actually stack properly" bug. :-)

Comment #42

Stalski commented 25 August 2011 at 18:18

I'll give it a try tomorrow morning. I do have to say I don't see much difference except that the patch in 40 includes tests.
So I'll take another look and try to understand what you meant. Probably the patch I tried is a too big a step.

Comment #43

Stalski commented 26 August 2011 at 01:13

Status	File	Size
new	butler_context_values_1177246_43.patch	9.86 KB

I had the whitespaces as well, those should be fixed. Also added a closing ) in an if clause.

I added a implementation of the contextValueInterface as well as a handler that can take a parameter to mimic the loading of a example instance (E.g. load from url)

Most basic tests are included. I hope you can work with that tomorrow, have fun at the sprint ;)