Add a storage (API) for persistent non-configuration state [#1175054]

Comment	File	Size	Author
#151	1175054-151-state.patch	24.36 KB	Anonymous (not verified)
#148	1175054_148.patch	30.28 KB	chx
#148	interdiff.txt	1.7 KB	chx
#147	1175054_147.patch	30.28 KB	arlinsandbulte
#145	1175054_145.patch	30.28 KB	chx
#145	interdiff.txt	8.86 KB	chx
#140	1175054_138.patch	28.26 KB	chx
#140	interdiff.txt	1.47 KB	chx
#137	1175054_137.patch	28.27 KB	chx
#137	interdiff.txt	821 bytes	chx
#129	config.state_.129.patch	28.54 KB	sun
#129	interdiff.txt	17.88 KB	sun
#126	1175054_126.patch	25.78 KB	chx
#119	1175054_119.patch	25.78 KB	chx
#98	1175054_98.patch	26.52 KB	chx
#94	1175054_94.patch	26.69 KB	chx
#93	1175054_93.patch	26.68 KB	chx
#89	1175054_89.patch	25.19 KB	chx
#87	1175054_87.patch	25.19 KB	chx
#86	1175054_86.patch	21.29 KB	chx
#84	1175054_84.patch	21.24 KB	chx
#81	1175054_81.patch	21.12 KB	chx
#75	1175054_74.patch	20.43 KB	chx
#75	state-do-not-test.patch	6.78 KB	chx
#68	drupal-1175054-68.patch	8 KB	tim.plunkett
#46	drupal8.state_.46.patch	7.88 KB	sun

Comment #1

boombatower commented 1 June 2011 at 07:18

Subscribe.

Log in or register to post comments

Comment #2

fabianx commented 7 June 2011 at 17:38

Subscribe.

Log in or register to post comments

Comment #3

lars toomre commented 7 June 2011 at 18:41

On one of my active websites, there are approximately 2,000 entries in my variables table with a composition of approximately 5%/85%/10% from core/contrib/custom modules. From this live example, I would guesstimate that more than 90% are key/value or key/array types with a great deal tracking information between page requests or cron jobs, or storing localized configuration values.

Hence, I think it is critical for the contrib module world that this new API exist before variable_get() and variable_set() go away. Thus in the upgrade to D8, developers can decide on whether each variable_get should get replaced by the new config API call or by this new key/value API.

+1 for this initiative.

Log in or register to post comments

Comment #4

boombatower commented 7 June 2011 at 19:30

A nice addition would also be not storing values that are default. Just saving many of the settings pages generates lots of variables that are unnecessary.

Log in or register to post comments

Comment #5

catch

he/him

English

commented 8 June 2011 at 01:07

This would specifically be for things that never have a settings page, I don't know what the plan is for the configuration initiative and settings pages.

Log in or register to post comments

Comment #6

Crell commented 22 June 2011 at 16:47

Subscribe.

Log in or register to post comments

Comment #7

yched commented 13 July 2011 at 09:36

subscribe

Log in or register to post comments

Comment #8

gdd

he/him

English

Portland, OR

commented 28 September 2011 at 08:56

Over at #1288142: Modify variable_set() / variable_get() to use the config system we have talked about having a separate config object that we use for stuff needed in early bootstrap, which would include a lot of the things being discussed in this issue. When you create a new config object, you pass it a class to use, so in this case we would use a SettingsConfig class or something, which accesses a separate storage area (db? settings.php? both?.) Not completely sure how this would look, but it would be nice if we had a common API for accessing everything. One thing we could do for this is subdivide things out such that things we need on all page loads are collected in one place, while things that are uncommon are elsewhere. We haven't talked much about caching for the config system yet so I don't have a grasp on those issues.

The issue above is focused on converting variable_get()/set() to the new config system, which would allow dumping the variables table but keeping backwards compatibility while we move stuff around.

Log in or register to post comments

Comment #9

pounard

French

commented 29 September 2011 at 07:48

sub

Log in or register to post comments

Comment #10

fabsor commented 5 October 2011 at 16:14

Subscribing

Log in or register to post comments

Comment #11

Mike Wacker commented 10 October 2011 at 05:52

+1 to Lars Toomre in #3. I agree its critical we find this replacement before we remove the old variable APIs. The model for the old variable APIs was to have persistent storage for anything that doesn't fit nicely into a DB table, whereas the model for the config APIs is to have persistent storage for only configuration.

If you want to replace the old variable APIs with the config APIs, then you need to provide a way to handle the cases where the old variable model doesn't map to the config model.

Another approach is to use the new config APIs only for that which is truly configuration, and leave around the old variable APIs for everything else. However, this approach could hinder adoption of the config API for contrib modules; not touching your variable_* calls when you upgrade the module to D8 won't break anything, at least until you try to do a configuration swap.

Log in or register to post comments

Comment #12

gdd

he/him

English

Portland, OR

commented 10 October 2011 at 06:10

As an update, fabsor spent some time this weekend trying to port variable_get()/set() to the config system and his results are at http://drupal.org/node/1288142#comment-5091964. Summary: Learned a lot, probably not worthwhile to finish as it would be faster just to convert all the subsystems.

Log in or register to post comments

Comment #13

catch

he/him

English

commented 22 November 2011 at 13:05

Title:

Build a replacement for non-configuration things we store in variables

» Add a persistent key/value store for non-configuration stuff

Per http://groups.drupal.org/node/191283 and http://groups.drupal.org/node/190729 it looks like CMI will be writing through to the persistent store as well as calling lots of hooks every time configuration is saved.

This makes having an API for non-configuration data (derived but not just a cache entry) very necessary if we're going to avoid some of the same performance nightmares we've had from variable_set(). I'll try to write some pseudo-code for this and/or update the issue summary, keep thinking about this one but not really getting to it either.

Log in or register to post comments

Comment #14

pounard

French

commented 23 November 2011 at 00:06

This makes having an API for non-configuration data (derived but not just a cache entry) very necessary if we're going to avoid some of the same performance nightmares we've had from variable_set().

@catch I have some ideas about that, I'd be glad to talk to you in order to evualate their level of stupidity before exposing it to the core issue.

Log in or register to post comments

Comment #15

Anonymous (not verified) commented 23 November 2011 at 00:17

eeeenteresting. don't know how i'd missed this issue until now.

Log in or register to post comments

Comment #16

pounard

French

commented 23 November 2011 at 16:28

My thoughts about this was to create a storage interface for key/value store. This interface would have a default core implementation based upon database.

Using this kind of interface, and such as cache backends, a bin name, we could easily make them swappable by configuration and be able to use other backends (redis, mongodb, another database, etc...) quite easily on a per bin name basis (exactly like cache).

By introducing the concept of store, decoupled of anything else, we could use it in various systems, cache would become by definition a key/value store specialization instead of living on its own.

It would also allow system administrator to group them on various storage, it's be higly flexible, with a default acceptable behavior (database).

For example, highlightening common features such as async writes and pipelining at the interface level, we could easily use it when available on the backend (I'm thinking a lot of Redis when writing that) but simulate it in a efficient way on code level when the backend doesn't support it.

For example, having a MySQLInd SQL backend would allow to developer a really efficient statistic storage using ASYNC queries while the default MySQLi based PDO driver would transparently not do it async, the end developer (aka the API user) wouldn't mind and just pass the "async" option when it knows it's safe to use it in his algorithm.

Extending this concept, we could create other types of storage, document oriented, key/value, pub/sub engine, statistics, queue is one of them too, etc.. using a complete interface for each one of them and sharing common code among all these storage engines, we would have a nice API in there.

Keeping this highly decoupled from business can create performance problems, but orienting the interfaces towards storage type interface capabilities maximasation, we can make it quite efficient.

In order for it to be fully configurable for system administrators, having the modules declaring the bin/type couple they'll use would make the model a lot more predictable.

Log in or register to post comments

Comment #17

neclimdul

He/Him

commented 24 November 2011 at 15:33

I think we're getting a bit out in the weeds talking about all the possible back-ends. I think important shift of this is "non-configuration" information since that's fundamentally different from what the CMI initiative is currently designing.

Honestly, it almost sounds like a better back-end for variable_(get|set) is what we're looking for with configuration data being migrated out into the separate CMI configuration.

The only thing that worries me about the OP is that cache mapping, default cache back-ends, and path white-lists are configuration IMHO. At least for the cache stuff though is baked into the WSCCI plugin design and using CMI so its sort of a moot point. Depending on what the path white-lists are for they also are probably baked in as well. Feel free to contact me about that out of this issue since its only peripherally related to the discussion.

Log in or register to post comments

Comment #18

pounard

French

commented 21 February 2012 at 18:16

Isn't that a duplicate of #1202336: Add a key/value store API ? For curious people I started that http://drupal.org/sandbox/pounard/1360694

Log in or register to post comments

Comment #19

catch

he/him

English

commented 22 February 2012 at 00:48

It's not really a duplicate. Whether we have a generic api or not, we need somewhere to put derived information - neither content nor config. The other issue is about sharing the same api between subsystems, this would be a new subsystem.

Log in or register to post comments

Comment #20

fgm

French

Paris, France

commented 22 February 2012 at 13:44

Just a wording note: in a discussion on this topic some time ago, yched characterized such data as "state".

We might want to use the term to have a consistent naming so anyone seeing it understands that we are not talking about config or data.

Log in or register to post comments

Comment #21

andypost

he/him

Russian

commented 24 May 2012 at 12:04

Suppose State is more generic but my association with this a bit different. State is a part of workflow also @catch mentioned about state in #1591412-11: Convert tracker settings to configuration system
So we could use tracker for first test

Log in or register to post comments

Comment #22

pounard

French

commented 24 May 2012 at 12:53

I forgot to tell in this issue that I changed my mind about having a generic key/value store, I don't think it make any sense when overlooking over the various Drupal business matters, see http://drupal.org/node/1202336#comment-5714508 (I know all of this is rather old).

Log in or register to post comments

Comment #23

chx commented 27 May 2012 at 07:24

pounard, let's make this clear, are you against this issue or not?

Log in or register to post comments

Comment #24

pounard

French

commented 27 May 2012 at 14:56

I am against, it's pretty clear now I think.

Log in or register to post comments

Comment #25

catch

he/him

English

commented 27 May 2012 at 15:35

@pounard: are you sure you're on the right issue. There needs to be somewhere to put state information in Drupal 8, that should not be config, nor the existing variable system, so no you're not clear at all.

Log in or register to post comments

Comment #26

pounard

French

commented 27 May 2012 at 21:57

You are right, may be this one is indeed a bit more specialized, case in which is seems more legit. If we identify it as a reall business need with a specialized and not a gigantic über generic API for everything, then I'm pro for it. But if it leaks and starts to look like the other issue, I may change my mind.

Log in or register to post comments

Comment #27

dww

we/he/they

commented 30 May 2012 at 16:32

Title:

Add a persistent key/value store for non-configuration stuff

» Add a persistent key/value store for non-configuration state

I posted some thoughts on how such a system would be useful for the update manager over at #1202336-31: Add a key/value store API in case that helps here. It's not entirely clear to me why these are two separate issues, but I guess the theory is that's for a general API potentially shared across multiple subsystems while this is for building a specific system. But the system needs an API to be useful. Whatever. ;) If people think this is the right issue, great.

My interest is that I would love to see a persistent + non-volatile storage mechanism in core that's *not* the CMI system which could be used in D8 to replace the hacked custom cache system used to record local state and to locally cache expensive remote data by D7 (and D6) update manager.

In thinking about this, I think it's safe to call everything update manager wants to store "state", even the local cache of the remote data, which is stuff like the current state of available releases for a given project. That's state, too. So I think I'm down with adopting "state" as the terminology for "non-configuration stuff", unless important counter-examples surface.

Cheers,
-Derek

Log in or register to post comments

Comment #28

sun

German

Karlsruhe

commented 30 May 2012 at 17:06

The summary mentions a cache wrapper around the stored things. I'm opposed to that, because the majority of data that will be stored is not going to need a cache. Thus, if a particular implementation needs a cache, then it can use the Cache API.
The most simple implementation for this already exists; the Database subsystem. It boils down to:
```
CREATE TABLE `state` (
  extension VARCHAR(255) NOT NULL,
  name VARCHAR(255) NOT NULL,
  value TEXT DEFAULT '',
  serialized INT(1) DEFAULT 0,
  created INT DEFAULT TIMESTAMP(),
)
```
Notes:
- 'extension' is important, because we need to be able to clean up. (this is non-volatile, after all)
- 'created' is debatable. (could as well be a separate record)
There's a mongo driver for DBTNG already, so why not?

Log in or register to post comments

Comment #29

catch

he/him

English

commented 31 May 2012 at 04:16

The summary only mentions putting caching into implementations, not baking in a cache layer into the API itself.

The current patch is using the variables table, I'd be much happier with the default being a 'state' table and I agree we should stick with that name unless something amazingly better comes up.

I'm not sure why there needs to be a specific extension column, as opposed to explicit cleanup on module uninstall? Adding an extension column and then clearing by extension, means implicitly restricting implementations to storage layers that are not straight key/value stores.

Created is useful for manually debugging stuff, but should not be in the API I think, needs more discussion though.

The mongodb driver for dbtng isn't a good thing, putting it in mongo should be done via a MongoDB-specific implementation of the storage API - the driver only exists for stuff where this is not possible.

@msonnabaum, would you mind copying your patch from http://drupal.org/node/1202336#comment-6056456 over to here?

Log in or register to post comments

Comment #30

sun

German

Karlsruhe

commented 31 May 2012 at 06:19

We already know that a manual clean-up does not really work out in the real world, and I stopped to believe that this would magically change long ago already. :)

That said, we can simply drop the 'extension' column idea and replace it with the config system's namespacing pattern; i.e., using a dot/period to declare and delimit the namespace/item owner from the item's name.

(Alas, we could have introduced exactly that pattern also for variables years ago in order to save us from many headaches caused by modules that do not properly uninstall their stuff. ;) I'm actively using a dot/period in variable names in a contrib project and of course it works just fine; thus, uninstall is as simple and safe to do as db_delete('variables')->condition('name', $module . '.%', 'LIKE')->execute();)

However, speaking of, if you want true k/v store compatibility, then the 'created' and 'serialized' columns probably don't fly as well...?

Log in or register to post comments

Comment #31

catch

he/him

English

commented 31 May 2012 at 06:28

We already know that a manual clean-up does not really work out in the real world, and I stopped to believe that this would magically change long ago already. :)

That doesn't bother me too much, I'd rather risk cruft than bake something into the API which might not be appropriate, especially since regardless of what happens with this, we can be pretty sure that we won't repeat variable_init() again.

A prefix clear is actually harder to implement than an extension namespace. With the explicit namespace, there's a finite list you can keep in a separate array somewhere, with prefixes there's infinite combinations - that problem was extremely hard to solve in memcache (not that there'll be a pure memcache backend for this but unless there's a really good reason I think it makes sense to restrict it to things that memcache can do easily).

Created - as long as we don't add something to the API to select items based on created, then it doesn't really matter if it's there or not.

Serialized - this doesn't matter at all, it's only an internal implementation to avoid serializing scalars no?

Log in or register to post comments

Comment #32

sun

German

Karlsruhe

commented 31 May 2012 at 06:49

I'm not really familiar with the namespace problem you pointed out - can you clarify some more why that is a problem?

Because, essentially, we're going to do the identical thing for the config system, and the "namespace/owner name" prefix in the config object names allows us to ensure to keep the system clean without relying on anyone to write explicit code for doing so (see #1505090: Clean up all (non-default) configuration (files) when a module is uninstalled).

Since there seems to be some desire to be able to replace or enhance the config system's active store with a different backend (e.g., mongo, redis, etc), it sounds to me as if a potential problem with namespaces/owner-prefixes would also affect the config system?

Lastly, I'm a bit skeptical whether memcache should be the measure -- I'd consider that to be an invalid backend for a persistent/non-volatile state cache in the first place.

Log in or register to post comments

Comment #33

pounard

French

commented 31 May 2012 at 07:06

Memcache is a key/value store, quite pure because it doesn't have fancy features except get/set/delete sort of, it's the best example we can use for a generic key/value store. If we write an API that cannot be plugged onto memcache then it's not a generic key/value API. I think the prefix problem is not important to us right now at this phase, as I can imagine, we are not going to do mass keys deletion using an arbitrary prefix filter, I hope that modules using it will be a bit more smarter than this.

If you start to need additional columns such as extension or created then this is not a key/value storage. One of the drawbacks of a key/value store is that you can fill it up easily without knowing how to clear it in a smart way since you cannot apply any filter on keys easily (except in advanced backends such as Redis). The fact you added the extension column (which is per definition not compatible with a generic key/value store) proves that we need more than a key/value store. We already started to add management logic in the data shape itself (thus defining a schema) in the form of the potential filters we will need and that even before we had any real business matter in mind, which defies the purpose of using a key/value store at the first place.

Still think a generic key/value store is shooting ourselves in the foot at this point.

Log in or register to post comments

Comment #34

catch

he/him

English

commented 31 May 2012 at 07:27

A strict key/value store can only look up values by the key that was used to add it.

That means you can't look up a list of values based on other criteria (such as a namespace) or a prefix (or any other substring) of a key. Memcache is just an easy example because I'm most familiar with it, there's no capability to list keys in the system - which is why the "who's online" block doesn't work for example. I can't off the top of my head think of a store with this limitation that's also disk backed so it might not matter.

(Note that it is possible to invalidate based on key substrings (very hard to get right though, it was broken for a long time in the memcache contrib project and it's very ugly now) and other fixed criteria like tags or namespaces (less difficult), since you can store that metadata separately then compare items on retrieval with that stored metadata. Both of these add some overhead compared to not having to do so though. However that's only really suitable for caching and things that live only in memory, since you're not actually deleting anything when you do that, just ensuring you won't return it if it's requested.)

Config has a hard requirement to allow locating all items with a specific prefix, so yeah it's only going to allow for backends like redis or mongodb (or SQL) that have that capability, that doesn't rule out a storage engine that includes a caching layer that can use APC or memcahe, it would just have to ignore the cache layer when retrieving lists of keys).

In the case of a state subsystem, ideally we want to keep it at the absolute lowest common denominator in terms of requirements, then if it's unavoidable we can add additional capabilities later (or add a new subsystem on top of it with those capabilities). There was a similar discussion in #1548286: API for handling unique queue items - i.e. that baking 'uniqueness' into the queue API would make implementations with some storage engines impossible.

Log in or register to post comments

Comment #35

sun

German

Karlsruhe

commented 31 May 2012 at 08:15

Well, if you can solve it in memcache with tags or potentially even a separate record that holds a mapping, then I think we're fine?

E.g.,:

$name = 'update.release_info';
$value = '...';

list($owner) = explode($name, '.', 2);

// Store the item.
$store->set($name, $value);

// Record the mapping. (if tags are no option)
$map = $store->get($owner);
$map[$name] = $name;
$store->set($owner, $map);

In general, I feel relatively strongly about this, since we really have sufficient historical experience with dumping grounds - which are always special in a modular system like Drupal is. Variables is not the only one, also remember {users}.data, and many other data bins that developers either forget and fail to clean up on uninstall, or which might even be close to impossible to clean up properly (e.g., {users}.data). The system should be smart enough to figure that out for you, and make sure on its own that an uninstalled module leaves no stray traces behind.

Log in or register to post comments

Comment #36

catch

he/him

English

commented 31 May 2012 at 10:51

{users}.data, like variables, is something that's loaded unconditionally all the time, so the bloat/performance concerns are much more of an issue than they are with system state.

The 'magic' cleanup of configuration doesn't currently work at the moment, since related data isn't cleaned up with it per #1609760: hook_image_style_*() is not invoked for image styles upon Image module installation, we have similar problems with the field API too. So we neither do dumping grounds nor automated cleanup very well in core.
On the other hand, the only real objection I have to this is whether there's a disk-backed key/value store that might be useful for this, that doesn't support some variation of prefixes/namespaces, and I don't know whether that's the case or not. I can only really think of about 4-5 optional core modules that need this API though, and they all have at most 1-2 use cases each for it - it's not going to be nearly as much used as the variables system was since by definition we're splitting that up.

Log in or register to post comments

Comment #37

pounard

French

commented 31 May 2012 at 18:35

~~Just a note, #1202336: Add a key/value store API is a duplicate of this issue and should be closed definitively, only the motive changes, but the code/feature proposal is exactly the same.~~

EDIT: And last one striked.

Log in or register to post comments

Comment #38

sun

German

Karlsruhe

commented 31 May 2012 at 13:44

@pounard: No, it's not. And perhaps that explains some of your confusion. This issue here is about introducing a single (one) storage (in its simplest form a db table), which wouldn't necessarily need a generic key/value store API, just a sane way to interact with it (again, in its simplest form the already existing database layer). But because the data to store for this issue are simple key/value pairs, we don't necessary need relational database features, and want to allow people to use a different backend for the stuff being stored. Thus, we're likely going to look into making the backend pluggable. The result of that will most probably be a couple of simple storage controllers that look almost identically to the storage controllers that we're fleshing out for the config system already. And at that point, the generic idea of #1202336: Add a key/value store API will become relevant again. But anyway, as already mentioned over there, let's do it pragmatically and work on this here in an isolated fashion first.

Log in or register to post comments

Comment #39

pounard

French

commented 31 May 2012 at 18:34

This is the exact same issue, those have no differences. Making a difference between them is saying that we would want 2 different key/value store interfaces, this makes no sense at all. If you expose methods to access a simple database based one in here, even without being pluggable, you achieve the same goal that the other issue, just not backendable. There is no confusion on my side, thinking both are different is being skizophernic somehow. Your definition of being pragmatic is making small steps, but this an error. You should conceive the full API prior to make any patch in here, because the simplest expression you will end up with in here won't fit nothing you'd have liked to have in the first place and you'll need to rewrite everything. This is a time and energy loss.

EDIT: Sorry for the noise. Confusing issue, poorly named.

Log in or register to post comments

Comment #40

sun

German

Karlsruhe

commented 31 May 2012 at 14:24

@pounard: Please keep that discussion in #1202336: Add a key/value store API. This issue here has a clear scope and focus, and will have to happen either way. Thanks.

Log in or register to post comments

Comment #41

pounard

French

commented 31 May 2012 at 18:32

Status:

Needs review

» Active

@sun I think you are doing some kind of ugly mistake here, and if that matters, I was participating to both issues almost since the beginning, and I'm telling you, historically, knowing both of them, they are the same.

EDIT:

will have to happen either way

You want it to happen, huge difference. Until now, there is still no concrete use case in either issue. Please note that we don't even have declined a proper interface and still don't know if we'll end up with one that fit more than one (two if we are lucky) use cases. I'm here to help, I proposed code a long time ago explaining challenges, wins and looses dues to such abstraction, even if minimal. I'm kinda sure I know what I am saying here.

If you proceed with this API, whether I or you like or not, you will eventually get rid of it later because it won't fit any business matter out there, it will be bloated and unperformant, moreover you still don't have even one concrete use case for now.

~~EDIT: Various typo fixes (I'm not a native english speaker sorry).~~

EDIT: I apologize for that. May I rephrase: @sun the wording in the issue description is not what the issue is about, at least for half of it is wrong, please update it.

Log in or register to post comments

Comment #42

catch

he/him

English

commented 31 May 2012 at 15:22

Until now, there is still no concrete use case in either issue.

Apart from the ones listed in the first post of this issue that have been there for almost exactly a year at this point.

Log in or register to post comments

Comment #43

pounard

French

commented 31 May 2012 at 15:36

The installer: needs a different workflow and API since it's working on a non fully installed site, this use case will never be appliable here, except if we manage to do a session or a file backend
The path alias whitelist: I don't know this part much, maybe this one is right
The default caching backend (and memcache.inc uses variables too reluctantly): those one requires garbage collection, namespacing, and selection by prefix abilities, this is not the simplest use case, so it doesn't really fit here
The css and js aggregation systems: is rotten, and if it needs to track down its file list in a table, it'd better be its own with some filtering capabilities (file version, file type, etc...)

Log in or register to post comments

Comment #44

dww

we/he/they

commented 31 May 2012 at 16:13

One thing I agree with pounard about is the confusion over why we have two separate issues for this discussion. I don't see a clear distinction between "define an API" and "build a system". If we're going to build a system, it should have an API. And in practice, we're just forking the discussion having it in two places. But I don't care -- if catch and sun want two issues and it's clear to them that this is about a specific system with its own API, and #1202336: Add a key/value store API is about someday merging this system with others, fine. If that's what y'all have in mind, let's rename #1202336 to something like "merge persistent k/v store APIs once we have some" or something.

However, this is a bit annoying:

you still don't have even one concrete use case for now.

Aside from what catch has said, what about #27? That sure seems like a valid use-case for this. And if we don't do anything horrendously stupid, whatever we build to solve that could be used to solve some other "state" storage needs such as those listed in the summary (menu rebuild, cron semaphore, etc), not to mention probably dozens or hundreds of use cases in contrib.

Log in or register to post comments

Comment #45

pounard

French

commented 31 May 2012 at 16:41

menu rebuild, cron semaphore, etc

I was about to write a more detailed post about why a key/value store is dumb for storing those kind of states, because what we need here is not a storage, but a specialized notification system, more precisely a pub/sub based notification system that would be able to propagate message to every PHP frontend to avoid desync such as we can have when use the APC cache backend. Those examples are perfect for demonstrating that the issue title itself is wrong, its title is false and confusing, this partly because it's not in accordance with the real life problem explained we experienced all along.

EDIT: @dww I guess the update manager issue may be a fit for a key/value storage, but are you sure in the end you wouldn't need a more detailled schema? Handling expiration, complex querying, aggregation queries etc?

Log in or register to post comments

Comment #46

sun

German

Karlsruhe

commented 31 May 2012 at 17:35

Status:

Active

» Needs review

Status	File	Size
new	drupal8.state_.46.patch	7.88 KB

Meh.

- Doesn't convert the 'node_cron_last' variable of Node module.

- Contains a @todo that points to the next best target ('install_time').

- #1497310-35: Convert statistics settings to configuration system points to another use-case.

- As @dww pointed out, all the data in {cache_update} belongs in there, too.

- Generally just serves as an example only to kick-start actual progress here. Without any doubt, we have plenty of use-cases within core alone.

Log in or register to post comments

Comment #47

pounard

French

commented 31 May 2012 at 18:07

@sun, this stable table makes a lot of sense, indeed, I probably have to apologize for my various misunderstandings. This issue must be renamed, we are not talking about key/value store, but about saving mutant state information.

Log in or register to post comments

Comment #48

pounard

French

commented 31 May 2012 at 18:30

I'd like to expose some use cases and detail more precisely the need around this issue.

The issue is title key/value interface, while the exposed use cases may apparently fit with it, I'm not sure the terms are right. I re-read the full post and there is indeed a lot of details I'm really wrong about, and I'd like to apoligize for that.

Most important use cases

First of all, the most important uses cases where this kind of API is critical are a basic report of the old variable_get() and variable_set() API, but which would avoid taxing the configuration system.

menu rebuild needed and cron semaphore

The menu rebuild needed is a volatile message, that the first thread must take it, act accordingly, them remove it. This is much closer from a volatile event system than a key value store. The cron semaphore acts the same way, but without the need of persisting between two requests: the message is read during the same runtime, but at shutdown time, in that particular case, we need the same feature, but non persistent.

Both those use case (and I guess there are a lot others in core) do not need a state table but an event queue instead. This would even be better by design because we could define the event processor at set time, or using some hooks: using a stricter API we'd make it predictable and easier to debug. We would also be able to unit test it and ensure it works correctly once for all, and remove some deprecated tests from the core, regarding cron and menu rebuild.

I think this specific use case needs a simple notification system (event queue) instead of a key/value store, and does not fit in this issue, but is a really interesting use case which happens often enough for us to open a new issue and design such system.

After greping and searching all over core, I identified all those use case would could benefit from this:

menu_rebuild_needed variable
node_access_needs_rebuild variable
cron_semaphore variable

Althought the cron_semaphore could possibly be a variable at runtime instead, it wouldn't hurt anyone. The most important thing about cron is that its uniqueness can be ensured thanks to the locking framewrk.

cron_last like variables

These are the real system state variables the issue is about all along, this may be solved by a minimal and internal key/value like store, but I wouldn't call it a key/value store, and call it a state table instead. If you want to stop hearing me grunting, you should probably rename this issue to Add a low level system state table or such, and re-focus the full issue entirely decoupling it from the key/value word, which has never been used in the right context in here.

This kind of data needs persistence, for most it, consistency ensured too. Those reprensent the real system state at a single moment in time: they are what the issue is about. Search core code I could identify those real life use cases:

maintenance_mode variable
update_last_check variable
cron_last variable
update_last_check variable
tracker_index_nid variable
statistic_day_timestamp variable

Those are the only one use case we should higlight in this issue, all others IMHO should be removed from the original post description.

Update module storage

I don't know what use case did you see @dww for the update module. While reading the code I identified the last update timestamp and categorized it upper. I'm thinking that if you need a more advanced storage somewhere, then it probably does not belong to this issue.

Cache variable_set() calls

I have to introspect those a bit more.

Path alias whitelist

Path alias whitelist is even a more edge case than every other. It doesn't need to store a single system state, but a whole blob of data, and needs rebuild each time the path changes, I think this one fits more as a cache entry.

CSS and JS aggregation system

The current CSS and JS aggregation system, at least the part that stores files mapping, should probably be revamped completely. An ideal system would probably keep, aside of the file names hash a key, a version number tied closely to timestamp. This would make it able to serve different versions of the same aggregation lot, if outdated browsers attempt to display outdated pages from cache. Additionally, coupling this with a smart timestamp usage this would also allow us to invalidate older files when we are sure the various clients cache lifetime passed, thus allowing to garbage collect the hash table during the site lifetime.

In theory, we shouldn't use such state table for storing this information. Following the actual Drupal 7 broken code we can, but it's not a good example for the issue and should be removed from the original post.

The installer and updater

The installer will work on a unfinished site, thus such using core API is dangerous. It fits with the need expressed by the state table, but is something to do carefully IMHO.

Another use case: clearing desynchronized cache on multiple frontends

It is highly possible that, in bigger environments, you may want to use APC to store some caches such as the cache_bootstrap table for raw performances. While accomplishing that may improve performances, it will also create cache state desync between your frontends: when clearing the caches only the frontend it was triggered upon will erase the local APC cache, all others will remain with the old cache.

This use case is more complex, it needs a real event system to achieve a complete and clean cache clearing. This event system must be able to push messages only once per subscriber, to all its subscribers, considering that the subscribers themselves are the frontends (which in Drupal is something agnostic from all known business, because thanks to the database we can consider the collection of frontend as one big frontend able to run concurent requests over the same storage backend).

There are multiple solutions for a problem like this, the first one would be to identify each frontend, and register it as a system event subscriber in a system event channel and build over that concept a simple async pub/sub API.

Going back to the original issue

As I said, the issue is misleading for multiple reasons:

its description, because most use cases identifier there does not actually fit with the real issue;
its title is wrong and misleading because people can't stop linking it to #1202336: Add a key/value store API (and yes, I have been one of them);

Once again, I have to say it, I was myself misled all along, until I started to re-read (once again) the full thread, and once @dww posted his comment speaking about the cron semaphore (which is here a good use case, but is not quoted anywhere in this thread).

@sun You are indeed right saying it's not the same issue, where the issue description and wrong (and then myself too) is that nobody ever changed the wording, which is important, because we definitely are not speaking about a key/value store here, even if somewhat close in the implementation.

Log in or register to post comments

Comment #49

dww

we/he/they

commented 31 May 2012 at 19:28

Status:

Active

» Needs review

@pounard: Since when is "key/value store" a totally special-case meaning with all your baggage and grunting? State consists of keys and values. We need to store them. WTF? ;)

Anyway, yes -- update_last_check is state and could be handled here. However, I'm mainly talking about the cache of the release history feeds. I've already explained this and linked to it, so I'm not going to do it again. No, we don't need any special/fancy additional storage. We just need a way to say "here's a blob of data about available releases for project X" and "give me the blob of data about available releases for project X or tell me you don't know". This doesn't need fancy expiration stuff, either. In fact, the whole point is that we do *not* want this data to fancily expire without update manager being involved in the decision (namely, we only clear old data once we have new data to replace it with).

Re #46: Do we care that {state} is totally unplugable here? I personally don't see a need to swap out a DB-backed {state} table with something else, but it seems other folks are interested in that. Do we want some abstraction here to allow different implementations, or do we just force this to be a DB-centric state system?

Log in or register to post comments

Comment #50

pounard

French

commented 31 May 2012 at 20:00

@pounard: Since when is "key/value store" a totally special-case meaning with all your baggage and grunting? State consists of keys and values. We need to store them. WTF? ;)

Yes, I know, but in this particular case, the same person opened two issues entitled key/value something, and I have to admit, I might be a bit off the edge here, but when I read key/value, I think memcache and redis quite instantly.

I think that it might not be useful for it to be pluggable, it's persistent data we are not supposed to hit every page, but a caching layer above might be interesting in the end, if we measure the opposite (this table being tickled at each request would be a terrible thing).

Log in or register to post comments

Comment #51

catch

he/him

English

commented 1 June 2012 at 01:10

With the current stuff destined for this, it will get hit every request. css/js aggregates and the poor mans cron check are at least two. Cache backends that need to store state would be a third. While we may end up storing css/js aggregates elsewhere we don't yet, and we can't release 8.x with anything left using the variables system.

Log in or register to post comments

Comment #52

pounard

French

commented 1 June 2012 at 06:53

Doesn't it risk a performance regression? This probably will need to be cached. We should have a lot of read but a few writes.

Log in or register to post comments

Comment #53

sun

German

Karlsruhe

commented 1 June 2012 at 08:17

My current recommendation would be to move on with this patch as is. That would allow us to spawn many novice tasks for the simple conversion of affected variables.

OTOH, when stuff like css/js aggregates gets converted, then the direct db access to state may cause a small performance regression for sites that use(d) a specialized cache backend for the variables cache previously. Since the direct db queries can only be swapped out with routing them to a different db engine (e.g., mongo), there'd be no simple/comparable storage performance optimization method like before.

OTOH, besides css/js aggregate info and menu/router masks, I'm not really sure how much else state data there actually is on regular runtime/requests. Which in turn makes me question whether we actually need something more fancy in the first place.

If we really do, then I probably would not want to try to "wrap" the state in a cache layer, but instead, let there be a single config for state that denotes which storage controller to use. Whereas core ships with a DatabaseStorage only for now, which piggybacks into the database layer (and thus effectively does the identical thing as in this patch, just with a different interface). If someone would want to swap out the storage controller (other than mongodb, which exists as db driver already), he'd be able to, but would also have to make sure on his own to transfer the existing state data from the state database table into the alternative storage.

Lastly, I'd still want the automatic cleanup on module uninstallation, which is not contained in this patch yet, but should be trivial to do. As discussed before and up to #35, alternative storages should be able to find ways to track the item owners. (or if they choose so, they could also skip the cleanup entirely, as it only pertains to uninstallation, which is probably a rare case on highly optimized sites in the first place)

Log in or register to post comments

Comment #54

chx commented 1 June 2012 at 10:10

Did someone say mongodb :) ? Rebuilding the path whitelist is costly so I would not really want that to expire if at all possible ie. it falls under this patch.

Log in or register to post comments

Comment #55

sun

German

Karlsruhe

commented 1 June 2012 at 18:16

Title:

Add a persistent key/value store for non-configuration state

» Add a storage (API) for persistent non-configuration state

Better title.

I'd like to get this issue off the table ASAP. So in light of #53:

- memcache is not persistent, so irrelevant.

- mongodb has a DBTNG driver, so a plain/pure database-based approach would work fine for it.

- redis_ssi has no DBTNG driver, but I'm also not sure whether that is suitable for a persistent storage?

Anything else?

Log in or register to post comments

Comment #56

berdir

German

Switzerland

commented 1 June 2012 at 18:27

It makes absolutely no sense to me to consider a mongodb dbtng driver a valid solution. That's a hackish experiment (IMHO) that manually parses sql queries and creates matching "mongodb queries". This is slow and error-prone. Also, I don't think you could use it only for the state table, that's just not how dbtng drivers work.

Also, I don't consider db_query() an API.

So, why not something like this:

- Add a simple interface, maybe just get() and set(), can still be extended later.
- Either use a factory function like e.g. queue() so that you can do state()->get() or use drupal_container() for that.
- Add a hardcoded DBTNG implementation
- Optionally make it pluggable once we have plugins. Thanks to actually having an API, this should be quite easy, certainly easier than replacing tons of db_query()'s all over the place.

Log in or register to post comments

Comment #57

sun

German

Karlsruhe

commented 1 June 2012 at 19:07

@Berdir: Yeah, that was kinda the essence from the beginning ;) Wanna create a config-state-1175054-berdir branch based on the config-state-1175054-sun branch and make some progress? :) (Hint: Copying Drupal\Core\Config\.*Storage.* into Drupal\Core\State\ should get you to 80% already ;))

Log in or register to post comments

Comment #58

catch

he/him

English

commented 2 June 2012 at 02:48

Redis works just fine for persistent storage. There's already a patch with an interface in the key/value issue that was postponed on this one. Not sure why we can't start from there?

Log in or register to post comments

Comment #59

chx commented 2 June 2012 at 05:54

Yeah the mongodb_dbtng driver is better considered nonexisting. And I agree with #58.

Log in or register to post comments

Comment #60

pounard

French

commented 2 June 2012 at 13:21

@sun Thanks for retitling, it's much better.
Berdir's #56 sounds like a fine solution.

Redis would fit perfectly with the use case, but we should focus on the interface and database implementation first. As I see it, but I might be wrong, I think than states won't represent a massive amount of data, it's supposedly a lot less than config itself, it'd may be good to chain the final storage with a cache proxy in front of it.

It think, trying to count today's use cases (less than 10 spotted I'd say), that they are a few and this is not gonna be a scalability problem, database is fine for this as a start, and chaining with a cached proxy in front will probably save us some SQL/backend requests, reducing it down to one for all on a normal run (data will change often, but is not supposed to change more than (arbitrary number) 1 request out of 10 maybe).

We probably have to use locks for state set (maybe as an optional parameter to the set method?). Aside of that, as I said upper, I think Berdir's plan is the right one for starting this. It separates those variables from the config, making it untainted by state.

EDIT: The caching stuff would be efficient and not memory intensive only if we drill down the use case for is as a state variable storage (tiny pieces of data, timestamps, arbitrary counts and small strings), and not a data storage, in that case, this would run just fine. I think that while focusing a simple interface (get and set primarily, and maybe a expiration time could be good too) some constraints should be determined considering usage, for example limiting the data size to store, to ensure people won't misuse it.

Log in or register to post comments

Comment #61

dww

we/he/they

commented 2 June 2012 at 16:00

Note: if this turns back into a "load the whole {state} in 1 query to save N requests" system (like {variables} is/was), it becomes useless for update manager (at least for the cache of the remote state that I've been talking about). That *is* a lot of data, and it definitely does *not* need to be loaded on every page. So, if it looks like we're going that way, y'all can remove that as a possible use case for this system (and we'll either keep our custom hacked "cache" system or generalize it in some other direction).

That said, definitely +1 to everything in #56. Something like that is what I assumed we were talking about here.

Cheers,
-Derek

Log in or register to post comments

Comment #62

sun

German

Karlsruhe

commented 2 June 2012 at 16:16

@dww: Next to resolving the variable conversion problem, I also want this new state storage to be compatible and suitable for Update module's release data. So yes (or actually no), we don't want a built-in cache layer for this.

That said, I looked at the latest patch in #1202336: Add a key/value store API once more, and it indeed looks like we can copypaste the interface and database storage implementation from there. I'm not sure whether we want to copy/retain the code to support multiple/different storage bins, but if we do, then I'd suggest to rename all of that to $bin instead of $collection. Otherwise, the database storage implementation should just simply hard-code '{state}'.

Log in or register to post comments

Comment #63

pounard

French

commented 2 June 2012 at 17:05

If we do this we're going back to implementing #1202336: Add a key/value store API but without the backendable storage, this makes it being the same again.

I think that update module, if it needs to store persistent data, should probably manage its own table to do this, especially if it needs to store huge data blob.

Log in or register to post comments

Comment #64

msonnabaum commented 3 June 2012 at 03:34

It sounds like you guys have just come up with a really great use case for #1202336: Add a key/value store API.

How about instead of copying that interface/dbtngcode, we just finish that up and then this class can extend it without actually overriding anything?

Log in or register to post comments

Comment #65

yched commented 20 June 2012 at 17:38

As for the amount / frequency of hits on this "state" store, looking the current D7 state-like cases is not enough.
CMI in D8 is much less "queryable" than D7 config-in-dedicated-db-tables: CMI can only do "give me all config items starting with some.prefix.*", whereas the D7 dedicated db tables allowed the full level of queryability given by the db schema and indexes you chose for subsystem tables. Field API, for instance, heavily relied on SQL queries for list of fields and instances definitions.

This means that, depending on the final structure that gets used for Field info in CMI files (that's not settled yet), Field API might need to pre-compile some data like "list of fields in a given entity bundle", "list of bundles a given field appears in",... for fast access.
This kind of data is derived from config, but is not config per-se (storing it in separate CMI entries would mean duplicating data and possibly allowing desynchronization). And, unless this hits massive community pushback, I was very much planning to store this kind of (possibly large, and queried basically on every page) stuff in the "state" store - at least to get high-performance storage pluggability for free.

This reasoning is not specific to Field API, of course. CMI being less queryable means other subsystem will need to store pre-compiled bits of config somewhere else too.

Log in or register to post comments

Comment #66

chx commented 20 June 2012 at 18:18

Funny, I always thought field API was using the lump of data collected in _field_info_collate_fields and only used field_read_fields to read inactive and/or deleted fields.

Log in or register to post comments

Comment #67

gdd

he/him

English

Portland, OR

commented 21 June 2012 at 04:38

@yched: This is essentially the same conclusion I have come to as a solution for #1479466: Convert IP address blocking to config system - the canonical source for the IP address list is in the active store/files, but on import it gets pulled into a dedicated table for queryability. When new values are added, they get saved into the active store and the new table, since saving into the blob is a much less expensive operation than reading from it. I think this is a perfectly valid approach to this problem, whether it gets stored in the generic k/v store or a separate table of some sort.

Log in or register to post comments

Comment #68

tim.plunkett

he/him

English

Philadelphia

commented 2 July 2012 at 23:56

Status	File	Size
new	drupal-1175054-68.patch	8 KB

This issue seems to have moved on since the last patch, but I rerolled it anyway just in case.

Log in or register to post comments

Comment #69

gdd

he/him

English

Portland, OR

commented 25 August 2012 at 10:12

Is there a sense of where we're at here? Do we want to proceed with the original patch, or push through #1202336: Add a key/value store API and use that? I personally think the latter makes more sense, but I'm mostly interested in seeing a solution come to bear.

Log in or register to post comments

Comment #70

pounard

French

commented 27 August 2012 at 07:32

I still don't think that a generic key/value store API is the right way to deal with this issue. The issue itself still is too generic; IMO those "states" will be prone to be queried very often, even during normal runtime, and should be cached and retrieved the same way that D7 variable system works if we do not want to drain the overall site performances by doing an horribly huge number of SQL/KeyValue/Whatever remote server queries.

A generic Key/Value store could be use as a backend for this, if it is behind a cache proxy, which must be behind a specific "state API" in front for the end user. In all cases, we need more than one key as index for the state values, we need the original module which set it (so we can do cleanup) and we might need a static registration for those keys (to avoid system bloat and outdated keys to remain after updates); This seems higly non compatible with a simple key/value store to me (even if doable with some of the existing key/value stores, such as Redis, but using non-generic advanced features of those).

Log in or register to post comments

Comment #71

chx commented 5 September 2012 at 02:47

Right now my train of thought is that the moment this API materializes (it's not too far off, depends on how much I can sit with such a hurting back) I will move variable_initalize, _set and _delete into this API and as a temporary shiv have it populate the global $conf as now so variable_get stays as it is. Then as we migrate everything belonging to CMI into CMI and K-V variables into the K-V API and only the state things remain in variable_get like magic. We get pluggability for variable_get for practically free; pluggability happens via drupal_container()->register() in settings.php.

Log in or register to post comments

Comment #72

tim.plunkett

he/him

English

Philadelphia

commented 5 September 2012 at 03:37

What's the catch? ;)

Sounds like a really cool idea.

Log in or register to post comments

Comment #73

chx commented 5 September 2012 at 08:36

To clarify further: we keep variable_get. That's the state API. Once K-V lands, we can begin to sort out current variable_gets: are you K-V (user mail), are you state (menu rebuild), are you config (most) and keep state in variable_get.

Log in or register to post comments

Comment #74

catch

he/him

English

commented 5 September 2012 at 09:00

Let's not keep variable_get(). I think we need a full API change here to force contrib and custom module developers to choose what to use.

Log in or register to post comments

Comment #75

chx commented 5 September 2012 at 09:02

Status	File	Size
new	state-do-not-test.patch	6.78 KB
new	1175054_74.patch	20.43 KB

To clarify even further, here's the actual patch, it includes the K-V patch too, I included the state patch separately.

Log in or register to post comments

Comment #76

5 September 2012 at 09:03

Status:

Needs review

» Needs work

The last submitted patch, 1175054_74.patch, failed testing.

Log in or register to post comments

Comment #77

catch

he/him

English

commented 5 September 2012 at 09:20

To clarify:

- we can leave variable_*() as they are for now.
- add a new procedural convenience wrapper like state() (I'm not sure if the state API itself should use multiple collections, we might want to figure that out as we go).
- this patch could go in with one example conversion of something simple.
- open a meta-issue to convert every variable_*() call to either CMI or this API with sub-issues, and eventually remove variable_*() - this would be a critical task so we don't release without it.

Log in or register to post comments

Comment #78

chx commented 5 September 2012 at 10:22

Note that the K-V issue is now closed thanks to pounard and we just continue from #75.

Log in or register to post comments

Comment #79

catch

he/him

English

commented 5 September 2012 at 10:27

I don't mind handling both here, the state stuff looks like it's going to be tiny on top of the work already done.

Log in or register to post comments

Comment #80

chx commented 5 September 2012 at 10:29

Edit: I did apologize for my harsh tone in the K-V issue, so I just rather delete this comment.

Log in or register to post comments

Comment #81

chx commented 5 September 2012 at 11:16

Status:

Needs work

» Needs review

Status	File	Size
new	1175054_81.patch	21.12 KB

See, a helpful review from someone who knows PHP very well would've been "you do not need abstract methods in AbstractStorage -- PHP won't fatal if an abstract class does not implement all methods of an interface, that only happens in regular class -- but it will fatal up until version X.Y.Z if you try to inherit an abstract class" cos the latter doesn't happen on my PHP version. In lieu of that help, I needed to debug on scratchtestbot and alas, hosed it :( Hope it won't be too hard to rebuild the poor bot.

Anyways, here's the whole thing, complete with a menu_rebuild_needed search-replace.

Log in or register to post comments

Comment #82

chx commented 5 September 2012 at 11:19

That abstract problem is https://bugs.php.net/bug.php?id=43200 and was fixed in 5.4.0 -- I am running 5.4 , the bot runs 5.3 so there.

Log in or register to post comments

Comment #83

5 September 2012 at 11:38

Status:

Needs review

» Needs work

The last submitted patch, 1175054_81.patch, failed testing.

Log in or register to post comments

Comment #84

chx commented 5 September 2012 at 11:57

Status:

Needs work

» Needs review

Status	File	Size
new	1175054_84.patch	21.24 KB

Easy to fix.

Log in or register to post comments

Comment #85

gdd

he/him

English

Portland, OR

commented 5 September 2012 at 14:02

re #77: some time ago, cosmicdreams made this google doc with every variable in core

https://docs.google.com/spreadsheet/ccc?key=0ArZqEiDWc3yddGN6RVlzSER0NjF...

My plan has been to move stuff into the 'Converted' sheet as it gets converted but I haven't really kept it up. This should probably be moved into an issue like #1696224: Remove system_settings_form(), with a list and a link to an issue addressing that variable as they get created. sun set that up and it has worked really well for us. I can try and do that today.

EDIT: #1775842: [meta] Convert all variables to state and/or config systems

Log in or register to post comments

Comment #86

chx commented 5 September 2012 at 18:03

Status	File	Size
new	1175054_86.patch	21.29 KB

Added a state() wrapper, converted more stuff, filed #1775488: drupal_cron_cleanup is not converted to lock for cron_semaphore.

Log in or register to post comments

Comment #87

chx commented 5 September 2012 at 18:10

Status	File	Size
new	1175054_87.patch	25.19 KB

The "converted more" is actually here, forgot to run add -u before rolling #86. It's drupal_css_cache_files, drupal_js_cache_files and path_alias_whitelist.

Log in or register to post comments

Comment #88

5 September 2012 at 18:12

Status:

Needs review

» Needs work

The last submitted patch, 1175054_87.patch, failed testing.

Log in or register to post comments

Comment #89

chx commented 5 September 2012 at 18:32

Status:

Needs work

» Needs review

Status	File	Size
new	1175054_89.patch	25.19 KB

Options widgets 261 passes, 0 fails, and 0 exceptions

Log in or register to post comments

Comment #90

5 September 2012 at 18:56

Status:	Needs review	» Needs work
Issue tags:	-Performance, -Configuration system

The last submitted patch, 1175054_89.patch, failed testing.

Log in or register to post comments

Comment #91

chx commented 5 September 2012 at 19:01

Status:

Needs work

» Needs review

#89: 1175054_89.patch queued for re-testing.

Log in or register to post comments

Comment #92

5 September 2012 at 19:23

Status:	Needs review	» Needs work
Issue tags:		+Performance, +Configuration system

The last submitted patch, 1175054_89.patch, failed testing.

Log in or register to post comments

Comment #93

chx commented 5 September 2012 at 23:09

Status:

Needs work

» Needs review

Status	File	Size
new	1175054_93.patch	26.68 KB

Log in or register to post comments

Comment #94

chx commented 5 September 2012 at 23:19

Status	File	Size
new	1175054_94.patch	26.69 KB

Sigh. The change is variable_get => state()->get , variable_set() => state()->set but variable_del() => state()->delete. This trip me every single time.

Log in or register to post comments

Comment #95

gdd

he/him

English

Portland, OR

commented 5 September 2012 at 23:58

Status:

Needs review

» Needs work

+++ b/core/includes/path.incundefined
@@ -72,7 +72,7 @@ function drupal_lookup_path($action, $path = '', $langcode = NULL) {
+    $cache['whitelist'] = state()->get('path_alias_whitelist');

Is this state? It strikes me as config.

Utimately I don't have a ton of comment on this. The architecture seems sensible, and as long as catch and msonnabaum are in favor then that works for me.

Log in or register to post comments

Comment #96

gdd

he/him

English

Portland, OR

commented 5 September 2012 at 23:59

Status:

Needs work

» Needs review

Log in or register to post comments

Comment #97

chx commented 6 September 2012 at 00:03

The path_alias_whitelist is rebuilt, not defined. Exactly what state is: built things that can be rebuilt just so, so expensive.

Log in or register to post comments

Comment #98

chx commented 6 September 2012 at 03:48

Status	File	Size
new	1175054_98.patch	26.52 KB

Log in or register to post comments

Comment #99

msonnabaum commented 6 September 2012 at 04:06

Status:

Needs review

» Reviewed & tested by the community

This is exactly how I imagined k/v being used; I totally approve of the approach. My only issue is that KeyValue should be in Components, not Core, but it's a grey area because of dbtng, so we can fight that in another issue.

chx addressed my only other small concerns in #98, which looks good to me. Assuming testbot comes back ok, RTBC from me.

Log in or register to post comments

Comment #100

webchick

she/they

English

Vancouver 🇨🇦

commented 6 September 2012 at 04:35

Assigned:

Unassigned

» catch

I think it's best to get catch's sign-off on this.

Log in or register to post comments

Comment #101

chx commented 6 September 2012 at 04:36

Edit: nevermind, crosspost.

Log in or register to post comments

Comment #102

pounard

French

commented 6 September 2012 at 06:32

I think the K/V store should be renamed as state store. I do not agree with making this K/V an all-purpose K/V store because its features are too light.

EDIT @msonnabaum This is not an issue about a generic K/V store so I think it does not fit into components. Plus, as I said upper, it should change of name.

Putting a such abstracted K/V store API for only three methods (get, set and getAll) seems to much abstraction for me. Passing it in a patch under the name "state API" is just cheating and ignoring comments such as this https://drupal.org/node/1202336#comment-6437424

Log in or register to post comments

Comment #103

chx commented 6 September 2012 at 06:54

To rehash the arguments from the other issue: when someone needs a better K/V store API they can write one. So far, no such thing ever materialized in contrib. Since we have namespacing, someone creating another, more rich K/V API shouldn't be a problem.

Log in or register to post comments

Comment #104

pounard

French

commented 6 September 2012 at 07:51

If people widely start to use this kind of storage for any kind of data, Drupal will become a huge data blobs trashcan. By diversifying that much the different storage engines, the whole will become unmaintainable and ofuscated. Data consistency is important, and a K/V store should be used only for stuff such as session or cache, but not business data. "States" are mostly business data, it's already making Drupal a huge blob trashcan. It's not diversification anymore, it's a huge mess we are doing by encouraging such practices, and I totally disagree, for many reasons, mostly because it's impossible to properly query and analyse data in blobs, and also because various Drupal parts start to be riped-off in various storage engines and it's impossible, in that context, to do serious data analisys and maintainance.

EDIT: Having a proper K/V store engine could good for performance reasons, and for additional and non essential systems, but once again, for this to be usefull, we need to take into account stuff like asynchronous queries, pipelining, and other nice stuff that most engines will bring to us. Without this, a K/V store is useless and will bring confusion, mess, and data problem.

Log in or register to post comments

Comment #105

webchick

she/they

English

Vancouver 🇨🇦

commented 6 September 2012 at 07:52

I don't know anything about this patch, but as far as I can see, the concept of this K/V store is not any different than the concept of the variables table we already have now. Meaning that people already have the opportunity to bloat it with a bunch of stuff that doesn't belong there (and there are several contrib modules who have, and have gotten bug reports and patches filed against them). I honestly don't understand the pushback to this patch, given it just switches the location and some of the mechanics of this thing that already exists. It's not adding any new horror that wasn't already there in D7, from what I can tell.

Log in or register to post comments

Comment #106

pounard

French

commented 6 September 2012 at 07:56

It's formalizing the horror. While the variable table was call "variables" people knew they were doing wrong things. By naming it "KeyValueStoreInterface" (very generic name) and pushing it as "public open and usable generic part of the API" we are saying to people "Yes, now you have the right to do it wrong, please do it wrong!".

Log in or register to post comments

Comment #107

pounard

French

commented 6 September 2012 at 09:24

Anyway, I'm sorry for this issue torpedo. I really think that a useful K/V API should be more consistent to be generally useful. I'm not against having it, but I think the state issue is not the right place for designing this. The state API sounds good, really, I'd like to see it going forward as much as anyone, chx, sun, etc...; but as a core specialized business API until we figure out all needs that will probably come later (such as the ability to proceed to per module cleaning, to get states per conditions, etc...) which could come later. Now I'm just leaving it, if you want to push the K/V in here just do it then.

Log in or register to post comments

Comment #108

chx commented 6 September 2012 at 13:04

To the contrary. While right now the only way to store any sort of non-entity data if you are not creating your own table is variable_get/variable_set we are going to advocate using CMI instead whenever possible. We do not advocate the K-V store any way or form. Core will use it a few times, yes but only for state and I am sure we will write a lot more documentation on the state() function to make it clear it's for derived data. In fact, your former argument against this API (it does not contain enough features) contradicts your latest problem (Drupal will become a blob trashcan) -- if you can't do anything with the data why store it this way? And, if you don't want to do anything else with the data but store and retrieve then why not store this way? Some data fits this API. Most does not.

Log in or register to post comments

Comment #109

pounard

French

commented 6 September 2012 at 13:35

Making it generic is advocating the API. If core is not going to use it for anything than state, then call the namespace state API. My former argument was all about giving real features if we commit in giving this to users. It doesn't mean I'm pro giving one, it just means that if we do, we need to do it properly. Right now, if this patch passes, we give something that we would not advocate, but that we make extremely generic and as a core component to ensure that everyone sees it as a generic all purpose tool.

With the state API being as-is I fear that a lot of people may end up using it widely, which will make it harder to cache efficiently like we were doing with variables.

For example, if we end up with menu rebuild stuff, update info, path alias whitelist, and drupal css/js cache files, we have two choices:
1. Load everything like variables (cached) because 3 out of the 4 examples will be consumed on every page request: then we would stupidely load the update stuff and bloat memory with it;
2. Load one by one each entry: then we would have (in this sample) 4 extra SQL queries instead of just only 1 per page.

A simple K/V store is not what we want for states IMHO, because if we use it as it exists in your patch, we are going to bloat a bit more the SQL server, and Drupal already is so intensive for the database, I wouldn't like to introduce this kind of code, which appears to me as huge performance regression.

If we want to solve that and use it using a cache proxy in front, this probably will work very fine with your implementation, but as soon as contrib will use it as a "state API" and not the variable system anymore, I'm afraid that abusive data bloating will happen, and we would end up loading extra large useless blobs of data into memory using getAll().

That's why I'm fighting again "state API", the concept is too vague, and we are merely replacing a wonky system (variables) -but for which people were aware of the dangers- by a too generic "state API" that most people will probably over-use because it's not the variable system anymore, but a K/V store, that may end up either bloating the memory, either bloating the database.

I trust people for using the K/V store or the state API as soon as it appears because it's fashionable, fun, modern and whow, OOP; At this very moment, I'm already worried about the later results we will see in core and contrib because of this.

Config system is already a big performance regression, and we're going to replace the only efficient remaining thing of the legacy API by a non encapsulated slower one.

Log in or register to post comments

Comment #110

sun

German

Karlsruhe

commented 6 September 2012 at 17:09

Status:	Reviewed & tested by the community	» Needs work
Issue tags:		+API addition

@pounard,
Your concerns are well understood and I don't think there's a need to repeat them any further. At this point, we need to make compromises in terms of "architectural API excellence" and perfectionism vs. getting things done.

By introducing a generic KeyValueStore component, we allow ourselves to advance on that in the future. And obviously, we also enable contributed and custom modules to introduce new/custom k/v storage bins for other use-cases than state information, without having to re-invent the entire wheel of pluggable storage engines.

In fact, we are no longer alone with this idea. There's a new Doctrine KeyValueStore component. Based on its code for storages, I can only guess that would be more in line with your thinking of a true and fully-fledged KeyValueStore component. However, the project is relatively young alpha material and also contains other ORM/data-binding stuff that's strictly speaking irrelevant for the main KeyValueStore functionality. I've already created an issue to explore a possible collaboration in the future.

The future might be that we'll continue to advance on our implementation and re-use it for other components, or it might be that we'll ditch our own in favor of collaborating with Doctrine. And to address your overall concerns, it might also be that we'll find out that doing this was a bad idea and we might not want to pursue it further. However, there is no black and white or wrong and right as long as you didn't try something new. In terms of the creative process, the proposed path forward here is not really different to many others that happened in Drupal's past already; Innovation requires to make brave steps at times.

On the patch:

- There are a full range of minor issues with regard to bogus phpDoc, comments, coding style, etc, but I'm happy to adjust those in a follow-up.

- The installer should use the MemoryStorage until the DatabaseStorage is available. We can adjust that in a follow-up though.

- Furthermore, the entire issue of introducing a State API that allows us to properly clean up any stored data from an extension when it is uninstalled (~#30 and following) seems to have been left out. I'm not really happy with porting the old + randomly named variables as-is into the state storage. I'd really like to discuss the actual/particular State API in some more detail, but I'm happy if we do that after an initial commit.

- Likewise, are we missing an upgrade path for the converted variables? Perhaps it would be a good idea to take those conversions out of the patch for now. They've been beneficial to test and confirm that it works, but ultimately it appears to be a good idea to discuss the actual State API some more and separately from the KeyValueStore.

- The only thing that really bugs me is the name of the database table. Why is the table name 'keyvalue' and not 'state'? (Also, the schema is added twice in system.install, probably caused by a merge.)
---- Closely related to that, drupal_container() registers the state service with a string argument of 'state', which 1) should be an array and 2) also refers to the table name 'state', not 'keyvalue'. The patch most likely only passes due to the string/array hiccup, which causes the DatabaseStorage to default to 'keyvalue' as table name (btw, can we remove that default value, plz?).

To clarify once more, all of the above points can probably be discussed in follow-ups; only the last would possibly cause some funky table rename havoc for core developers.

Log in or register to post comments

Comment #111

lars toomre commented 6 September 2012 at 17:20

I am confused about how this storage area is different from the TempStore area being worked on in #1642062: Add TempStore for persistent, limited-term storage of non-cache data as part of the VDC initiative. Both this and that effort seem to want to find a way of temporarily storing non-configuration (state) data.

Can someone please explain the difference? Or should the two efforts be combined? Thanks.

Log in or register to post comments

Comment #112

chx commented 6 September 2012 at 17:29

Status:

Needs work

» Needs review

The reason we pass a string is because it is the name of the collection, not the name of the database table to use. keyvalue is just the generic table name and we store the 'state' collection into it. I haven't really seen a need to have a separate table from the generic keyvalue table for this. Yes we could remove the default -- I simply kept it, it was already there.

I will look into using the MemoryStorage -- but I believe no problems occured because the installer didn't use variable_set before there was a DB. So I might just remove the state stuff from the installer container. I will reroll and look for various minor issues.

Edit: I will talk to the VDC people, yes, it seems there's a duplication. Very sad.

Log in or register to post comments

Comment #113

msonnabaum commented 6 September 2012 at 17:51

There is overlap with TempStore, but it needs more query-ability than you'd get from k/v. I think this is fixable if parts of TempStore are decoupled from each other, but in it's current state, it's not a good choice.

Log in or register to post comments

Comment #114

tim.plunkett

he/him

English

Philadelphia

commented 6 September 2012 at 18:27

Status:

Needs review

» Needs work

I'm working to strip down TempStore to its essentials, and it might lead to be rewritten on top of this.

Marking back to "needs work" for

The only thing that really bugs me is the name of the database table. Why is the table name 'keyvalue' and not 'state'? (Also, the schema is added twice in system.install, probably caused by a merge.)

Otherwise RTBC.

Log in or register to post comments

Comment #115

cosmicdreams commented 6 September 2012 at 19:19

really? Can't we create a follow up issue for the naming?

I would prefer we not postpone the functionality of this issue for coming to consensus on the name.

Log in or register to post comments

Comment #116

chx commented 6 September 2012 at 19:22

I would like to thank sun for the very positive tone of his review, I really appreciate it especially in the light of some clashes between the two of us in the recent past. Thanks.

Log in or register to post comments

Comment #117

chx commented 6 September 2012 at 19:26

As for the table name: msonnabaum presented his API with a 'collection' feature. Do we want a table per collection and if so would the caller need to set up or just one table for all collections and a collection column? I have absolutely no opinion in this matter but realize please that if you are naming it state then you need to drop the collection column and make the tablename the collectionname which puts restrictions on the collection name.

Log in or register to post comments

Comment #118

tim.plunkett

he/him

English

Philadelphia

commented 6 September 2012 at 19:30

Do we want a table per collection and if so would the caller need to set up or just one table for all collections and a collection column?

This made me go "Aha!" aloud. The table name should definitely stay as 'keyvalue', this just needs a reroll for the duplicated schema, and then needs a couple follow-up issues.

Log in or register to post comments

Comment #119

chx commented 6 September 2012 at 19:35

Status:

Needs work

» Needs review

Status	File	Size
new	1175054_119.patch	25.78 KB

Here it is without the duplicated schema.

Log in or register to post comments

Comment #120

tim.plunkett

he/him

English

Philadelphia

commented 6 September 2012 at 19:38

Status:

Needs review

» Reviewed & tested by the community

Assuming this comes back green, it's good to go.

Needs follow-ups for the stuff listed in #110.

Log in or register to post comments

Comment #121

sun

German

Karlsruhe

commented 6 September 2012 at 19:49

Yeah, I already had some brief discussions about the slightly undetermined question of what a "collection" is with @msonnabaum in the past. IIRC, his answer to that was along the lines of "For the SQL backend, ignore it." or similar, which frankly didn't really help me to understand what it is ;-)

Perhaps we want to move that question to a more specific State API follow-up issue, but generally speaking, my train of thought was that if we'd use a 'state' table for the state service, then we could potentially leverage the 'collection' field to record and track the owner of the stored data?

E.g., the stored data in {state} (SQL) would look like:

collection | key              | value
-------------------------------------------
system     | css_cache_files  | .....

And of course, we'd also need to make sure to pass the collection name accordingly from elsewhere; e.g.:

state()->set('system', 'css_cache_files', $map);
# or:
state('system')->set('css_cache_files', $map);

I think this would make sense. But I'm not sure whether the idea is "architecturally compatible."

Log in or register to post comments

Comment #122

tim.plunkett

he/him

English

Philadelphia

commented 6 September 2012 at 19:55

I don't think having an owner level really helps. What would be the benefits? Either way that could easily be a follow-up.

Log in or register to post comments

Comment #123

msonnabaum commented 6 September 2012 at 19:56

The optional table param is there so by default the api can be used without having to manage the create/drop, but it's there if you really want it.

Giving the SQL backend too much thought is missing the point. It's an implementation detail.

Log in or register to post comments

Comment #124

sun

German

Karlsruhe

commented 10 September 2012 at 03:58

While we're waiting...

I don't think having an owner level really helps. What would be the benefits?

The idea is to prevent us from making mistakes twice. The new State API introduces a dumping-ground for arbitrary data. Unlike the old Variable API, it is even the officially endorsed subsystem for dumping arbitrary data. Thus, it will be actively used for that.

By now, we know that asking contrib/custom developers to clean up after themselves simply does not work out. Yes, some modules have been patched in the past. But reality is that most of the modules you're installing and uninstalling leave a dozen of variables and state data artifacts behind (each). This is just painful, both for module authors, developers, and site operators, especially when you find out later in debugging performance.

For the module Schema API, as well as for the new Configuration system, we resolved that problem by making the system smarter, by design. If you uninstall a module, all of its schema tables are automatically dropped, and all of its configuration is automatically removed. Leaving no traces behind.

I'd like the new State API to follow that paradigm. Just be smart. Doesn't necessarily need to be related to $collection (although that appears to make sense), it could also be achieved through owner/namespace prefixes in stored keys (à la config object names).

But anyway, let's leave that to the dedicated follow-up issue on State API. The conversions of this patch can be easily adjusted.

Log in or register to post comments

Comment #125

olli commented 11 September 2012 at 04:44

+++ b/core/lib/Drupal/Core/KeyValueStore/DatabaseStorage.php
@@ -0,0 +1,89 @@
+  public function set($key, $value) {
+    db_merge($this->table)
+      ->key(array('name' => $key))
+      ->fields(array(
+        'collection' => $this->collection,
+        'value' => serialize($value),
+      ))
+      ->execute();
+  }

Should 'collection' be in key() instead?

Log in or register to post comments

Comment #126

chx commented 11 September 2012 at 22:48

Status	File	Size
new	1175054_126.patch	25.78 KB

Totally should. Nice catch. Thanks.

Log in or register to post comments

Comment #127

webchick

she/they

English

Vancouver 🇨🇦

commented 12 September 2012 at 04:35

Does that indicate missing test coverage?

Log in or register to post comments

Comment #128

chx commented 12 September 2012 at 10:34

Status:	Reviewed & tested by the community	» Needs work
Issue tags:		+Needs tests

Yes.

Log in or register to post comments

Comment #129

sun

German

Karlsruhe

commented 15 September 2012 at 11:48

Assigned:	catch	» Unassigned
Status:	Needs work	» Reviewed & tested by the community
Issue tags:	-Needs tests

Status	File	Size
new	interdiff.txt	17.88 KB
new	config.state_.129.patch	28.54 KB

We really need this ASAP, so I went ahead, added tests and generally improved the code:

Fixed missing tests for collection/name uniqueness.
Fixed phpDocs and coding style.
Relocated 'keyvalue' schema definition in system_schema() to retain alphabetical order.
Fixed KeyValueStore base test class cannot be re-used by contrib/custom implementations.
Fixed variable names and missing comments in StorageTestBase.

All of this is very minor, so moving back to RTBC.

Please credit @msonnabaum when committing this.

Log in or register to post comments

Comment #130

15 September 2012 at 12:11

Status:	Reviewed & tested by the community	» Needs work
Issue tags:	-Performance, -API addition, -Configuration system

The last submitted patch, config.state_.129.patch, failed testing.

Log in or register to post comments

Comment #131

yched commented 15 September 2012 at 12:14

Status:

Needs work

» Needs review

#129: config.state_.129.patch queued for re-testing.

Log in or register to post comments

Comment #132

15 September 2012 at 12:38

Status:

Needs review

» Needs work

The last submitted patch, config.state_.129.patch, failed testing.

Log in or register to post comments

Comment #133

sun

German

Karlsruhe

commented 15 September 2012 at 13:20

Status:	Needs work	» Needs review
Issue tags:		+Performance, +API addition, +Configuration system

#129: config.state_.129.patch queued for re-testing.

Log in or register to post comments

Comment #134

sun

German

Karlsruhe

commented 15 September 2012 at 15:09

Status:

Needs review

» Reviewed & tested by the community

Log in or register to post comments

Comment #135

pounard

French

commented 15 September 2012 at 15:38

If I read this patch well, it will add a minimum of 4 extra SQL queries per page, considering that the patch is not complete because it doesn't extend the state usage to every piece of API where it could apply.

Log in or register to post comments

Comment #136

Anonymous (not verified) commented 17 September 2012 at 00:21

this doesn't look right:

+  public function getMultiple(array $keys) {
+    $results = array();
+    // Foreach only needs to traverse $keys once, the PHP implementation
+    // array_intersect_key(array_flip(array_values($keys)), $this->data);
+    // would be slower.
+    foreach ($keys as $key) {
+      if (isset($this->data[$key])) {
+        $results[$key] = $this->data[$key];
+      }
+    }
+    return $results;
+  }

isn't keys just an array of $keys, with no values? so we can just do:

return array_intersect_key($this->data, array_flip($keys));

Log in or register to post comments

Comment #137

chx commented 17 September 2012 at 00:43

Status	File	Size
new	interdiff.txt	821 bytes
new	1175054_137.patch	28.27 KB

You are right, even if $keys is not just list, it'll still work.

Log in or register to post comments

Comment #138

Anonymous (not verified) commented 17 September 2012 at 00:50

Status:

Reviewed & tested by the community

» Needs work

+      $result = db_query('SELECT name, value FROM {' . db_escape_table($this->table) . '} WHERE name IN (:keys) AND collection = :collection', array(':keys' => $keys, ':collection' => $this->collection))->fetchAllAssoc('name');
+      $values = array();
+      foreach ($keys as $key) {
+        if (isset($result[$key]) && ($value = unserialize($result[$key]->value))) {
+          $values[$key] = $value;
+        }
+      }
+      return $values;

also doesn't look right.

i assume we're allowing FALSE and NULL as valid values? that second check in here if (isset($result[$key]) && ($value = unserialize($result[$key]->value))) { will not allow them to be returned.

Log in or register to post comments

Comment #139

Anonymous (not verified) commented 17 September 2012 at 00:54

re. #135, i think this patch is ok to land as is if we open up a follow up issue to add caching.

Log in or register to post comments

Comment #140

chx commented 17 September 2012 at 01:01

Status:

Needs work

» Reviewed & tested by the community

Status	File	Size
new	interdiff.txt	1.47 KB
new	1175054_138.patch	28.26 KB

Some NULL handling tuning although I full not expect to store scalars here.

Log in or register to post comments

Comment #141

Anonymous (not verified) commented 17 September 2012 at 01:02

thanks chx, i think this is RTBC now.

Log in or register to post comments

Comment #142

sun

German

Karlsruhe

commented 17 September 2012 at 10:08

Priority:

Major

» Critical

Yes, RTBC. I'm bumping this to critical, since time is seriously running out. We have to move forward, fast.

Log in or register to post comments

Comment #143

catch

he/him

English

commented 17 September 2012 at 10:18

Opened #1786490: Add caching to the state system.

Log in or register to post comments

Comment #144

webchick

she/they

English

Vancouver 🇨🇦

commented 18 September 2012 at 03:24

Status:

Reviewed & tested by the community

» Needs work

I can't tell from #143 why catch didn't just commit this, but in the meantime I looked at it and found a bunch of things. Oops. :\

+++ b/core/includes/bootstrap.inc
@@ -2505,11 +2505,23 @@ function drupal_container(Container $new_container = NULL, $rebuild = FALSE) {
 /**
+ * Returns the state storage service.
+ *
+ * @return Drupal\Core\KeyValueStore\KeyValueStoreInterface
+ */
+function state() {

Since state() and config() both take the place of variable_get()/set()/del() in various places, I think this needs a lot more explanation as to when module developers use state() vs. when they use config(). The docs for config() aren't great (http://api.drupal.org/api/drupal/core!includes!config.inc/function/config/8) but they're at least more than one line that uses the name of the thing to describe what it does. :)

+++ b/core/includes/common.inc
@@ -3176,7 +3176,7 @@ function drupal_pre_render_styles($elements) {
+  $map = state()->get('drupal_css_cache_files') ?: array();

@@ -4719,7 +4719,7 @@ function drupal_add_tabledrag($table_id, $action, $relationship, $group, $subgro
+  $map = state()->get('drupal_js_cache_files') ?: array();

Cool trick with ?: :) Didn't know about that.

One concern about this is there's no way to tell the difference between a typo and a legit FALSE/NULL value, unlike variable_get() in the past. I over-saw :P people talking on IRC about a Config::DOES_NOT_EXIST type of return value, and we might want to take a look at something like that in a follow-up for this.

+++ b/core/includes/menu.inc
@@ -451,7 +451,7 @@ function menu_get_item($path = NULL, $router_item = NULL) {
+    if (state()->get('menu_rebuild_needed') || !variable_get('menu_masks', array())) {

Looking at menu_masks (well, menu_get_ancestors()), I'm confused why this is not also a state variable. It seems to be calculated at runtime, as opposed to static configuration. Much like path_alias_whitelist which is converted to state.

+++ b/core/includes/update.inc
@@ -130,6 +130,36 @@ function update_prepare_d8_bootstrap() {
+      if (!db_table_exists('keyvalue')) {
...
+        db_create_table('keyvalue', $specs);

+++ b/core/modules/system/system.install
@@ -866,6 +866,34 @@ function system_schema() {
+  $schema['keyvalue'] = array(

In general, we donotsmooshwordstogether. Unless this is deliberately done to avoid looking like a bridge table, I'd recommend "key_value".

+++ b/core/includes/update.inc
@@ -130,6 +130,36 @@ function update_prepare_d8_bootstrap() {
+          'description' => 'Generic key-value storage table.',

+++ b/core/modules/system/system.install
@@ -866,6 +866,34 @@ function system_schema() {
+    'description' => 'Generic key-value storage table.',

Can we add any more info here? What it's used for, etc?

+++ b/core/includes/update.inc
@@ -130,6 +130,36 @@ function update_prepare_d8_bootstrap() {
+            'name' => array(
+              'description' => 'The key.',
...
+            'collection' => array(
+              'description' => 'The collection of the variable.',
...
+            'value' => array(
+              'description' => 'The value.',

+++ b/core/modules/system/system.install
@@ -866,6 +866,34 @@ function system_schema() {
+      'name' => array(
+        'description' => 'The key.',
...
+      'collection' => array(
+        'description' => 'The collection of the variable.',
...
+      'value' => array(
+        'description' => 'The value.',

"For example" what? I can read the column name myself just fine. :) The description should add a bit more info.

I'm particularly struggling with what "collection" refers to.

+++ b/core/lib/Drupal/Core/KeyValueStore/AbstractStorage.php
@@ -0,0 +1,55 @@
+  /**
+   * @var string
+   */
+  protected $collection;

PHPDoc.

+++ b/core/lib/Drupal/Core/KeyValueStore/DatabaseStorage.php
@@ -0,0 +1,89 @@
+          $values[$key] =  unserialize($result[$key]->value);

(nitpick) Extra space before "unserialize"

+++ b/core/lib/Drupal/Core/KeyValueStore/DatabaseStorage.php
@@ -0,0 +1,89 @@
+      // If the database is never going to be available, key/value requests should
+      // return FALSE in order to allow exception handling to occur.
+      return array();

Hm. The comment does not say what the code appears to do...

+++ b/core/lib/Drupal/Core/KeyValueStore/KeyValueStoreInterface.php
@@ -0,0 +1,99 @@
+   * Constructs a new key/value collection.
+   *
+   * @param string $collection
+   *   The collection for which the object is created.

Little help here? What is a collection? I would've guessed it was an array of key/value pairs from looking at the name, but I see it wants a $string instead so I am confused.

+++ b/core/lib/Drupal/Core/KeyValueStore/KeyValueStoreInterface.php
@@ -0,0 +1,99 @@
+   * @param array $options
+   *   An associative array of options for the key/value storage collection.

What are the possible options? If it varies based on implementation (which seems to be the case), we should document them in those places (e.g. 'table' in DatabaseStorage)

+++ b/core/lib/Drupal/Core/KeyValueStore/KeyValueStoreInterface.php
@@ -0,0 +1,99 @@
+   * @todo What's returned for non-existing keys?
+   */
+  public function getMultiple(array $keys);

What indeed. :D

+++ b/core/lib/Drupal/Core/KeyValueStore/MemoryStorage.php
@@ -0,0 +1,87 @@
+  /**
+   * @var array
+   */
+  protected $data = array();

PHPDoc.

+++ b/core/lib/Drupal/Core/KeyValueStore/MemoryStorage.php
@@ -0,0 +1,87 @@
+    $this->collection = $collection;
...
+    return $this->collection;

Is there a need to sanitize this at all? Back in DatabaseStorage you used various db_escape_table()/PDO escape things.

+++ b/core/modules/system/lib/Drupal/system/Tests/KeyValueStore/DatabaseStorageTest.php
@@ -0,0 +1,37 @@
+class DatabaseStorageTest extends StorageTestBase {

+++ b/core/modules/system/lib/Drupal/system/Tests/KeyValueStore/MemoryStorageTest.php
@@ -0,0 +1,25 @@
+class MemoryStorageTest extends StorageTestBase {

There are no assertions in either of these? Shouldn't we at least assertTrue() something to indicate they're working?

+++ b/core/modules/system/lib/Drupal/system/Tests/KeyValueStore/StorageTestBase.php
@@ -0,0 +1,104 @@
+  public function testCRUD() {

Great work. This looks really comprehensive!

9 days to next Drupal core point release.

Log in or register to post comments

Comment #145

chx commented 18 September 2012 at 04:21

Status:

Needs work

» Needs review

Status	File	Size
new	interdiff.txt	8.86 KB
new	1175054_145.patch	30.28 KB

> Is there a need to sanitize this at all?

No.

Doxygen'd the hell out of the rest. Thanks goes to webchick and msonnabaum for help with that.

Log in or register to post comments

Comment #146

webchick

she/they

English

Vancouver 🇨🇦

commented 18 September 2012 at 04:25

Status:

Needs review

» Reviewed & tested by the community

AWESOME! Thanks so much, this addresses my concerns.

Found one small thing in the interdiff:

+++ b/core/lib/Drupal/Core/KeyValueStore/DatabaseStorage.php
@@ -17,31 +17,38 @@
+      // @todo: Perhaps ff the database is never going to be available,

(nitpick) Perhaps "if" the database...

Can be cleaned up on commit.

I'm going to leave this for 24 hours, since I would prefer catch to commit it, since he already looked at it a bunch. But if he's not available I'm happy to do so as well since I've now been eyeballs deep in it the last hour. :)

Log in or register to post comments

Comment #147

arlinsandbulte commented 18 September 2012 at 04:39

Status	File	Size
new	1175054_147.patch	30.28 KB

Just corrected webchick's nitpick in #146.

Log in or register to post comments

Comment #148

chx commented 18 September 2012 at 04:55

Status	File	Size
new	interdiff.txt	1.7 KB
new	1175054_148.patch	30.28 KB

Grr!

Log in or register to post comments

Comment #149

catch

he/him

English

commented 18 September 2012 at 14:27

I'm not sure about the path_alias_whitelist being converted here. That's a cache, and it's only so expensive (really, really, 30s+ expensive) to build because we haven't fixed #1209226: Avoid slow query for path alias whitelists yet. Can we take that out and tackle it separately? Putting it in state now would be bad example.

Everything else that's converted here is classic state use cases though.

Log in or register to post comments

Comment #150

effulgentsia commented 18 September 2012 at 22:13

Status:

Reviewed & tested by the community

» Needs review

Everything else that's converted here is classic state use cases though.

What about menu_masks and drupal_(js|css)_cache_files? Aren't those also just caches that are expensive to rebuild?

In #97, in reference to path_alias_whitelist, chx says:

Exactly what state is: built things that can be rebuilt just so, so expensive.

I don't think I agree with this. I think we should limit state() to only what cannot be rebuilt at all (e.g., timestamp of last cron run, whether a menu rebuild is needed, $form_state of multistep forms), and allow site developers / operations people to make decisions about which cache backends to use for expensive cache bins.

"needs review" cause I'm not clear whether we should only rip out path_alias_whitelist per #149, or also rip out menu_masks and drupal_(js|css)_cache_files.

Log in or register to post comments

Comment #151

Anonymous (not verified) commented 19 September 2012 at 00:04

Status	File	Size
new	1175054-151-state.patch	24.36 KB

in the interests of moving this along, here's a reroll that removes menu_masks, path_alias_whitelist, and drupal_(js|css)_cache_files.

given that we've already got a follow up issue listing other stuff that needs to be converted to state(), i think this should be ok. we can work out all the rest in follow ups.

Log in or register to post comments

Comment #152

effulgentsia commented 19 September 2012 at 00:41

Status:

Needs review

» Reviewed & tested by the community

Great. Back to RTBC then.

Minor and can be addressed in a follow-up:

+++ b/core/includes/bootstrap.inc
@@ -2504,11 +2504,29 @@ function drupal_container(Container $new_container = NULL, $rebuild = FALSE) {
+ * Use this to store machine-generated data, local to a specific environment
+ * that does not need deploying and does not need human editing; for example,
+ * the last time cron was run. Data which needs to be edited by humans and

last cron run isn't a use case converted to state() in this patch.

Log in or register to post comments

Comment #153

chx commented 19 September 2012 at 06:40

Yes, I rushed a little with the conversions. The {system} -> CMI issue will show the true power of the K-V system.

Log in or register to post comments

Comment #154

catch

he/him

English

commented 20 September 2012 at 09:22

Title:	Add a storage (API) for persistent non-configuration state	» Change notice for: Add a storage (API) for persistent non-configuration state
Status:	Reviewed & tested by the community	» Active

OK we can handle that comment being out of date either in a quick follow-up here or just make the conversion patch major so it gets done.

Committed/pushed to 8.x, thanks all!

Should add a change notice for the new API. Somewhere we're going to also need a page explaining the difference between state() and config() and what should go where.

Log in or register to post comments

Comment #155

pounard

French

commented 20 September 2012 at 09:37

I'm definitely sad to see this change. I consider this both as a potential performance and design regression.

Log in or register to post comments

Comment #156

gdd

he/him

English

Portland, OR

commented 20 September 2012 at 16:32

I started a doc page a couple days ago about the state api at http://drupal.org/node/1787278. It has a couple lines about what should be state vs config. There is also a configuration api page at http://drupal.org/node/1667894 which should probably contain the same information.

Log in or register to post comments

Comment #157

podarok

🇺🇦 he/him/his

Ukrainian

Ukraine

commented 20 September 2012 at 16:40

Status:

Active

» Needs review

#156
Created initial change notice from Your docs with links to
http://drupal.org/node/1790518

Log in or register to post comments

Comment #158

andypost

he/him

Russian

commented 20 September 2012 at 16:47

Status:

Needs review

» Needs work

I think better to explain a real life examples

+++ b/core/includes/menu.incundefined
@@ -451,7 +451,7 @@ function menu_get_item($path = NULL, $router_item = NULL) {
-    if (variable_get('menu_rebuild_needed', FALSE) || !variable_get('menu_masks', array())) {
+    if (state()->get('menu_rebuild_needed') || !variable_get('menu_masks', array())) {

state()->get

+++ b/core/includes/menu.incundefined
@@ -2666,7 +2666,7 @@ function menu_router_rebuild() {
-    variable_del('menu_rebuild_needed');
+    state()->delete('menu_rebuild_needed');

state()->delete

+++ b/core/modules/search/search.admin.incundefined
@@ -175,7 +175,7 @@ function search_admin_settings_submit($form, &$form_state) {
-    variable_set('menu_rebuild_needed', TRUE);
+    state()->set('menu_rebuild_needed', TRUE);

state()->set()

Log in or register to post comments

Comment #159

Anonymous (not verified) commented 20 September 2012 at 23:39

follow up patch to improve the handling of booleans: #1790882: Allow to store value FALSE in the key value store.

Log in or register to post comments

Comment #160

alexpott

he/they

English

🇪🇺🌍

commented 21 September 2012 at 01:12

follow up patch to convert cron_last... #1790920: Move cron_last, node_cron_last and common_test_cron to state system

Log in or register to post comments

Comment #161

webchick

she/they

English

Vancouver 🇨🇦

commented 23 September 2012 at 21:32

We are now over thresholds on critical tasks, so this and other issues with outstanding change notices are blocking features in D8. Please fix.

Log in or register to post comments

Comment #162

podarok

🇺🇦 he/him/his

Ukrainian

Ukraine

commented 24 September 2012 at 17:49

Status:

Needs work

» Needs review

#158 fixed

Log in or register to post comments

Comment #163

chx commented 24 September 2012 at 18:35

Status:

Needs review

» Fixed

Looks good to me.

Log in or register to post comments

Comment #164

Tor Arne Thune commented 24 September 2012 at 18:42

Title:

Change notice for: Add a storage (API) for persistent non-configuration state

» Add a storage (API) for persistent non-configuration state

Log in or register to post comments

Comment #165

fubhy commented 2 October 2012 at 09:18

I just opened #1800838: Possibly tinker the CacheBackendInterface around the KeyValueStoreInterface

Log in or register to post comments

Comment #165.0

fubhy commented 2 October 2012 at 09:18

Issue summary:

View changes

Updated issue summary.

Log in or register to post comments

Comment #166

fubhy commented 2 October 2012 at 15:24

Priority:	Critical	» Normal
Status:	Fixed	» Active

I am not sure if we want to keep using this issue for minor improvements to the current implementation that got commited. If not, feel free to close it again.

+++ b/core/lib/Drupal/Core/KeyValueStore/KeyValueStoreInterface.phpundefined
@@ -0,0 +1,99 @@
+   * @return array
+   *   An associative array of items successfully returned, indexed by key.
+   *
+   * @todo What's returned for non-existing keys?
+   */
+  public function getMultiple(array $keys);
+

I would probably expect this to behave similar to the current implementation for cache()->getMultiple(). In that, we pass the $keys array ($cids) by reference and remove the elements from that array in case there is a value for that. Hence, nonexistent values would be those that are left in the $keys array.

Also... Is there any use-case for entirely flushing a 'collection'? If so, we don't support that yet.

Log in or register to post comments

Comment #167

pounard

French

commented 3 October 2012 at 07:15

Actually, there is a need of flushing a collection: when you uninstall modules they need to be able to do that.

Log in or register to post comments

Comment #168

berdir

German

Switzerland

commented 3 October 2012 at 07:20

Status:

Active

» Fixed

#1547008: Replace Update's cache system with the (expirable) key value store needs a flush()/deleteAll() and implements it. As suggested by chx in #1642062: Add TempStore for persistent, limited-term storage of non-cache data, that is a new feature and those can be added in new issues, we shouldn't keep this one open IMHO.

Log in or register to post comments

Comment #169

17 October 2012 at 07:21

Automatically closed -- issue fixed for 2 weeks with no activity.

Log in or register to post comments

Comment #169.0

(not verified) commented 17 October 2012 at 07:21

Issue summary:

View changes

Updated issue summary.

Log in or register to post comments

Add a storage (API) for persistent non-configuration state

Follow-ups

Comments

Most important use cases

menu rebuild needed and cron semaphore

cron_last like variables

Update module storage

Cache variable_set() calls

Path alias whitelist

CSS and JS aggregation system

The installer and updater

Another use case: clearing desynchronized cache on multiple frontends

Going back to the original issue