There are a few things in Drupal that resemble a key/value store, I've been thinking it'd be nice to add a centralized API to unify some of this.

The overall plan is to introduce a concept of pluggable storage that is separate from the specific subsystem that is being swapped out. In practice it is likely that both will need to be, but it might still allow code to be shared for either different subsystem backends or different storage backends or both of these. And we'll get consistency even if neither of those happen.

Things that might be able to build on a shared API:

cache API
lock API
Sessions?
Form state storage (currently mis-uses cache system)
Update module cache (currently re-implements cache system)
Possibly even entities could build off this for basic CRUD.
#1175054: Add a storage (API) for persistent non-configuration state
Configuration store?

On top of this there are projects like http://drupal.org/project/memcache - this actually provides integration with two different libraries (the memcached and memcache PECL extensions have different feature sets and APIs), and currently does so via it's own dmemcache.inc. Potentially we could get to the point where the cache.inc/lock.inc included with the memcache project are generic and you could swap in a non-memcache library (APC? something else?) for just the storage - or swap the store and extend a couple of methods at least.

Haven't worked up any code yet, this is just notes for now and there are definitely things to work through before it can be viable.

Things that would be needed, not necessarily all in the base interface(s)/classe(s) though:

Concept of tables/bins/collections - pass this to the constructor like cache.inc does now.
Ability to create tables/bins/collections #1167144: Make cache backends responsible for their own storage
Ability to require tables/bins/collections to be disk-backed.
We already have per-field storage and per-cache-bin storage so that could do with being central or at least possible to implement.

Zend and possibly some other frameworks have been separating their read and write interfaces into separate classes (KeyValueReader + KeyValueWriter) which I assume is for code weight reduction on read requests.

At least these methods make sense for most stuff.

>get()
->getMultiple()
->set($key, $value, ($expires?))
->setMultiple()
->delete()
->deleteMultiple()
->garbageCollection() (needed for SQL and MongoDB at least if we add expires)

Maybe ->add() (insert as opposed to merge)?

Files: 
CommentFileSizeAuthor
#80 1202336_81.patch14.22 KBchx
FAILED: [[SimpleTest]]: [MySQL] 41,115 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
#79 1202336_80.patch14.12 KBchx
FAILED: [[SimpleTest]]: [MySQL] 41,114 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
#78 1202336_78.patch13.63 KBchx
FAILED: [[SimpleTest]]: [MySQL] 41,118 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
#76 1202336_76.patch13.64 KBchx
FAILED: [[SimpleTest]]: [MySQL] 41,117 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
#61 platform.keyvalue.60.patch17.56 KBsun
PASSED: [[SimpleTest]]: [MySQL] 39,800 pass(es).
[ View ]
#58 1202336-key-value-store-58.patch13.06 KBmsonnabaum
PASSED: [[SimpleTest]]: [MySQL] 37,260 pass(es).
[ View ]
#40 1202336-key-value-store-40.patch13.08 KBmsonnabaum
PASSED: [[SimpleTest]]: [MySQL] 36,706 pass(es).
[ View ]
#38 1202336-key-value-store-38.patch12.5 KBmsonnabaum
PASSED: [[SimpleTest]]: [MySQL] 36,700 pass(es).
[ View ]
#33 1202336-key-value-store-33.patch11.2 KBmsonnabaum
PASSED: [[SimpleTest]]: [MySQL] 36,694 pass(es).
[ View ]
#22 1202336-key-value-store-22.patch28.69 KBmsonnabaum
PASSED: [[SimpleTest]]: [MySQL] 36,683 pass(es).
[ View ]
#11 key-value-store.patch5.13 KBmsonnabaum
PASSED: [[SimpleTest]]: [MySQL] 34,586 pass(es).
[ View ]

Comments

sub

Good plan, we have to start to think about:
1) Maximization of backend capabilities => What can we do?
2) Trying to stick in there some advanced functions (Redis has a really simple yet powerful transaction management system, that allowed me to create really fast safe mutexes using it, this is the kind of features we definitely need).
3) The interface of such API may end up with a really loaded set of atomic methods, yet necessary, such as transaction handling, async handing, etc...

I'd start right now, what we definitely now at the interface level is:
* Basic CRUD, get/set/setExpires/delete with all the *multiple equivalents
* Transaction start/stop/commit/rollback
* Async start/stop (specialization of transaction)
* Pipelining start/stop (specialization of transaction)
* Generic interface methods for sending commands while in a transaction.

If we chain, we basically send a lot of command, some backends could then do implicit pipelining in that case. Async and pipelining imply we need a ResultSet interface that will return results from chained methods.

We also have to think about a generic error handling. For example, errors on chaining stuff should be exception based, because without exceptions we cannot really chain (testing result at each step).

There is another problem behind transaction support, if we chain stuff, some errors are meant to remain silent (cannot chain, or get a key returns no result) and this result would be fetched later using the ResultSet; Some other errors are meant to be silent because the backend doesn't support it but the result remains guaranted (a backend that cannot do Async will do it in a sync. fashion, result is slower but is the same in the end). Some other errors are blocking, for example, trying to set an expire time on a backend that doesn't support it (may be not the right example, but it could be).

This means that the interface must be able to be fine tuned enought so we can have a really predictible behavior, yet implementable for all backends we target, and non restrictive in term of features. That also mean we need to create default implementations for some emulated behaviors (Async for example).

A lot of details of that kind. Looking to Predis library gives a complete and really nice to read (and use) example of such API, except it's Redis only, but it's abstract enough for some concepts to be re-used generically.

Other stuff we need to determine in standard to use in the overall API. For example, Async behavior can be written in multiple ways:

<?php
$store
= new KeyValueStore();
try {
 
$store->startAsyncTransaction()
        ->
set($k1, $v1)
        ->
setMultiple($setOfValues)
        ->
setExpires($k1, $timeout1)
        ->
setExpires($k2, $timeout2, $value2)
        ->
del($key3)
        ->
commit();
} catch (
StorageException $e) {
 
// Do something.
}
try {
 
$store->startPipelinedTransaction()
        ->
set($k1, $v1)
        ->
setMultiple($setOfValues)
        ->
setExpires($k1, $timeout1)
        ->
setExpires($k2, $timeout2, $value2)
        ->
del($key3)
        ->
commit();
} catch (
StorageException $e) {
 
// Do something.
}
$result = $store->getLastResultSet();
// Do something with my result.
?>

This one need the backend to implement everything internally in every tiny method we pass to it.

Or

<?php
$store
= new KeyValueStore();
try {
 
$store->doAsyncTransaction(function($store) {
   
$store->set($k1, $v1)
          ->
setMultiple($setOfValues)
          ->
setExpires($k1, $timeout1)
          ->
setExpires($k2, $timeout2, $value2)
          ->
del($key3);
  });
} catch (
StorageException $e) {
 
// Do some error handling.
}
try {
 
$store->doPipelinedTransaction(function($store) {
   
$store->set($k1, $v1)
          ->
setMultiple($setOfValues)
          ->
setExpires($k1, $timeout1)
          ->
setExpires($k2, $timeout2, $value2)
          ->
del($key3);
  }, function(
ResultSetInterface $result) {
   
// Do something with my ResultSet.
 
});
} catch (
StorageException $e) {
 
// Do some error handling.
}
?>

This one is really nice, because it allows the internal store to change the $store object and give a specialized one to the callback: it can then provide a specific implementation for async and a specific implementation for normal behavior.

----

Once we decided how to handle our very own transactions, we need to stick with it for all storage types, and all interfaces where we'd use them, ensure consistency all over the API.

Each one of these methods has its own lot of pros and cons, because each one of them implies different level of technical complexity for the backend developer. It also happens that some of them might allow to share code (and it's the path to definitely go into) where some other cannot.

Predis also solve the pipelining/chaining problem by creating a queue of commands to run and storing them into a temporary array. It then flushes the queue (sending all the commands at the same time) once the execute() method is reached or pragmatically called.

This seems really efficient, and in order to do this, the get() set() and other methods are only wrappers that creates internally ICommand instance(s).

Those instances are used internally by a common method for sending them (each command has its own result parsing method).

If we follow this architecture, this mean that every specific method we have can be shared, but the real ICommand implementation behind would be the only thing that varies depending on the backend. Then the backend just would have to implement the specific ICommand instances and that's pretty much it.

The async/pipeline code would be only the shared code that queue those commands, and flush them once the callback returns: the magic would then happen at flush time (the only backend specific stuff with commands to implement).

Doing that, the pipeline executor is a simple command queue, that references the real backend and just send a series of explicit call() or sendCommand() method calls.

This model seems higly flexible and allow to maximize code sharing, at least for backends that doesn't support either async or pipelining.

Starting some work on a sandbox, as a D7 module (yet requireing PHP 5.3) as a proof of concept. See http://drupal.org/sandbox/pounard/1360694

Issue tags:+Configuration system

We're going to work on this as part of - or rather follow-up to #1270608: File-based configuration API

So one thing I have been thinking about, and which I talked to sun about last weekend, is that the config system's classes should really be reworked so that they start with a generic key/value store interface and build the config system on top of that. Right now this is sort of happening but not quite. For instance, the SignedFileStorage class combines the file access and signing, which would ideally be decoupled. Also the file storage and sql storage do not descend from a common class, which it seems like they should. This will, I think, bring the config system a lot more flexibility, as well as provide a common base for others to work on. Much of the simple API that catch outlined already exists, the classes just need to be reworked a bit.

Thoughts?

So I think we could start with a minimal common interface with the methods in the original post.

The main problem with that though is that ->set() is very likely to differ for things like caching or locking - since those often require a ttl or similar, and you definitely don't want a ttl for config. Also we'll hopefully add tags to the cache API soon, which means another parameter to the set method that nothing else would use.

Another thing to bear in mind is the PHP standards group object cache proposal - this is proposing a common interface for cache APIs across projects to share (not implementations at all, just the interface for now). That also has the same problem with set() in that it can lock down the feature set (whereas most other methods it doesn't matter so much). If you look for comments from me and Crell in that thread you'll find that bit of the discussion quick enough. If that proposal ends up getting accepted and is compatible with the features we need/want to support in the cache API (my main issues with it as it stands are set() and exists(), the rest is barely different from now), then it would make sense to adopt it, it's not clear how it'll go yet.

In terms of a common Drupal key/value API, it feels to me like this would be an interface for drivers (i.e memcache, redis, database etc.). Then, it's OK for it to support features like ttl that something like config won't ever use. The public configuration API then would not need to inherit from this (cache or lock might do though), so no-one needs to know about extra features that aren't used even if they're there.

So I think it'd be a good step to have the config storage classes share a common interface with each other, then in this issue we can try to figure out how to reconcile that with some of the other key/value stuff in core to see how much could be shared overall. The ideal situation would be that it's possible to interchangeably use the same key/value back ends for different core subsystems - so the Memcache and Redis projects don't need to know about lock/cache/config at all. If that's not doable, then it feels like it ought to be easy to get to a point where at least the RedisCache and RedisConfig classes both extend a RedisKeyValue and add a couple of methods or similar.

I think what heyrocker is saying in #7 makes sense. To clarify, you're suggesting something along the lines of:

<?php
interface KVStore {
  function
put();
  function
get();
}
interface
KVItem {
  function
key();
  function
value();
}
interface
ConfigurationObject extends KVItem {}
interface
ConfigurationStore extends KVStore {}
?>

Am I following?

I definitely agree that SignedFile should be a separate object that wraps FileStorage via composition/the constructor. In fact... possibly stupid idea (and probably the wrong thread), should that be implemented as a stream wrapper and stream filter?

@catch Lock is so specific that I guess we would never be able to use a classical and/or generic key/value store for it: we would drown into complexity doing that.

When I first thrown ideas into this issue, I didn't thought about the fact that key/value stores are really different themselves so it's really unseasy to do a nice common API, it would be either too generic (no added value at all, I mean what's the point of doing this interface only if we have get() and set()) either we would exclude a lot of existing backend if we make it too complex.

In that regard, our actual interfaces are not that bad actually, because they are simple and do fit with the business they are meant to.

We could indeed reconcile all the interfaces we need to ensure that drivers would be easy to implement, but we cannot tie together stuff such as cache and lock: they both have really different API and constraints, and one backend interface for all would probably force developers to make a choice: either optimize for lock, either optimize for key/value.

So IMHO, we can actually propose a simple key/value interface, but simple as in really simple: we can start from cache need and make a superset of it (adding set() without expire, for example, and add some other stuff such as AYSNC and PIPELINE generic support) but I'm afraid we cannot really go further.

Status:Active» Needs review
StatusFileSize
new5.13 KB
PASSED: [[SimpleTest]]: [MySQL] 34,586 pass(es).
[ View ]

Here's a very basic patch to get things started here.

I think a basic k/v api in core is still very useful even if it doesn't support the more advanced features. Simply having that option instead of MySQL allows modules to use it, and then we can offload that to something like redis. Just having variables in a proper key/value store so we dont have to load them all into memory on each request is a win IMO.

I think that interfaces should always be at a higer business level: a key-value store interface is too specialized to be a low level interface, but not specialized enough to be performant: I think this is a leaky abstraction, that is why I actually did not completed my database implementation.

Another option might be to start with the Symfony ParameterBag component

http://api.symfony.com/2.0/Symfony/Component/HttpFoundation/ParameterBag...

Have not looked at this in detail, but if we're getting the Symfony stuff into core we may as well use it.

They don't have getMultiple / setMultiple or deleteMultiple, although we can extend that class and just add those...

And we save our stuff in the database... The symfony paramterBag class doesn't save anything in a database.

As I understood it, ParameterBag is an enhanced associative array, with caller-provided defaults for missing entrie s (a bit like our current variable_get($name, $default)). Pure runtime api, with no persistent storage. Possibly interesting as an API / interface for the 'key/value' side, but brings nothing on the 'store' side.

I'm still thinking having a generic key/value store interface is not right, it's not specialized enough for more specialized APIs such as lock, cache, configuration or entity properties storage to be performant, and too abstract to be really useful. If we go this way, we'll end up rewriting a full key/value store interface thus specific driver implementations (Redis, Mongo, etc...) will be too hard or complex to write.

This is right in the middle between the real storage backend (Redis etc...) -which are performant by definition- and the specialized API (cache etc) -which must be specialized and atomic enough to be performant-: if we choose that this key/value interface must be simple to make drivers easy to implement, it won't work for specialized business matter and we'll end up with really complex specialized business layer implementation; On the contrary, if we choose to make it complete and fully featured, we won't be able to plug it to any drivers easily, except the one we choose as example, and we will exclude straightforward implementation, by making them complex we will make them slow and loose the performance and maintainability.

I'm not pro for this. I'd prefer to have one driver interface per subsytem (one for cache, one for lock, one for configuration, one for entity properties, etc...) to rely on, it would be much more efficient for writing specific and straightforward -thus performance wise- drivers. They would be eaiser to maintain, understand, test, and also easier to evolve when the real backend evolve.

A generic key/value storage interface in Drupal is a leaky abstraction and it will distance us from many of our goals: performance, simplicity, avoiding cyclomatic complexity, it will make the full Drupal core a lot hard to apprehend for newbies by adding a complex and too generic layer, that will never fit perfectly any of our subsystem business layers.

EDIT: Please, focus on the actual needs of Drupal (configuration, properties and cache) by writing clean and business oriented interfaces for those -we do need to clean those first- instead of creating new virtual ones (the key value store) by trying to anticipate the potential specific layers needs, we actually really don't know those yet.

I have to agree with pounard. This is putting the cart before the horse. A simple convention for method names on "key based things" is sufficient, and then let each "system that stores stuff" do what it needs to do.

It's possible that after we implement a new locking system, an improved cache system, evolve the config system, etc. we'll see enough similarity that we could make a common interface for them. Possible, but highly unlikely IMO, and not something we can reasonably predict now.

Plus, bear in mind that such a universal interface would create a hard dependency for ALL of those systems. We want to remove hard dependencies betwee subsystems. I don't see the value-add here that a loose naming convention wouldn't achieve just as well.

Well we need a simple but pluggable key/value store for #1175054: Add a storage (API) for persistent non-configuration state. If it /only/ gets used for those use cases then it's worth having - we don't need to call it a generic system that will work for every use cases, just for storing 'state' which is neither content nor configuration.

A "system state" subsystem makes a ton of sense, I agree. It probably could be written as a component. I just don't think that starting off by treating it as a special case of an uber-stored-array-on-steroids-meta-system is a good approach.

Agree with Crell here.

"State sub-system" really interesting, so what to do in this issue #1591412-11: Convert tracker settings to configuration system
The variable

variable_del('tracker_index_nid');

is highly attached to {node} table

StatusFileSize
new28.69 KB
PASSED: [[SimpleTest]]: [MySQL] 36,683 pass(es).
[ View ]

First, I'd like to point out that I don't care if this gets used for cache, or even lock. Cache especially is unique enough to warrant it's own backends where needed, although it's quite possible that those classes could at least extend these. Although I do think this could work for config, statistics, and probably others, I think the real use case here is contrib.

We're all eager to get rid of variable_set() as it's used now for config, but that's not the only thing it's used for. Many contrib modules abuse the variables table by storing random data in it, but most of what's wrong about that approach has to do with the fact that we pre-load it all on every request. Having a simple key-value API for storing persistent data that isn't config is very useful. Modules should be able to store their data in a simple way without having to design a schema and mess with schema API. This also allows us to swap out the SQL backends for some of this contrib storage (I'm calling them "collections") in favor of faster/more flexible k/v stores like Redis.

So I'd prefer that we focus on potential use cases so we can see how it could be used.

Here's where I'm at with the current K/V store patch:

I moved it to Component, because this code is now totally independent of Drupal, with the exception of the DbtngStorage class. It could probably benefit from using the DIC, but I didn't try because I'm not familiar enough with it yet.

I included the phpunit tests that I used while writing it (should all pass). I understand those aren't getting commited, but figured it'd be worth showing the usage. Also, the state of PSR-0'd simpletests for core components is very confusing so I didn't bother trying to convert them. I also included a RedisStorage class as another example that I know won't get committed.

The DbtngStorage class currently uses the variables table unless a tablename is passed in $options. Could be a different table than variables, don't really care.

The only other thing I added aside from the component is the keyvalue() wrapper. It uses the variables system similar to cache for storage classes, but that would obviously change later. Here's the basic usage:

<?php
$kv
= keyvalue('foo');
$kv->set("bar", "baz");
print
$kv->get("bar"); // "baz"
?>

I've included three file-based storage classes, serialized php, json, and yaml. I could see config using this by changing the helper to something like this:

<?php
function config($name, KeyValueStoreInterface $kv = NULL) {
 
$kv = $kv ?: keyvalue('config.' . $name);
  return new
DrupalConfig($kv);
}
?>

CMI would need much more work than that to actually use it in practice, but that's probably necessary anythow to properly decouple the storage from the rest of config. The file backends could be implemented in a number of ways, so if CMI would need a different approach, that's worth exploring.

I'm still thinking this is a really bad idea. The interface you propose right now is even less able than the cache backend. A too generic API is no good for anything, because you can't make critical choices at the application level, such as the choice of either optimize reading either or optimize writing, or the choice of going synchronous or asynchronous, or the choice of telling the API it can hit slaves instead of the master, etc... All those things we already don't really care about while we definitely should in existing core code.

And even reading your post, I still don't think any real use case right now. I don't want to put a new feature just because it's fashionable and contrib *may* use it (without even having a real use case).

In Drupal, we have a storage backend, it's called the database, and module developers should always care about their data schema, even if as simple as storing key/value pairs. We also have entities and fields, we also can use cache backends as persistent storage (all we need is not exposing the bin name to the hook_flush_caches()), CMI for persistent and non mutable relatively simple data, and many more features, this one seems one too many IMHO.

I know that the key/value store sounds appealing, really, I was on that side at the beginning, but if people can't do what they need with the solutions upper, then they probably need to start thinking in term of more complex storage, and start designing a schema, etc... and at this point they'll probably be need to design their own storage interfaces anyway.

I really do think than we should first focus on business matters first.

module developers should always care about their data schema, even if as simple as storing key/value pairs

This strikes me as odd. If I know I want a key/value store, why would I want or need to maintain my own table with key and value columns?

As far as using entities/fields and cache backends, that doesn't seem like it's even in the same ballpark as what we're talking about. Those are both a lot more complicated than creating a table, and we're looking for something *less* complicated. We're replacing a schema-less dumping ground that's brain-dead simple to use. The replacement should be equally brain-dead simple.

Let's focus on a specific use case, as discussed in #18 and #19: System state. This is not configuration. It's not content. It's not session because it's not per-user. We cannot regenerate it on the fly at any time, so we can't use the cache. It's blobs of persistent k/v data. I'm thinking, say, the object cache from ctools, which we're likely going to need in core between SCOTCH and Views in Core. Or the menu masks from the current menu system. (That's going away I hope, but that type of data.)

Let's focus on solving that case.

Maybe better inherit from DrupalCacheArray with different way to store data
I don't think that $expire make sense in context of persistent storage
OTOH

the object cache from ctools

really great example, we could use CACHE_PERMANENT as default

I suppose the main purpose of this is a persistent runtime state/data

Status:Needs review» Needs work

"the state of PSR-0'd simpletests for core components is very confusing"

Yet, the rule is that no patches go in without simpletests.

What is the point in having a separate KeyValueStore class when all it does is wrapping a storage object with the exact same methods? Why not merge the storage and the KeyValueStore class?

Why is loadFile public...?

db_query('SELECT name, value FROM {' . $this->table . '} WHERE name IN (:keys)'

This, I think misses db_escape_table().

As a module developer when do i use config and when k/v?

Oups wrong issue once again, I'm sorry, this is becomming annoyng we have two issues trying to put in place the exact same solution for different needs. Anyway, if we need to focus on use cases, let's name the code following those use case first and then derivate to more generic on a per-need basis.

I see a system state table is more specialized than just a key/value store, at some point depending on my business matter I will need to add a few additional columns and be able to filter on it, and based upon that fact, a document oriented storage would probably be a best fit because with some storage backends such as mongo I'd be able to do that. Which make me think back of having a SQL table when we need it since aggregate operations are meant to do that.

Getting back to the original examples, what we cannot generalize clearly is:
- lock (lock represent a mutex, not a storage space)
- cache (it needs wildcard clearing abilities and tag support, and our implementation is working now)
- entity simple crud (we'll need more than one key to store our objects, filter them, and do all kind of operations over it like sorting etc...)
- form state storage is something that but misuse the cache, attaching forms to the session would be a better idea since we have the notion of state attached on a per session basis
- CMI needs its own specialized storage, for which implementations will be radically different from a simple key/value store or cache, and they already have their own interfaces to go with it
- some other stuff such as the system table cannot fit in there, if we want to keep this as it exists (meaning we can query over the existing modules and sorting them) we need a real SQL table with specialized fields and indexes. A more common solution for module list would be to set it in configuration instead
- update cache is probably wrong if it has its own layer
- the object cache of ctools is cache: we can use the cache system. Behind this, it also stores objects, I have to introspect this, but I have a hunch that this is nothing the entity system, cache system and CMI cannot solve altogether, depending on the objects it stores

Does anyone has a single real use case where we could benefit from this?

I'm relatively confident that we can prove @pounard to be wrong, but I also believe he has a point in that we should work on one-off solutions first, and only attempt to merge/unify them later on. Otherwise, we're risking to introduce too much abstraction and/or premature optimizations.

Therefore, I'd suggest to move forward with #1175054: Add a storage (API) for persistent non-configuration state first, and do the required config system clean-ups outlined in #1560060: [meta] Configuration system roadmap and architecture clean-up in parallel. The results of both efforts are most likely going to show two subsystems that both use storage controllers for simple key/value store operations, which can be easily unified afterwards.

Since I'm responsible for it, I should clarify the situation with the update cache. Originally, update manager just used the regular cache system. Then we realized that was totally broken over at #220592: Core cache API breaks update.module: fetches data way too often, kills site performance, etc due to core's cache API being potentially swapped out with volatile storage or otherwise configured to clear itself ridiculously frequently. Update manager needs non-volatile caching. Hence, it bypasses the core cache system so that we know we're always using non-volatile DB storage for this data and can clear the cache at the right moment. This is all fairly well documented and explained at:

http://api.drupal.org/api/drupal/modules!update!update.module/group/upda...

Everyone should know this before they go around calling this "wrong". ;) It's wrong insofar as core provided neither a guaranteed non-volatile cache API nor a generic solution (like the one proposed here) that I could rely on, so I had to write my own (and a generic non-volatile cache API was out of scope at the time -- we were already way past feature freeze at that point, and I needed to be able to backport to D6, too).

So, in D8, I'd be happy to see update manager's "cache" improved. Looks like someone's already trying to do that over at #1547008: Replace Update's cache system with the (expirable) key value store. Perhaps this is a better solution. I haven't yet looked at either of the patches, just skimmed the issues and comments.

Hope this helps shed some light,
-Derek

Status:Needs work» Postponed

Agreed with sun and pounard. Let's postpone the generic approach until we have some use-case-specific implementations in place; then we can see if we can merge them, and if there would be any actual value in doing so. Let's continue in #1175054: Add a storage (API) for persistent non-configuration state, which should serve dww's use case from #31 as well, I'd hope.

Status:Postponed» Needs review
StatusFileSize
new11.2 KB
PASSED: [[SimpleTest]]: [MySQL] 36,694 pass(es).
[ View ]

Here's a very minimal version, with the needs of #1175054: Add a storage (API) for persistent non-configuration state in mind.

This patch only contains the interface, memory, and dbtng backends (and includes chx's feedback about db_escape_table). Since we're talking about essentially duplicating this interface in the other thread, it seems like we have our use case and we should just get this in.

Sure, we could just do a one off for that issue and then look at making a more generic solution later, except that it would be identical. This is exactly why I made the interface as simple as it is. The storage api class (or whatever we're calling it) can just extend the k/v storage and it wouldn't have to override anything. The abstraction costs us nothing.

@chx - The main KeyValueStore class is probably unnecessary now. There was a reason I separated it from storage initially, but I honestly dont remember why.

Status:Needs review» Needs work

A) Can we please not have "Dbtng" in a class name? ;) Must the bizarre Star Trek references creep all the way into our code? Let's just call it "DbStorage" or perhaps "DatabaseStorage".

B) There's an extra newline and * in the doc block for the StorageInterface interface definition.

Otherwise, this seems pretty sane to my just-got-back-from-the-club-and-still-kinda-drunk eyes. ;)

I continue to be confused why these are two separate issues, but whatever. Yes, this seems like the basic abstraction we're aiming for.

Thanks,
-Derek

The database implementation uses a single table for all namespaces, you should create one table per namespace: the indexes would be smaller, it'd be easier and faster to drop a namespace, and each namespace would be isolated from the others. This also mean you could get rid of the prefixing logic.

This also means that, in order to work, namespaces need to be declared somehow (at module install time maybe).

EDIT: Another solution would be create a collection index in the table, instead of working on prefixes, this would have a lot of sense since we could proceed to cleanup more easily.

I think that, whatever is the approach this patch takes, that namespaces should be declared by modules, and dropped when modules declaring one namespace is uninstalled, just like cache or any common database table.

Consistency checks should be done when manipulating keys (in get/set etc) operations, and exceptions should be raised when the user tries to access a non declared namespace, for consistency, else any typo error in code will remain silent and it'll raise drastically key bloating and chances to end up with orphan keys.

I think this should care about locking also most of backends has counter functionality (a-la memcache)
This could help with #103866: Add a general counter API

@pounard - The issue of one table vs many is exactly the issue I've been wanting to hash out. I can see the benefits of table per collection and all in one table, but I kind of like the idea of a collection column. I hadn't thought of that.

@dww - I started to write out why I called it dbtng and then realized I dont care. Someone else decide what it should be called and then tell me :)

Status:Needs work» Needs review
StatusFileSize
new12.5 KB
PASSED: [[SimpleTest]]: [MySQL] 36,700 pass(es).
[ View ]

New patch incorporates dww and pounard's feedback:

- Db class is now SqlStorage
- Creates a new table called keyvalue, which uses a collection column in addition to name/value. A table name can still be passed in $options to use a different table however.

this is looking nice. i would love a ->getAll() method.

that's really all we'd need to shift config storage away from per-file blobs to the default db backend, where the collection name == the config name.

StatusFileSize
new13.08 KB
PASSED: [[SimpleTest]]: [MySQL] 36,706 pass(es).
[ View ]

Seems like a reasonable addition to the interface. Here's a version with getAll().

getAll() potentially limits us in terms of storage implementations, but I don't know if that's a real concern or not.

Also is getAll() really OK for the config API? The current storage for config only implement getNamesWithPrefix() which doesn't have to load an unserialize all the values at once. With default views at the moment, loading all of them can take 20M of memory.

Agree with catch, for both the backend concern and the fact that getAll() won't scale as soon as the collections will grow.

I had the same initial reaction to getAll(), but I can't really think of a backend where we wouldnt be able to do it. It seems like this is part of what differentiates k/v from cache, because we can make this assumption.

getAll(), where 'collection' maps one to one to a config object/file. so, that loads all the data for a *file*, not a *site*.

i'd anticipate being able to much more easily front-end load 'used on every page' config values across the site if we store the data this way. also, it will be easier to make updates to a few config values across different file(s) not require rebuilding the world in standard drupal fashion.

note that storing config this way (each value stored separately) is vapourware - it's just an idea i'd like to try to implement in core before D8 lands.

failing that, i'll write a contrib version, as a successor to Variables That Suck Less.

Agree with catch, for both the backend concern and the fact that getAll() won't scale as soon as the collections will grow.

That is 100% premature optimization without a use case. Of course it won't scale. It's O(n), that's the point.

Another note, the KeyValueStore class doesn't serve any purpose in this code, since it's only a proxy to the storage interface, this could be simplified and the the keyvalue() function could return a storage interface directly.

The getMultiple() function is confusing because it may return an arbitrary number of results. Redis for example returns non existing keys as a null value (could be a null object to differenciate it from null values): we'd better return an iterable custom resultset, with the keys that couldn't be fetched in place but with a specific value/properties to tell it was a fail.

Using an iterable cursor like object for results could allow code that consumes a lot of memory to use those for, for example, sequential object loading (by extending them) and thus give more control to the developer. We could also potentially return the database resultset directly, for example, and avoid some useless array iterations.

getAll(), where 'collection' maps one to one to a config object/file. so, that loads all the data for a *file*, not a *site*.

OK gotcha, misunderstood at what level we'd getting all of something. We'd just have to make it clear that you shouldn't be using this for vast collections. Seems like CMI would still need the 'get all stuff with a prefix' method though in that case?

yes, we do want to be able to get a subset of the keys in a file based on prefix. hmm.

'get me all the keys with this prefix' probably doesn't belong in the key-value API. maybe this is not going to fit like i hoped it would.

Well we could have the config interface extend this one, then config storage classes just have to implement an extra method no? That at least would save some code duplication in projects like redis or mongodb.

Config system would be our first exception of the generic key/value store.

Another matter: we should add the expiration time capabilities for keys, if I read the issue description, we will need it for: session, cache, form state storage.

I think those 3 exemples should not use a generic key/value store anyway: session is too critical and too low level, cache already has a fully featured interface and multiple implementations that exists and has specific feature needs such as tags, and form state storage is something that probably should be tied to the session.

re #49 yeah, that sounds about right.

@pounard - I thought about expiration, but I dont think it's something we should put in this interface. Expiration can mean too many things. In cache, I may not actually want to expire a key, but rather check it's timestamp on a get, and potentially still serve the old version while the new is regenerating.

Hum maybe yes, that's one new good reason for cache not being a candidate for this generic key/value API. Anyway, whatever the name is, "expire" or "TTL" this is a nice feature to have, but this does not really matter right now.

If we're not going to use this for cache, session or lock, and I doubt we will, then I don't see the use in adding expiration.

Form state - I think xjm is working on porting the ctools object cache to a core patch at the moment, and that's likely to tackle form state fairly early on (and iirc it is indeed tied to session), so that will likely not end up in this either at least first pass.

Time to live might be useful for some business specific features, but as you said, if we're not going to add any of locking, cache, session, form storage into this, I don't really see where we are going to use it. May be there's one candidate: the update module for its specific update information. EDIT: But once again, I don't see why they wouldn't use their own table, the only cost is having to describe the schema in the .install file. In the end, is db_select() that awful to use? Wanting a generic key/value API is like doing the same than db_select() but with less features.

  1. I agree that getAll() should be left out from this here. The stored values are not necessarily arrays.
  2. The usage/interpretation of "collection" here doesn't really make sense to me. The interface does not state how collection is being meant here, and whether that is supposed to be some kind of a child thingy within a storage that's only supported by NoSQL stores, or whether it actually means separate storage bins (tables in SQL speak).
  3. Putting "collection" as a column into the database storage seems wrong to me. Given the primary use-case of storing persistent state data, then I'd consider 'state' to be the $collection ($bin). If something else also wants to use this for large amounts of other state data, which for some reason shouldn't be stored in 'state' (e.g., form state data in its current form), then I'd say that code should instantiate a new k/v storage and pass 'form_state' (or 'state_form') instead of 'state' as $bin ($collection). The use-case would be to intentionally use a different and entirely separate storage.
  4. As already mentioned in #1175054: Add a storage (API) for persistent non-configuration state, I strongly believe that all keys/names being stored within a generic (state) bin need to be prefixed with an owner name, and we need a facility to find all items that have a certain prefix (or namespace, if you will), so the system is able to determine and properly clean up all the data that has been stored by a certain module upon uninstallation. If that is the intended meaning of "collection", then it would make more sense to me.

    However, both the database and the memory implementation do not seem to use "collection" in this way. And I think that doing so would mean to move the $collection parameter as first argument into the get() and set() methods, so you'd do e.g.:

    storage('state')->get('update', 'release_data');
    // i.e.:
    storage($bin)->get($collection, $name);

    Can we do that?
  5. In general, the code needs some love; the database implementation tries to filter by a "key", although the column is "name". And the Memory/Array implementation does not take the collection into account at all. It's thus a bit hard to wrap one's head around the intended architecture/design.
  6. SqlStorage should be renamed to DatabaseStorage, because it works with and depends on the Database component, so there isn't any more clear term than DatabaseStorage.

The usage/interpretation of "collection" here doesn't really make sense to me. The interface does not state how collection is being meant here, and whether that is supposed to be some kind of a child thingy within a storage that's only supported by NoSQL stores, or whether it actually means separate storage bins (tables in SQL speak).

Strictly bin in the cache bin sense I guess, just an arbitrary namespace (maybe not even tied to the owner module in the end). For the "state" storage of the other issue, the "collection" would be "state".

Putting "collection" as a column into the database storage seems wrong to me. Given the primary use-case of storing persistent state data, then I'd consider 'state' to be the $collection ($bin). If something else also wants to use this for large amounts of other state data, which for some reason shouldn't be stored in 'state' (e.g., form state data in its current form), then I'd say that code should instantiate a new k/v storage and pass 'form_state' (or 'state_form') instead of 'state' as $bin ($collection). The use-case would be to intentionally use a different and entirely separate storage.

I'm not sure to understand what you are trying to say here, but the collection column I recommended before has the only goal to normalize indexes in the database and avoid prefixed LIKE operations. It is probably right if we have one only table (will be slower for writes, but will proceed only to strict string comparison when reading), but it won't have any sense anymore if we change the implementation to be one table per collection, which I recommend to avoid ending up with one huge bloated table.

As already mentioned in #1175054: Add a storage (API) for persistent non-configuration state, I strongly believe that all keys/names being stored within a generic (state) bin need to be prefixed with an owner name, and we need a facility to find all items that have a certain prefix (or namespace, if you will), so the system is able to determine and properly clean up all the data that has been stored by a certain module upon uninstallation. If that is the intended meaning of "collection", then it would make more sense to me.

This is not the intended meaning of collection, I don't think so at least. Considering the use case of the "state" table, this is not the case at all since the "bin" ("collection") is "state" and the key names are arbitrary and belong to the module that set it. But if you want to add the owner namespace it's not a generic storage anymore because you just add filterable properties onto your keys. This could be achieved with additional columns, indexes, or tags. In all of these cases, we're not speaking about a key/value store anymore but about a specialized formalized and normalized SQL table like storage instead, which would fit better with a relational SQL table or an advanced document oriented storage such as MongoDB which is able to handle a schema and indexed fields.

EDIT: Pure typo fixes.

StatusFileSize
new13.06 KB
PASSED: [[SimpleTest]]: [MySQL] 37,260 pass(es).
[ View ]

I agree that getAll() should be left out from this here. The stored values are not necessarily arrays.

That's not what it does. It just gets all k/v pairs for a collection.

The usage/interpretation of "collection" here doesn't really make sense to me. The interface does not state how collection is being meant here, and whether that is supposed to be some kind of a child thingy within a storage that's only supported by NoSQL stores, or whether it actually means separate storage bins (tables in SQL speak).

I think you're looking for too much meaning here. It's vague by design, the meaning shouldn't imply an implementation detail like tables.

Putting "collection" as a column into the database storage seems wrong to me. Given the primary use-case of storing persistent state data, then I'd consider 'state' to be the $collection ($bin). If something else also wants to use this for large amounts of other state data, which for some reason shouldn't be stored in 'state' (e.g., form state data in its current form), then I'd say that code should instantiate a new k/v storage and pass 'form_state' (or 'state_form') instead of 'state' as $bin ($collection). The use-case would be to intentionally use a different and entirely separate storage.

Each collection can use a different storage backend by design. If you're using the default SQL backend, you still have the option of passing a different table name in the constructor if you need another table.

As already mentioned in #1175054: Add a storage (API) for persistent non-configuration state, I strongly believe that all keys/names being stored within a generic (state) bin need to be prefixed with an owner name, and we need a facility to find all items that have a certain prefix (or namespace, if you will), so the system is able to determine and properly clean up all the data that has been stored by a certain module upon uninstallation. If that is the intended meaning of "collection", then it would make more sense to me.

That's what $collection is for. The getAll() method would allow you to get all the k/v pairs to clean them up. If the module isn't responsible for cleaning this up on their own (like it is for variables in hook_uninstall) then we'd need an info hook type thing to declare collections.

However, both the database and the memory implementation do not seem to use "collection" in this way. And I think that doing so would mean to move the $collection parameter as first argument into the get() and set() methods, so you'd do e.g.:

storage('state')->get('update', 'release_data');
// i.e.:
storage($bin)->get($collection, $name);

Can we do that?

I don't see the problem here. You instantiate your kv object with the collection, then you can set k/v pairs on it. Both the sql and memory implementations work this way. It looks like you're adding a level above collection, which is too much IMO.

In general, the code needs some love; the database implementation tries to filter by a "key", although the column is "name". And the Memory/Array implementation does not take the collection into account at all. It's thus a bit hard to wrap one's head around the intended architecture/design.

Sql bug fixed, thanks for catching that.

The memory implementation doesn't need to do anything with the collection because it's essentially just an array, which tells you what you need to know about the design. All k/v objects are just arrays that persist. How they persist is an implementation detail. The k/v procedural wrapper caches the object per collection, but the k/v storage objects don't have any knowledge of any other collections but the one they were instantiated with. That wrapper should probably just be a static method on the main k/v object that currently doesn't do much at all. I think that's the original way I designed it.

Status:Needs review» Needs work

In this version the class KeyValueStore is absolutely unnecessary, every single method just a one line wrapper to pass the method call to $this->storage, change $kv_objects[$collection] = new KeyValueStore(new $class($collection, $options)); to $kv_objects[$collection] = new $class($collection, $options); and remove the class and no functionality is changed.

Status:Needs work» Needs review
Issue tags:+Framework Initiative

Current state of affairs.

@msonnabaum, I've seen that the KeyValueStore class is not really necessary anymore -- any particular architectural reason for why we have the KeyValueStore wrapper class?


Sorta super-offtopic for this issue, but OTOH, quite essential:

To prevent people from blaming someone innocent. It was me who added the MIT LICENSE.txt in order to reflect the original author's intention.

This code originates from @msonnabaum's github repo and was created with the intention of creating an abstract key/value store API that can be re-used and shared across projects, since no such component seems to exist in any of the PHP frameworks we've checked.

The fundamental idea of pulling it as a component directly into Drupal core has the sole purpose of making the Drupal community a recognized player in that framework field in the long run. To combat the Not Invented Here syndrome, if you will. Our great community can do more than consume; we are certainly able to produce, too.

If this will cause any kind of havoc, we'll withdraw further development on this code on this d.o issue entirely and make Drupal core pull in the library/component from github instead. Details on how this could work can and should be figured out in a separate issue: #1713080: Allow code in Drupal\Component to be used by non-GPL projects

Note: The above is my own stance, which is not coordinated with @msonnabaum, whose opinion may differ.

StatusFileSize
new17.56 KB
PASSED: [[SimpleTest]]: [MySQL] 39,800 pass(es).
[ View ]

What is the point in having a separate KeyValueStore class when all it does is wrapping a storage object with the exact same methods? Why not merge the storage and the KeyValueStore class?

Do we want to have a CAS operation? Followup?

Hm. That's interesting. I wouldn't necessarily say it's missing something, but it seems to have a completely different goal in mind; the Storage interface clarifies that the "focus is on mapping the data onto PHP objects." That's a pretty extreme goal and difference (which could become useful, if we'd want to re-use these storages for entities one day... but alas, that's a long shot...).

OTOH, the very same phpDoc also states that code, which doesn't want or need that, should interface with the storages directly.

I also do not really like that the Storage interface does not define a consistent constructor for all storages, which means that we'll have troubles to interface against them. The component's readme already shows vast differences for how to instantiate a particular storage. Although I guess we could work around that in some way.

So... hm, I'm not sure. Given that the project is relatively new, it might be worth to contact them for possible collaboration and see what they think. It's good that we have our own code readily at hand, so they can directly compare and see what we're up to. In the end, it really does look very similar though (although with a strong bias/focus on mapping the k/v records to PHP objects, which we're not interested at all in), and even the component name and directory + file names are almost identical ;), so it would at least make sense to double-check that we're not inventing the same wheel in parallel :)

#66 , Rob Loach withdrawn his suggestion and I have withdrawn my (rather stern) criticism of it. Let's get back to our solution. I had two valid questions in #62, may I get an answer to that?

Thanks for the insightful comment there. I am glad we are still interested in keeping this simple (and obviously I am not against working with Doctrine).

I talked to both msonnabaum and sun about this and they both agree that its more important to get this into core so we can use it and unblock issues like #1175054: Add a storage (API) for persistent non-configuration state than it is to continue chasing the github/MIT issue with it. Sun wants to merge in some changes he has and then he's going to post a new patch so we can get going on that.

Assigned:Unassigned» chx

I plan to work on this, the descendant issue about state and then the system table issue on the plane home and post results on Aug 31. I fully expect those results to be ready, already this looks nice, just needs tests. If you have code to add please do before Aug 31 because I will be offline (obviously).

Just wanted to confirm #70. Although I want a DrupalComponents project that this would ideally go in, I think getting this functionality into core is more important so I don't want to block it in any way.

Assigned:chx» sun
Status:Needs review» Needs work

We have fully working code for this (including tests) already, so I don't think it makes sense to restart anything from scratch.

I'll try to port/migrate the code into the Drupal\Component namespace later today.

I didn't intend to restart from scratch but the issue doesn't have tests; that's all I wanted to add. (sigh)

Assigned:sun» chx

OK taking this back as no patch materialized. I will get the tests and everything from https://github.com/msonnabaum/KeyValueStore and post a patch today.

Status:Needs work» Needs review
StatusFileSize
new13.64 KB
FAILED: [[SimpleTest]]: [MySQL] 41,117 pass(es), 1 fail(s), and 0 exception(s).
[ View ]

msonnabaum suggested memory and sql backends for now. I left his interface completely intact (aside from making $options in the constructor optional), finished both backends, somewhat rewritten but mostly just rearranged the tests and here it is.

Edit: sorry for the slow work; some health problems slow me down.

Status:Needs review» Needs work

The last submitted patch, 1202336_76.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new13.63 KB
FAILED: [[SimpleTest]]: [MySQL] 41,118 pass(es), 1 fail(s), and 0 exception(s).
[ View ]

A fatal?? I did see a fail because I removed the last minute the return value from delete thinking that exceptions will handle that but that was not a fatal well, let's see this.

StatusFileSize
new14.12 KB
FAILED: [[SimpleTest]]: [MySQL] 41,114 pass(es), 1 fail(s), and 0 exception(s).
[ View ]

OK, so catch pointed out that the DB test could be unit test too by creating its own schema. This allowed moving the actual test (and most of setup) into a common base. Very nice.

StatusFileSize
new14.22 KB
FAILED: [[SimpleTest]]: [MySQL] 41,115 pass(es), 1 fail(s), and 0 exception(s).
[ View ]

Well, if we create our table, then we should drop it as well: added tearDown.

Status:Needs review» Needs work

I don't like this key/value store interface because even the cache backend interface has more features. I don't see the point at all, I'd rather use a cache backend as storage instead of using this in the end, at least it brings me a primitive wildcard support (via the prefix). The only feature we add is the getAll() but we don't have any of the good stuff we would really need as API users.

IMO a key/value store API for our usage would be at least to have some more features, such as async, pipelining, transaction, prefix and wildcard support; Maybe some other high-level features such as incr, decr, expiration, etc... All of those features are easy to emulate on the Drupal layer for backends that don't support it natively, and can be added at least in interfaces even if there is no benefit for the SQL backend, benefits would come with advanced backends such as Mongo or Redis or custom fine tuned SQL backend.

Because of this I really think that this is a regression to spent time on this patch. I really hope we could all focus on both high level and performance related features before actually trying to bring this API in.

I'm opposed to a generic key/value store API -I still think it's too generic- but if this is going to happen, I'd like that we all think of the useful feature we might need later for both advanced core and contrib modules and spend more time on conception and maturation than on a quick implementation rush based on the code of a single individual that did not take any of ideas of the initial post summary and first posts.

Status:Needs work» Needs review

This will be the underpinning of the state API. Does not need to be capable of more than that; and we already see there it is capable of that. More capable APIs can come along as necessary but during the many years of Drupal, there never was actually a need for a higher-capable K-V API. Also trying to unify the APIs / features of Berkeley DB, Tokyo Cabinet, Redis and Memcached -- the leading KV stores -- absolutely frightens me.

To clarify further, even if we never use this API for anything but the state API, we are still good. We didnt lose anything, there's no performance regression or anything.

Status:Needs review» Needs work

Then you are writing code with a wrong name, in the wrong namespace, and in a wrong issue.

EDIT: Just as a side note: [...] absolutely frightens me that's why I always proposed only features that can be safely emulated with no real performance penalty at API level.

Status:Needs work» Needs review

Sorry, I was wrong above slightly: besides the state API plain K-V usage is likely, for example user mail texts which are currently variable. So no, this doesn't need to be merged into the state API , you alone have hammered this since before there was a patch, now that there is you still do not review it but harp it is not enough. It is. We can do more when the need arises.

Note that among the thousands of Drupal projects, noone written a K-V API like the one you suggest -- perhaps because noone needed one?

Status:Needs review» Closed (won't fix)

But you know, I am sick of fighting you, sun and everyone else who just wants to fight. I am merging this into the state issue.

Status:Closed (won't fix)» Closed (duplicate)

This isn't won't fix if the patch is moved elsewhere, but we also can't justify adding this without a use-case, and we know we're not going to use it behind the cache or session backends, which wasn't necessarily the case over a year ago.

So I'm OK with marking it duplicate of #1175054: Add a storage (API) for persistent non-configuration state.

For posterity: it is absolutely not OK to close this down before we figured out why that fatal happens. It's just the usual issue queue bullying that forced me to do it.

This is not about fighting, but when the issue was opened, the goal was to be able to switch a lot more of subsystems onto such Key-value API. I was enthusiast toward this approach at the time, for many reasons. Now I'm not anymore, but, there could be real life use case, and if we do reveal some, I'd be glad to parcipate in implementing such API. The only condition IMO is that it must be complete, and reflect the original issue summary.

The state API is itself a good idea too even if I don't always see the real need (we have a database today, we can use it for such state API without abstracting that much, keep it minimal and close to what core really need). For this state API, a simple K/V store as proposed here would fit, so please, do it, but do it into the proper issue, and with the proper name.

IMO, the only real use case for a simple K/V API is as a cache backend, or stuff like sessions, or simple statistics (that don't need to be aggregated or subject to complex query and display), but that's all about it. You cannot developer business stuff related to such a complex schema as ours properly with a K/V store; Not with fields, objects, relation between objects, need to query them using properties, etc... Even configuration objects such as Views are complex and may need more than just a name to be queried, we need other indexed fields in most cases (Views are grouped by status, tags, can be ordered by name, needs an upper cache even with the existence of this API, etc...).

I do understand that a lot of people might want to use a simple K/V store API like this one, but in most cases it will be at the expense of optimized business APIs or just because people are lazy to code a nice SQL schema. We have a SQL database, let's just embrace it, and use it correctly, instead of storing everything in BLOB's which makes direct data manipulation (SQL custom queries, maintainance in a SQL client, schema discovery, schema design and drawings, integrity, other constraints, data analysis, aggregation queries, etc...) totally impossible and unsutunable.

If Drupal is meant to go away from a SQL database, let's just do it properly some day, I would be definitely curious and happy to hear proper global design solutions, with pros and cons etc..

Today with this we are just giving an incomplete tool to people, that will serve the only purpose of writing incomplete, unmaintainable code accompagnied with a totally unconsistent data schema with no integrety constraints at all or any mean to properly query or analyse the data using anything else than the keys. Using a K/V store means handling indexes in the software side of things in case we need it, instead of relying on existing tools (SQL is perfect for that, but Mongo would be too, and some other solutions probably exist), which means that this tiny piece of handy API comes with the burden of making your life a lot harder when it comes to add specific business features around data filtering, querying, ordering, cleaning, analysis etc...

My conclusion is that we should NOT give this tool as an open API to contrib developers, because it will bring a lot of ofuscated data with no way around correct analysis and maintainance. Let's just develop the specific state API, even if it is indeed a K/V store, but let's develop it in the state issue, and let's do fine tuning to it throught its own encapsulated API. Then, in the end, if it ends up still simple as that, then we may derivate it to use a more generic abstraction.

I might have, again, reacted the wrong way. All I knew this API is needed for state and that we had a mysteriious fatal (turned out to be https://bugs.php.net/bug.php?id=43200 ) and all I wanted to do in this issue is figuring out that fatal and I didn't take too well to be against this API. Those wider concerns were basically just frustrations in an already frustrating situation I didn't even want to hear about. Our common history, alas, didn't help this at all. Sorry for the harsh tone.

Sorry for this too. I think I never really exposed why I hate this generic issue, because people focus on abstraction before focusing on the need. I did now. I just don't want to see Drupal turn into a huge data garbage bin (see where the word 'bin' can fit).

I don't want to fight with you again, and state API must go on; I'd rather see it going on at the right place.