Re-start for this after #1492188: Update module creates duplicate queue items drifted back to being a workaround fix for update.module.

Right now, update.module does it's own unique checks based on cache entries in cache_update. This should be a feature provided by the queue system itself, either by providing an actual unique queue or by providing wrapper functions to deal with this.

API suggestion by @beejebus:

<?php
function drupal_unique_queue_item_create(QueueInterface $queue, $data, $key) {
 
$status = db_merge('unique_queue_item')
    ->
key(array('key' => $key))
    ->
insertFields(array(
     
'created' => REQUEST_TIME,
    ))
    ->
updateFields(array(
     
'updated' => REQUEST_TIME,
    ))
    ->
execute();
  if (
$status === DrupalCoreDatabaseMerge::STATUS_INSERT) {
   
$queue->createItem($data);
  }
}
?>

This is also relevant for #1547008: Replace Update's cache system with the (expirable) key value store which aims to re-unite update.module with the cache API, which is not possible right now due to these fetch_task entries. Not sure if it's the right approach though, maybe yet another API for the project information data might make more sense for that. But that isn't possible either until we this API here.

Comments

Would it be better to encapsulate this into the Queue interface (making it an abstract class instead), so that all queue backends might take advantage of their own unique-handling if applicable?

<?php
abstract class QueueInterface { // <= should probably be renamed ...
 
abstract public function createItem($data);
  abstract public function
numberOfItems();
  ...
  public function
createUniqueItem($key, $data) {
   
// Shouldn't the unique item be per queue?
   
$status = db_merge('unique_queue_item')
      ->
key(array('key' => $key))
      ->
insertFields(array(
       
'created' => REQUEST_TIME,
      ))
      ->
updateFields(array(
       
'updated' => REQUEST_TIME,
      ))
      ->
execute();
    if (
$status === DrupalCoreDatabaseMerge::STATUS_INSERT) {
      return
$this->createItem($data);
    }
    return
FALSE;
  }
  public function
deleteItem($item) {
   
// Should this remove the "lock" from "unique_queue_item"?
 
}
}
?>

The code above is wrong in many ways the most visible one being that this stub implementation hardwires db_merge(): different queue backends are not supposed to be hardwired to any storage backend.

But there is a very revelant comment inside: Shouldn't the unique item be per queue?. I think for the same of API consitency there is two parameters here:
1. The item can be unique only on a per queue basis (I have the right the set the key "2" for example in two different queues even if it may remain unique in each one of those);
2. I think that the "unique" property should be determined at queue level (either by creating a new interface, either by adding it as isUnique() or whatever getters/setters) so that when creating a new queue, exceptions such as UnsupportedQueueCapaticityException("unique") can be thrown for backends that don't support it.

Granted, of course the stub should NOT be storage backend aware.

edited :-)

We can't force queue storage backends themselves to implement uniqueness, it won't work with a 'proper' queue runner like beanstalkd at all - this was discussed in the original update module issue. So keeping the queue API more or less at it is and adding a separate layer on top seemed like the only way to do this - then you can run any queue runner, and if you're using redis and want queues and queue tracking in the same storage you can.

Possible we could have this new API extend from the queue interface though, I don't think there's any particular reason it shouldn't.

So, last year I needed a queue that could be optionally provide the functionality of a Unique and Priority queue, in addition to a standard queue - since this was all up in the air at the time, I decided to build one that would work for now and integrate with whatever was built later on.

Since that seems to have not gone anywhere, would it be worth contributing my code? I've sandboxed it for you here: http://drupal.org/sandbox/ottawadeveloper/1948226

It provides a new static method (UniqueQueue:get()) for building queues that support both unique items and prioritized items (with both being optional) and provides a default DB-based implementation of an abstract class that could be extended to fulfill any data storage method necessary. The interface is very similar to the current queue interface (there are a few things I haven't implemented yet, like releaseItem() or createQueue() both of which should be done), however it overloads the methods with additional parameters for the unique and priority functionalities.

I chose to write my own full implementation that relies on a new queue table with the proper values as the alternative seemed to be altering the core table (which I try to avoid) or doing a join with a table (more expensive) and it also seemed difficult to integrate with the generic framework. It would be possible to create a version likely that relies on a regular queue as well (either by doing the join or extending the queue table) as a wrapper around a DrupalQueueInterface object if people feel that's preferable, but it makes sense to me that the storage requirements are different enough to completely reimplement the storage methods in a unique framework for performance reasons.

As I'm going to need this for some other projects I have on the go, I'm perfectly happy to commit to maintaining it if there is interest :).