When unflagging a content with a flag with some attached fields, the data from these fields are not deleted. The same behavior happens when deleting a user : the flags the user has done are deleted, but the fields data remain as zombies.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

joachim’s picture

Version: 7.x-3.0 » 7.x-3.x-dev
Status: Active » Postponed (maintainer needs more info)

It works for me with the 'Confirmation form' link type.

What link type are you using?

MasterWalker’s picture

Status: Postponed (maintainer needs more info) » Active

The flagging is done with the 'Confirmation form' link type, but for some reasons the unflagging is done programmatically with the $flag->flag('unflag', $id, $account) method (the user can't unflag directly). Again it does not work either when deleting a user account.

joachim’s picture

Can you do some debugging?

_delete_flagging() should be getting called, which then calls field_attach_delete(). That should delete field values.

MasterWalker’s picture

I tested using Devel's 'Execute php code' with a simple 'unflag' operation, and it worked (the fields were deleted). My bad.

So here is what I think happened : when testing an account deletion, I saw the flags' fields were not deleted. I then tried to unflag programmatically in hook_user_delete, but it did'nt work ; so I assumed unflagging would not delete the fields. However I was misled by the fact that flag_user_delete was called before my module's hook implementation, and thus I was working on empty flaggings.

Now that I've had a deeper look at the Flag module's code, I can see that field_attach_delete() is not called on user deletion ; neither for the flags the user has done nor for the ones he has received. Am I missing something ?

For now I just changed the hook_user_delete execution order using hook_module_implements_alter, I unflag every flags related to the user being deleted, and I get what I want. But I feel like I'm missing something again.

joachim’s picture

Title: Unflagging does not delete attached fields data » deleting a Flagging via deleting an entity does not delete Flagging fields data

Thanks for digging that out!

User deletion and account removal is handled by a common helper, flag_user_account_removal().

That in turns calls _flag_entity_delete(), which is the real culprit.

_flag_entity_delete() is a shared helper for:

- hook_user_cancel() / hook_user_delete(), via flag_user_account_removal()
- hook_entity_delete()
- hook_node_delete()

So it actually affects deletion of any entity.

I can reproduce this by creating a node, flagging it with a confirmation form to enter Flagging field values, then deleting the node. The flagging field values remain in the DB.

Fixing this is going to involve finding, then loading all the flaggings for the entity deleted in_flag_entity_delete(), then calling field_attach_delete(). I hope that doesn't lead to recursive hook invocation! (Technically, we should also invoke hook_entity_delete() at this point, but that definitely WOULD be recursive, so we'll leave that to another issue and another day.)

This is going to mean deleting entities is an even heavier operation. But then there's already a problem in core when deleting large numbers of entities.

joachim’s picture

Oh wait, there's flag_user_account_removal() which has the problem that flaggings done BY the user (rather than TO the user) have this problem too!

MasterWalker’s picture

From what you've said I suppose this will take some times before this is fixed. I'll just put my workaround here, but I don't know about the performance (you wrote about a core problem with massive deleting of entities, my code was not tested but for a few user accounts only).

I'm using these two functions to get the flaggings :

/**
 * Get the flags done by a user and unflag them
 */
function _delete_user_flags($account, $type){
	$flagged_entities = flag_get_user_flags($type, NULL, $account->uid, NULL, TRUE);
	foreach($flagged_entities as $flagged){
		foreach($flagged as $entity_id => $struct){
			$flagging = entity_load_single('flagging', $struct->flagging_id);
			$flag = flag_get_flag($flagging->flag_name);
			$flag->flag('unflag', $entity_id, $account, TRUE);
		}
	}
}
/**
 * Get the flags done on an entity and unflag them
 */
function _delete_entity_flags($entity_id, $type){
	$flags = flag_get_entity_flags($type, $entity_id);
	foreach($flags as $uid => $flagged){
		foreach($flagged as $flag_name => $struct){
			$flag = flag_get_flag($flag_name);
			$flag->flag('unflag', $entity_id, user_load($uid), TRUE);
		}
	}
}

Then I can implement hook_user_delete and hook_entity_delete :

/**
 * hook_user_delete
 */
function mymodule_user_delete($account){
	_delete_user_flags($account, 'profile2'); // we need to know the available types
	_delete_user_flags($account, 'node');
}
/**
 * hook_entity_delete
 */
function mymodule_entity_delete($entity, $type){
	$ids = entity_extract_ids($type, $entity);
	_delete_entity_flags($ids[0], $type); 
}

Of course as I said above, I need to make my module execute mymodule_user_delete before flag_user_delete :

/**
 * hook_module_implements_alter
 */
function mymodule_module_implements_alter(&$implementations, $hook) {
  switch ($hook) {
    case 'user_delete':
      $mymodule = $implementations['mymodule'];
      unset($implementations['mymodule']);

      $flag = $implementations['flag'];
      unset($implementations['flag']);

      $implementations['mymodule'] = $mymodule;
      $implementations['flag'] = $flag;
    break;
	// I didn't find the need to do the same with hook_entity_delete.
	default:
	break;
  }
}

As you can see, the problem with this code is that we need to know what types of entity the user may have flagged. I couldn't find a better solution.

joachim’s picture

Status: Active » Needs review
FileSize
10.85 KB

Here's some tests, which fail as expected.

Status: Needs review » Needs work

The last submitted patch, 2052947-6.flag_.flagging-field-delete.tests-only.patch, failed testing.

joachim’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 2052947-6.flag_.flagging-field-delete.tests-only.patch, failed testing.

joachim’s picture

First time was failing because I'd forgotten to push a docs clean-up commit.

Second fail is as expected: 3 lots of field data are left in the field table as zombies.

apotek’s picture

I'm wondering if the topic title of this issue report still really reflects the issue.

In an abstract fashion, the problem as I see it is that when it comes to cleanup, the flag module bypasses both its own D6-ish flag->[action] api AND the D7 entity api.

flag_user_account_removal() for example (called in a hook_user_delete()) removes flags by direct db queries and doesn't in turn invoke any hook or other action that modules can hook into to ensure that flagging fields are dealt with or handled, or just trigger any other event off of the removal of a flag. The flag module's own hooks aren't even called (no module_invoke_all('flag_unflag') for example. it would not be necessary to call these hooks explicitly, if the account removal call implemented the removal of flags via their own flag object api where each flag removed (unflagged) would trigger a field_attach_delete() somewhere down the line, which other modules could then hook into to ensure cleanup of orphaned bundle/field data.

What's interesting is that if you look at the flag_flag->flag/unflag logic, you see this, where a related problem is explicitly recognized:

<php>
        try {
          // Note the order: We decrease the count first so hooks have accurate
          // data, then invoke hooks, then delete the flagging entity.
          $this->_decrease_count($entity_id);
          module_invoke_all('flag_unflag', $this, $entity_id, $account, $flagging);
          // Invoke Rules event.
          if (module_exists('rules')) {
            $event_name = 'flag_unflagged_' . $this->name;
            // We only support flags on entities.
            if (entity_get_info($this->entity_type)) {
              $variables = array(
                'flag' => $this,
                'flagged_' . $this->entity_type => $entity_id,
                'flagging_user' => $account,
                'flagging' => $flagging,
              );
              rules_invoke_event_by_args($event_name, $variables);
            }
          }
          $this->_delete_flagging($flagging);
          $this->_unflag($entity_id, $flagging->flagging_id);
</php>

So it looks like, for very understandable reasons, flag module is somehow straddling the world of D6 hooks and D7 entities, and in the case of garbage collection, things are falling through the gaps because of it.

What would be nice would be to have a sense from the maintainers in which direction they want to solve this, so we know in which direction to create patches/hacks in order to keep our data clean in the meantime. Looking at the recent addition of flag_entity_delete and changes to flag_node_delete is giving some indication.

So I would imagine that a call like flag_user_account_removal() could be refactored to use flag_entity_delete --> _flag_content_delete and we should be able to resolve this.

If the flag module handles removal of user flags and entity/bundle data that is attached to those flags, then, in most cases, another module that defines a certain flag type, might not need to implement hook_user_delete() at all, and thus not worry about execution order etc, since the flag_user_delete() would trigger hooks that can be tapped into instead.

joachim’s picture

> bypasses both its own D6-ish flag->[action] api AND the D7 entity api.

Argh, you're quite right. We're bypassing our own API. (Though hey, who are you calling D6-ish? ;)

The problem is one of scalability.

If the user unflags a node, we call the unflag API, which does a bunch of checks, invokes some hooks and rules, then tells the entity system to delete the Flagging, which in turn tells FieldAPI to delete field data.

If we delete a user, we have to run that whole chain for each flagging that user ever created, taking each flagging at a time. If your user only flagged a handful of nodes, we're fine. If they flagged dozens or hundreds, the process will take too long and die.

apotek’s picture

I get the point about the scalability, believe me. I've got over 1.4 million users in the users table :).

But when deleting a user, and thus deleting their flags, any bundle data attached to that flag is left dangling around, which, in the long run creates a different problem that can affect scalability & management: orphaned data.

I'm wondering if there's a way to, at the bottom of the more efficient direct db calls you are making in hook_user_delete() to add a line or two of code that would trigger an entity-delete cycle on the entities removed, or, if that's too scary slow, create a custom module_invoke_all('flag_user_delete_garbage_cleanup') call that would allow another module familiar with its specific flagging entity removal needs to handle it via it's own calls.

Just thinking out loud, but still inside the (text) box.

apotek’s picture

I get the point about the scalability, believe me. I've got over 1.4 million users in the users table :).

But when deleting a user, and thus deleting their flags, any bundle data attached to that flag is left dangling around, which, in the long run creates a different problem that can affect scalability & management: orphaned data.

I'm wondering if there's a way to, at the bottom of the more efficient direct db calls you are making in hook_user_delete() to add a line or two of code that would trigger an entity-delete cycle on the entities removed, or, if that's too scary slow, create a custom module_invoke_all('flag_user_delete_garbage_cleanup') call that would allow another module familiar with its specific flagging entity removal needs to handle it via it's own calls.

Just thinking out loud, but still inside the (text) box.

kaizerking’s picture

Issue summary: View changes

Is this patch working?

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 8: 2052947-6.flag_.flagging-field-delete.tests-only.patch, failed testing.

joachim’s picture

See the comment I made when I posted the patch:

> Here's some tests, which fail as expected.

brunodbo’s picture

If they flagged dozens or hundreds, the process will take too long and die.

Could we delete the flaggings' field data in a batch operation? We could add a function that runs the batch operation, and then call it in flag_entity_delete, flag_node_delete, and flag_user_delete, none of which delete flaggings' field data as far as I can tell.

indigoxela’s picture

The problem still exists: when a flaggable entity is deleted, the flaggings themselves get purged from the database, but helper function _flag_entity_delete() ignores attached field data for performance reasons.

What about a completely different approach?
A new api hook in flag module, that hands over the flagging ids. That gives module developers a chance to take care of these data.
Such a hook would require an additional db_select in the helper function.

With the flagging ids given, it's possible to use, for instance, the drupal queue in a custom module to remove orphaned field data on cron runs.
And a hook doesn't overload the flag module itself with functionality only needed in special use cases (flagging has fields).

Right now it's rather tricky to get rid of the orphaned data.
Either by fiddling around in the hook_entity_delete order (as descibed in #7), or by doing some sort of database garbage collection on cron runs (database table compare, remove field data when flagging_id doesn't exist anymore).