I have several importers in my site, which is set to run "as often as possible". When cron is executed, some of them seems to be updating every item even though nothing has changed. If I run the importers manually from the standalone form, everything works as intended, returning "There are no new nodes".

I've traced this down to the hash check, in FeedsProcessor::process(). The stored hash is loaded as it should, but the hash for the current item is being calculated differently. It seems as this is due to a static variable in FeedsProcessor::hash(), which stores the serialized mappings. I think that the static variable is getting it's initial value based on the importer that is triggered at first. The rest of the importers, will then use that static variable.

I haven't done any thorough research, but I've removed the static variable to test my thoughts, and everything is indeed working as it should.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

olofbokedal’s picture

Status: Active » Needs review
FileSize
658 bytes

This patch simply removes the static variable, causing the mappings to be serialized every time.

olli’s picture

olli’s picture

Version: 7.x-2.0-alpha5 » 7.x-2.x-dev
Status: Needs review » Reviewed & tested by the community

#1 solves this.

twistor’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
1.07 KB

We really shouldn't get rid of the cache entirely.

olli’s picture

Thanks for looking at this.

#4 makes sense. Would #1 brake something or just be a little slower?

olli’s picture

+++ b/plugins/FeedsProcessor.inc
@@ -763,11 +768,10 @@ abstract class FeedsProcessor extends FeedsPlugin {
    *  Empty/NULL/FALSE strings return d41d8cd98f00b204e9800998ecf8427e

I guess that has not been true for a while...

olli’s picture

Hm. Do we need an update function for this?

Additionally, would it make sense to cache the hash of serialized mappings?

olofbokedal’s picture

Without any actual testing, I believe that #4 would work, but I can't see that caching would mean any significant performance improvements, since it's a simple matter of serializing. #1 is simpler to understand and maintain, but #4 would mean that the serializing is only done once.

However, this is a small change. Both patches would work, the choice is up to the maintainer. This won't need an update function.

twistor’s picture

Status: Needs review » Fixed

You're absolutely right, a quick test shows that the savings are trivial.

http://drupalcode.org/project/feeds.git/commit/8dd1ca3

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.