Download & Extend

Hash check fails when running multiple importers on cron

Project:Feeds
Version:7.x-2.x-dev
Component:Code
Category:bug report
Priority:normal
Assigned:Unassigned
Status:closed (fixed)

Issue Summary

I have several importers in my site, which is set to run "as often as possible". When cron is executed, some of them seems to be updating every item even though nothing has changed. If I run the importers manually from the standalone form, everything works as intended, returning "There are no new nodes".

I've traced this down to the hash check, in FeedsProcessor::process(). The stored hash is loaded as it should, but the hash for the current item is being calculated differently. It seems as this is due to a static variable in FeedsProcessor::hash(), which stores the serialized mappings. I think that the static variable is getting it's initial value based on the importer that is triggered at first. The rest of the importers, will then use that static variable.

I haven't done any thorough research, but I've removed the static variable to test my thoughts, and everything is indeed working as it should.

Comments

#1

Status:active» needs review

This patch simply removes the static variable, causing the mappings to be serialized every time.

AttachmentSizeStatusTest resultOperations
1848726-1-feeds-hash_check_cron.patch658 bytesIgnored: Check issue status.NoneNone

#2

#3

Version:7.x-2.0-alpha5» 7.x-2.x-dev
Status:needs review» reviewed & tested by the community

#1 solves this.

#4

Status:reviewed & tested by the community» needs review

We really shouldn't get rid of the cache entirely.

AttachmentSizeStatusTest resultOperations
feeds-serialized-mappings-1565890-4.patch1.07 KBIgnored: Check issue status.NoneNone

#5

Thanks for looking at this.

#4 makes sense. Would #1 brake something or just be a little slower?

#6

+++ b/plugins/FeedsProcessor.inc
@@ -763,11 +768,10 @@ abstract class FeedsProcessor extends FeedsPlugin {
    *  Empty/NULL/FALSE strings return d41d8cd98f00b204e9800998ecf8427e

I guess that has not been true for a while...

#7

Hm. Do we need an update function for this?

Additionally, would it make sense to cache the hash of serialized mappings?

#8

Without any actual testing, I believe that #4 would work, but I can't see that caching would mean any significant performance improvements, since it's a simple matter of serializing. #1 is simpler to understand and maintain, but #4 would mean that the serializing is only done once.

However, this is a small change. Both patches would work, the choice is up to the maintainer. This won't need an update function.

#9

Status:needs review» fixed

You're absolutely right, a quick test shows that the savings are trivial.

http://drupalcode.org/project/feeds.git/commit/8dd1ca3

#10

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.