Duplicate Feed Items
jbsarma - February 2, 2009 - 21:18
| Project: | Aggregation |
| Version: | 6.x-1.5 |
| Component: | Code |
| Category: | feature request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | active |
Jump to:
Description
Every time cron is run the module fetches all the items, not just the new since the last cron but all the duplicates as well even though items are already there unchanged in any respect.
How to prevent duplicate entries?
Thanks.

#1
Installed this tonight, and I'm having the exact same issue. I think perhaps the "title as GUID" comparison code isn't quite working right.
#2
Can you send me the URL to the feed or attach it here for testing?
#3
For me it is: http://www.worldofwarcraft.com/rss.xml To my novice eyes it seems to be formatted properly
- produces duplicate items
- only grabs 4 of 12 items listed
From this error and a scan of the code. It seems like you need to probably put quotes around the title as GUID and escape any single or double quotes within the title as GUID. Any title that has the word AND in it is probably going to cause problems.
-user warning: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'Wrath of the Lich King Keyboard and Gaming Surfaces AND 1234209036 - n.created >' at line 1 query: _aggregation_add_item /* admin : _aggregation_add_item */ SELECT n.nid AS nid FROM node n, aggregation_item ai WHERE ai.nid = n.nid AND ai.story_guid = New Wrath of the Lich King Keyboard and Gaming Surfaces AND 1234209036 - n.created >= 0 in /home/powrslns/public_html/kookyguides.com/sites/all/modules/aggregation/aggregation.module on line 1398.-user warning: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'Shop Talk - Arena Ratings AND 1234209036 - n.created >= 0' at line 1 query: _aggregation_add_item /* admin : _aggregation_add_item */ SELECT n.nid AS nid FROM node n, aggregation_item ai WHERE ai.nid = n.nid AND ai.story_guid = Blizzard Shop Talk - Arena Ratings AND 1234209036 - n.created >= 0 in /home/powrslns/public_html/kookyguides.com/sites/all/modules/aggregation/aggregation.module on line 1398.
#4
If I understood more about module development I would attempt it myself. I imagine wherever story_guid is used there may be issues. But as it is, these might be of help:
http://api.drupal.org/api/function/db_escape_string/6
http://ca2.php.net/manual/en/function.addslashes.php
http://ca2.php.net/manual/en/function.stripslashes.php
#5
Well, the guid is passed through a CRC so no titles should get into the GUID check, also, db_query does auto-escaping. But thanks for the thoughts :)
Please check out the new release and let me know if the duplicate feed items problem is solved and weather you're still getting the mysql errors you posted.
#6
I was talking about when you use "title as guid" to compare if there are new items or not. The problem is with the word AND in this title, "New Wrath of the Lich King Keyboard and Gaming Surfaces" or any other reserved sql term used in any title. From the code I wasn't sure if title as guid was put in quotes or not. And if it was are you already escaping apostrophe etc.
I had 8 feeds set up when I used the new release. I didn't see the old error but this new one. But I don't know which feed it was from. I guess we need to check whether an alias already exists or not before trying to insert it. 8^)
user warning: Duplicate entry 'category/feed-categories/hunters/feed-' for key 2 query: path_set_alias /* admin : path_set_alias */ INSERT INTO url_alias (src, dst, language) VALUES ('taxonomy/term/820/feed', 'category/feed-categories/hunters/feed', '') in /home/powrslns/public_html/kookyguides.com/modules/path/path.module on line 112.user warning: Duplicate entry 'category/feed-categories/love-air/feed-' for key 2 query: path_set_alias /* admin : path_set_alias */ INSERT INTO url_alias (src, dst, language) VALUES ('taxonomy/term/832/feed', 'category/feed-categories/love-air/feed', '') in /home/powrslns/public_html/kookyguides.com/modules/path/path.module on line 112.
user warning: Duplicate entry 'category/feed-categories/warhammer/feed-' for key 2 query: path_set_alias /* admin : path_set_alias */ INSERT INTO url_alias (src, dst, language) VALUES ('taxonomy/term/844/feed', 'category/feed-categories/warhammer/feed', '') in /home/powrslns/public_html/kookyguides.com/modules/path/path.module on line 112.
I've turned off all feeds except the warcraft one. It is no longer producing the error from #3 but only grabbing 4 of 11 items in the feed. It's still stopping after grabbing this item, "New Wrath of the Lich King Keyboard and Gaming Surfaces". So even though no error is produced there is possibly still a problem with sql reserved words.
#7
update: I think the error in #6 was caused by already having a term "feed" with sub-terms 01 to 09. I changed the term to rssfeed and am no longer getting that error. This was done as a temporary fix to use views. http://drupal.org/node/337890.
http://www.worldofwarcraft.com/rss.xml is still only grabbing 4 of the available 11 items in the feed right now.