Every time cron is run the module fetches all the items, not just the new since the last cron but all the duplicates as well even though items are already there unchanged in any respect.
How to prevent duplicate entries?
Thanks.
Every time cron is run the module fetches all the items, not just the new since the last cron but all the duplicates as well even though items are already there unchanged in any respect.
How to prevent duplicate entries?
Thanks.
Comments
Comment #1
khawaja commentedInstalled this tonight, and I'm having the exact same issue. I think perhaps the "title as GUID" comparison code isn't quite working right.
Comment #2
Ashraf Amayreh commentedCan you send me the URL to the feed or attach it here for testing?
Comment #3
khawaja commentedFor me it is: http://www.worldofwarcraft.com/rss.xml To my novice eyes it seems to be formatted properly
- produces duplicate items
- only grabs 4 of 12 items listed
From this error and a scan of the code. It seems like you need to probably put quotes around the title as GUID and escape any single or double quotes within the title as GUID. Any title that has the word AND in it is probably going to cause problems.
Comment #4
khawaja commentedIf I understood more about module development I would attempt it myself. I imagine wherever story_guid is used there may be issues. But as it is, these might be of help:
http://api.drupal.org/api/function/db_escape_string/6
http://ca2.php.net/manual/en/function.addslashes.php
http://ca2.php.net/manual/en/function.stripslashes.php
Comment #5
Ashraf Amayreh commentedWell, the guid is passed through a CRC so no titles should get into the GUID check, also, db_query does auto-escaping. But thanks for the thoughts :)
Please check out the new release and let me know if the duplicate feed items problem is solved and weather you're still getting the mysql errors you posted.
Comment #6
khawaja commentedI was talking about when you use "title as guid" to compare if there are new items or not. The problem is with the word AND in this title, "New Wrath of the Lich King Keyboard and Gaming Surfaces" or any other reserved sql term used in any title. From the code I wasn't sure if title as guid was put in quotes or not. And if it was are you already escaping apostrophe etc.
I had 8 feeds set up when I used the new release. I didn't see the old error but this new one. But I don't know which feed it was from. I guess we need to check whether an alias already exists or not before trying to insert it. 8^)
I've turned off all feeds except the warcraft one. It is no longer producing the error from #3 but only grabbing 4 of 11 items in the feed. It's still stopping after grabbing this item, "New Wrath of the Lich King Keyboard and Gaming Surfaces". So even though no error is produced there is possibly still a problem with sql reserved words.
Comment #7
khawaja commentedupdate: I think the error in #6 was caused by already having a term "feed" with sub-terms 01 to 09. I changed the term to rssfeed and am no longer getting that error. This was done as a temporary fix to use views. http://drupal.org/node/337890.
http://www.worldofwarcraft.com/rss.xml is still only grabbing 4 of the available 11 items in the feed right now.
Comment #8
whan commentedThe kind of duplication problem I face with Aggregation Module is : When I update a feed_item node which has already been imported from RSS feed URL by the Aggregation Module, the updated feed_item node is imported again as a duplicate for the next cron run when there is a new feed item in the RSS feed URL.
Thank you.
Comment #9
whan commentedI've enabled Aggregation module on Drupal installation and was successfully able to import blog posts from a Blogger to my Drupal database. However, I've updated few blog posts which are imported as Feed items to my drupal application to change the text/image alignments. Now, when I post a new blog post on my Blogger and run the cron, the Feed item nodes which have been updated are being imported again. I've noticed that 'story_guid' field in the 'aggregation_item' table becomes '0' when a Feed item node is updated/edited.
Thank you.
Comment #10
whan commentedhttp://drupal.org/node/964384 has got the fix for this Issue.
Thanks