All the other news feeds migrated perfectly from 4.x to DRUPAL-5 aggregator and I have not seen this happen with other feeds imported to the new system, but we have one new feed, from the BlogSpot site http://thewee3.blogspot.com/atom.xml, where the feed itself is fine, but the new aggregator reports reading the stories and the headlines appear in the feed block, but all the stories have the same URL (the first story's link)

We have deleted the feed and re-added it, and get the same results.

CommentFileSizeAuthor
#40 aggregator_Atom10-D6.patch743 bytestorsten
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Ashraf Amayreh’s picture

Assigned: Unassigned » Ashraf Amayreh
Category: bug » feature
Status: Active » Postponed

I hope you haven't mixed up between the aggregation module and another one because you mentioned aggregator in your post. The aggregation (not aggregator) module currently supports RSS only. ATOM and RDF are being considered, but there's no deadline set since I'm pretty short on time.

You can easily add handling for ATOM and/or RDF, and if you'd like advanced customizations then you can contact me directly as well. Please check the readme file for details.

teledyn’s picture

oops, sorry, maybe I did choose the wrong one; I didn't realize there were two until you pointed it out -- here's what I have in the .info file:

$ more aggregator.info 
; $Id: aggregator.info,v 1.3 2006/11/21 20:55:33 dries Exp $
name = Aggregator
description = "Aggregates syndicated content (RSS, RDF, and Atom feeds)."
package = Core - optional
version = VERSION

I don't have the contrib module installed, just this one from the core DRUPAL-5-1 sources.
is it possible to move this bug to the other project's issues?

teledyn’s picture

Title: Strange behaviour with Blogspot feeds » Strange Aggregator behaviour with Blogspot feeds
Project: Aggregation » Aggregator2
Version: 5.x-1.3 » master
Category: feature » bug
Status: Postponed » Active

Don't know if this is Aggregator2 or Aggregator Node; the core DRUPAL-5-1 module just calls itself "Aggregator" but the Issues Tracker has no such module and neither of the Aggregator modules has a Version listing for 5.1

Ashraf Amayreh’s picture

You should report the bug in drupal's bug tracking system I believe, if the bug does exist, then it is considered a core bug.

Choose "Support" tab on drupal's site then check under "Bug Reports" on how to do that.

teledyn’s picture

Ah ... so Issues is not for Bugs. Got it. Thanks.

teledyn’s picture

What gives??? I did just as you said, and it leads me right back here to the Issues pages!!

Ashraf Amayreh’s picture

Project: Aggregator2 » Drupal core
Version: master » 6.x-dev
Component: Code » aggregator.module
Assigned: Ashraf Amayreh » Unassigned

Sorry for that, seems I got you mixed up. Issues are for bugs, Aggregator module is a core module, so you won't find it listed in the module's section. Instead, it's considered a core module.

Core modules are considered "Drupal". This includes all the modules that come with a default drupal installation, weather enabled by default or not.

So you simply needed to change the project from the "Project" drop-down from "aggregator2" to "Drupal". And change the component drop-down to "aggregator.module". I'll do that right now, but just wanted to clear that up :-)

teledyn’s picture

Thanks. I have now confirmed this bug on another 5.1 installation; in 4-7 the blogspot atom feeds were simply ignored (zero entries) but in the 5-1 the results appear to be unpredictable, sometimes null stories, sometimes all stories assigned to the wrong URL (url of the first story?)

there is a workaround: blogspot feeds have an rss2 option: use the alternate feeds/posts/default?alt=rss and the feed will integrate properly.

Ashraf Amayreh’s picture

Status: Active » Closed (fixed)

I guess this is solved now. closing...

leeksoup’s picture

Version: 6.x-dev » 5.1
Status: Closed (fixed) » Active

I'm still seeing this bizarre behavior with Atom feeds from a Blogger blog. I can see the posts come through just fine, but the urls are all messed up -- all point to the same one (I think the one for the first post).

I tried the workaround suggested by teledyn in #8, but then all I get from the feed is the titles and post times w/o any content. I looked at the xml file at the default?alt=rss url for the blog and it looks fine -- it shows the first paragraph of each post.

smrogers’s picture

I am having the same experiences as #10. With the RSS feed, you only get the titles. With the atom feed, you get the wrong links to the blogspot posts but you get teasers.

CSCharabaruk’s picture

I'm having similar behaviour... Usually all that'll appear through aggregator is the URL of an old article, but sometimes the title and teaser text for the most recent articles show up with the old article URLs. I'm getting this both through the original feed and through FeedBurner's version of it.

Ashraf Amayreh’s picture

Did anyone test/confirm this problem with the "aggregation" module?

If so, does it happen with the same URL in the issue description or a different one (please point out the troublesome URL if the problem is also valid in the aggregation module).

Thanks

skeptict’s picture

Version: 5.1 » 5.2

I'm seeing this behavior too. When I add the atom.xml feed to News Aggregator, it correctly picks up title information about the latest post, but the link it provides is to the oldest post on the feed. I don't know if Blogger's implementation of Atom is non-standard or what, but it doesn't seem to happen with other platforms (Wordpress blog atom feeds, for instance, seem to work fine).

sokrplare’s picture

Version: 5.2 » 4.7.1

I'm experiencing the same thing with Drupal 4.7.1 (and I tested with the same results in 5.1) on one of my client websites.

For reference here is the feed with inaccurate info - note how all of the links go to the same page which is the bottom feed entry:
http://www.waeagles.com/?q=aggregator/sources/2

That feed references a feedburner link which appears to work perfectly:
http://feeds.feedburner.com/eagleforum/JLEj

The original feed aggregation is available at and it has the same problem:
http://www.eagleforum.org/blog/atom.xml

One the site I have a second aggregation up with that one at:
http://www.waeagles.com/?q=aggregator/sources/1

Any thoughts? I'm going to take a look at the code, but I'd love to see a patch for this!

sokrplare’s picture

Version: 4.7.1 » 5.2

Sorry, I'll leave this with 5.2, but recognize that it appears to be an issue with older versions too.

sokrplare’s picture

Okay, I found a workaround using FeedBurner. The problem is somehow related to the fact that with Atom the "link" field for each entry isn't picked up in the parsing. I didn't look into exactly why this is, but because aggregator works fine with RSS I switched my FeedBurner setup over to that. Here is how:

  1. Login to http://www.feedburner.com/ (create an account if needed)
  2. Add a feed with default options (see feedburner for how to do this)
  3. From the "My Feeds" page click on the feed you want to change to RSS
  4. Click on the "Optimize" tab
  5. Under "Services" on the left click on "SmartFeed"
  6. Deactive SmartFeed
  7. On the left again click on "Convert Format Burner"
  8. Choose RSS 2.0 from the drop-down and click "Save"

That did the trick for me!

72dpi’s picture

yes, this is a bummer.

I opted to go for: /feeds/posts/default?alt=rss at the end of my URL,
so, at least the user gets to the right article,
rather than just print out the Content & a wrong link.

Hope someone works out how to fix this (without having to rely on feedburner)...

rares’s picture

Version: 5.2 » 5.7

I can confirm that this bug with the Atom format persists in the 5.7 version. The feedreader and ?alt=rss workarounds are good, but I think this is a problem that developers should work on.

RahDick’s picture

Status: Active » Needs review

I have found that the error occurs in the "LINK" case handling of the aggregator_element_start() function (line 607).
When you add $items[$item]['LINK'] = $attributes['HREF']; to the "else"-Statement, too, the bug is gone. I'm absolutely not sure what side-effects this may cause due to a lack of test cases. Can anyone have a look at this?

kikko’s picture

Sorry, can you explain with more extended code?

   case 'LINK':
      if ($attributes['REL'] == 'alternate') {
        if ($element == 'ITEM') {
          $items[$item]['LINK'] = $attributes['HREF'];
        }
        else {
          $channel['LINK'] = $attributes['HREF'];
        }
      }
      break;

This is the part of you are talkink about. What change I have to do?
Thanks and sorry

bensheldon’s picture

Version: 5.7 » 6.1
Status: Needs review » Active

@RahDick Yep, I think that's the issue too---though the broader issue is why is that if-statement having to get to the else.

I too was running into this issue, and I'm not sure why blogger's Atom Feeds seem to have it as opposed to any other Atom feed. The one irregularily I did find is that Blogger seems to use single-quotes in their feed, rather than double-quotes, but I'm no XML genius so I don't know if that's what's bunging stuff up.

  case 'LINK':
      if ($attributes['REL'] == 'alternate') {
        if ($element == 'ITEM') {
          $items[$item]['LINK'] = $attributes['HREF'];
        }
        else {
          $items[$item]['LINK'] = $attributes['HREF']; // ++++++
          $channel['LINK'] = $attributes['HREF'];
        }
      }
      break;

But that seems like a hack because the issue is really that if-statement. Why isn't the parser correctly reading $element == 'ITEM'?

Also, there is some weird flow to database insertions:

1) It seems like when you first create the feed in aggregator and update it for the first time, no new Items are added but the Feed's Link is set to the last item's Link (probably because of that wonky if-statement above). But the Feed's Link is added to the Feed *after* (or rather, it's never re-retrieved) it tries to insert the items. This is a problem because the Add Item logic says "we have to have a link, if we don't, use the Feed's link" but it doesn't have the feed link (even improperly set) so no items are inserted.
2) But... if you then say "delete items" on the aggregator admin screen (even tho there are no items---this is because the aggregator sends a request that says "I checked on this date, is there anything newer?....Nope, ok, never mind" so nothing new will be pulled since it barfed on parsing last time),
3) Now if you refreesh the feed, it *will* add the items!!!---though the links will all be set to the feed's link (which is the last item's link from the first time the feed was refreshed, so still wrong).

Whew. So that's where I'm at.

cdiggity’s picture

Is there any action on this front? Will a fix be included in the next version of drupal? This bug is a year old!

eean’s picture

Version: 6.1 » 6.3

I'm experiencing odd issues with BlogSpot feeds myself. In the log it said new items were found for one of them, but no items are downloaded at all.

lennart’s picture

I can confirm.

This issue still exists in Drupal 6.3.

It's not just with blogspot.com though, but also with other ATOM-feeds I've tried.

mustafau’s picture

yogayak’s picture

The article above still did not solve the issue.

In my case, I will try the feedburner way out.

love and light

ericjam’s picture

Priority: Normal » Critical

Has no one resolved this? I cannot aggregate blogspot feeds correctly for the life of me, I've tried alt=rss even and it doesn't really do much. Basically all brackets are stripped out of tags leaving html behind. Its not an encoding issue is this?

teledyn’s picture

I rather expect this bug is not going to be fixed. For this and other similarly embarrasing reasons, I've long since ceased using drupal.

liminalspace’s picture

Version: 6.3 » 5.3
Status: Closed (duplicate) » Active

OMG indeed. I now also need to provide a blogger/blogspot feed to one of my client's sites - an international school - one of their students has a blog on blogger who will be attending the big conference in cophenhagen (COP15) and reporting about it in her blog - I would have preferred she blogged directly using the school's drupal site, but now she is using blogger and i can hardly tell her to change it because i am having technological issues....

.... I've tried ALL the above sugesstions, done a bunch of googling on diff blogger rss feed formats.. none of them work.... Same problem as above....alt=rss even rss.xml at the end - still nothing..

please does anyone have any solutions? or any other modules that will work for parsing a blogger feed?????

help~

-Anti-’s picture

Version: 5.3 » 6.14

Same problem. I did a drupal site for a guy who has a successful blogspot blog, and one of the main conditions in the brief was that he could continue to use the blogspot and the posts would be aggregated onto the drupal site. I had previously tested the aggregation with a few other feeds and it seemed to be ok, so I said it was no problem.

Stupidly, I yet again trusted drupal devs to provide a working script, when it is absolutely plain to me by now that you can't trust drupal with even the simplest of features without testing the f*** out of it first. Really, is there *anything* in drupal that is more than 90% functional? Drupal really sucks sometimes. I really have to stop using it; I'm just waiting for the last straw... maybe this is it.

hinanui’s picture

I am using Drupal 6 and I still can't get feeds from any Blogspot blog, even by adding "/feeds/posts/default?alt=rss" at the end of the blog URL.
Has anyone found a real fix for this issue?? I can't believe it has not been resolved yet!

justinchev’s picture

Subscribe

rayz90’s picture

I've got the same issue with youtube feeds: http://gdata.youtube.com/feeds/api/users/rayz90/uploads/

dMaggot’s picture

I stumbled upon this bug recently using the UDPlanet module, where blog posts from Blogger feeds aren't even displayed in the planet page.

The problem appears to be with Blogger's Atom feeds, and that ?alt=rss fix only works on some blogs as far as I have tested, that's probably because of a per-user setting that would enable or disable RSS feeds. If you cannot find the setting (as my friend that helped me debug this didn't) you can use a service like http://atom2rss.semilogic.com to convert that feed to RSS.

Notice, though, that some Atom feeds like the ones in Wordpress work pretty well with the Aggregator module, so maybe this is just somethign Blogger should deal with (if that's not too much to ask).

David E. Narvaez

abamarus’s picture

Version: 6.14 » 6.16

I am having the same issue with the aggregator and atom feeds. This time it's the atom feed from a you tube channel that's giving me all the same links whatever the actual content.

dmyurych’s picture

This seems to be intermittent. I was consistently getting a 0 items message when trying to update items for a Blogger feed (Atom or RSS). But then it suddenly just started working without any changes to the feed configuration.

I did find a temporary workaround until it did start working. I setup a Feedburner feed (http://feedburner.google.com) but used the RSS feed from Blogger instead of the Atom feed (ie. ?alt=rss at the end of the feed URL). Once I had this setup I added the feed in the Aggregator module of Drupal to use the the Feedburner feed instead.

Perhaps getting the Feedburner feed working had something to do with the Blogger feed now suddenly working. I can't be sure.

--
Darrel Yurychuk

Sturmey’s picture

Version: 6.16 » 6.17

This is still an issue in 6.17

This has been an ongoing issue since 4.x and I really don't think blogger or blogspot really care that much that drupal can't handle a proper atom.xml file. Everyone else seems fine with the feeds that are generated, but drupal fails with this. I think I'm going to switch over to joomla, I know that they can handle these feeds.

dddave’s picture

Title: Strange Aggregator behaviour with Blogspot feeds » Strange Aggregator behaviour with Blogspot feeds / Atom feeds not pulled

I think this issue can be the main hub for this long lasting issue.

Marked the following issues as dupes:
#342043: Feed items do not get fetched
#784160: Unable to pull content from certain feeds.
#541026: aggregator module does not show feeds from blogspot.com
#616206: Only 1 of 9 items are aggregated. might be worth reading because it contains some quality input and not mainly bitching and moaning.

Note: Any further reports about feed aggregation not working is NOT helpful as long as it does not contain quality debugging info or (*hold your breath*) a patch. The current state might be frustrating but you can check out modules from contrib for feed aggregation or pay some wizard to fix this (and after that please provide it to the community).

torsten’s picture

FileSize
743 bytes

Hi,

here's a patch which makes the Atom 1.0 feed http://www.heise.de/newsticker/heise-atom.xml working for me.
The "case 'LINK'" section in function aggregator_element_start looks a bid weird now.
Maybe the developer of the aggregator module can have a look at this:

Based on aggregator.module in Drupal 6.17:

482     case 'ID':
483       if ($element != 'ITEM') {
484         $element = $name;
485       }
486       break;     <-- added this line with break
487     case 'LINK':
488       if (!empty($attributes['REL']) && $attributes['REL'] == 'alternate') {
489         if ($element == 'ITEM') {
490           $items[$item]['LINK'] = $attributes['HREF'];
491         }
492         else {
493           $channel['LINK'] = $attributes['HREF'];
494         }
495       }
496       else {                 <-- added 4 lines with this else branch
497           $items[$item]['LINK'] = $attributes['HREF'];
498           $channel['LINK'] = $attributes['HREF'];
499       }
500       break;

I've added some other rss feeds in my web page and this patches aggregator module with no side-effects so far...

--
Torsten

dddave’s picture

Status: Active » Needs review

;)

Branjawn’s picture

subscribe. This is a pain in the arse!

dddave’s picture

@#42

Please test the patch!

http://drupal.org/patch/apply

Branjawn’s picture

I'm sorry, I'm not qualified. I don't understand line commands and CVS and all that type of stuff. I'm just a self-taught church volunteer with very little computer background.

That said, I got my site up and running all by myself (foc4u.org)

So, for now, I'll wait...

p.s. there is a Drupal camp where I live in a few weeks and I requested a session on CVS, patches, setting up test environments, etc
p.s.s. that's correct, test environments... I've always done all my work on a live production site b/c I don't even know how to set up a dev site

Sturmey’s picture

@#43

I've worked in IT for several years, I am the admin for several web sites and managed terabyte sized databases and managed linux, UNIX and Windows servers. The link that you gave is NOT helpful to the average user. Drupal's documentation is aimed at experienced users, new drupal users are lost with most of it.

In all my years working with servers and web sites I have never found patch to be a straight forward solution.

Since you are clearly the drupal expert, have you tested the patch? Can you report your findings?

Clearly getting atom feeds to work with Drupal is not a priority. My recommendation is to switch to another CMS such as Joomla that can handle atom feeds as if they are native. Drupal is currently a few years behind Joomla from being "average user" friendly. For the sites that I maintain that require atom feeds I no longer user Drupal.

Realistically Drupal core should remove the Aggregator module since it is not a fully functioning module (evidence in above posts) and all bugs reports get closed without solutions.

dddave’s picture

re #45
I don't use this feature and therefore I am not testing this patch. I tried my best to organize this issue to ease the finding of a solution.

Do I read your post correctly that you did not test the patch but instead wrote an useless rant? Drupal is a community effort and therefore simply demanding a solution, whining about features not working correctly and not doing a lick to solve an issue is helping absolutely nobody.

This is all I am going to say about this.

Test this patch or stop derailing this issue. This patch needs enough testing before it can be included in a release. Thanks.

Mc Fly’s picture

TheoRichel’s picture

The patch doesnt work for me. Atom feeds still give 'no items', although...the pffcial url for this site's feed is: http://john-ray.blogspot.com/feeds/posts/default . It gives 'no items'. However when I add ?alt=rss then it works, though it leaves tags <b> and <br> in the blocks I display on www.groenerekenkamer.nl/milieublogs .

nemsis’s picture

I just tested this patch on Drupal 6.19, It doesn't work for me with or without the ?alt=rss ending.
I get 0 items added and when I look at the address in a custom block from the snippets pages I get a very old article (about a month) address from the site I am trying to aggregate. All the other sites show the home page address of the aggregated site. As far as I know this is the only site putting out an Atom feed, all the rest are rss.
I don't know if this last bit is of any use but I thought it might be a clue for someone who actually knows what they are doing, unlike myself.
This is the PHP code from the snippets block I am using.

// Last update: May 7 2006
$result = db_query("SELECT a.title, a.url, a.fid, a.link
    FROM {aggregator_feed} a
    ORDER BY a.title");
$output = '<ul>';
while ($feed = db_fetch_object($result)) {
    $output .= '<li>' . l($feed->title, $feed->link) . '</li>';
}
$output .= '</ul>';
$output .= '<p><a href="/aggregator/sources">more</a></p>';
return $output;

Hope someone can figure this out, the aggregator works perfectly for so many other feeds it is a shame not to fix this ongoing problem.

Stoob’s picture

I have been drinking so much Drupal Kool-aid my tongue is blue. Despite my positive attitude and PHP knowledge, Blogspot feeds continue to not work on 6.19 for me with or without the patch. DRUPAL FTW

Sborsody’s picture

subscribe

OK, pulling blogspot atom feed works in D7 but not 6.20. Can we get some sort of backport?

klim_’s picture

subscribe

whan’s picture

Subscribe. Blogspot atom feed does not work in D6.20.

dddave’s picture

@52,53

Did you test the patch? If so: What are your findings?

codefactory’s picture

I can confirm the issue too with this feed
http://greek-economist.blogspot.com/feeds/posts/default
The patch did not work for me

planstoprosper’s picture

Version: 6.17 » 6.22

I am running Drupal 6.22, and I have been having the same problem with five Blogger feeds. I tested the patch from #40 on my site and it did not work.

The fix described in #20, #21 and #22 appears to have worked for one of the five problem feeds, but not the other four. FWIW, the one feed that is fixed is the only one of the five to have updated since the last time I ran cron. Since I don't control any of these feeds, I can only wait until they update to see if they start working.

For the one feed that is fixed, the new post has only one link in Aggregator, but the old posts have two links-- one with the correct URL and one with the old, incorrect URL. It seems that Aggregator thinks these are two separate posts because of the different URLs, although I assume they will expire normally, so this will only be a transitory problem.

I will keep my eye on my remaining four problem feeds, and update here once I determine that the #20 fix definitely has or has not worked for them.

michielu’s picture

Patch works for me, when I want to import phpBB feeds.

canishk’s picture

Subscribing...

Feeds are not showing the actual characters. Need a solution for feeds generated from vBulletin.

canishk’s picture

Status: Needs review » Active

Hi,

Some special html characters from vBulletin feed are not showing correctly. Special html characters like “ ” and some french characters were displayed as squares. Can you please look into this issue ?. The vBulletin has latin7 charset and I believe drupal has utf8. Let me know if there is anything I can do for a quick fix.

Thank You.

Anish

Rewted’s picture

Title: Strange Aggregator behaviour with Blogspot feeds / Atom feeds not pulled » kernel.org rss.xl feed only lists 1 of 9 items.

http://www.kernel.org/kdist/rss.xml only lists 1 of 9 items, was reported by me back in 2009 http://drupal.org/node/616206

dddave’s picture

Title: kernel.org rss.xl feed only lists 1 of 9 items. » Strange Aggregator behaviour with Blogspot feeds / Atom feeds not pulled

...and I closed it as a dupe of this one. Please don't change the title of an established issue unless you provide some reasoning.

veteporlasombra’s picture

If you want to import Blogger feeds using the Drupal 6 Aggregator (core module) then you must use the RSS feed (just add "/feeds/posts/default?alt=rss" to the end of your blog's URL). If you don't then the aggregator will list posts correctly, but the links will all be to the same post.

from http://pblog.ebaker.me.uk/2009/02/blogger-feeds-and-drupal-6-aggregator....
or: http://drupal.org/node/98398#comment-469530

ParisLiakos’s picture

Priority: Critical » Normal
Issue summary: View changes
tjaoht’s picture

First off this comment pertains to Drupal 6.

For me the issue was in the function aggregator_element_start() of /modules/aggregator/aggregator.module. It was incorrectly detecting $element.

To diagnose you can change this:

    case 'LINK':
      if (!empty($attributes['REL']) && $attributes['REL'] == 'alternate') {
        if ($element == 'ITEM') {
          $items[$item]['LINK'] = $attributes['HREF'];
        }
        else {
          $channel['LINK'] = $attributes['HREF'];
        }
      }
      break;

to this:

    case 'LINK':
      drupal_set_message($name . ' - ' . $attributes['REL'] . ' - ' . $element . ' - ' . $attributes['HREF'], 'error');
      if (!empty($attributes['REL']) && $attributes['REL'] == 'alternate') {
        if ($element == 'ITEM') {
          $items[$item]['LINK'] = $attributes['HREF'];
        }
        else {
          $channel['LINK'] = $attributes['HREF'];
        }
      }
      break;

Then on your site navigate to /admin/content/aggregator and update the feed. Look at all of the lines containing:
LINK - alternate -

Then you'll just want to extend the if to include whatever is after the last dash on those lines...in my case it was SUMMARY so I changed the code to read:

    case 'LINK':
      if (!empty($attributes['REL']) && $attributes['REL'] == 'alternate') {
        if ($element == 'ITEM' || $element == 'SUMMARY') {          // <-- this line
          $items[$item]['LINK'] = $attributes['HREF'];
        }
        else {
          $channel['LINK'] = $attributes['HREF'];
        }
      }
      break;

Status: Active » Closed (outdated)

Automatically closed because Drupal 6 is no longer supported. If the issue verifiably applies to later versions, please reopen with details and update the version.