import rss feeds as nodes
bb - October 24, 2002 - 13:15
hi,
i'm looking for a way to import rss feeds as nodes (linked to specific termes of the taxonomy) into my community. i've come across several posts discussing this matter and was wondering about the status quo and if there is already someone working on this.
one post mentioned a module called rss_html, but i wasn't able to find any further information about it.
thanx,
bb

Needs to be done
You are right, see http://www.drupal.org/node.php?id=468
If nobody beats me to it
If nobody beats me to it improving the aggregation module will be my next project after the node api, download page update, project module upgrade and phpdoc integration into Drupal. Too bad the code just doesn't write itself
--
Kjartan
I've done a bit of work on RS
I've done a bit of work on RSS into nodes in the LiveJournal import module I've written. Email me. tom@rowan.me.uk
done - needs more review
see breyten's wonderful import module. please post your comments/bug fixes as this module is nearing CVS readiness. the idea is to replace the current import module with these modules.
no luck here
I've been trying to get this to work, but for some reason it won't add a feed correctly. It just never completes the task.
I've been working with the feed.module for about an hour now, and I can't figure out why it's not adding the feed I submit. No errors anywhere.
Any suggestions are welcome, though I realize this is still very much in development.
John
One Solution
Here's how I've added RSS feed bundles as nodes:
WARNING - This is not a prity solution....
1.) Create RSS bundle of feeds
2.) note what number the bundle is by going to "news by topic", "more" of the bundle you just created, make a note of the bundle number in the URL (e.g. import/bundle/3)
3.) Create Taxonomy for your news pages
4.) Create book page node for news page
5.) Set this to PHP instead of HTML
6.) Paste the following code into the Body (change the $bid to the number of your bundle)
$bid = 3;
$bundle = db_fetch_object(db_query("SELECT * FROM bundle WHERE bid = %d", $bid));
echo "<h1>News</h1><h2>$bundle->title</h2>";
$keys = explode(",", $bundle->attributes);
foreach ($keys as $key) $where[] = "i.attributes LIKE '%". trim($key) ."%'";
$result = db_query_range("SELECT i.*, f.title AS ftitle, f.link AS flink <br>FROM item i, feed f WHERE (". implode(" OR ", $where) .") <br>AND i.fid = f.fid ORDER BY iid DESC", 0, variable_get("import_page_limit", 75));
while ($item = db_fetch_object($result)) {
if (module_exist("blog") && user_access("maintain personal blog")) {
$blog_icon = " ". l("<img src=\"". theme("image", "blog.gif") ."\" <br>alt=\"". t("blog it") ."\" title=\"". t("blog it") ."\" />", <br>"node/add/blog&iid=$item->iid", array("title" => <br>t("Comment on this news item in your personal blog."), "class" => "blog-it")) ." ";
}
if ($item->title) {
$output .= "<h3>$item->title</h3>";
}
$output .= "<p>";
if ($item->description) {
$output .= "$item->description";
}
$output .= "<br><a href=\"$item->link\">Read More</a> | <a href=\"<br>$item->flink\">$item->ftitle</a> $blog_icon</p>";
}
theme("box", $bundle->title, $header);
theme("box", t("Latest news"), $output);
<!--break-->
some eval of items as nodes module in contrib
For the last few days, I ran the
import module in contrib which stores RSS items as nodes. Great concept.
And I'm glad that I installed it. But I'm going to have to remove it until it's
more mature. Still, I'm really looking forward to using it in the future.
With that in mind, below is some feedback which I hope will be of help in it's
futher development. One thought that informs this analysis is that a news aggregator
which can promote feed items into the regular posts is a novel concept; it's
important to consider that to successfully avoid confusion among less frequent
site users, these nodes are going to have to be very clearly demarcated.
never successfully got the login process to work correctly as described in
the install. As a workaround, I created a user with the correct permissions
and hard coded the user's name and password into the code where feed.module
checks the user. At this point, the module updated the feeds regularly according
to the cron settings.
effect updates were having on site content. The cron run overrides any changes
that I might have made to an individual node, such as adding a teaser break
comment to change the way that the item displays or tagging the item manually
with taxonomy terms. This is an interesting issue since on the other hand,
if an incoming item has changed on the originating site, then Drupal should
update it. And it's good that site admins cannot easily change text incoming
via RSS; those texts are copyrighted and controlled by the originating site
and should, I think, be controlled by them. This avoids potential intellectual
property and misrepresentation problems.
it in place of where a site user is normally specified as the author with
each post. Otherwise, Drupal uses "Anonymous." From a users standpoint,
since the node doesn't clearly indicate the originating site, this could be
confusing, especially when the incoming feed is a community site where many
different authors could be included in the feed items. I think it would be
better to use the RSS feed title (which can be modified by the admin). If
an author is specified in the feed, that is additional information which should
be included somehow either preceding or following the node content.
implementation, the node gets two "read more" links, one going to
the full node view and one back to the post on the original site (this is
mentioned in BUGS in contrib). One quick solution in the short term could
be to change the link to the original site to "original post."
actually be more cumbersome than the curent box display for the core import
module. When comparing both, I much prefer the more compact version in the
standard news aggregation display for my daily news reading. Perhaps this
should be the default still for import.module, then there should be some other
way of accessing the node listing view. Easy enough for a site admin to do
as it is now by merely tagging all items with a taxonomy term and then using
the taxonomy url.
feeds, some of which are pretty active. All of the item nodes are picked up
by tracker module. View recent posts is fairly useless with incoming RSS items
great outnumbering the updates to the site from site users. Seems that there
might need to be away of specifying which node types tracker module tracks.
option.
problem for feeds which are promoting their feeds, or have been promoted by
the admin, to the front page. One would assume that front page promoted items
are deemed important by the site admin and should probably still be included
in tracker displays and notifications. One solution might be to convert items
that are promoted to stories, rather than listing them as items. Of course,
then they would no longer show up in item displays. Or the tracker and notification
modules could be configured to include posts on the front page and any other
node types specified. Sort of a difficult issue whichever way that it's looked
at.
should be stored as users, not nodes. The obvious advantage is that permissions
could then be applied to feeds. This potentially allows permissions to be
set which would allow Drupal sites to to exchange nodes and have them posted
as the specific node type: forums, stories, events, images, etc. Or allow
taxonomy terms to be applied. Meanwhile, it would also automate the process
of having the feed name listed in the author field link directly to a feed
information page.
Any thoughts?
Alternative approach
Thanks for sharing your experience.
For sake of curiosity; would life be better when news items were not nodes, but when there was a "nodify"-link (so to speak)? The "nodify"-link would transform a 'RSS news item' into a 'news item node'.
One could either transform all news items to nodes or just a subset like the news that you want your users to comment on or that you feel is worth promoting to the front page. For one, the tracker page and the node administration page would be less polluted, and the notify module would not need to be altered.
Furthermore, Drupal could add a 'synchronize news item'-link to the node when it detects that the news item has been altered in the RSS feed.
I haven't really given this much thought, I'm just thinking up load, but I think this approach could counter quite a few of the shortcomings you mentioned. Give it some thought.
re: Alternative approach
I think you are right. I think for many sites, keeping the functionality of the exisiting import module while creating a method for promoting certain items to nodes would be the best way to go. Specific feeds could be tagged for conversion automatically and there could be a radio button or checkbox on the Administration » content syndication » news aggregation » tag news items menu.
And instead of thinking along the lines that converted nodes should be promoted to the front page automatically, maybe they should be converted to nodes and then be subject to the default content publishing settings. This would better parallel the existing node posting/publishing system and process that sites are already following. Some sites might want those nodes to enter moderation, or might not want comments, to be consistent with posts by members on site. Then, once they are converted, an admin could always change those settings on a per node basis, in the same way that they can be changed now.
But then which type of nodes? It almost seems like their should be a choice between stories or forum topics since these are the two community posting node types available with Drupal. Choosing one or the other would exclude different community sites. Incorporating that thinking also leaves room for eventually specifying the node type in incoming feeds from other Drupal sites.
Then, what to be done on the import page? Should promoted nodes still be listed as news items? Or should their be a place holder with a link to their node url? I'm in favor of a link with the news item to the node.
As for synchronizing imported nodes, I almost think this should be a global default setting. Admins/Drupal communities are probably going to want to control this depending on their own views about depublishing. And, in many cases, an imcoming RSS feed item is itself a teaser. So in most cases, the converted node becomes a permanent item in the site, and provides opportunity for commenting, moderation, taxonomy, etc., but users are typically still going to need to follow the feed item link to the original post on the originating site (does this make sense?). Now it might be nice to have an update notice which appears with the node when the original post has been updated, but of course this creates overhead for Drupal to watch the nodes and feed items, as well as can this be watched indefinitely?
One other thought. Probably good to have a global switch in the aggregator admin to be able to ignore teasers. This plays into the idea that incoming RSS items are often teasers--why should they be reteasered again? Also would help usability since it eliminates that "read more" "read more" duplication if site admins wish.
unpublished nodes
I think you get the same effect (no bad interaction with Tracker and Notify) by importing new items as unpublished nodes.
thank
thank you
__________________
http://wehosting.blogspot.com