technorati-like rss aggregator in Drupal?
dcoder - October 13, 2007 - 00:54
I need to do a rss aggregator for a site with more than 6000 weblogs feeds. some feeds are broken, some feeds doesn't updates and some ones are very active.
It's possible to make a technorati-style aggregator site (with social networking bits) in Drupal? the aggregator2 module will do it?
how can I aggregate this load of feeds? Can I set several cron jobs and 'chop' the aggregation part?
any idea or comment to do this will be greatly appreciated.
THANKS!
Nicolai

yes, you can but
of course you could do it with drupal - but with that large number of feeds you will very quickly come to a point, where you will have to:
- dig deeper into what happens
- understand and measure what happens
- optimize what happens
- start over again
with 6000 feeds your cronjobs probably will time out because of php timeout restriction - you will have to think about an efficient update and storage solution. You sure do not want to update 6000 feeds over and over again just because one feed does break the whole process with broken syntax.
Drupal still can be a good founding ground for your project - but you get much more with Drupal than just an aggregator - this can be an advantage, if you would like to have more, but it also can be a disadvantage, as you might not need features that will eat time and memory.
It all depends on what exactly you want your site to do.
thanks drupalista
Somebody told me about setting out daemon process in the background (I'm thinking also in "selective parsing", just parse more the more frequently updated feeds and leave the rest for a less frequently parsing.. maybe using a 'parseable rating/index' or something?)
In the case I'll do it with Drupal can I just use aggregator2 for this (even if I start with some selected feeds, not with the whole bunch)?
Or maybe use something like magpie/simplepie and set another parallel application? in this case the data model of Drupal is not so complex that I'll finish updating 8 tables manually just to be able to insert the items in the cms?
thanks for your help, any idea will be greatly appreciated++
External Aggregator?
I'm thinking about a similar project and considering using Yahoo Pipes as the aggregator which will then provide a single feed for my website. What I can't figure out is how to set up the live scrolling feed like the one thay have on technorati frontpage.
thanks
somebody told me about pipes but i don't know how can I integrate my code -and the registered feeds of my users- with this service. can I use this api to send over all the feeds, let pipes aggregates and parse everything and receive just the final feed after? is this possible?
thanks for this input, I think this can be the right direction (at least the easier and less expensive)
i'm not a guru
I've just started playing with pipes myself, but they seem to be able to do what you want. You might have to add input feeds manually, though. Or you can try employing an openkapow.com robot to fo that. And don't ask me about the latter:) - I just saw someone use it within their pipe to retrieve comments from livejournal posts and it looks like it can do many other thins.