Have rss feeds from News aggregator work with Drigg
rek2 - December 24, 2007 - 06:38
| Project: | Drigg |
| Version: | 5.x-1.x-dev |
| Component: | Code |
| Category: | feature request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed |
Jump to:
Description
I will love to be able to pull RSS feeds with News aggregator and that they automatically get
posted in the upcoming section.. with the option to disable this for when I get real users summiting stories..
is this already possible some how? I bet there could be a fast hack to do this, I know this is possible for the blog module..
Thanks

#1
Hi,
This would be a nice option. However, it's gonna have to be marked as postponed, since it's not planned right now.
I will work on it right away for $500.
If somebody submit a patch, however, the patch will be most welcome!
Sorry, I don't mean to be mean, but this is a secondary feature and there are tons of other important features that will need to be implemented first.
NOTE: this will eventually be implemented, even without bounty. The bounty will speed development up and will make my developer's life easier.
Bye,
Merc.
#2
no problem, is understandable, myself last night digged a bit into the module and I am able to made this work half way..
but not yet there..
I made it so when I have an rss feed, the icon you usually get to blog the RSS feed on your blog.. I modify it so it will talk to
Drigg and fill in the fields to summit a story....
will let you know if I can get something else to work.
if it was 100 dollars I may be able to help ... but 500.. :-(
#3
#4
The site techmeme.com I believe has done this - and very successfully. They also add all related stories links - so it looks like it searches through rss feeds, filters according to keywords, and displays. Anyone else have some insight?
#5
I'm actually writing the patch myself.. I already got it implemented in the drigg admin menu, it list the aggregator feeds you have if any and
will let you choose witch one will you like for it's feed to get auto added to a drigg node and under what username and tag etc.
#6
Hi,
Fantastic!!!
Please send the patch my way -- I will be happy to review it and place it in the module!
Bye,
Merc.
#7
Hi,
Any news on this patch...?
The module is evolving a little... please make sure the patch applies to the latest version of Drigg -- or, even better, release it as an extra module. I am willing to maintain it for you, if it's reasonably well coded and if you want me to.
Merc.
#8
Hi, sorry I was doing some projects for work and I put this on the side for one week, now I am back to it.. what I am going to do
is download the latest stable and work on it there, I will try to see if my changes will work on it.. if not I will start from scratch.
Thanks.
Chris F.
#9
I am interested in sponsoring this but I need some specific features for the RSS parser.
I'd like it to scrape the first image (if any) from the feed (not just an image enclosure, anything in the html of the feed content), make sure it is not an ad and then store it on the local server to then be processed by imagecache or similar. A reference to the image would be stored with the node data in the database.
The RSS parser needs to be very flexible and perhaps have feed parsing wrappers like the Aggregator module. I have a heap of regexs for filtering out ads reliably but if we were able to build our own feed wrappers maybe the Drigg feed parser would not have to worry implementing some of this stuff as the user could do it themselves.
I am running a site now which uses lots of cobbled together modules to do a similar thing to Drigg/Digg/Reddit. I rely hugely on RSS feeds for the up-to-date info, and the voting is more of a sideline than killer feature. However, Drigg's related nodes, clean urls without aliases and various other stuff is really useful to me.
One thing that would be a killer feature for Drigg is if it were as content agnostic as it was vote agnostic, then I could keep my current feed aggregator feeding in items and Drigg would still process them. But alas, it seems that Drigg only likes content that has been submitted through the form.
#10
Hi,
About this:
>One thing that would be a killer feature for Drigg is if it were as content agnostic as it was vote agnostic, then I could
>keep my current feed aggregator feeding in items and Drigg would still process them. But alas, it seems that Drigg only
>likes content that has been submitted through the form.
Yep, sorry, Drigg _needs_ to work on "news". If you look at the code, you will see why... it would need to be _much_ more complicated than what it, and MUCH bigger, to work on any type of node.
The features you are requiring seem to be very time-consuming in terms of development. Also, it would be nice to make it contents-agnostic; that is, it should work with Drigg but also with any odd module.
Are you willing to pay for it _seriously_? (I am often asked for features which require 30 to 50 hours of development time, and offered $200 for them... so, I am just making sure)
Merc.
#11
Hi Merc,
I am a little confused by your reply, not sure if you are saying what I am asking for is possible or not.
I think what I am really after is something that can take any feed or user submitted content and treat them both as "news" and filter them using the vote algorithms and related content/content suggestion algorithms etc. I think it would be brilliant to then be able to add other node types to the system at will. Say I make a image gallery, I'd like to be able to let the image nodes be voted for and processed by the Drigg system.
I can very much imagine how this would be a *LOT* of work, and kind of drifting away from the Drigg system. Perhaps what I am really looking for something that is so far removed from what Drupal does it might be worth looking into making it from scratch.
S.
#12
Hi,
I think you might be right... It's definitely quite far from what Drigg does. However, I think Drupal (with extra_voting_forms and other modules) could be the way to go...
Merc.
#13
Hi Merc,
I am looking for this: using Drigg on any versatile content type.
Considering the rizing standard FeedAPI module (feedapi_node to be specific), I believe that making Drigg deal with any local content type node should be enough to tube content from RSS input to User_karma - am I wrong?
(Do not hesitate to tell me if I am not clear enough)
If you agree with this Drigg-FeedAPI compatibility: what would be the sponsorship price? [seriously]
Thanks for this uge module :-)
Alexandre
#14
Hi,
Drigg will ever only work with the content type drigg_node.module -- sorry.
The best way for you to discover why, is to actually try and change the source code so that it's not.
To actually grab things from a feed, we need an extra Drigg module that does the job for Drigg -- probably using the FeedAPI. This will give it a lot of flexibility etc.
I would be REALLY happy to be proven wrong about this.
I am not sure how long it would take me to code it. Probably around $500? But, it might be up to $1000.
The best thing would be to get a bunch of users interested, and share the expense. However, I have been finding this a little hard to organise.
Bye!
Merc.
#15
Hi Merc,
I tried: in every SQL query (drigg_module), I replaced 'drigg' by the list of every type chosen in extra_voting_forms with a little hack - it took me 2 hours to discover the code were _really_ closed to the drigg type!
I can send you the file if you are curious about this - but it was just a testing/discovering phase.
So I agree with you about the code :-)
And because I am a better integrator than a coder, I come to you because I really need this - and I believe that the author is best placed to make the best job out of this.
What are we talking about here: US dollar or Australian dollar? (I am from France, Europe)
Let me know about all this.
Alexandre
#16
Hi Merc,
Sorry for the bounty: I found my way out :-D
Here the solution: what was on my way for real was in fact the _REQUIRED_ status for URL.
So here is the solution to go with FeedAPI_node to get RSS inside:
1 - Change required from 'true' to 'false'.
2 - Destroy the URL as Primary Key in the database: drigg_node table.
And... that's it!
No more URL duplicate that blocks the way: here come the feeds!
Concerning FeedAPI_node: select drigg type as the content type where to create your content.
Of course, you still can create an usual digg-like node with URL data.
Let me add this: having not to enter a URL allow me now to use your drigg machine to receive propositions from the visitors - in other words, drigg can be used for conversational management :-D
As a conclusion, I would say the required URL is a "by-design" wrong way.
This requirement should be proposed as an option in the drigg module: for people who want to lock this usage.
Tell me what you think?
(this super-module is so cool :-)
Alexandre
#17
Hi,
Believe me, I am very glad you found a way! I am way overworked as I am right now...
However, it's not really sane for Drigg to drop the URL requirement. That's because:
* It's a key used for redirection if a person submits an existing story
* There mustn't be the same link twice in Drigg
Even when I expand Drigg so that links are not necessary, the "link" field will still be populated (with the node's url itself).
The right way of doing this, I think, is:
* Selecting several feeds
* Deciding which category and author should be given to each feed.
* Add stories automatically in Drigg
Then, you get pretty much "automatic" sites. However, even with this there would be problems:
* You'd get loads, and loads, and loads of multiple stories
* If you add an HUB, the URL to the story will be the HUB's story rather than the story itself. This is renown to be very annoying
* The summaries are nearly guaranteed to be crap most of the time
Bye,
Merc.
#18
Hi again Merc,
I forgot a detail: FeedAPI is promoting the n last items - and demoting the precedent ones... This is not really compatible with Drigg: once made popular, they disappear in the 60 seconds or 5 minutes or... (depending on the cron setup).
To resolve this, we have to comment the whole function: feedapi_node_feedapi_after_refresh
You should have:
function feedapi_node_feedapi_after_refresh($feed) {
/*
...
*/ }
It should be OK, now.
Concerning this thread: http://drupal.org/node/205812 (Drigg and FeedAPI)
The issue come from the required URL - not from FeedAPI.
> As you say :
When FeedAPI create an item, it does not put the URL original article in your "URL" - but in another field.
When FeedAPI create a second one: it find out that '' already exist and answer duplicate (see Primary Key)
Hope it will help a few people.
I would like you to develop these lines:
Thanks for your time on this discussion.
Alexandre
#19
Hi,
What part is not clear in those lines?
Merc.
#20
Hi again,
What are you talking about exactly: loads, loads, loads... ? What kind of situation are you thinking of?
Drigg has nothing to load here - it is handled by FeedAPI. Am I missing something?
What is an HUB? Is it an acronym?
Why ? How summaries are supposed to be built by Drigg?
Alexandre
#21
Hi,
------------------------
Then, you get pretty much "automatic" sites. However, even with this there would be problems:
* You'd get loads, and loads, and loads of multiple stories
What are you talking about exactly: loads, loads, loads... ? What kind of situation are you thinking of?
Drigg has nothing to load here - it is handled by FeedAPI. Am I missing something?
--------------------------
Situation: you get several feeds. A lot of them cover the same grounds. You have let's say 10, 15 different feeds. As a result, what happens is that you end up with the same story multiple times. This is quite drastic as the number of feeds increases.
Plus, you end up with the feed ads as well, which is quite annoying.
----------------------------------------
* The summaries are nearly guaranteed to be crap most of the time
Why ? How summaries are supposed to be built by Drigg?
-----------------------------------------
A story's summary is what the user puts in. If you get the story's contents from RSS, you end up with lost formatting, weird tags, and sometimes even ads. In general, you're not gonna get something as good as a human submitting.
--------------------------------------
What are you talking about exactly: loads, loads, loads... ? What kind of situation are you thinking of?
Drigg has nothing to load here - it is handled by FeedAPI. Am I missing something?
* If you add an HUB, the URL to the story will be the HUB's story rather than the story itself. This is renown to be very annoying
--------------------------------------
A hub is a site that lists contents rather than holding it. Slashdot, Digg, Linuxtoday are hubs.
Situation: you add slashdot. Now, the "Story link" will point to the Slashdot story, rather than the "real" story. Result: somebody clicks on the story, and he's redirected to Slashdot's story rather than the "real" source.
These are not "real" showstoppers, but they are the reasons why it's just not very good to use RSS feeds to feed Drigg. It creates (fakely) busy pseudo-sites, rather than communities.
Bye,
Merc.
#22
I agree with you.
But these are Content management issues - not technical issues.
These are great matters to handle: feed input should not be opened but managed.
When the site will launched, I will give you the URL so we can talk about it if you are interested.
What is true is that we come from nodes and user management to content and community management : that is so interesting to me!
Alexandre
#23
Hi, wondering how much it would take to get this done?
"Even when I expand Drigg so that links are not necessary, the "link" field will still be populated (with the node's url itself)?"
#24
Hi,
philipsim. please open a separate feature request (with some details) and we;ll talk over there.
I don't want to mess up this rss feature request with unrelated requests! :-D
The answer is "not much". It should only take 3-4 hours, between coding, testing, releasing.
Thanks,
Merc.
#25
Hi,
I must have not read the feature request well enough...
"I bet there could be a fast hack to do this, I know this is possible for the blog module.."
There was a "fast hack" to do this. It just took me all day to do it.
I hacked the aggregator form, so that it has extra options -- which are then used to configure the drigg_rss module. It's a simple module, less than 300 lines, which does pretty much everything.
It should be ready soon. However, it will need quite a bit of testing. It won't be production-ready for a while. But, it's going to be there...
Bye,
Merc.
#26
Hi,
I finished implementing this little monster. I have already committed the code to -dev.
Marking this as "closed", since it's done. It will be ready for testing tomorrow, once I release the latest Drigg.
Bye,
Merc.
#27