Import.module: modifying to create a more sophisticated aggregator archive

jibbajabba - October 21, 2003 - 18:51

I've been thinking about feed aggregation inside a company intranet and am developing some ideas about how company weblogs can be aggregated and archived. It seems to me that this might be possible with import.module and node.module. I haven't really created Drupal modules on my own yet -- except for 1 tiny module I made that makes my home page show recent weblinks, blogs, images in separate lists -- but I am hoping to develop the concept around this one and perhaps get some suggestions on how to proceed.

My concept:
A business weblog aggregator is different from client's news feed readers. A business weblog aggregator would need to...

1) archive collected weblog entries for long-term storage and retrieval

2) provide a richer set of metadata for each entered record. A minimal set of metadata might include: author, title, publisher, URL, subject. Some of these metadata elements should be entered in an automated fashion.

3) provide some semi-automated means of classification via a controlled vocabulary (taxonomy), e.g. where terms occurring in the blog entry text (title, description) are compared with synonyms, phrases, more complex boolean expressions that can be mapped to terms in the controlled vocabulary.

Now, while I think I might be able to handle feature 1 above, I am thinking, as a non-programmer, there is no way I can do 2 and three. So before investigating this further, I want to get your ideas about the possibility of doing this in Drupal.

-michael

not far off

moshe weitzman - October 22, 2003 - 12:29

the import.module suite in Contrib is quite close to achieving this. The most recent version is here.

this is quite plausible in drupal. this module hasn't been in active development for a while. hopefully someone will take an interest.

Breyten's module?

jibbajabba - October 22, 2003 - 13:33

Thanks, Moshe. Is thie Breyten's module that I just noticed mentioned in drupal-dev? I'm starting to watch that list again, albeit in digest mode. From drupal-dev.

7. Improve Drupal's news aggregator; save news items as nodes, add support for (not)-Echo, RSS 2.0, etc.

Breyten implemented an import module that stores news items as nodes. It turns out that this isn't ideal (yet). Kristjan implemented a basic atom or not-echo module and we have added 's to our RSS feeds (RSS 0.92, not RSS 2.0).

yes, that is the one

moshe weitzman - October 22, 2003 - 13:38

please post your bug and feature list to the import project on drupal.org.

Excellent.

jibbajabba - October 22, 2003 - 14:07

Thanks, Moshe. I will be installing this on a new Drupal system soon to try it out.

Other import.module

Anonymous - October 22, 2003 - 14:27

Hello,

sorry to be anonymous, still not had the time to register (lazy?).

Well, I'm a rather new to Drupal, just made some experiments on my home Linux box.

I've been thinking for a while on the same idea.
I've installed the "other" import.module, but despite it adds new good features, it's a bit annoying in others. The cron part and missing the feature of the standard import.module to aggregate several feeds in groups.

I cannot promise too much at the moment, but I might support the development of a more sophisticated aggregator.

Francesco

[OT] now registered to Drupal...

francesco - October 22, 2003 - 14:32

not a big deal indeed...

 
 

Drupal is a registered trademark of Dries Buytaert.