Project:New Aggregator for Drupal core
Version:7.x-0.1-alpha1
Component:Code
Category:task
Priority:normal
Assigned:Unassigned
Status:closed (fixed)

Issue Summary

Hi Aron,

I reviewed the code and added @todo comments inline and rolled them as a patch. I have ignored functionality that's not built yet.

Here is a summary of the most important issues:

1) deduplication and feed item by feed item vs batch saving feed items

I see in aggregator_light_aggregator_process() that feed items are processed in batch and deduplication is therefore opaque to the aggregator module - it would be good if add on modules could change the deduplication algorithm in some way. A possibility I'm seeing is to process item by item and make deduplication hook_aggregator_process() callback. What are your thoughts here?

2) Installation

I'm aware of that you're planning a per default content type that's configured right out of the box when you install aggregator. But: the remaining problem is that where users had to activate a single module before, now they have to activate 3: aggregator, aggregator_light and common_syndication - that's not ideal.

On top of that, we're having a tricky dependency here: aggregator does not depend on common_syndication parser, but it depends on _a_ parser - enabling it without any parser doesn't make sense.

Is there any new functionality in D7 that gives us a handle on this? I also think that we don't need to solve this right away, because it should be easy to factor in at a later point in development.

3) aggregator_light name

I think we could find a better name here. Should we call this aggregator_simple?

4) We should be able to make do without a settings hook - see my comment inline.

Code looks very clean. Great work so far.

AttachmentSize
aggregator_todo.patch6.97 KB

Comments

#1

1)
I don't like the idea of adding another $op. I suggest this:
"Another option, but less attractive, could be to make it the processor's responsibility to provide an API for deduping to other modules." - from the patch. This could be a faster way: passing the items together makes sense. (as for the parser)
2)
I see a way now to resolve this nasty dependency stuff: put parser_common into aggregator . But this is definitely not nice. I have to think about better solutions :)
3)
Whatever. It's easy to refactor.
4)
Yes, i agree, this was my fault. It will be fixed.

#2

1) Here's what's on my mind in regards to overridable deduping:

* If deduping is an extra callback step, saving becomes a little more complex and slower. Complexity could be negotiable. I wonder how much slower we become.
* If deduping is the processor's responsibility, every processor needs to implement their own deduping API. This can lead to inconsistencies. We would have to at least implement two API's - not nice.
* Both cases (deduping API in aggregator, deduping API in processor) could amount to the same performance penalty.
* For implementing modules different API's mean more complexity. On the other hand, implementing modules need to distinguish between which processors are in place because their deduplication algorithms will depend on them.

2) Let's keep syndication_parser seperate (i think we've discussed this). Check out my latest commits, I have added some configuration validation: http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/new_aggrega... and http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/new_aggrega...

3) Let's keep aggregator_light as name for the time being.

#3

Title:Review of aggregator alpha 1» Review 1 of aggregator alpha 1

Distinct name from 2nd review. http://drupal.org/node/282267

#4

Status:active» fixed

The issues are fixed or included in review3.

#5

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

nobody click here