As per

http://lists.drupal.org/pipermail/development/2006-July/017414.html

I've moved the feed parsing code to it's own .inc file.

At this point is pretty much a straight cut and paste job.

In classic drupal style, the new API function drupal_retrieve_feed returns a nested array of channel information and items.

I've reworked aggregator.module to make use of this functionality.

This patch probably sits somewhere between "needs work" and "needs review." But it is a first attempt to get the ball rolling.

There are some todo's in the code and this doesn't have any of the feed generation stuff I talked about in the mailing list thread, but one thing at a time.

Ciao,
-Mark

Comments

budda’s picture

I've done this in the feedparser module (currently work in progress in CVS).

http://cvs.drupal.org/viewcvs/drupal/contributions/modules/feedparser/

I've not added a refactored aggregator.module just yet, but it's on the list of things to do.

Eventually it would be good to move the parsing inc file into Core for use by everything. The plan is to expand the .inc file to handle ATOM feeds in addition to RSS.

dries’s picture

I'm going to put this patch on hold for 1-2 days. I'm going to create a subdirectory for each of the modules first. Then the aggregator.module and aggregator.inc can live in the same directory.

eaton’s picture

Dries: do you think that it's a good idea to keep the aggregator.inc and aggregator.module tightly coupled? It seems that one of the key advantages of separating them is allowing other modules to completely replace aggregator.module without reimplimenting the parser. Moving this file into /includes means that it can be considered a 'given' for other modules to build on without requiring that the existing aggregation UI, etc be present.

walkah’s picture

Dries - I think feed parsing shouldn't be dependent on aggregator... general feed (RSS & Atom) parsing (as well as feed writing!) have applications in contrib outside of aggregator...

my $0.02CAD - but excited to see this moving forward.

dries’s picture

Sure, moving it to includes makes sense too. :)

mfredrickson’s picture

@budda: I did look at your module before starting this refactor. I decided to concentrate on the existing aggregator code as that has the benefit of core level scrutiny. Of course, it could use improvement! In the interests of combining effort, would you be interested in working on this .inc file when it hits? Let's get it into core first, and then we can start working on improvements. (Hint: I'm hoping for a patch review. :-)

budda’s picture

I've not actually touched much/if any of the parsing code i moved out of the aggregator.module.

I got the feeling such activity was a bit too radical for Core so soon so went about putting it into a contrib module first. I'm also going to be working with Drewish on this - he has nominated himself for ATOM support upgrades to the parser.inc file.

eaton’s picture

I agree with that approach: if we split out the inc file, efforts to improve the pure parsing side of things can focus on the core include file The need to include separate parsing libraries in other projects is exactly why this patch helps.

drumm’s picture

Status: Needs review » Needs work

Looks okay. I'm fine with feed.inc being put in the includes directory. But we should only include it as necessary (when feeds are parsed). And there are a couple lines comment out for posterity's sake. We don't care about posterity.

mfredrickson’s picture

what's the best way to include it then? generally, we don't have modules include /includes/*.inc files.

Should I put the parsing logic in feed.inc and the API call in common.inc?

Also, I was anticipating adding feed generation functions eventually. But one patch at a time...

LAsan’s picture

Version: x.y.z » 7.x-dev

How's the work around this patch code?

Moving to cvs.

mustafau’s picture

Status: Needs work » Needs review
StatusFileSize
new29.81 KB
new29.81 KB

I have updated the patch for CVS HEAD.

Changes are:

- Moved drupal_retrieve_feed() into common.inc.
- Implemented tests for feed.inc.
- Change _feed_parse_feed() to feed_parse().
- Rename aggregator_parse_feed() to aggregator_save_items().

Two of the tests fail because parser can not parse "<link href="http://example.org/"/>".

moshe weitzman’s picture

There is a new aggregator module in patch review. The current one is not long for this world. Please review the other if possible. Thanks.

mustafau’s picture

I am aware of the new aggregator patch. I have done my part of the review for it.

This is another approach to improving the current aggregator. If somehow new aggregator can not make it's way into the core this will be here as a resort.

mustafau’s picture

StatusFileSize
new29.28 KB

Updated the patch.

Improvements over #12:
Better error handling.
Support for alternate feed parsing libraries.
Test cases merged into one.

dries’s picture

Status: Needs review » Fixed

This one was committed to CVS HEAD earlier today. Thanks all.

mustafau’s picture

Status: Fixed » Needs review

@Dries: This patch is not committed. The one that was committed is: #16282: OPML Import for Aggregator

dries’s picture

Oops, this wasn't committed yet. Updated the wrong issue.

mustafau’s picture

Title: Move feed parsing to own .inc file, refactor aggregator.module » Move feed parser to includes/feed.inc
Category: task » feature
StatusFileSize
new28.79 KB

Re-roll.

mustafau’s picture

StatusFileSize
new28.82 KB

Latest patch is broken. Re-rolling.

mustafau’s picture

StatusFileSize
new29.65 KB

Added following functions to feed.inc

function feed_set_data($data = NULL) {}
function feed_get_data() {}
function feed_get_hash() {}
function feed_set_headers($headers = array()) {}
function feed_get_headers() {}
robin monks’s picture

Status: Needs review » Needs work

When trying to update a feed in aggregator after adding it:

* There is no new syndicated content from gmplanet.
* The feed from gmplanet seems to be broken, because of error " Invalid document end".

* notice: Undefined property: stdClass::$code in /s**s/development.**.ca/htdocs/d1/modules/aggregator/aggregator.module on line 503.
* notice: Undefined property: stdClass::$code in /s**s/development.**.ca/htdocs/d1/modules/aggregator/aggregator.module on line 504.

Needs work.

Robin

mustafau’s picture

StatusFileSize
new29.66 KB

Updated.

@Robin Monks: Can you try this one with another feed?

dmitrig01’s picture

Why aren't you using simpleXML?

robin monks’s picture

Status: Needs work » Needs review

True, since we're requiring PHP 5.2 or higher this version, SimpleXML is available for this. I agree that would be a good step, but, I don't see why it couldn't be done after this patch hits HEAD. I think adding any more complexity to this patch will make it miss D7.

I'll be able to test later this afternoon, thanks mustafau!

Robin

alex_b’s picture

Aron just posted a patch that replaces the existing parser with SimpleXML: http://drupal.org/node/302245 . It also moves parsing to an include file, but does not move it to includes/ - yet.

robin monks’s picture

Status: Needs review » Reviewed & tested by the community

#26: That patch is back to code needs work right now, they probably need a hand over there ;-)

Anyways:

This patch has been tested and WORKS! http://robinmonks.com/feed worked perfectly. Great work mustafau! This patch is now RTBC. I'd personally like to see this go in first, then get ported to SimpleXML, but, the powers that be will decide. This patch also survived the "aggregator" simpletest, 207 passes, 0 fails, 0 exceptions.

Robin

robin monks’s picture

Also note, this is a task, not a feature request AFAIK.

robin monks’s picture

Category: feature » task
mustafau’s picture

Component: aggregator.module » base system
StatusFileSize
new30.39 KB

With more documentation.

robin monks’s picture

Status: Reviewed & tested by the community » Needs work

Sorry, just found a new bug, if I update the same feed twice in a row I get this:

* notice: Undefined property: stdClass::$items in /srv/w*a/htdocs/d1/modules/aggregator/aggregator.module on line 472.
* notice: Undefined property: stdClass::$code in /srv/w*a/htdocs/d1/modules/aggregator/aggregator.module on line 504.
* notice: Undefined property: stdClass::$code in /srv/w*a/htdocs/d1/modules/aggregator/aggregator.module on line 505.

* There is no new syndicated content from Robin Monks' Blog.
* The feed from Robin Monks' Blog seems to be broken, because of error " Invalid document end".

This is a fresh head install, no previous data to mess anything up.

Robin

alex_b’s picture

#27 - Indeed it would be good to move first on this patch and then on SimpleXML - don't know how compatible the two approaches are - curious to see Aron's response here.

aron novak’s picture

StatusFileSize
new32.07 KB

I unified the two patches (#30 and SimpleXML for agg). aggregator.module tests are ok for me. Only one parser feed fails (no DC ns support currently for channel elements). Please test it.

SeanBannister’s picture

sub

mustafau’s picture

Status: Needs work » Needs review
StatusFileSize
new19.88 KB

I needed to kill some time. This is the result.

Very simplistic. Just move some functions from aggregator.parser.inc to include/feed.inc. Now we have a general feed parser and a way to override default parser.

Status: Needs review » Needs work

The last submitted patch failed testing.

lilou’s picture

Status: Needs work » Needs review
mustafau’s picture

Version: 7.x-dev » 8.x-dev
klausi’s picture

#35: feed_inc-72915-35.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, feed_inc-72915-35.patch, failed testing.

fabianx’s picture

Is this still an issue?

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

klausi’s picture

Issue summary: View changes
Status: Needs work » Closed (outdated)

This is now all in classes and probably no longer relevant.