Current RSS returns the entire text of the front page (or at least the teasers) including all HTML tags. For rapidly changing sites, this means the RSS polling is effectively downloading the entire front page over and over.

The attached patch adds two additional node configuration variables, one for the option to strip the tags from the RSS descriptions, and a second to then also trim the description to just a lead-in substring. This can result in a 70-90% reduction in the size of the /node/feed for significant savings in bandwidth.

There is one known bug: The new Textile module inserts a \001Textile\001 code into the top of the story and this code is not stripped in the RSS description output (ie, it is ignored by the tag-stripping function).

Comments

garym@www.teledyn.com’s picture

oops ... this one may need more work --- while it did work for 4.3.0, the code is not stripping the tags in the cvs. investigating and will post an updated patch when I've solved it.

Bèr Kessels’s picture

StatusFileSize
new3.48 KB

It does not work, because of line 712 in common.inc.

Here the content is check_output()-ted. So filters (e.g. BB) will re-insert html-tags.
It would be best if _after_ check_output() the content is strip_tag()-ged.

A quick-n-dirty hack is attached.

Regards Ber

moshe weitzman’s picture

one poster says this patch needs more work, and the next offers a 'quick and dirty hack' ... removing from the patch queue.

also, i personally see too much duplication between a 'teaser' and 'lead in'.

alexandreracine’s picture

Version: x.y.z » 4.5.0
Status: Active » Closed (fixed)

Very old version, not supported anymore.

For recent version you can do just that with the views.module.

Closing.