By jonwatson on
Hi All,
I am attempting to use the core Aggregator module to pull in a news feed from Google News. I have created the feed and it pulls items no problem, but it makes no attempt to interpret the HTML in the feed. Rather, it just prints it out verbatim so I get a page full of ugly gibberish.
Something like this:
font style="font-size:85%;font-family:arial,sans-serif"brdiv style="padding-top:0.8em;"img alt="" height="1" width="1"/divdiv class=lhtable border=0 align=right cellspacing=0 cellpadding=0cellpadding=3 style="font-size:100%;font-family:arial,sans-serif"trtd width=80 align=center style="padding-left:6px;" valign=topa href="http://news.google.com/news/url?sa=Tct=us/1-0i-0fd=Rurl=http://seattlepi.nwsource.com/national/1104ap_as_kashmir_elections.html%3Fsource%3Dmypicid=1280449413ei=C3xFSejQDpfcMZHOpYcPusg=AFQjCNE0eD4RX0pd614zD4V6jHEyH2XbHA"img src=http://news.google.com/news?imgefp=Dpb4Ilnio_8Jimgurl=seattlepi.nwsource.com/dayart/aponline/9839.101India-Britain.sff.jpg width=77 height=80 alt="" border=1brfont size=-2Seattle Post Intelligencer/font/a/td/tr/tablea href="http://news.google.com/news/url?sa=Tct=us/1-0-0fd=Rurl=http://www.hindu.com/2008/12/15/stories/2008121558931200.htmcid=1280449413ei=C3xFSejQDpfcMZHOpYcPusg=AFQjCNHKUs2BqBMe93k3YmIDjEYRc2un4A"b“Pakistan must ensure its soil is not used for terrorist activities”/b/abrfont size=-1bfont color=#6f6f6fHindunbsp;-/font nobr22 minutes ago/nobr/b/fontbrfont size=-1Paramilitary personnel stand guard in Srinagar on Sunday. Curfew was clamped in the wake of Prime Minister Manmohan Singh’s visit to Kashmir./fontbrfont size=-1a href="http://news.google.com/news/url?sa=Tct=us/1-0-1fd=Rurl=http://www.upi.com/Top_News/2008/12/14/Brown_urges_Zardari_to_break_terror_links/UPI-24221229276820
Does anyone know how to stop this and get readable text?
Thanks
Jon
Comments
I am having trouble with
I am having trouble with Google newsfeed too. Gibberish and when I click on headline I get a page redirect that doesn't work
I quick messing around for a while and use Feedburner now.
HI I tried feeding this
HI
I tried feeding this Google News feed through Feedburner as well, but with the same result. Is that what you mean when you say you use Feedburner now?
Thanks
I went to feedburner.com and
I went to feedburner.com and got their feeds instead of directly from google. Go to my site http://www.digitalmania-online.com and check the news at bottom of home page. They are all from feedburner.
Turns out this isn't just a
Turns out this isn't just a Google News item. I've tried feeds from various sites and they all show up like gibberish when I used the categories taxonomy to look at them.
I can't be the only one experiencing this. Anyone?
Known bug
This is actually a libxml2 bug: http://bugs.php.net/bug.php?id=45996