We've isolated an issue with our installation of Drupal and version 5.x-1.7 of this module. When creating a feed which contains this code: , Leech doesn't read the feed. However with this tag removed from the feed Leech works correctly. We don't have this issue with the previous version of Leech on a different installation of Drupal. An example of a failing feed can be found here: http://www.aph.gov.au/library/rssbd_feed.xml.

CommentFileSizeAuthor
#10 leech_news_parser_php5.patch1.04 KBalex_b

Comments

f1shnch1ps’s picture

Just to reiterate, the suspect code in an RSS feed which renders Leech inoperable as regards to that feed is <guid isPermaLink="true">

cburschka’s picture

I'm not involved with Leech and don't know what might be the problem, but could you be more specific?

<guid isPermaLink="true">, by itself, is not well-formed XML. Do you mean the full guid entity, which is <guid isPermaLink="true">[permanent link to the item]</guid>, that is, is the tag closed and containing an identifier? If it isn't, the tag mentioned above would break the XML format and probably throw off the parser.

Could you try setting isPermaLink to "false" instead of "true"? Perhaps the parser can't handle permanent links correctly. Or if the identifier between the guid tags is not a URL but some other string, isPermaLink should be "false" anyway...

I hope that if these don't solve the problem, it will make it easier for the Leech maintainers to narrow down the cause.

cburschka’s picture

Okay never mind, from the linked page it looks like the guid tag is closed and well-formed, taking as an example:

<guid isPermaLink="true">http://www.aph.gov.au/library/pubs/bd/2006-07/07bd166.pdf</guid>

So I have nothing more to contribute.

f1shnch1ps’s picture

We could not resolve this issue with our MySQL 5/PHP 5 installation so we reverted to MySQL 4 and PHP 4 which fixed the issue. Go figure.

alex_b’s picture

Interesting that reverting to php4/mysql4 did the job. Could you explain where exactly leech failed to "read" the feed - when you create the feed right away?

keesje’s picture

Hi,

I seem to have exactly the same problem, yes, this occurs right after creating the feed.
In my case both the elements link and squid are provided, like

<item>
	
	<title>Bloemen en koeken voor koning Albert II</title>
	<link>http://www.vrtnieuws.net/cm/vrtnieuws.net/nieuws/binnenland/070627_Koning</link>
	<guid>http://www.vrtnieuws.net/cm/vrtnieuws.net/nieuws/binnenland/070627_Koning</guid>
	<description>Gisteren is de koning geopereerd aan het dijbeen en de heup. Voorlopig is alles nog rustig aan de ingang van het Sint-Jansziekenhuis. </description>

	<pubDate>wo, 27 jun 2007 08:22:00 +0200</pubDate>	
	
</item>

for example, token from this feed:
http://rss.vrtnieuws.net/cm/vrtnieuws.net/nieuws/hoofdpunten

I realy look forward to a solution. I found several feeds not working, to me because of this issue.

keesje’s picture

Hi,

I seem to have exactly the same problem, yes, this occurs right after creating the feed.
In my case both the elements link and guid are provided, like

<item>
	
	<title>Bloemen en koeken voor koning Albert II</title>
	<link>http://www.vrtnieuws.net/cm/vrtnieuws.net/nieuws/binnenland/070627_Koning</link>
	<guid>http://www.vrtnieuws.net/cm/vrtnieuws.net/nieuws/binnenland/070627_Koning</guid>
	<description>Gisteren is de koning geopereerd aan het dijbeen en de heup. Voorlopig is alles nog rustig aan de ingang van het Sint-Jansziekenhuis. </description>

	<pubDate>wo, 27 jun 2007 08:22:00 +0200</pubDate>	
	
</item>

for example, token from this feed:
http://rss.vrtnieuws.net/cm/vrtnieuws.net/nieuws/hoofdpunten

I realy look forward to a solution. I found several feeds not working, to me because of this issue.

keesje’s picture

Pleez remove double post, sorry for that.

I nailed down the couse of the error, its in the pubDate element.

This works fine:

Wed, 27 Jun 2007 11:09:59 GMT

This one not:

wo, 27 jun 2007 08:22:00 +0200
keesje’s picture

Version: 5.x-1.7 » 5.x-1.8

More details,

I'm using 1,8 version

Line 283 leech_news_parser.inc:
" $item->date = strtotime($data[$feed->has_dates][0]['VALUE']); // strtotime() returns -1 on failure"
does NOT return -1, it returns an empty string! This might be a lead to the previous post, regarding PHP versions.

My (probaby to quick) conclusion:
PHP's strtotime funcion does not parse none-english date/time strings like this one. If it fails (at least in this occasion), it does not return a -1 value.

My hack for temporary solution:
Replace line 291-293 in leech_news_parser.inc with:

  if (strlen($item->date) < 3) {
    $item->date = time();
  }

I'm on:

Apache version :
Apache/2.0.55 (Win32)

PHP version :
5.1.2

Loaded extensions :
bcmath, calendar, com_dotnet, ctype, date, ftp, iconv, odbc, pcre, Reflection, session, libxml, standard, tokenizer, zlib, SimpleXML, dom, SPL, wddx, xml, xmlreader, xmlwriter, apache2handler, mbstring, curl, gd, ldap, mysql, mysqli, PDO, pdo_sqlite, SQLite

MySQL version :
5.0.18-nt

alex_b’s picture

Assigned: Unassigned » alex_b
Status: Active » Needs review
StatusFileSize
new1.04 KB

keesje76, great bug reporting, thank you very much.

I just read up that strtotime, as of PHP5.1.0, returns FALSE when the time string is not valid. Before it was -1. This makes sense, because a unix timestamp could also be negative, and therefore have the value -1. http://se.php.net/strtotime

In the attached patch, I added a version check to the parsing process - can you apply the patch and test it on your PHP 5 environment?

alex_b’s picture

Any news?

keesje’s picture

Shit, I missed your reaction in my tracker!! (Are Issue reactions tracked at all ??)

I see 1.9 is released in the meantime...

I gonna test the pach on 1.8 still.

keesje’s picture

Patch works on situation as described in above post, thanks.

I'm looking into another issue now..., this one seemes solved fine.

keesje’s picture

Patch works in situation as described in above post, thanks.

I'm looking into another issue now..., this one seemes solved fine.

alex_b’s picture

1.8/1.9 doesn't make a difference here. How are your tests going? I will commit the patch as soon as you give me green light.

keesje’s picture

I copy/pasted the updated code from patched 1.8 to 1.9, seems to work fine in my testbed. Later this month it's going to be tested more heavily in production use. So, fixed to me, thanks again.

alex_b’s picture

Version: 5.x-1.8 » 5.x-1.x-dev
Status: Needs review » Fixed

This is committed to 5.x dev. Thank you guys for going to the ground of this one. http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/leech/leech...

Anonymous’s picture

Status: Fixed » Closed (fixed)