If pubDate is 0 or empty, use current time
alex_b - October 29, 2009 - 14:01
| Project: | Feeds |
| Version: | 6.x-1.x-dev |
| Component: | Code |
| Category: | bug report |
| Priority: | normal |
| Assigned: | mcaudy |
| Status: | closed |
| Issue tags: | Novice |
Description
#611578: Feeds from PubMed not getting parsed reports that feed http://eutils.ncbi.nlm.nih.gov/entrez/eutils/erss.cgi?rss_guid=1RCQjEc-E... does not show items on Managing News. This is due to the fact that it does not have pubDates defined for the items.
Needs to be fixed on the parser level (SimplePie and Syndication Parser).
This is a novice task.

#1
I will work on this.
#2
alex_b: I added the following "if" statement after the original timestamp "get_date("U")" on line 48 of FeedsSimplePie parser.inc:
48 $item['timestamp'] = $simplepie_item->get_date("U");
// Set timestamp to current date if 0 or empty or NULL.
if ($item['timestamp'] == "") {
$item['timestamp'] = mktime(0, 0, 0, date("m"), date("d"), date("y"));
}
Everything seems to work fine now. I have successfully parsed feeds from PubMed, HubMed, and Science, all of which were given timestamps = 0 before, and would not display. They all are getting non-"0" (presumably current) timestamps now, and they all display properly.
I will create a patch to submit later, and then also do the corresponding changes for FeedsSyndicationParser.inc.
Thanks for helping me solve this problem, so that users can now parse most or all RSS/Atom feeds from PubMed and other scientific journals.
#3
alex_b: does this revision seem likely to you to be sufficient? Or, do you think anything more would be required?
#4
I'd love to see a patch for FeedsSyndicationParser and FeedsSimplePieParser (may be FeedsSyndicationParser does not need any).
#5
alex_b: I will do the patch for FeedsSimplePieParser, and then look at the FeedsSyndicationParser to see if it needs any changes. I will try to do this by tomorrow, but I have some other commitments today.
#6
Here is the patch for FeedsSimplePieParser, to correct the pubDate = 0 problem with the parser.
This is the first patch that I have done for a project other than my own, so you might want to just look at the diff file. It is a very simple addition of a single "if" statement, so you can probably tell if diff file is correct just by looking at it.
The parser now works correctly for a variety of feeds that were not working before, so the patch seems to be OK.
#7
Here is the patch file attachment. It did not upload for some reason.
#8
The approach seems solid. Some issues:
- Use empty() for the if clause http://php.net/manual/en/function.empty.php (Adjust comment accordingly)
- There is an indentation problem (each indentation should consist of 2 spaces)
- Use FEEDS_REQUEST_TIME instead of mktime() - instead of mktime you could use time() but even better is using the PHP constant FEEDS_REQUEST_TIME as it is cheaper than a call to time().
#9
alex_b: Thanks for your code improvement suggestions:
- I will use empty();
- I was aware of the 2 space indentation convention. I ran the diff in terminal and copied the output into TextMate without noticing that it added more spaces;
- I have googled for 'FEEDS_REQUEST_TIME' and also searched php.net and drupal.org. I cannot find this constant anywhere. Instead, I find the following, from drupal.org/node/429458:
"Drupal 7 is using a constant REQUEST_TIME instead of time().
if (!defined('REQUEST_TIME')) {
define('REQUEST_TIME', time());
}"
Is this good? If not, can you point me to where I can find 'FEEDS_REQUEST_TIME' defined?
#10
FEEDS_REQUEST_TIME is defined in feeds.module
#11
OK, I see 'FEEDS_REQUEST_TIME' now. Since you said it is a 'PHP constant', I thought it was defined more generally, and looked for it outside the feeds module, at php.net.
#12
Here is the patch with the revised use of "empty()" and "FEEDS_REQUEST_TIME" .
I have tested this with several new feeds from sources that were previously lacking timestamps, and the revised patch also works to set any empty timestamp values to the current time. Therefore, they display properly.
I next will test whether the FeedsSyndicationParser needs similar revisions, or whether it is working correctly regarding setting timestamps and displaying the feed items properly.
#13
Preliminary Results for FeedsSyndicationParser: when FeedsSyndicationParser (with term extraction) is used, properly named feed nodes are created, but no feed-items are added to the feeds_data_syndication table, and the web page shows:
HTTP - 500 Internal Server Error
The page you are attempting to load returned an error.
The properly named feed nodes can be seen in the node table, and in tracker, but there are no feed items created and saved in the feeds_data_syndication table.
This testing was done on an app with the patched SimplePieParser. This should not affect the results for the FeedsSyndicationParser, but it is possible that it does. When I have time later, I will create a new virgin Managing News install (also with Feeds alpha5) and test the FeedsSyndicationParser again, but I suspect the results - and errors - will be the same.
If the errors are reproduced, I will created a new issue for FeedsSyndicationParser.
#14
I committed #12 with minor tweak to comment. Thank you.
#13: right, this would be a new issue.
#15
Automatically closed -- issue fixed for 2 weeks with no activity.