Item doesnt show link to source

funana - June 4, 2007 - 18:53
Project:Leech
Version:5.x-1.8
Component:leech
Category:bug report
Priority:normal
Assigned:alex_b
Status:postponed (maintainer needs more info)
Description

Hi,

I noticed that a lot of items have no backlinks to the source articles. I really could not figure out why they are published without backlinks. The XML sources are valid and contain the links.
First I thought it has something to do with the length of the item body, because the items with the missing links are very short. But then I realized that I have some good working feeds with items that contian only a few words in the body...

Anyone with the same problem or a possible solution? May this effect be caused by another module, not by leech itself?

Any help appreciated!

#1

funana - June 12, 2007 - 15:51

Same problem on an updated 5.1 site. No links to item sources are displayed... I really don't understand this. Can anybody help me? Am I just too stupid, or do I miss something essential?
Any help is greatly appreciated!

#2

funana - June 12, 2007 - 16:08

Forgot to say that I have "Use source link" checked. Some items of the same feeds contain source links, some dont...

#3

alex_b - June 12, 2007 - 17:27
Assigned to:Anonymous» alex_b
Status:active» postponed (maintainer needs more info)

"Use source link" checked is fine.

Can you post a feed URL that produces source links and a feed URL that does not?

Can you take a look in your DB table "leech_news_item" wether the items without source URLs have a URL in the link column?

#4

funana - June 12, 2007 - 19:03

Can you post a feed URL that produces source links and a feed URL that does not?

The problem is: There are items with missing source links and items wich contain them and these items are both from the same FeedURL!

Can you take a look in your DB table "leech_news_item" wether the items without source URLs have a URL in the link column?

Yes, everything seems to be stored correctly... Every item got a link.

Alex, I wonder if it could be a problem that's caused by any other module... Could that be?
And another strange thing: I have a Drupal 5 site with Leech 5.x-1.x-dev and it works very well. No problems, every item has a backlink to the source article.
So I have the problem on every 4.7 site and on Drupal 5.1 with Leech 5.x-1.7.

Thank you.

#5

alex_b - June 13, 2007 - 20:20

Strange that it's stored but not displayed. Have you installed devel and looked at the feed item on the object tab? does the leech_news_item object show up on nodes that don't display their source links? Does the leech_news_item object have a link property?

#6

funana - June 14, 2007 - 15:15

Thanks for your support. I will install devel and check it out.

#7

funana - July 8, 2007 - 19:45
Version:4.7.x-1.6» 5.x-1.8

Hi Alex,

sorry for the delay... I installed Devel and compared an item containing the source link with an item which doesnt.
The Item with the missing source link does not contain the "leech_news_item" link!

What do I have to do now?

Greetings,

Funana

#8

alex_b - July 12, 2007 - 14:19

Is the entire $node->leech_news_item property missing?

I imagine that you have leeched the node we are talking about still with a pre 1.8 version of leech, correct? Up to the revision 1.4.2.28 there was a problem when leech did not finish on cron run or when cron ran parallelly on leech. In these cases, nodes where created but leech did not store feed information for them. The effect was duplicate node items and probably also what you're describing.

Did this issue occur also with nodes created by the 1.8 version?

#9

funana - July 12, 2007 - 14:31

Is the entire $node->leech_news_item property missing?

Yes!

I imagine that you have leeched the node we are talking about still with a pre 1.8 version of leech, correct? Up to the revision 1.4.2.28 there was a problem when leech did not finish on cron run or when cron ran parallelly on leech. In these cases, nodes where created but leech did not store feed information for them. The effect was duplicate node items and probably also what you're describing.

Did this issue occur also with nodes created by the 1.8 version?

Yes. It occurs with the 1.8 version. The duplicates have also been created.

#10

alex_b - July 13, 2007 - 11:14

Do you call wget with the -t 1 option?
http://drupal.org/node/150972

#11

funana - July 13, 2007 - 17:22

Sorry, I don't understand... The link you posted is a patch for 4.7?!
Please give me more informations. I would really like to bring light into this...

#12

alex_b - July 18, 2007 - 12:13

The link I posted is a patch for 4.7.x, 5.x and 6.x . It is actually just a patch to the readme file.

What you need to do is call your cron.php with the -t 1 flag of wget. Otherwise wget repeats calls to the cron.php if it does not return on time. It typically does not return if you've got lots of feeds and leech takes too long to return. Those repititions can produce duplicates in leech.

#13

funana - July 19, 2007 - 18:46

Duplicates is one thing, and I understand what you mean. This doesnt help with the missing backlink bug... I had no crons running, no feeds active, added a new feed which is valid - no links apear. That's still the problem...

#14

alex_b - July 19, 2007 - 20:06

This is really strange. I just set up http://leech5.devseeddev.com/ and everything works fine...

Did you try a standard theme such as Garland?

#15

funana - July 21, 2007 - 16:58

Tried it, no successs. Still no sourcelinks.

#16

alex_b - July 23, 2007 - 12:35

Probably you're really having some interaction with another module there. You said you're having this problem with every 4.7 install and with 5.x 1.8, but not on a 5.x dev site. Well, 5.x1.8 and 5.x dev are practically identical (check out the cvs to see the differences). On top of that, I know 4.7 leech versions really well (we are running several 4.7 projects here), and i never had this problem...

Alex

#17

funana - July 29, 2007 - 15:41

Hey Alex,

could you do me a favour and try this feed here on your test install?
http://www.nik-o-mat.de/index2.php?option=com_rss&no_html=1

I tried to disable all other modules and still had the same effect.
Once again I feel like I am missing something really obvious here...

Thank you!

#18

alex_b - July 30, 2007 - 17:15

#19

funana - July 31, 2007 - 12:37

Thank you very much Alex.

Although I still can't isolate the error source I want to let you know that I really appreciate all you support and help with this issue. It's great to be supported that way and I still love this module :)

I know I'm beeing a pain, but what do you think how I could solve this issue? May I uninstall the leech.module and delete the mySQL tables of leech without loosing all the content? If so, I would try to do that and install the new version. Maybe this would solve the issue...

Thanks.

#20

alex_b - July 31, 2007 - 13:24

funana,

can you send me the database scheme (just the create table statements) of all your leech tables?

#21

funana - August 2, 2007 - 12:40

CREATE TABLE `leech` (
  `nid` int(10) NOT NULL default '0',
  `url` text NOT NULL,
  `refresh` int(10) NOT NULL default '0',
  `checked` int(10) NOT NULL default '0',
  `modified` int(10) NOT NULL default '0',
  `etag` varchar(255) NOT NULL default '',
  `mime` varchar(255) NOT NULL default '',
  PRIMARY KEY  (`nid`),
  UNIQUE KEY `url` (`url`(255))
) TYPE=MyISAM;

CREATE TABLE `leech_news_feed` (
  `nid` int(10) NOT NULL default '0',
  `template` int(10) NOT NULL default '0',
  `logo` varchar(255) NOT NULL default '',
  `link` text NOT NULL,
  `author` varchar(64) NOT NULL default '',
  `items_guid` tinyint(2) NOT NULL default '0',
  `items_status` tinyint(2) NOT NULL default '0',
  `items_update` tinyint(2) NOT NULL default '0',
  `items_delete` int(10) NOT NULL default '1000000000',
  `items_promote` int(10) NOT NULL default '1000000000',
  `items_date` tinyint(2) NOT NULL default '0',
  `links_display_mode` tinyint(3) unsigned NOT NULL default '0',
  PRIMARY KEY  (`nid`)
) TYPE=MyISAM;

CREATE TABLE `leech_news_item` (
  `nid` int(10) NOT NULL default '0',
  `fid` int(10) unsigned NOT NULL default '0',
  `link` text NOT NULL,
  `author` varchar(60) NOT NULL default '',
  `guid` varchar(255) NOT NULL default '',
  `source_link` varchar(255) NOT NULL default '',
  `source_xml` varchar(255) NOT NULL default '',
  `source_title` varchar(128) NOT NULL default '',
  PRIMARY KEY  (`nid`),
  KEY `fid` (`fid`),
  KEY `link` (`link`(255)),
  KEY `guid` (`guid`)
) TYPE=MyISAM;

CREATE TABLE `leech_opml` (
  `nid` int(10) NOT NULL default '0',
  `template` int(10) NOT NULL default '0',
  PRIMARY KEY  (`nid`)
) TYPE=MyISAM;

Thank you!

#22

alex_b - August 3, 2007 - 16:58

OK. I just saw that you're missing some table columns in leech. This should be the scheme:

CREATE TABLE `leech` (
  `nid` int(10) NOT NULL default '0',
  `url` text NOT NULL,
  `refresh` int(10) NOT NULL default '0',
  `checked` int(10) NOT NULL default '0',
  `modified` int(10) NOT NULL default '0',
  `etag` varchar(255) NOT NULL default '',
  `mime` varchar(255) NOT NULL default '',
  `adaptive` tinyint(4) default '0',
  `news_last_arrived` int(10) default '0',
  `avg_btw_news` float default '0',
  `deviation` float default '0',
  `num_of_tests` int(10) default '0',
  PRIMARY KEY  (`nid`),
  UNIQUE KEY `url` (`url`(255))
);

Please add the missing fields to your table and then tell me how it worked.

BTW: from which version of leech did you upgrade? This scheme was already part of the first leech release on 5 (1.6).

#23

skomdra - September 3, 2007 - 08:11

Hi,
I think I may have the same problem so I checked the db.
My 'leech' table is exactely as you described but problem remains, just duplicated items show up without link to original article. I can't believe that cron causes this problem. Is it possible that maybe problem causes rss? Because, the other leech which takes feeds from feedburner shows no problem. But, on the other hand, other readers, for example google front page shows the content of particular problematic feed without any problems. The only diference between two leeches I publish on my site is the feed type, on feed burner I have RSS 2.0 and on blogger is ATOM. So, I am confused, please help.
Drupal 5.2 Leech 1.8.
Thanks

#24

Aron Novak - September 11, 2007 - 17:20

skomdra, can you tell us exact feed pairs, i mean the following:
http://url/to/feed/which/is/problematic - http://url/to/feed/which/is/ok
If you provide this, we can try to reproduce your problem. Without this it's not possible to do efficient steps in order to solve the issue.
"I can't believe that cron causes this problem" - you cannot be sure about that. How many feeds are there on your site? What is the maximal execution time of a PHP script?

 
 

Drupal is a registered trademark of Dries Buytaert.