I tried searching for this, as I'm thinking I don't have some checkbox checked correctly to stop Link Checker from checking what aren't actual links.
# # #
I run Link Checker and it reports that URLs are bad (500, 400, etc.) that aren't valid links within the node.
Example Page:
http://winetest.fermentedreviews.com/twin_fin_2005_cabernet_sauvignon_ca...
"www.twinfinswines.com" on that page comes up as "php_network_getaddresses: getaddrinfo failed: Name or service not known." Other pages on the site with non-links of www.xyz.com come up with various other error codes, based upon whatever the server sends back.
I'm guessing that every text entry of a plain www.xyz.com within each node is being checked? And I'm just getting the normal resultant errors (the checked Server doesn't want to talk to HEAD, etc.)?
Is there a way to have Link Checker only check items that are enclosed within type markup?
I currently have checked:
Scan node types for links: {All node types selected}
Scan comments for links
Scan blocks for links
Check full qualified domain names only*
Extract links in and tags
Extract links in
tags
All other setting are default.
Any pointers on what to change?
Thanks,
Sam
*Thought this would exclude plain text, but it didn't.
Comments
Comment #1
Michael-IDA commentedHmm, I guess I don't have enough permission to edit my own posts. The above was eaten a bit by a filter, here's the entire post in a code block so things aren't truncated or missing.
Comment #2
hass commentedThe module runs per default an URL filter to find as many links as possible... I can add an extra checkbox to disable this feature, but I think it could be better to fix this links. Otherwise you are able to disable checking of this individual link since version 2.2. Aside - why do you have links in your content that do not link to a real URL?
"php_network_getaddresses: getaddrinfo failed: Name or service not known."is a clear statement... DNS lookup for the domain name has failed.Server error 500 are often missconfigured servers. Try again with GET method. You can change this in "edit link settings".
Comment #3
hass commentedOnly to note - if
Check full qualified domain names onlyis checked only URLs like http://example.com/foo/bar are extracted. If you'd like to extract local path's like/node/1234you need to uncheck this.Comment #4
Michael-IDA commentedComment #5
hass commentedCould you please do not add CODE tags around all your postings... not that easy to read.
Teach the authors that visitors of their site do *not* like to read DEAD links. They are looking for working weblinks. Everything else makes no sense on the internet.
Link checker was initially designed to find as many links as possible. If your users add broken / buggy links to the site they need to disable link checking for all this broken links they do not like to fix. I'm not sure what the problem is... this is what link checker has been build for!? If you do not like to see *broken* links - turn the link checker module off. Nevertheless I would suggest to tell the authors that links are references to other sites that need to exists - if not - the option is to turn off link checking *individually* for this buggy links. It doesn't matter how many nodes the site have as you can have thousands of links in one node.
The URL filter (note: this one converts "www.example.com" into real links) runs *always* on all content. Today there is no way to turn this behaviour off. Maybe in a future version.
Comment #6
hass commented#497096: Links generated by input filter
Comment #7
Michael-IDA commented"Could you please do not add CODE tags around all your postings... not that easy to read."
At least YOU COULD READ them. Without code tags, Drupal.org was removing content. Which I said previously, "The above was eaten a bit by a filter, here's the entire post in a code block so things aren't truncated or missing."
"Teach the authors"
You don't work in the real world do you? Nor get paid to do a job? Nor read up on Google's BS with nofollow?
http://gazebo.commonplaces.com/2009/06/a-few-words-on-relnofollow-or-ple...
"PageRank that would flow to that link simply ‘exaporates’ when you make that link nofollow."
I understand this is a "free" effort on your part, and your module is a great idea, but I'm sorry you think your opinion of how others should operate hinders its use.
Comment #8
hass commented1. You complained about non-links that has been checked in past. Ok, no longer the case! Non-links are not extracted if the URL filter is not enabled for a format (since 2.3) and URL filter can be globally disabled in latest dev or next 2.4.
2. Now you talk about nofollow - what the heck has this to do with linkchecker? Linkchecker verifies links on your site if they are broken or not. It doesn't care about any search engines stuff and I have currently no plans to add exclusion of links with rel nofollow attribute. It would be easy to add this in D7, but I believe it's pretty useless for what link checker intentionally does.
After you have upgraded to the latest version, press the button 'Analyze content for links' on link checker settings page to cleanup old "Non-Links" stuff.