Broken links with space are not recognized, i.e. <a href="/one two">Link</a> or <a href="one two">Link</a>.

This applies to the dev-version as well.

Other than that, nice module.

Comments

pawel_r’s picture

+1

hass’s picture

Version: 6.x-2.4 » 6.x-2.x-dev

Is one of You guys able to fix the regexes in _linkchecker_extract_links() and provide patch, please? This should be a D6 bug only.

hass’s picture

Version: 6.x-2.x-dev » 7.x-1.x-dev

Function valid_url() returns FALSE for links with spaces... http://www.example.com/foo%20bar/foo works. Not sure how we can fix this.

hass’s picture

drupal_encode_path() rawencode double points, ankers, question marks and others. url() does not work, too.

This may work, but I expect more than only spaces and side effects... damn hacks... I hate them. Current design may be wrong.

Line 1356 ff:

  $links = array();
  foreach ($urls as $url) {
    // Full qualified URLs.
    // HACK: Encode spaces in URLs; validation equals TRUE; Link gets added.
    if (valid_url(str_replace(' ', '%20', $url), TRUE)) {
      $links[] = $url;
    }
    // Skip mailto:, javascript:, etc.
    elseif (preg_match('/^\w[\w.+]*:/', $url)) {
      continue;
    }
hass’s picture

Funny, drupal_http_request() can request URLs with spaces...

hass’s picture

Status: Active » Needs review
hass’s picture

Title: Broken internal links with spaces are not found » Links with spaces are not extracted
Status: Needs review » Fixed

Decided to go with the hack for now.

hass’s picture

Title: Links with spaces are not extracted » Links with spaces are not validated / add link failed

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

drupal_jon’s picture

Version: 7.x-1.x-dev » 7.x-1.0-beta1

Hi,

Not sure whether to reopen this issue, or refer to the similar D6 issue #1525146: Space in filename triggers 404. But although links with spaces in them are added to the report now, the linkcheck itself seems to return an incorrect 404 if the link contains a space.

hass’s picture

Version: 7.x-1.0-beta1 » 7.x-1.x-dev
rob230’s picture

Still experiencing the problem reported by drupal_jon: files with a space in them incorrectly appear in the link report as 404, even though the link works. You can click on one of the 404 links in the link report and it successfully loads the page.

rob230’s picture

Issue summary: View changes

Actually, this is a content problem. Drupal's valid_url() function correctly returns FALSE if the URL has a space in it. The URLs in the content should use %20 instead of a space, even though most browsers handle a space without issue.