Errors errors errors (and a UI glitch) [#89772]

I like the idea behind this module, but it doesn't work very well on pages of mine that have a lot of links. Following all the directions to the letter, I ended up a lot of errors. My /files/urlicon/ directory has several favicons now, but the pages that have a lot of links always time out (resulting in a blank page).

    * warning: parse_url(http:///favicon.ico) [function.parse-url]: Unable to parse url in /home/fabmin/public_html/includes/common.inc on line 366.
    * warning: parse_url(http:///en.wikipedia.org/wiki/Metaverse) [function.parse-url]: Unable to parse url in /home/fabmin/public_html/modules/contrib/urlicon/urlicon.module on line 68.
    * warning: fsockopen() [function.fsockopen]: php_network_getaddresses: getaddrinfo failed: Name or service not known in /home/fabmin/public_html/includes/common.inc on line 371.

It would be nice if it would make note of outgoing links and update the /files/urlicon/ directory every so often instead of doing it on each page load.

Finally, with urlicon.module is enabled the node ID is printed at the beginning of the node body.

Comment	File	Size	Author
#1	urlicon.patch	3.18 KB	jaydub

Comments

Comment #1

jaydub commented 18 October 2006 at 05:57

Status:

Active

» Needs review

Status	File	Size
new	urlicon.patch	3.18 KB

I have put together a patch to address this issue along with a few other items.

First off, this being a filter, it will not be run every time if the there is a filter cache entry and the cache test (which I do not know much about) will trigger whether or not this module will do its thing.

Changes:

1) I modified the regex for finding the URL. I didn't see the need to save each section of the match that was in the original regex. Now it just stores the contents of the HREF and the text of the <a> tag.

2) The issue of printing out the node ID was due to the drupal_set_message functional call. The code was simply passing the node ID to this function which results in not very informative output. I put in some text to indicate that this module was doing something.

3) Modified the urlicon_replace callback function to address the following:

After fetching the headers of the request for URL favicon, need to test if the response indicated the existance
of a favicon. So first test is if HTTP response code of 200. Also tested if Content-Length was > 0. Using drupal.org
as a test returned HTTP 200 but zero-length content. Not sure how to get the drupal.org favicon :)

Next test expanded the Content-Type from image/x-icon to also include text/plain. I came across this when testing
with www.nytimes.com . Not sure why they return text/plain but to make it work I added that.

Only if the above conditions are met is the CSS file edited.

TODO

Right now this filter will run for both the trimmed and full version of a node. This results in two <link> entries in the resulting HTML as well as double entries in the node's CSS file. Ideally would be able to just make a single entry.

Comment #2

sanduhrs

he/him

German

🇪🇺 Heidelberg, Germany, Europe

commented 18 October 2006 at 07:47

Assigned:

Unassigned

» sanduhrs

Thanks for the input. I will have a look at it tonight.
vg

Comment #3

sanduhrs

he/him

German

🇪🇺 Heidelberg, Germany, Europe

commented 18 October 2006 at 22:27

Status:

Needs review

» Fixed

Based on the patch I did some improvements on the code and commited it to HEAD.
- Improved performance if favicon already availlable.
- Improved regex
- Added support for Content-Type: text/plain
- suppress warnings in case of invalid urls

Thanks.

Comment #4

jaydub commented 19 October 2006 at 03:36

Status:

Fixed

» Active

I came across something else after submitting the patch to you...the regex will capture internal links which will not have the hostname. i see that you've added in an IF test for url['host'] however you could probably just add in the 'http' string as part of the regex to reduce the number of calls (helpful in a heavily internal linked page).

so, (.+?) becomes (http:\/\/.+?) or even just (http.+?). This is after the search for the 'href'. This will also mean that we will avoid looking up any non-http links such as ftp or mailto.

Comment #5

sanduhrs

he/him

German

🇪🇺 Heidelberg, Germany, Europe

commented 19 October 2006 at 09:37

Status:

Active

» Fixed

You're absolutely right.
Changed the regex accordingly and commited it.
Thanks.

Comment #6

(not verified) commented 2 November 2006 at 09:45

Status:

Fixed

» Closed (fixed)

Errors errors errors (and a UI glitch)

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

News items

Our community

Documentation

Drupal code base

Governance of community