I like the idea behind this module, but it doesn't work very well on pages of mine that have a lot of links. Following all the directions to the letter, I ended up a lot of errors. My /files/urlicon/ directory has several favicons now, but the pages that have a lot of links always time out (resulting in a blank page).
* warning: parse_url(http:///favicon.ico) [function.parse-url]: Unable to parse url in /home/fabmin/public_html/includes/common.inc on line 366.
* warning: parse_url(http:///en.wikipedia.org/wiki/Metaverse) [function.parse-url]: Unable to parse url in /home/fabmin/public_html/modules/contrib/urlicon/urlicon.module on line 68.
* warning: fsockopen() [function.fsockopen]: php_network_getaddresses: getaddrinfo failed: Name or service not known in /home/fabmin/public_html/includes/common.inc on line 371.
It would be nice if it would make note of outgoing links and update the /files/urlicon/ directory every so often instead of doing it on each page load.
Finally, with urlicon.module is enabled the node ID is printed at the beginning of the node body.
| Comment | File | Size | Author |
|---|---|---|---|
| #1 | urlicon.patch | 3.18 KB | jaydub |
Comments
Comment #1
jaydub commentedI have put together a patch to address this issue along with a few other items.
First off, this being a filter, it will not be run every time if the there is a filter cache entry and the cache test (which I do not know much about) will trigger whether or not this module will do its thing.
Changes:
1) I modified the regex for finding the URL. I didn't see the need to save each section of the match that was in the original regex. Now it just stores the contents of the HREF and the text of the <a> tag.
2) The issue of printing out the node ID was due to the drupal_set_message functional call. The code was simply passing the node ID to this function which results in not very informative output. I put in some text to indicate that this module was doing something.
3) Modified the urlicon_replace callback function to address the following:
After fetching the headers of the request for URL favicon, need to test if the response indicated the existance
of a favicon. So first test is if HTTP response code of 200. Also tested if Content-Length was > 0. Using drupal.org
as a test returned HTTP 200 but zero-length content. Not sure how to get the drupal.org favicon :)
Next test expanded the Content-Type from image/x-icon to also include text/plain. I came across this when testing
with www.nytimes.com . Not sure why they return text/plain but to make it work I added that.
Only if the above conditions are met is the CSS file edited.
TODO
Right now this filter will run for both the trimmed and full version of a node. This results in two <link> entries in the resulting HTML as well as double entries in the node's CSS file. Ideally would be able to just make a single entry.
Comment #2
sanduhrsThanks for the input. I will have a look at it tonight.
vg
Comment #3
sanduhrsBased on the patch I did some improvements on the code and commited it to HEAD.
- Improved performance if favicon already availlable.
- Improved regex
- Added support for Content-Type: text/plain
- suppress warnings in case of invalid urls
Thanks.
Comment #4
jaydub commentedI came across something else after submitting the patch to you...the regex will capture internal links which will not have the hostname. i see that you've added in an IF test for url['host'] however you could probably just add in the 'http' string as part of the regex to reduce the number of calls (helpful in a heavily internal linked page).
so, (.+?) becomes (http:\/\/.+?) or even just (http.+?). This is after the search for the 'href'. This will also mean that we will avoid looking up any non-http links such as ftp or mailto.
Comment #5
sanduhrsYou're absolutely right.
Changed the regex accordingly and commited it.
Thanks.
Comment #6
(not verified) commented