Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
What a neat module!
Just wondered (if it doesn't already) whether it could use the job_queue module to email the author of a node to say something like
"The following links on your page at: --node_url---
do no resolve to accessible pages:
http://xyz - Error 404 - Page not found
etc"
That way the authors of the node can do the work! ;-)
Comment | File | Size | Author |
---|---|---|---|
#43 | linkchecker_mail4.patch | 16.76 KB | r_honey |
#37 | linkchecker_mail3.patch | 16.71 KB | hass |
#34 | linkchecker_mail2.patch | 16.72 KB | hass |
#24 | linkchecker_mail.patch | 17.15 KB | hass |
#22 | linkchecker_mail-D6.patch | 17.15 KB | hass |
Comments
Comment #1
hass CreditAttribution: hass commentedGood idea... nevertheless I don't like this notification / spamming method very much. I've already have had the idea to use job_queue for a more reliable link check processing, but this is one more reason to use it :-)
Comment #2
hass CreditAttribution: hass commentedHow this should be build:
1. Make email notification user configurable. By default ON if globally activated. Add select box to user page.
2. Make sure we do not send one mail per unpublished node. All broken nodes should be collected first per author and then one email with all unpublished nodes listed should be send.
3. May add an early warning email before unpublishing a node.
4. Don't forget to limit this mailings to "3" for dead mail addresses and authors that don't care about their articles and links inside.
5. Limit mailings once per day/week/month.
Comment #3
scedwar CreditAttribution: scedwar commentedCheck out notifications - well worth hooking into that for most of this functionality.
http://drupal.org/project/notifications
Comment #4
hass CreditAttribution: hass commentedCan you provide a patch?
Comment #5
scedwar CreditAttribution: scedwar commentedpatches like this are far beyond my coding skills, sorry!
Comment #6
Anonymous (not verified) CreditAttribution: Anonymous commentedWant to bring this request back to life. I'm running a CMS comparison site, and authors are publisher who maintain their own listings. I really need to send them automatic emails on CRON runs in case their URLs are out of date.
So on CRON run, simply:
- check URL (implemented)
- if theshold e.g. >3 fails, then: (implemented)
- send email to author (* new)
With a custom email field with tokens. for author email, name & failed URL + node id.
If I have time, I'll patch it myself. May take a few weeks, very busy right now. (We all are, I know.)
Comment #7
hass CreditAttribution: hass commentedYeah, please see #2 for the proposed functionality... I hope to have nothing missed. Would be great to have a patch for the next release. Do not waste much time with point 4... it's an issue, but not so important :-).
Comment #8
chadd CreditAttribution: chadd commentedi would also like to see this implemented.
the functionality in #2 looks fine to me.
Comment #9
r_honey CreditAttribution: r_honey commentedHi everyone, the attached zip provides mailing functionality. It contains minor patches to 2 linkchecker files, linkchecker.module and linkchecker.admin.inc.
Then there is an additional file called linkchecker.mail.inc in the includes folder. This file contains the actual code for mailing on founding broken links.
The first patch to .module file adds a couple of lines to incorporate mailing at appropriate time during link checking. The patch to admin file adds options for mailing.
I am still testing these patches. But tought of sharing them here.
Comment #10
r_honey CreditAttribution: r_honey commentedAs it currently stands, the patched code sends out an email to each user whose node/comment was found to contain a broken link after a cron run.
Features in #2 are highly desirable. However, for implementing them, we might need to maintain when was last mail sent to a user. This can be accomplished by either adding a brand new table, linkcheker_mails, or by adding a last_mail column to the linkchecker_nodes and linkchecker_comments tables.
I would be willing to adapt code for incorporating these features. As it requires some changes to the module, it might be better if I can make these directly to the CVS. Wonder if the module owners would be willing to provide me CVS access for this module!!
Comment #11
hass CreditAttribution: hass commentedThank you for sharing your work. Do you know http://drupal.org/coding-standards?
1. Try to use require_once conditionally in hook_init()
2. Patch contains context sensitive bugs in translatable strings. (e.g. $email_token_help)
3. Always use db_placeholders() and never variables in an SQL strings.
4. Always make a comma on the end of arrays and a newline '#collapsible' => FALSE);, see core for examples for better code readability.
5. Code indentation does not meet drupal standards
6. All titles need to be upper-case first, lower case rest, e.g. t('Broken Link Email Subject') -> t('Broken link email subject')
7. Never post ZIP files, please. Post one patch. This module is also monitored by the code review and testing robot... so we know if the patch applies and all tests are still working.
8. if you post a patch, set the case to "needs review".
Comment #12
hass CreditAttribution: hass commentedThis patch looks a bit like a spam engine to me!?!? We need to make sure that mails are not send on every cron run. Many people may run cron every 5 minutes and this would cause status emails every 5 minutes. Individual users MUST be able to configure - send me an report once per day, once per week (default), once per month. I thought in past - using a table named linkchecker_users. Most of the points in #2 are really important and a must have to me. Without this features the functionality is not really usable for the most people. The day should also be customizable as many people don't like to get such type of to-do's on sunday/monday on their desk... :-)
Comment #13
r_honey CreditAttribution: r_honey commentedWell, maybe yes. As I said, satisfying some of the requirements in #2 required schema support, and I stopped short of making changes to the schema.
linkchecker_users with columns for whether a user wants the email, last time an email was sent etc. would satisfy most of the points in #2.
Regarding the coding standards, well I use Zend Studio 7 for development, and I have configured the development environment to mimic the coding standards as closely as possible (indentation etc.). But I was not able to completely set the options as pr the Drupal standards.
For db_placeholders(), the user input has been properly escaped before (db_escape_string). However, I do accept that I should have been more vigilant for things like capitalization, translatable string etc.
Comment #14
r_honey CreditAttribution: r_honey commentedOkay, the attached files should satisfy everything in #2, and in #12 (except early warning email before unpublishing a node).
There's a global Admin setting which controls whether linkchecker emails are sent at all. Users can further opt-in/opt-out of linkchecker emails, together with minimum number of days between emails and days on which to receive an email if at all.
The first of the 2 attached file patches linkchecker.module, linkchecker.admin.inc, and linkcehcker.install to add support for sending emails. The second is a completely new file that need to be put in the included folder with the name linkchecker.mail.inc.
As you would be knowing, cvs diff does not diffes newly added files. I develop an Win machine, and therefore a couple of options mentioned in doc pages (fakediff etc). did not apply to me.
The patches are for the latest version of 6.x-2.x branch of the code downloaded this morning by me. Everything is taken care of, except for providing an upgrade path to add the new schema table, linkchecker_users. I thought the module maintainers would be able to decide the upgrade path better. So, these patches should be tested on a new install of linkchecker.
To @hass, I have tried to follow coding standards (although I followed last time also):
1) I could not find a doc page saying require_once in hook_init.
2) $email_token_help was actually copied from the core user module. And this is how it was in the admin file for the core module also.
It might be possible that I left out one or two of your earlier suggestions. However, if if you think the patch adds value to the module, those issues can always be taken care of before committing to the CVS.
Comment #15
r_honey CreditAttribution: r_honey commentedStrange, but the files attached by me did not show up in the previously posted comment.
Comment #17
r_honey CreditAttribution: r_honey commented#15: mail.patch queued for re-testing.
Comment #19
hass CreditAttribution: hass commentedThere is no rule except the one I made :-). Normally it's best practice to include files only if they are required. So - if we send mails, we should include it - otherwise not. The logic will change in D7... but we can see this the hook_init workaround in many modules.
You are using 4 spaces for one tab, in Drupal we use 2 spaces for one tab.
Also make a comma after TRUE and a line break so that the bracket is in the same line where $form starts. Nitpicking, but clean code :-).
No idea who have written this, but if this is in core, it looks like a bug :-). I think it need to be: t('Available variables are: !site, !user_link, !uri, !uri_brief, !username, !mailto, !date.')
The only idea I can come up is - the variables are text and not placeholders... this may be a reason.
t('Broken link email subject')
t('Broken link email body')
I hope I will find some time soon to give this great feature a test.
I'm also using Windows and I have not heard that a file need to be faked... I'm using Eclipse... but your patches seem to have real issues... this need to fixed first. If you don't get the green light here - something should wrong with your patch.
Comment #20
r_honey CreditAttribution: r_honey commentedRegarding the $email_token_help variable, the text that is outside the t() are really placeholders, that the code replaces in the mail subject/body.
Because these are replacement patterns, maybe therefore they are outside theming.
Regarding patches, well I am really surprised here. The patches have been created by cygwin's cvs package by diffing them to the Drupal CVS. I really have no idea why these patches are having issues.
Even more confusing to me is the fact that once a patch has failed testing, why is there an option for retesting it. If it has failed once, it would fail again.
I acknowledge my deviation from Drupal coding standards. But apart from that, I really believe that the code does exactly what it advertises. Would wait to see your feedback on the code (functionality, not appearanace;)).
Comment #21
hass CreditAttribution: hass commentedI have fixed the last code style issues.
Fixed:
1. Update hook was missing to add linkchecker_users table to existing installations
2. Tested includes/linkchecker.mail.inc inclusion in attached patch and it works well with Eclipse. I suggest you change to Eclipse - it's a great tool :-)
Untested:
3. db_placeholder in mail.inc is completly untested
4. require_once should be enough to be added once in .module, no need in admin or have I missed something? module_load_include is the normal way to load the .inc. May change to hook_init(). Needs more investigation.
5. Have you tested the schema creation with 'foreign keys' definition? It's a new feature in D7... I'm not sure if it would cause failures in D6. Never tried it yet.
6. UI strings and schema strings should get a review.
7. We may need to remove the & from &$edit and &$user for PHP 5.3 compatibility
8. Variable naming in linkchecker_user may need to be prefixed with linkchecker and more general about linkchecker_mail_* - Need to think more about this.
9. We should add some tests if possible :-)
Comment #22
hass CreditAttribution: hass commentedThe patch is not correctly named. Re-attach for robot
Comment #24
hass CreditAttribution: hass commentedNew try without suffix
Comment #26
Anonymous (not verified) CreditAttribution: Anonymous commentedA patch with the name ending in -D6.patch will not queued for testing; that is why that patch is not tested. See the description for the allowed attachments reported at the bottom of this page.
Comment #27
r_honey CreditAttribution: r_honey commentedTo @hass:
1) Regarding hook_update, I mentioned it in the post earlier that I left hook_update to the module team (did not had a look at the update numbers ;))
2) Changing to Eclipse - Well, I will give it a try when I get time. But I think Zend Studio is a great IDE (I know it's further based on Eclipe).
3) I have found a minor issue in the code. In linkchecker.mai.inc, this line:
$receive_mail = $user->receive_mail == NULL ? 1 : $user->receive_mail == NULL;
should be changed to:
$receive_mail = $user->receive_mail == NULL ? 1 : $user->receive_mail;
4) I did not find a reference to db_placeholder in mail.inc. I was not quite clear what you meant here.
5) Yes, require_once should have been sufficient once in .module file.
6) Schema creation with foreign keys - Yes, on my D6 install, it did not cause any issues. I haven't gone into D6 core Schema API code, but my understanding is that in all Drupal API methods which expect arrays with specific keys (Form API, Schema API etc.), the core API simply ignores keys in the arrays it does not understand/expect.
So, just left that foreign key part to be able to use it on both D7 & D6 (where it should be ignored).
8) Variable naming in linkcheker_users - I am not sure, but probably you are talking of the table columns here. I don't see any reason for this. The module does not hook into user_load and loads the column data when a user objects load.
linkchecker_users is rather a private table for the module. The only place where data is fetched from this table is in mail.inc, in the _linkchecker_notify_broken_links method. Following is the select statement for this:
SELECT u.*, lu.receive_mail, lu.min_days, lu.days, lu.last_mailed
We can alias columns here if required.
9) I was thinking about scrip timeout. The current behavior of the module is stop once it uses over half of the maximum execution time. Do we need to change this to let's say for 1/3 of the execution time, if the module setting variable linkchecker_mail_broken is set the TRUE, meaning that mails are to be sent.
This is because sending mails can itself take some time, especially is an installation uses the smtp module to connect to an external smtp server (e.g. smtp.google.com) for sending out mails.
Comment #28
hass CreditAttribution: hass commentedComment #29
hass CreditAttribution: hass commented#24: linkchecker_mail.patch queued for re-testing.
Comment #30
hass CreditAttribution: hass commented@kiam: D6 code cannot tested at all or what does Setup environment: failed to drop checkout database. mean?
Comment #32
Anonymous (not verified) CreditAttribution: Anonymous commented@hass: I was referring to the description reported in the fieldset
; it could be the text needs to be updated, as the description seems to report that tests for Drupal 6 are not done (which doesn't seems true, as tests for Drupal 6 modules are correctly done). It could also be there is code that filters out the patches to test basing on the file name, and that code has not been removed.Comment #33
scubasmurf CreditAttribution: scubasmurf commentedsubscribing
Comment #34
hass CreditAttribution: hass commentedFixed Unix LF's
Comment #36
r_honey CreditAttribution: r_honey commentedhass, I pointed this out above also. This line:
$receive_mail = $user->receive_mail == NULL ? 1 : $user->receive_mail == NULL;
in linkchecker.mail.php needs to be changed to:
$receive_mail = $user->receive_mail == NULL ? 1 : $user->receive_mail;
Also, would it make sense to reduce the linkchecker script timeout if email sending is enabled in these lines in linkchecker.module:
We can reduce the time to $max_execution_time/3 or something suitable to leave sufficient time for email sending.
Comment #37
hass CreditAttribution: hass commented*Damn* robot, there was one Windows LF inside... also fixed #36.
I'm not sure about $max_execution_time/3. "$max_execution_time/3" is really a short timeframe - at least if people have only 60s $max_execution_time on shared hosts (nevertheless we are not able to support such *short* cron runs). Have you seen cron failing?
Sending Mails shouldn't be a problem at all... if a mail hasn't been send it should be send at a later cron run, isn't it? We should only prevent cron failures with this logic as this may break a site. I do not feel strong about a delayed email... We may can change the logic a bit that sending emails is also calculated into the time frame of $max_execution_time/2. D7 also have some changes we can use...
Comment #38
r_honey CreditAttribution: r_honey commentedI agree that sending emails should not be a priority that much. But one issue might be that if a timeout prevents emails from being sent for nodes that are found to contain broken links, subsequent cron runs would also flag those nodes with broken links, and unless the overall number of nodes with broken links reduce (either by members who received mail or by administrators), such members who did not receive mail on a cron run might also not receive it on subsequent runs.
Comment #39
hass CreditAttribution: hass commentedHow about running the mails upfront and than do the linkchecking? Than you are not sending the mails after a check, but in the very beginning of the next cron run one hour later and at least before you do the next checks. Dirty, but should work. Normally this is really no problem as links are only checked once per 4 weeks (default) and I strongly discourage using any other value shorter than 4 weeks...
Comment #40
r_honey CreditAttribution: r_honey commentedHmmm... that might actually be a good idea to spread out load during a cron run. As I said earlier, I myself use a third-party SMTP server for sending out mails on my Drupal installations. This might take some time depending upon the number of mails that need to be sent out.
Implementing this would require some changes to the code. One thing that immediately comes to my mind is to store a module variable linkchecker_scan_time. This stores the last timestamp when module scanned content for broken links. At the beginning of the cron run (for sending emails), we execute the Sql statement that fetches all users whose content have broken links and whose last email time is less than the last scan time. After an email is sent, the user's last email time is updated immediately (instead of a batch at the end when all emails have been sent).
This would ensure that no user gets duplicate emails in case email sending is terminated in between due to script timeout or something else (This is where I cherish ASP.NET more, the ability to create background threads instead of blocking the UI threads and imposing timeouts on the UI threads).
Comment #41
r_honey CreditAttribution: r_honey commentedAnother issue I found with the code... In the hook_user, we show the linkchecker settings only for registered users (but not while creating a new user etc.).
Therefore validate for hook_user should only validate if an existing user is being edited. Therefore, inside hook_user in linkchecker.module, 2 lines need to be added:
This means the validate case should change to:
Comment #42
hass CreditAttribution: hass commentedPlease update the patch
Comment #43
r_honey CreditAttribution: r_honey commentedResolution to #41
Comment #44
j0nathan CreditAttribution: j0nathan commentedsubscribing
Comment #45
itrivino CreditAttribution: itrivino commentedSubscribing. This feature should be part of the module. A great module!!
Comment #46
camaraderie CreditAttribution: camaraderie commentedSubscribing, unless anyone knows how to do this using Notifications?
Comment #47
rkodrupal CreditAttribution: rkodrupal commentedsubscribing
Comment #48
hass CreditAttribution: hass commentedD7 need to be done first
Comment #49
hass CreditAttribution: hass commented#43: linkchecker_mail4.patch queued for re-testing.
Comment #51
jastern CreditAttribution: jastern commented+1 vote for this patch! :-)
we have a drupal 6 installation with many pages, and while it is not a money-making site, it is a professional university site that faculty, staff, and students all count on, so unpublishing a node -- even temporarily -- is just not an option -- we would catch too much flack from our departments. it would be much better to send email notifications to authors with all the specifications mentioned above (esp #2).
we were just about to write our own rule or even a separate module to scan through the linkchecker table and find such pages, then email the author, but saw this patch.
what's the status of this patch?
thanks!
Comment #52
hass CreditAttribution: hass commentedSee #50
Comment #53
Yuri CreditAttribution: Yuri commentedWould be nice to have this as a Rules trigger, this gives the most flexibility and is probably the most simple solution.
Comment #54
backslash856 CreditAttribution: backslash856 commentedsubscribe
Comment #55
backslash856 CreditAttribution: backslash856 commentedI suppose patch #43 is for drupal 6 only, isn't it? For drupal 7 there are some solution or work around?
Thanks