The excellent link checker modul has a function _linkchecker_unpublish_nodes($lid) that unpublishes nodes containing invalid links.

The flexibiliy of the module could be greatly enhanced by triggering actions based on the outcome of the test results. Refer to the screenshot ... if the trigger mechanism is included to link checker, you could configure something like this

  • Trigger: If link checker reports a 404
    => Action: unpublish node
  • Trigger: If link checker reports a 302
    => Action: Custom Action: change Node's Workflow State to Verify
  • Trigger: If link checker reports a 301
    => .....
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

hass’s picture

Please provide a patch.

ducdebreme’s picture

FileSize
3.32 KB

Hi Alex
i have attached a patch. You need Actions und Trigger enabled.
At /admin/build/trigger/linkchecker you can assign Actions to be called, when Linkchecker encounters special HTTP codes.
Build on 6.x-2.1
PS Guide for building Triggers: http://drupal.org/node/375833
Thanks
Stefan

hass’s picture

I will take a look to your patch. I'm not a friend of using hook_init(), but in general it should be much better to use triggers as the issue #249701: Email notification for node authors may also become obsolete by this change.

We should not forget to remove the current 301, 404 settings and all outdated variables + we need to update from the old settings to the new triggers via an update hook. Plus we should write some text to the old settings fieldset where the new settings have been moved too :-).

hass’s picture

Status: Active » Needs work

Patch needs work, 404 also contains the call to _linkchecker_unpublish_nodes(). As more I'm thinking about _linkchecker_unpublish_nodes() - it looks per unformant to me. I need to think about going abroad the node_load API call and update the node table directly... should be much quicker if we think about 1000+ nodes having the same link in the content. But not using the API is also a no-go...

hass’s picture

The patch is missing the "actions" the module have currently implemented.

hass’s picture

I've renamed a few variables, the filenames and folders and added a help text.

This still needs actions to be implemented.

ducdebreme’s picture

Do You mean, we should ...

  • ... move the current 301 link update functionality into an action?
  • ... remove the _linkchecker_unpublish_nodes() and use the standard node_unpublish_action instead?

I don't think, we need more actions to implement as there are plenty available. Or did i forget anything?

hass’s picture

Yes, I thought you'd like to do the same... better integration with D6 core functionality would be great and solve many other possible feature requests (like notifications), too. And save the module from duplicating others efforts... :-)

We only need to define actions depending on a number of check fails (for e.g. unpublish after 3 fails)... not sure if we are able to reuse the node_unpublish_action. We need to be able to unpublish more than one node per execution, but _linkchecker_unpublish_nodes may be adapted.

ducdebreme’s picture

Hi Alex
thinking about it. You will here from me.
Stefan

hass’s picture

We should also implement triggers that:

1. Allow to change method HEAD to GET automatically if it HEAD fails.
2. Disable link checking if GET also fails (maybe)

hass’s picture

May be required to add dependency with Rules module. It provides configurable conditions http://drupal.org/node/298534 what we may need for the "unpublish after N fails"

hass’s picture

@ducdebreme: I've fixed nearly all remaining issues in the queue... if we are able to get a working patch - this could go in before the next release, planed within the next day(s). I may only add the feature #517174: Auto-update could prefix urls and integrate with other modules and then create a new release if you are not able to work on this feature... the module is nearly feature complete and I'm not aware about any remaining features or bugs except cURL support that is difficult to solve. Therefore it may be a release for a very long time and may postpone this trigger stuff to far future.

sinasalek’s picture

subscribed

hass’s picture

Khalor’s picture

What's the status on this? Did it make it into dev?

hass’s picture

Nobody is working on it and the case is short before getting closed for inactivity reasons.

gnindl’s picture

Status: Needs work » Needs review
FileSize
5.73 KB

I further worked on patch #6 which is now included in this patch (you only need this patch to get it working).

Apart from the trigger there's now an action which changes the HTTP request method from HEAD to GET and conducts the request. If successfull the link in linkchecker_link is updated. This is needed for Youtube links as Youtube doesn't support HEAD requests yet.

I also changed the trigger providing a link object and the node object as context.

Testing:
1. Just go to your trigger interface, i. e. admin/build/trigger/linkchecker
2. On "Trigger: Content contains links with HTTP status code 404" assign the action "Process HTTP GET method if HEAD request fails"
3. Run cron

To boost a spefic data row just provide a lower timestamp in last_checked column of linkchecker_links table, as cron only runs over a subset of links (the oldest ones).

Status: Needs review » Needs work

The last submitted patch, linkchecker-trigger-change-http-get-action.patch, failed testing.

hass’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, linkchecker-trigger-change-http-get-action.patch, failed testing.

sebos69’s picture

Hi, I'm really interested in seeing this feature included to link checker!

spade’s picture

subscribe

DynV’s picture

Subscribing

I'm especially interested that there would be a trigger when a node is unpublished on file not found error (after the chosen number of time it occur, so just as it unpublishes). When I remember I have that option enabled, it work up my nerves as I wonder how long a node has been unpublished (without my knowledge) ; assigning an action to that trigger would reassure me.

hass’s picture

Version: 6.x-2.x-dev » 7.x-1.x-dev
jelo’s picture

Actions/rules integration would be absolutely fantastic for this module and would open up so many possibilities. I just posted in the feeds module about options to use linkchecker:
http://drupal.org/node/1240366
If you automatically import content with feeds from external sources, this module could be superb in making sure that these external resources have not been removed.

Linkchecker seems to have 3 components:
- scanning the selected nodes for links (internal / external)
- checking the links (triggered through cron)
- error handling

It seems to me that error handling could be better handled completely by rules, i.e. removing it from the global settings. That would allow us to be way more granular. Right now, I can only set "unpublishing nodes" for all nodes and fields at the same time. With rules, I might be able to say:
- if any link on node type y returns 404, send an email to token
- if specific link in CCK field returns 404, unpublish the parent node (e.g. for my feed import example

Would it be possible to:
- create a trigger/event "Broken link is detected"
- add condition how often the link has to be recorded as broken
- add conditions for the http response headers (404, 301 etc.)
- add condition in what element the broken link was detected (e.g. body field, block, specific CCK field etc.)
- there should already be plenty of actions to handle any responses to finding broken links
I am not able to write/extend the patch myself, but would be happy to test this!

JCB’s picture

Rules integration will an excellent feature.

Would be great if it can be added to D6 as well.

VladimirAus’s picture

Issue summary: View changes
Status: Needs work » Postponed (maintainer needs more info)

Task was updated more than 9 years ago. 🏰
Marking as outdated.
Please update description and info if it is still valid.