For link checker module integration I need the ability to exit the httprl_send_request() process after a specified time. Everything not completed is no problem at all. Until _linkchecker_status_handling() has not been executed and updated a timestamp in the local database the link get selected again for check on next cron.

Use case:
0. Link checker check links via cron in a background task...
1. We can add 1.000 links to httprl_request()
2. After a limited time the check process need to be exited cleanly so, cron does not fail. Current implementation works this way:

    if ((timer_read('page') / 1000) > ($max_execution_time / 2)) {
      break; // Stop once we have used over half of the maximum execution time.
    }

Aside of this, are we able to add a callback function to httprl_send_request()? e.g. I need to call _linkchecker_status_handling($link, $response); after a link check has been completed. Today I would need to wait for all request results until httprl_request() has completed all checks and than the time to update all results in linkchecker_links table may not enough depending on the number of results. And than cron will be logged as crashed again :-(.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

hass’s picture

As interims solution I'm adding only 8 links, process them, run status code handling logic and run the next 8 checks. I checked nearly 980 links in blocking mode within 2 minutes - what is really great. This is not benchmarked, but around 15-20 times faster than core that was only able to check around 60-80 links per 2 minutes.

See http://drupal.org/node/380052#comment-5526984 for a prove of concept integration patch, that is working in general, but because of so many inconsistency bugs in the httprl module it is not working reliable before the responds object of httprl has been fixed.

mikeytown2’s picture

Status: Active » Postponed

#1268096: Implement a rate limiter needs to be coded up so only 8 URLs are processed at a time. Once that is done, httprl will then process all links in the queue until they are all done or until the global timeout is reached. This is going to be postponed until the rate limiter is in.

mikeytown2’s picture

Status: Postponed » Active
mikeytown2’s picture

Status: Active » Fixed
FileSize
6.11 KB

This has been committed.

hass’s picture

Thanks!

How about executing a callback function/hook with params - after a link check has been completed?

mikeytown2’s picture

Open a feature request

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

Added more info