http://drupal.org/project/httprl
Working through some issues that recently came up in HTTPRL, but it's something to look into using for your crawler. HTTPRL is a standalone module for the most part. It can call theses functions from drupal inside the .module file.
drupal_convert_to_utf8() - Only If stream_socket_client failed (I can work around this but doing an include of includes/unicode.inc
is all that's needed).
drupal_generate_test_ua() - Only if $GLOBALS['drupal_test_info'] is set (this bit of code I could kill).
variable_get - in bootstrap.inc
request_uri - in bootstrap.inc
VERSION - in bootstrap.inc
HTTP_REQUEST_TIMEOUT in includes/common.inc. I can selectively define this if needed.
MENU_CALLBACK - in includes/menu.inc. Only called if httprl_menu() is called.
MENU_NORMAL_ITEM - in includes/menu.inc. Only called if httprl_menu() is called.
Comments
Comment #1
perusio CreditAttribution: perusio commentedYes but it requires to bootstrap Drupal to a higher level in order to use the module. Does it not? I thought about it and that was the main reason I left it out. I could do a simple
require
orinclude
of the module files but I'm uncertain which Drupal API functions it makes use of.Are the two you state above the only Drupal API functions used?
It certainly would be nice to have it as a simple parallel option. The Nginx Lua module is very powerful but it's not the route most people feel comfortable following, I suspect.
EDIT: Can you create a version of the module that has an include file providing the needed stuff from the Drupal API, therefore not
requiring bootstrapping Drupal above the DB layer?
Comment #2
mikeytown2 CreditAttribution: mikeytown2 commentedModule loading code for D7.
Also need this patch #1426886-1: Allow HTTPRL to operate at the database bootstrap level. or to grab the latest version from git.
Comment #3
perusio CreditAttribution: perusio commentedOk. Thanks. It's now on the TODO list and it will be part of the next release.
Comment #4
perusio CreditAttribution: perusio commentedOk. Thinking out loud. Provide a
--with-httprl
that can be empty or passed a /path/to/module for people not whishing to install another module but just want to take advantage ofhttprl
parallel abilities just for the crawler.When
--with-httprl
is empty it assumes that the module is installed.Comment #5
mikeytown2 CreditAttribution: mikeytown2 commentedCode for httprl has been fairly stable. Have you thought about implementing an option for using it?
Comment #6
perusio CreditAttribution: perusio commentedI have. But lacking time :( I recently moved from one country to another and just now I'm recovering my work rhythm. I'll do it ASAP.
Hopefully there will be a drupal meetup in April here in Paris and I will talk about microcaching with `httprl` as an option.
Comment #7
mikeytown2 CreditAttribution: mikeytown2 commentedLatest dev of httprl requires no drupal bootstrap now.
Comment #7.0
mikeytown2 CreditAttribution: mikeytown2 commentedadd in more things that are in core