Hi there,

Could anyone recommend an approach to warm the cache for auth cache roles. Seeing as each role has a separate cache, how should this be done…? Any help would be appreciated.

John

Comments

johnald’s picture

What I would also like to ask is whether it is possible to add data to the specific authcache through php. More specifically through a webcrawler... like as suggested here: https://drupal.org/node/1681700
What function (either built-in or from authcache module) would allow me to do this.

znerol’s picture

Please take a look at the Cache Warmer drush extension. It supports crawling with authenticated users.

The most prominent problem I see at the moment is that authcache will not save a page to the cache if the has_js cookie is not on the request. However it is possible to simply pretend that the cookie is there by directly manipulating the $_COOKIE superglobal, e.g. in settings.php (see also #2092185: no_cache_reason: "Caching disabled for session (nocache cookie set)"):

if ($_SERVER['REMOTE_ADDR'] == "your.crawler.ip.here") {
  $_COOKIE['has_js'] = '1';
}
jason.fisher’s picture

With that has_js tip, I am successfully using HTTPRL Spider now to crawl as a masqueraded user. It is not necessarily secure (currently relying only on ip_address() matching 127.0.0.1), but it works.

https://drupal.org/node/2172933

Anonymous’s picture

Version: 6.x-1.0-rc1 » 7.x-1.x-dev
Priority: Major » Normal
Status: Active » Needs review

I put together a couple of shell scripts that use wget & an XML sitemap created by the XML Sitemap module to crawl & cache Authcache pages for both anonymous & authenticated users. I have an account I created (named 'authentic') specifically to work with the authenticated user script. I run them daily from cron. This site only uses https, so to get authenticated page cache for http you would need to also hit those pages from http. In addition, if you have multiple combinations of user roles, you will need to do this for each combination to get pages cached for them (this is the way Authcache prefixes cached pages with a unique id for each combination of roles).

For anonymous users:

#!/bin/sh

export DISPLAY=:0

/usr/local/bin/wget -q https://example.com/sitemap.xml --no-cache -O - | egrep -o "https://example.com[^<]+" | /usr/local/bin/wget --header "Cookie: has_js=1" -U "cachewarmer" -q -i - -O /dev/null --wait 1

For authenticated users:

#!/bin/sh

export DISPLAY=:0

site=https://example.com/
name=authentic
pass=<password>
cookies=/var/www/example/example_com-cron-cookies.txt

touch /var/www/example/example_com-ba-cookies.txt
touch /var/www/example/example_com-cron-cookies.txt
chmod 0600 /var/www/example/example_com-ba-cookies.txt
chmod 0600 /var/www/example/example_com-cron-cookies.txt

/usr/local/bin/wget -O /dev/null --save-cookies /var/www/example/example_com-ba-cookies.txt --keep-session-cookies --load-cookies $cookies "${site}user"
/usr/local/bin/wget --keep-session-cookies --save-cookies $cookies --load-cookies $cookies -O /dev/null \
        --post-data="name=$name&pass=$pass&op=Log%20in&form_id=user_login" \
        "${site}user"
		
sess1=$(grep SSESS $cookies | awk -F '\t' '{ print $6}')
sess2=$(grep SSESS $cookies | awk -F '\t' '{ print $7}')
cookieheader=$(print "Cookie: has_js=1;$sess1=$sess2")

/usr/local/bin/wget -q --keep-session-cookies --save-cookies $cookies --load-cookies $cookies https://example.com/sitemap.xml --no-cache -O - | egrep -o "https://example.com[^<]+" | /usr/local/bin/wget --header "$cookieheader" -U "cachewarmer" -q -i - -O /dev/null --wait 1

I hope this will help somebody. Perhaps a script like this can be incorporated into Authcache?

NWOM’s picture

HTTPRL Spider in combination with XMLRPC Page Load allows you to crawl all entities on a site with any user ID of your choosing. You can also define non-entity paths to crawl, as discussed here: #2684849: Crawl Pages with specific Views Exposed Filters active

I also added a feature request to potentially allow dynamic caching based on site usage here: #2684945: Provide Cache Miss View (Uncached Pages)

NWOM’s picture

Version: 7.x-1.x-dev » 7.x-2.x-dev
Status: Needs review » Active

Setting to Active, since no patch is attached.

vadym.kononenko’s picture

I see 'XMLRPC Page Load' can not warm cache as 'xmlrpc' uses POST only method and authcache exludes such method from caching.

See: xmlrpc() -> _xmlrpc()

  $options['method'] = 'POST';

See: authcache_authcache_request_exclude()
See: authcache_builtin_cacheinc_retrieve_cache_page()

  // Only GET and HEAD requests allowed.
  if (!($_SERVER['REQUEST_METHOD'] === 'GET' || $_SERVER['REQUEST_METHOD'] === 'HEAD')) {
    return FALSE;
  }
firewaller’s picture

This module Force JS does what #2 suggests and works for me when using Cache Warmer.

firewaller’s picture

One caveat with using the Cache Warmer module (and any other drush-based cache warming systems) is that Authcache skips active cache backends for drush commands. When using authcache_debug, after I run the cache-warmer drush command I get in the authcache_debug dblog: "Excluded: No active cache backend."

Upon looking for the above error in Authcache, I see that the reason there is no active cache backend is due to the module ignoring drush entirely:

/**
 * Initialize the cache backend module.
 */
function authcache_backend_init($module, $vary_header, $initial_key) {
  if (drupal_is_cli()) {
    return FALSE;
  }
  ...
}

It makes sense that Authcache is not initialized during normal drush commands, but is is possible to implement a bypass specifically for cache warming drush commands?

firewaller’s picture

znerol’s picture

Status: Active » Closed (outdated)

No activity, closing this.