Hi, does the Redis module offer failover to regular Drupal DB caching if Redis is offline? Thanks!

Comments

pounard’s picture

Category: support » task

Actually it doesn't.

Switching this to task, sounds like a reasonable feature.

rahim123’s picture

Thanks very much for looking into this.

This is definitely not an elegant or recommendable solution, but I ended up using this hack in my settings.php:

exec("redis-cli -h 127.0.0.1 -p 22253 ping", $output); 
if ($output[0]=="PONG") {
    # All the code to configure Drupal to use Redis for caching. 
} 

Thanks again for the Redis integration for Drupal!

greggles’s picture

@pounard - any sense of how this should be implemented? The solution in #2 seems suboptimal to me since it would do an exec on every page load. At the minimum it seems like a variable should be set/get which controls whether to use redis or not and, if that variable is disabling the cache, redis availability could be checked a certain percent of the page views.

omega8cc’s picture

We do this in the global.inc file included in every settings.php file:

$redis_up = FALSE;
if (file_exists('/var/run/redis.pid')) {
  $redis_up = TRUE;
}

Then:

if ($redis_up) {
    $conf['redis_client_interface'] = 'PhpRedis';
    $conf['redis_client_host'] = '127.0.0.1';
    $conf['redis_client_port'] = '6379';
   // etc.
}

Not ideal, and /var/run path needs to be listed in the open_basedir if used, but it works and file_exists() is cached in PHP (at least when the file exists).

You could use just: if (file_exists('/var/run/redis.pid')) {} instead of if ($redis_up) {}, but we do this to be able to re-use the variable instead of checking the file existence every time.

greggles’s picture

@omega8cc - thanks. So, I guess that means that redis runs on every webserver. Is that right?

omega8cc’s picture

@greggles - Yeah, this check will work only when Redis runs locally. Note that it is important to make sure if Redis is really available, especially when you disable Drupal caching. We have experienced that this may cause all caching effectively disabled (ouch!) when Redis is UP, but the password doesn't match, if used. In this case checking just for /var/run/redis.pid is not enough.

I think that the proper failover should periodically (not on every request, but say, every 10 seconds) check Redis response (probably with ping, so it would work also when Redis is on another system/machine) and store the result locally, even as a local pseudo "pid" file, to avoid database overhead.

[EDIT] And this monitor should be just a simple bash script, run from cron, while in PHP we could still just check for the "pid" existence.

pounard’s picture

Re, sorry long time no answer! I'm quite busy those days!

I think that the safer implementation would be to nest the redis backend into a dummy implementation that would catch exceptions and act as a null implementation in case of any error. The problem is that we cannot chain with another backend (e.g. database for example) because we would experience cache entries desync.

EDIT: The implementation where we ping the Redis server could be a viable alternative, but cannot work alone, if you use the Redis server after it went down and before you do the ping, you will experience excpetions the and site will break: both need to coexist.

greggles’s picture

EDIT: The implementation where we ping the Redis server could be a viable alternative, but cannot work alone, if you use the Redis server after it went down and before you do the ping, you will experience excpetions the and site will break: both need to coexist.

This is where my proposal in #3 shines - by using a Drupal variable to indicate whether Redis is working well the variable can be set by code that catches that exception. Then the next request will not attempt to use the Redis cache.

pounard’s picture

Oh okay, didn't caught that! I still think this code should belong to the code arround the cache backend and not the cache backend itself.

omega8cc’s picture

@greggles - But how would you access/read that variable to decide if to use Redis or not, without bootstrapping Drupal high enough from the settings.php level? It sounds like a rather expensive method.

[EDIT] And then remember also about cache_bootstrap bin.

greggles’s picture

Yep...good point :/

pounard’s picture

True. I'm thinking about writing a generic cache backend decorator that would work with any backend able to catch exception and switch to a null objet implementation in case of error. Still not the best solution but that would definitely be a good failover. The price would be a more complex configuration in settings.php file, but that might live in its own module and be usable for everyone. What do you think of this solution?

This decorator could be highly configurable, using for example APC or a file to store its configuration locally if available, and do that on a per frontend basis.

bibo’s picture

Sorry to be a possible party pooper, but wouldnt MySQL-cache failover in Drupal case mean that the cache may contain stale data?

I mean, all cache flushes go to the cache bin backend which happens to be in use at the moment the flushing is executed. If the cache bin(s) change (Redis->MySQL or memcached->MySQL) after a flush, the other cache may contain stale data and lead to pretty much any kind of weird problems. I can only imagine that a secondary cache (probably only MySQL) also needs to be flushed in sync. Or do you know another well performing way out of that?

Also, from what I've read, Redis is very stable, and not so easy to bring down, so a fallback might not be that necessary.

pounard’s picture

Back from vacations, time to make it active back again.

pounard’s picture

#13 is right. I don't think it worthes anything to try to implement such failover by cascading cache backends: the complexity would be too high and normal site runtime would be less performant. IMO the best approach to have is either fail, either behave as a no-op cache backend. Any suggestions or anyone has arguments I didn't think of to make me wrong? Does anyone ever experienced such failures often any to need a downgrade?

In the meantime, I'll explore ways to downgrade using no cache to see if that's easy to make it generic and use for other cache backend.

generalconsensus’s picture

Not sure if it's much help but this works for me in production to gracefully degrade...

$redis = new Redis();
$redis->connect('127.0.0.1', 6379);
if ($redis->IsConnected()) {
  $redis->auth('optional');
  $response = $redis->ping();
  if (strpos($response, 'PONG')) {
    $conf['redis_client_interface'] = 'PhpRedis'; //Choose your poison
    $conf['redis_client_host'] = '127.0.0.1';
    $conf['redis_client_port'] = 6379;
    $conf['redis_cache_socket'] = '/tmp/redis.sock';
    $conf['redis_client_password'] = 'optional';
    $conf['cache_prefix'] = $_SERVER['SERVER_NAME'] . '_';
    $conf['cache_backends'][] = 'sites/all/modules/contrib/redis/redis.autoload.inc';
    $conf['cache_class_cache'] = 'Redis_Cache';
    $conf['cache_class_cache_menu'] = 'Redis_Cache';
    $conf['cache_class_cache_bootstrap'] = 'Redis_Cache';
    $conf['lock_inc'] = 'sites/all/modules/contrib/redis/redis.lock.inc';
  }
  $redis->close();
}
omega8cc’s picture

#16 works perfectly for us. Thanks for sharing!

generalconsensus’s picture

@omega8cc Just as an FYI the client that is viewing the site might end up having to an inordinate amount of time before the site hits the backup database cache. This I believe is due to the PHP Redis drivers timeout...

omega8cc’s picture

Yeah, there is the price due to the forced delay caused by timeouts default, but I think it is OK (or could be lowered probably), since the failover should be able to fallback to the other caching backend just as-soon-as-possible (not instantly). Also because SQL backend will have initially cold/empty bins (or worse yet, some garbage left there), so it is not going to be a smooth experience during the switch, no matter what. But it's OK, since it serves the purpose of having one or another caching active ASAP, to avoid having effectively no caching at all if Redis goes down for any reason. And the PING/PONG is far better/reliable than my previous dirty and heavy trick with redis pid file check.

pounard’s picture

The best approach would probably be to have a backend chain, using a chain of reponsability pattern of the read operations. and a chain of command pattern for the write operations. This way the most viable backend even if not read would always be up to date (and in theory we're not supposed to write so often so that should not be a bottleneck). But that would be an issue for another module to resolve.

generalconsensus’s picture

One other thing to be aware of for folks with authentication built into their Redis box. You going to want to set your authentication below the IsConnected method so that you don't get false positives with regard to the ping/pong response. Mentioned below:

if ($redis->IsConnected()) {
  $redis->auth('whowantstogotothehospital\123111');
  $response = $redis->ping();
  if (strpos($response, 'PONG')) {

One more point. If your password has forward slashes " \ " you will need to escape them in your:

$conf['redis_client_password'] = 'whowantstogotothehospital\\123111';
vinmassaro’s picture

Issue summary: View changes

FWIW, the solution in #16 does not work for me with Redis 7.x-2.11. I'm not sure if this used to work, or just has not really been tested. Here is how I tested:

1. Log into site with Redis enabled, verify keys are created in Redis via redis-cli
2. Stop Redis daemon
3. Refresh page in site and receive PHP Fatal error: Call to undefined function cache_get() in /data01/d7/includes/module.inc on line 723

We are currently testing Redis with a master and a slave, with Redis Sentinel and were hoping that if Redis on the master went down, Drupal would fall back to MySQL until Redis Sentinel promoted the slave to master. Below is our settings.php config:

$redis = new Redis();
$redis->connect($redis_host, 6379);
if ($redis->IsConnected()) {
  $response = $redis->ping();
  if (strpos($response, 'PONG')) {
    $conf['redis_client_interface'] = 'PhpRedis'; //Choose your poison
    $conf['redis_client_host'] = $redis_host;
    $conf['redis_client_port'] = 6379;
    $conf['cache_backends'][] = 'sites/all/modules/contrib/redis/redis.autoload.inc';
    $conf['cache_default_class'] = 'Redis_Cache';
    $conf['lock_inc'] = 'sites/all/modules/contrib/redis/redis.lock.inc';
    $conf['path_inc'] = 'sites/all/modules/contrib/redis/redis.path.inc';
 
    // Do not use Redis for cache_form or cache_metatag bins.
    $conf['cache_class_cache_form'] = 'DrupalDatabaseCache';
    $conf['cache_class_cache_metatag'] = 'DrupalDatabaseCache';
  }
  $redis->close();
}
vinmassaro’s picture

Followup to my issue in #22: It turns out that $redis->IsConnected(); returns 1 even if Redis is down, so it goes on to attempt the redis ping which causes a fatal error. I refactored this snippet to connect and then try/catch on the Redis ping which throws an exception if Redis is down. I repeated the test in #1 and saw no white screens or errors when taking Redis down and failing over to MySQL.

$redis = new Redis();
$redis->connect($redis_host, 6379);

try {
  $redis->ping();
}
catch (Exception $e) {
}
if (!isset($e)) {
  $conf['redis_client_interface'] = 'PhpRedis'; //Choose your poison
  $conf['redis_client_host'] = $redis_host;
  $conf['redis_client_port'] = 6379;
  $conf['cache_backends'][] = 'sites/all/modules/contrib/redis/redis.autoload.inc';
  $conf['cache_default_class'] = 'Redis_Cache';
  $conf['lock_inc'] = 'sites/all/modules/contrib/redis/redis.lock.inc';
  $conf['path_inc'] = 'sites/all/modules/contrib/redis/redis.path.inc';

  // Do not use Redis for cache_form or cache_metatag bins.
  $conf['cache_class_cache_form'] = 'DrupalDatabaseCache';
  $conf['cache_class_cache_metatag'] = 'DrupalDatabaseCache';
}

$redis->close();
omega8cc’s picture

I have just tested this again and the recommended failover just works, no extra tricks were required:

  $redis = new Redis();
  $redis->connect('127.0.0.1', 6379);
  if ($redis->IsConnected()) {
    header('X-CONNECTED-Redis: YES');
    $redis->auth('password');
    $response = $redis->ping();
    if (strpos($response, 'PONG')) {
      $redis_up = TRUE;
    }
    $redis->close();
  }
  else {
    header('X-CONNECTED-Redis: NO');
  }

While Redis was running, the request generated expected debugging header:

X-CONNECTED-Redis: YES

After shutting down redis-server it automatically switched to mysql backend and updated/activated all involved cache db tables properly, while generating header:

X-CONNECTED-Redis: NO

This means that something else must be incorrect with your setup, configuration or even PHP extension used, because the logic itself is correct and works as designed. Or rather that this logic works and was previously tested only with single Redis server used. No idea how it may or may not work with Redis multi-instance setups, though.

omega8cc’s picture

I have just tested your method with catch (Exception $e) and it also works with single Redis instance, as expected. Actually, it doesn't work for me at all after further testing, while the original method works just fine. probably because there is nothing to catch when Redis is completely down.

vinmassaro’s picture

@omega8cc: You're right, this doesn't work for me now when I test locally. I'm going to do some more testing next week to try to see why we saw this behavior. Thanks for testing.

vinmassaro’s picture

@omega8cc: Ok, we made progress on our issue today. The reason the initial code did not work for us is because we have an F5 load balancer in front of Redis, with Drupal connecting to a virtual IP. When we stopped the Redis daemon, Drupal was still receiving a successful connection response to the load balancer, which explains why I was still receiving 1 from $redis->IsConnected() in #1824146-23: Redis failover?. We're going to see if there's anything we can do about the response from the load balancer, but if not, I think this is why the try/catch code worked for us. Let me know what you think.

vinmassaro’s picture

@omega8cc: We're now using a profile at the F5 that checks all Redis hosts before responding to the client, so we get a proper response from $redis->IsConnected(); now. Thanks for testing since it led us to debugging this issue further!

omega8cc’s picture

Thank you for the update!

pounard’s picture

Status: Active » Closed (won't fix)
Related issues: +#2556097: Support cluster connections

This module will never provide any database fallback, it would be too dangerous to try to read stalled data. Nevertheless, for high-availability configurations, 3.x versions can connect to a sharding proxy transparently and cluster support is planned, see #2556097: Support cluster connections.

socialnicheguru’s picture