I have a site which is heavily geo-localized. Because of this I'm not using Drupal's standard block or page cache. Each page is served upon request, and I'm using contrib and custom caching techniques to keep my page loads to a reasonable level. Because of this I need to fight for every millisecond in a page request.

I'm recently moving my site off php-memcached and on to redis and during this process I'm performing some benchmarks via XHProf.

I noticed that while redis (using PHP-Redis plugin) appears to be around, if not faster than memcache. That's wonderful, as I get additional features for free and persistance with redis. One downside I am noticing though, is a large increase in memory consumption per page request. This doesn't bother me though, as memory is cheap.

An area with this redis module which stands out as a place to improve is that Redis_Cache_PhpRedis::get seems to consistently spend 50% of it's time performing PHP's unserialize() (at least in PHP 5.3). I've consistently noticed this on pretty much every page request I've tested. If this could be reduced, then Redis would out perform memcache by around 100%

Example:

Redis_Cache_PhpRedis::get took 84ms and called 103 times.
unserialize() took 40ms and called 101 times.
Redis::hGetAll took 43ms and called 103 times.

In Memcache this page also calls 103 times and takes roughly 80ms. With memcache I was not using the igbprotocol as I believe there was a bug in it some where, which caused it to greatly perform worse than with out it.

From reading documentation on the subject though, ibginary_unserialize should be the fastest for unserializing data (not for serializing it though). The PHPRedis module also supports igbinary serialization. I'm mostly curious if anyone else has experience with this subject, and if it would be a good topic to look into.

https://github.com/nicolasff/phpredis

phpize
./configure [--enable-redis-igbinary]
make && make install

Some benchmarks, although test data is much too small for what we'd be doing in something like EntityCache. We should gather some real world benchmarks from live sites.

Sample Benchmarks and a good read over all.
http://www.niden.net/2011/11/fast-serialization-of-data-in-php-how.html

Curious to hear others chime in on this subject.
--

Unrelated other information on PHP serialization in ApacheSolr module
https://drupal.org/comment/7401672#comment-7401672

Comments

j0rd’s picture

Additionally from looking at the code, and correct me if I'm wrong, but it looks like we're holding on to both the serialized and unserialized data in the $cache object.

    if ($cached->serialized) {
      $cached->data = unserialize($cached->data);
    }

I would it not save memory, to unset($cache->serialized) after un-serializing it. I assume it's no longer used.

---

I noticed that PHPRedis does offer various serialization options, and that this module defaults to using a hybrid (and serializer none). This would make igbinary dropin, very easy.

$redis->setOption(Redis::OPT_SERIALIZER, Redis::SERIALIZER_NONE);   // don't serialize data
$redis->setOption(Redis::OPT_SERIALIZER, Redis::SERIALIZER_PHP);    // use built-in serialize/unserialize
$redis->setOption(Redis::OPT_SERIALIZER, Redis::SERIALIZER_IGBINARY);   // use igBinary serialize/unserialize

Here's some issues when using igbinary protocols with phpredis though. Which I assume is perhaps one of the reasons the redis module here rolled their own:
https://github.com/nicolasff/phpredis/issues/246
https://github.com/nicolasff/phpredis/issues/81

Since this module only supports the use of basic redis features as a cache store though, I do not believe this is an issue.

---

I'm also hearing unconfirmed by benchmarks rumors that PHP 5.4 & 5.5 may have improved serialize performance. Also that igbinary may have problems in 5.4 and 5.5.

---

Additionally, here's @catch on the serialize topic:
[#7571125]
https://drupal.org/comment/7571125#comment-7571125

pounard’s picture

Nice, thank. I should probably give the possibility to switch the serialisation mechanism in configuration. I should do more reading about this.

pounard’s picture

From what I remember, igbinary may cause some troubles because it's not compatible with PHP serialization. PHP serialization is supposedly stable and backward compatible since a lot of versions if I remember well.

pounard’s picture

Leaving this opened for discussion, I need to think, play with, test it, then see if it is doable without too much complexity. Problem being if we rely on PhpRedis for serializing by itself our data, we must keep if()'s around the place in case we just don't. This means that get() and set() operations will suffer from more complexity within the backend itself. So, every improvement brings a counterpart which oftenly is complexity and harder maintainance. Need to think about that one.

j0rd’s picture

As for implementation. The easiest way, and most maintainable would be to check configuration options, then everywhere this module uses serialize/unserialize, when storing and getting data from redis, and replace with

igbinary_serialize() and igbinary_unserialize()

I would ignore using the protocol implementations with PHPRedis, although I'd be curious to see if they make any performance impact.

I'm also curious in seeing serialize() and unserialize() benchmarks vs igbinary on PHP 5.5 as I haven't see those, and may become a non-issue if they're around the same speed.

mja’s picture

Hey guys,

I have (unofficially) replaced serialize() and unserialize() in the module code with the igbinary equivalent, and anecdotally I can say that things seem faster. Although it's worth saying that igbinary wasn't really developed for speed - it was developed to keep serialised data as small as possible, which is helpful when putting large objects in memory.

@pounard is right to be cautious though - if you switch between the PHP version and the igbinary version, you will end up with a useless (and effectively corrupt) cache. You can't even flush the cache using drupal functions, because they won't find any valid keys. You have to use FLUSH DB from the Redis CLI. So it's not really something I would want to put in the config options, not unless you want a nice fat issue queue to deal with! Might be best to be an install config option, as j0rd suggests.

j0rd’s picture

@mja, igbinary was actually developed with "unserialize" speed in mind. It should perform better than standard PHP unserialize. But it can take longer on the serialize side of things. I'm very interested in how it performs with our "Drupal" data on the unserialize speed front.

So recap, potentially slower cache writes, and should have faster cache reads.

The unserialize millisecond reduction was why I initially brought this up, memory reduction is a bonus, but not what I'm interested in personally.

Would love to see some XHProf screenshots if that would be possible between pre-cached page hits.

How to test would be to find your pages which use the most amount of redis cache. Turn off "page and block cache" in performance settings. Hit the page 3 times to prime the caches. Then XHPROF this function call for both with and with out igbinary.

Redis_Cache_PhpRedis::get

I'd also probably attempt to do the same with page & block caches on, as while I'm not interested in those numbers, others might be.

If you don't have time, no problem, but if you did, it would be welcome.

PS, here's the benchmarks from the page http://www.niden.net/2011/11/fast-serialization-of-data-in-php-how.html

ENCODING
=======
Shitty Hardware
---------------------
serialize() [all]: Size: 1567 bytes, 42.437592029572 time to encode
json_encode() [all]: Size: 462 bytes, 9.9569129943848 time to encode
igbinary_serialize() [all]: Size: 478 bytes, 18.053789138794 time to encode
=============================================
Good Hardware
--------------------
serialize() [all]: Size: 1092 bytes, 13.614814043045 time to encode
json_encode() [all]: Size: 462 bytes, 7.7341570854187 time to encode
igbinary_serialize() [all]: Size: 478 bytes, 5.6470530033112 time to encode
DECODING
=======
Shitty Hardware
--------------------
unserialize() [all]: 31.01339006424 time to decode
json_decode() [all]: 14.574991941452 time to decode
igbinary_unserialize() [all]: 10.734386920929 time to decode
=====================================
Good Hardware
--------------------
unserialize() [all]: 11.949328899384 time to decode
json_decode() [all]: 9.9836950302124 time to decode
igbinary_unserialize() [all]: 4.4029591083527 time to decode
mja’s picture

@j0rd - really interesting stats! Thanks a lot for posting. Great to see json_decode in there too.

Not sure if I will get round to doing something this thorough in the next few days, but will see how it goes.

pounard’s picture

json won't keep references and typing information (except for scalar types) it's not a good alternative for serialization. Even if it seems the fastest one, and that's probably due to the type reduction it does, and the fact it will fail on objects, it's not worth it.

IGBinray does not only saves memory, but have a greater compression ratio than PHP serialization (which actually doesn't compress anything) and may save *a effing lot* of disk and memory space on the Redis side.

I'd advise to anyone that deals with a *effing lot of data* to use it, but you can use it transparently by just telling PHP to use as serialization handler without calling the igbinary functions.

Aside of that, I don't know if PHPRedis actually serialize by itself the data, I hope not because I don't serialize myself scalar values, it's because I want to save those few bytes.

larsdesigns’s picture

According to this article: Predis is clearly not a good choice: http://alekseykorzun.com/post/53283070010/benchmarking-memcached-and-red...

The obvious reason is that it is written in PHP and not C such as phpredis.

j0rd’s picture

@larsdesigns, yep. But no one is talking about using Predis in here. So I'm unsure how your comment is valid.

larsdesigns’s picture

I guess there is no valid point to this thread. So I thought I would add two cents.

vinmassaro’s picture

@pounard: in #9 when you say "just telling PHP to use as serialization handler without calling the igbinary functions", are you referring to just running phpredis configure with --enable-redis-igbinary? We are going to add Redis to our stack soon in a multi-tenant Drupal environment with a few hundred sites and want to see the best space savings possible.

pounard’s picture

@vinmassaro no I'm saying use the http://pecl.php.net/package/igbinary PHP extension at the PHP level, not only at Redis level, then everything serialized and unserialized will use igbinary transparently (it is supposed to work this way). See https://github.com/igbinary/igbinary/#how-to-use for documentation. If Redis don't use it after that, then yes enable igbinary on Redis and retry.

pounard’s picture

Status: Active » Closed (works as designed)

Hum closing the issue, hope the discussion here was enough for everyone. Feel free to open it back if you want to continue to discuss the topic.

socialnicheguru’s picture

Is there advantage to using this module, https://www.drupal.org/project/igbinary_cache?

JvE’s picture

Status: Closed (works as designed) » Active

everything serialized and unserialized will use igbinary transparently

No, installing the igbinary PHP extension does not make PHP use igbinary transparantly, it just makes the igbinary_* functions available.

I changed the serialize and unserialize calls in Cache.php to igbinary_serialize and igbinary_unserialize.
This gave me a 10-30% performance increase on most pages.
Profiling with blackfire showed a 50% reduction of the time spent unserializing.
Redis memory usage also decreased by 50%.

I think it may be worth it to have the option available to choose an alternative serializer.

bojanz’s picture

@JvE
Would be great to know the PHP version tested.
I've managed to find 0 benchmarks that are for modern PHP versions (5.5, 5.6), and I've seen that 5.4 attempted to optimize serialization performance.

JvE’s picture

I tested on CentOS 6 with the php56u, php56u-pecl-redis and php56u-pecl-igbinary packages from the IUS repository.
Using Redis 3.0.5 from the remi repository.

Edit: I used the script from the mentioned benchmark (https://niden.net/post/fast-serialization-of-data-in-php), removed the json en/decode and ran it on my machine:

serialize() [strings]: Size: 105 bytes, 2.0046539306641 time to encode
igbinary_serialize() [strings]: Size: 64 bytes, 2.3350989818573 time to encode
====================
serialize() [integers]: Size: 121 bytes, 2.9640078544617 time to encode
igbinary_serialize() [integers]: Size: 58 bytes, 1.8331098556519 time to encode
====================
serialize() [booleans]: Size: 62 bytes, 2.1183650493622 time to encode
igbinary_serialize() [booleans]: Size: 27 bytes, 1.5669319629669 time to encode
====================
serialize() [floats]: Size: 307 bytes, 9.000009059906 time to encode
igbinary_serialize() [floats]: Size: 142 bytes, 1.9521350860596 time to encode
====================
serialize() [mixed]: Size: 105 bytes, 3.1574158668518 time to encode
igbinary_serialize() [mixed]: Size: 50 bytes, 1.9139108657837 time to encode
====================
serialize() [objects]: Size: 326 bytes, 4.6532421112061 time to encode
igbinary_serialize() [objects]: Size: 177 bytes, 3.9466071128845 time to encode
====================
serialize() [all]: Size: 1092 bytes, 19.418648004532 time to encode
igbinary_serialize() [all]: Size: 478 bytes, 9.3856239318848 time to encode
====================
=:==:==:==:==:==:==:==:==:==:==:==:==:==:==:==:==:==:==:==:=
unserialize() [strings]: 1.9994440078735 time to decode
igbinary_unserialize() [strings]: 1.5736892223358 time to decode
====================
unserialize() [integers]: 2.6240360736847 time to decode
igbinary_unserialize() [integers]: 1.7326591014862 time to decode
====================
unserialize() [booleans]: 1.9784460067749 time to decode
igbinary_unserialize() [booleans]: 1.4351291656494 time to decode
====================
unserialize() [floats]: 8.6046590805054 time to decode
igbinary_unserialize() [floats]: 1.780898809433 time to decode
====================
unserialize() [mixed]: 3.55735206604 time to decode
igbinary_unserialize() [mixed]: 1.6083300113678 time to decode
====================
unserialize() [objects]: 5.1788020133972 time to decode
igbinary_unserialize() [objects]: 3.7346758842468 time to decode
====================
unserialize() [all]: 18.727533102036 time to decode
igbinary_unserialize() [all]: 6.7537951469421 time to decode
====================

I then used something more real life drupal and tested it with a theme_registry cache entry:

serialize() [reg]: Size: 53983 bytes, 3.2227749824524 time to encode
igbinary_serialize() [reg]: Size: 30422 bytes, 4.3486812114716 time to encode
=:==:==:==:==:==:==:==:==:==:==:==:==:==:==:==:==:==:==:==:=
unserialize() [reg]: 6.0846838951111 time to decode
igbinary_unserialize() [reg]: 3.6590559482574 time to decode
pounard’s picture

@JvE thanks for the details and correcting my mistakes.

pounard’s picture

Status: Active » Postponed (maintainer needs more info)

@JvE did you try setting this into your php.ini file:

; Use igbinary as session serializer
session.serialize_handler=igbinary

Anyway, I don't plan to support igbinary directly into the module.

andypost’s picture

Version: 7.x-2.5 » 7.x-3.x-dev

Suppose better to have this logic in separate module https://www.drupal.org/project/igbinary_cache
Just needs make sure that redis module has support for igbinary that needed at least for compile time

mkalkbrenner’s picture

Version: 7.x-3.x-dev » 8.x-1.x-dev
Status: Postponed (maintainer needs more info) » Active
Related issues: +#839444: Make serializer customizable for Cache\DatabaseBackend

First, sorry I'm only working with the 8.x version.

I see two problems with the current implementation:
1. igbinary is not used for serialization /unserialization
2. the data doesn't seem to be compressed. Here's an interesting article: http://labs.octivi.com/how-we-cut-down-memory-usage-by-82/

I would like to provide a patch that allows the injection of the Drupal\igbinary\Component\Serialization\IgbinaryCompressSerialize serializer that addresses both problems. (I can also provide just compression for people that don't have the igbinary php extension.)

I think that the patch for the redis module should follow the pattern as proposed in #839444: Make serializer customizable for Cache\DatabaseBackend, especially https://www.drupal.org/node/839444#comment-11838553

What do you think about that?

pounard’s picture

7.x version of the module does compression already. I also took the liberty of creating a not yet stable but working backend for Drupal 8 that uses common code for 7 and 8 there: https://github.com/makinacorpus/redis-bundle there are a few ideas in this package that Berdir might use for the 8.x version of the module.

mkalkbrenner’s picture

Status: Active » Needs review
StatusFileSize
new6.38 KB

Here's a first patch that allows the serializer of the cache backend to be swapped.
Beside igbinary, the igbinary module also provides a compressing php serializer that could be easily injected if the igbinary extension isn't available.

berdir’s picture

Looks fine to me.

+++ b/src/Cache/PhpRedis.php
@@ -196,7 +204,7 @@ class PhpRedis extends CacheBase {
     // Let Redis handle the data types itself.
     if (!is_string($data)) {
-      $hash['data'] = serialize($data);
+      $hash['data'] = $this->serializer->encode($data);
       $hash['serialized'] = 1;
     }
     else {

this snippet seems to be repeated quite often, wondering if we can move it to a heper method on the parent?

+++ b/redis.services.yml
@@ -1,7 +1,7 @@
     class: Drupal\redis\Cache\CacheBackendFactory
-    arguments: ['@redis.factory', '@cache_tags.invalidator.checksum']
+    arguments: ['@redis.factory', '@cache_tags.invalidator.checksum', '@serialization.phpserialize']
   redis.factory:

So you have to override the service definition manually to get igbinary support? Or will the module do that automatically?

mkalkbrenner’s picture

this snippet seems to be repeated quite often, wondering if we can move it to a heper method on the parent?

I agree but I think the Predis implementation requires some more refactoring or needs to be removed. It doen't implement some required abstract methods. Therefore I suggest to open a dedicated issue.

So you have to override the service definition manually to get igbinary support? Or will the module do that automatically?

The module already contained a ServiceProvider to do everything magically. But for the same arguments as written in https://www.drupal.org/node/839444#comment-11846090 I removed it for the moment. Currently you need to provide a piece of custom code or your own service.yml.
But the plan is to introduce some settings and to re-add the ServiceProvider. The default serializer for redis will be PHP compressed.
Once done, you can safely depend on the igbinary module if you like. This way you get at least compression even if you don't have the igbinary PHP extension.
But I think the current patch is a first step anyway.

berdir’s picture

Status: Needs review » Fixed

I think this is fine to commit now based on what we decided in the core issue. Looking forward to trying out igbinary ;)

  • Berdir committed cca69ab on 8.x-1.x authored by mkalkbrenner
    Issue #2143149 by mkalkbrenner: PHPRedis and igbinary support for read...
berdir’s picture

I did notice one unfortunate thing, this does break custom bootstrap_container_definition definitions to store the container in redis.

So if somoene is looking for this, this is the new definition:

  $settings['bootstrap_container_definition'] = [
    'parameters' => [],
    'services' => [
      'redis.factory' => [
        'class' => 'Drupal\redis\ClientFactory',
      ],
      'cache.backend.redis' => [
        'class' => 'Drupal\redis\Cache\CacheBackendFactory',
        'arguments' => ['@redis.factory', '@cache_tags_provider.container', '@serialization.phpserialize'],
      ],
      'cache.container' => [
        'class' => '\Drupal\redis\Cache\PhpRedis',
        'factory' => ['@cache.backend.redis', 'get'],
        'arguments' => ['container'],
      ],
      'cache_tags_provider.container' => [
        'class' => 'Drupal\redis\Cache\RedisCacheTagsChecksum',
        'arguments' => ['@redis.factory'],
      ],
      'serialization.phpserialize' => [
        'class' => 'Drupal\Component\Serialization\PhpSerialize',
      ],
    ],
  ];

Need to document this properly, help welcome :)

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.