I have a site which is heavily geo-localized. Because of this I'm not using Drupal's standard block or page cache. Each page is served upon request, and I'm using contrib and custom caching techniques to keep my page loads to a reasonable level. Because of this I need to fight for every millisecond in a page request.
I'm recently moving my site off php-memcached and on to redis and during this process I'm performing some benchmarks via XHProf.
I noticed that while redis (using PHP-Redis plugin) appears to be around, if not faster than memcache. That's wonderful, as I get additional features for free and persistance with redis. One downside I am noticing though, is a large increase in memory consumption per page request. This doesn't bother me though, as memory is cheap.
An area with this redis module which stands out as a place to improve is that Redis_Cache_PhpRedis::get seems to consistently spend 50% of it's time performing PHP's unserialize() (at least in PHP 5.3). I've consistently noticed this on pretty much every page request I've tested. If this could be reduced, then Redis would out perform memcache by around 100%
Example:
Redis_Cache_PhpRedis::get took 84ms and called 103 times.
unserialize() took 40ms and called 101 times.
Redis::hGetAll took 43ms and called 103 times.
In Memcache this page also calls 103 times and takes roughly 80ms. With memcache I was not using the igbprotocol as I believe there was a bug in it some where, which caused it to greatly perform worse than with out it.
From reading documentation on the subject though, ibginary_unserialize should be the fastest for unserializing data (not for serializing it though). The PHPRedis module also supports igbinary serialization. I'm mostly curious if anyone else has experience with this subject, and if it would be a good topic to look into.
https://github.com/nicolasff/phpredis
phpize
./configure [--enable-redis-igbinary]
make && make install
Some benchmarks, although test data is much too small for what we'd be doing in something like EntityCache. We should gather some real world benchmarks from live sites.
Sample Benchmarks and a good read over all.
http://www.niden.net/2011/11/fast-serialization-of-data-in-php-how.html
Curious to hear others chime in on this subject.
--
Unrelated other information on PHP serialization in ApacheSolr module
https://drupal.org/comment/7401672#comment-7401672
| Comment | File | Size | Author |
|---|---|---|---|
| #25 | 2143149_swap_serialization_for_caches.patch | 6.38 KB | mkalkbrenner |
Comments
Comment #1
j0rd commentedAdditionally from looking at the code, and correct me if I'm wrong, but it looks like we're holding on to both the serialized and unserialized data in the $cache object.
I would it not save memory, to unset($cache->serialized) after un-serializing it. I assume it's no longer used.
---
I noticed that PHPRedis does offer various serialization options, and that this module defaults to using a hybrid (and serializer none). This would make igbinary dropin, very easy.
Here's some issues when using igbinary protocols with phpredis though. Which I assume is perhaps one of the reasons the redis module here rolled their own:
https://github.com/nicolasff/phpredis/issues/246
https://github.com/nicolasff/phpredis/issues/81
Since this module only supports the use of basic redis features as a cache store though, I do not believe this is an issue.
---
I'm also hearing unconfirmed by benchmarks rumors that PHP 5.4 & 5.5 may have improved serialize performance. Also that igbinary may have problems in 5.4 and 5.5.
---
Additionally, here's @catch on the serialize topic:
[#7571125]
https://drupal.org/comment/7571125#comment-7571125
Comment #2
pounardNice, thank. I should probably give the possibility to switch the serialisation mechanism in configuration. I should do more reading about this.
Comment #3
pounardFrom what I remember, igbinary may cause some troubles because it's not compatible with PHP serialization. PHP serialization is supposedly stable and backward compatible since a lot of versions if I remember well.
Comment #4
pounardLeaving this opened for discussion, I need to think, play with, test it, then see if it is doable without too much complexity. Problem being if we rely on PhpRedis for serializing by itself our data, we must keep if()'s around the place in case we just don't. This means that get() and set() operations will suffer from more complexity within the backend itself. So, every improvement brings a counterpart which oftenly is complexity and harder maintainance. Need to think about that one.
Comment #5
j0rd commentedAs for implementation. The easiest way, and most maintainable would be to check configuration options, then everywhere this module uses serialize/unserialize, when storing and getting data from redis, and replace with
igbinary_serialize() and igbinary_unserialize()
I would ignore using the protocol implementations with PHPRedis, although I'd be curious to see if they make any performance impact.
I'm also curious in seeing serialize() and unserialize() benchmarks vs igbinary on PHP 5.5 as I haven't see those, and may become a non-issue if they're around the same speed.
Comment #6
mja commentedHey guys,
I have (unofficially) replaced serialize() and unserialize() in the module code with the igbinary equivalent, and anecdotally I can say that things seem faster. Although it's worth saying that igbinary wasn't really developed for speed - it was developed to keep serialised data as small as possible, which is helpful when putting large objects in memory.
@pounard is right to be cautious though - if you switch between the PHP version and the igbinary version, you will end up with a useless (and effectively corrupt) cache. You can't even flush the cache using drupal functions, because they won't find any valid keys. You have to use FLUSH DB from the Redis CLI. So it's not really something I would want to put in the config options, not unless you want a nice fat issue queue to deal with! Might be best to be an install config option, as j0rd suggests.
Comment #7
j0rd commented@mja, igbinary was actually developed with "unserialize" speed in mind. It should perform better than standard PHP unserialize. But it can take longer on the serialize side of things. I'm very interested in how it performs with our "Drupal" data on the unserialize speed front.
So recap, potentially slower cache writes, and should have faster cache reads.
The unserialize millisecond reduction was why I initially brought this up, memory reduction is a bonus, but not what I'm interested in personally.
Would love to see some XHProf screenshots if that would be possible between pre-cached page hits.
How to test would be to find your pages which use the most amount of redis cache. Turn off "page and block cache" in performance settings. Hit the page 3 times to prime the caches. Then XHPROF this function call for both with and with out igbinary.
Redis_Cache_PhpRedis::get
I'd also probably attempt to do the same with page & block caches on, as while I'm not interested in those numbers, others might be.
If you don't have time, no problem, but if you did, it would be welcome.
PS, here's the benchmarks from the page http://www.niden.net/2011/11/fast-serialization-of-data-in-php-how.html
Comment #8
mja commented@j0rd - really interesting stats! Thanks a lot for posting. Great to see json_decode in there too.
Not sure if I will get round to doing something this thorough in the next few days, but will see how it goes.
Comment #9
pounardjson won't keep references and typing information (except for scalar types) it's not a good alternative for serialization. Even if it seems the fastest one, and that's probably due to the type reduction it does, and the fact it will fail on objects, it's not worth it.
IGBinray does not only saves memory, but have a greater compression ratio than PHP serialization (which actually doesn't compress anything) and may save *a effing lot* of disk and memory space on the Redis side.
I'd advise to anyone that deals with a *effing lot of data* to use it, but you can use it transparently by just telling PHP to use as serialization handler without calling the igbinary functions.
Aside of that, I don't know if PHPRedis actually serialize by itself the data, I hope not because I don't serialize myself scalar values, it's because I want to save those few bytes.
Comment #10
larsdesigns commentedAccording to this article: Predis is clearly not a good choice: http://alekseykorzun.com/post/53283070010/benchmarking-memcached-and-red...
The obvious reason is that it is written in PHP and not C such as phpredis.
Comment #11
j0rd commented@larsdesigns, yep. But no one is talking about using Predis in here. So I'm unsure how your comment is valid.
Comment #12
larsdesigns commentedI guess there is no valid point to this thread. So I thought I would add two cents.
Comment #13
vinmassaro commented@pounard: in #9 when you say "just telling PHP to use as serialization handler without calling the igbinary functions", are you referring to just running phpredis configure with --enable-redis-igbinary? We are going to add Redis to our stack soon in a multi-tenant Drupal environment with a few hundred sites and want to see the best space savings possible.
Comment #14
pounard@vinmassaro no I'm saying use the http://pecl.php.net/package/igbinary PHP extension at the PHP level, not only at Redis level, then everything serialized and unserialized will use igbinary transparently (it is supposed to work this way). See https://github.com/igbinary/igbinary/#how-to-use for documentation. If Redis don't use it after that, then yes enable igbinary on Redis and retry.
Comment #15
pounardHum closing the issue, hope the discussion here was enough for everyone. Feel free to open it back if you want to continue to discuss the topic.
Comment #16
socialnicheguru commentedIs there advantage to using this module, https://www.drupal.org/project/igbinary_cache?
Comment #17
JvE commentedNo, installing the igbinary PHP extension does not make PHP use igbinary transparantly, it just makes the igbinary_* functions available.
I changed the
serializeandunserializecalls in Cache.php toigbinary_serializeandigbinary_unserialize.This gave me a 10-30% performance increase on most pages.
Profiling with blackfire showed a 50% reduction of the time spent unserializing.
Redis memory usage also decreased by 50%.
I think it may be worth it to have the option available to choose an alternative serializer.
Comment #18
bojanz commented@JvE
Would be great to know the PHP version tested.
I've managed to find 0 benchmarks that are for modern PHP versions (5.5, 5.6), and I've seen that 5.4 attempted to optimize serialization performance.
Comment #19
JvE commentedI tested on CentOS 6 with the php56u, php56u-pecl-redis and php56u-pecl-igbinary packages from the IUS repository.
Using Redis 3.0.5 from the remi repository.
Edit: I used the script from the mentioned benchmark (https://niden.net/post/fast-serialization-of-data-in-php), removed the json en/decode and ran it on my machine:
I then used something more real life drupal and tested it with a theme_registry cache entry:
Comment #20
pounard@JvE thanks for the details and correcting my mistakes.
Comment #21
pounard@JvE did you try setting this into your php.ini file:
Anyway, I don't plan to support igbinary directly into the module.
Comment #22
andypostSuppose better to have this logic in separate module https://www.drupal.org/project/igbinary_cache
Just needs make sure that redis module has support for igbinary that needed at least for compile time
Comment #23
mkalkbrennerFirst, sorry I'm only working with the 8.x version.
I see two problems with the current implementation:
1. igbinary is not used for serialization /unserialization
2. the data doesn't seem to be compressed. Here's an interesting article: http://labs.octivi.com/how-we-cut-down-memory-usage-by-82/
I would like to provide a patch that allows the injection of the
Drupal\igbinary\Component\Serialization\IgbinaryCompressSerializeserializer that addresses both problems. (I can also provide just compression for people that don't have the igbinary php extension.)I think that the patch for the redis module should follow the pattern as proposed in #839444: Make serializer customizable for Cache\DatabaseBackend, especially https://www.drupal.org/node/839444#comment-11838553
What do you think about that?
Comment #24
pounard7.x version of the module does compression already. I also took the liberty of creating a not yet stable but working backend for Drupal 8 that uses common code for 7 and 8 there: https://github.com/makinacorpus/redis-bundle there are a few ideas in this package that Berdir might use for the 8.x version of the module.
Comment #25
mkalkbrennerHere's a first patch that allows the serializer of the cache backend to be swapped.
Beside igbinary, the igbinary module also provides a compressing php serializer that could be easily injected if the igbinary extension isn't available.
Comment #26
berdirLooks fine to me.
this snippet seems to be repeated quite often, wondering if we can move it to a heper method on the parent?
So you have to override the service definition manually to get igbinary support? Or will the module do that automatically?
Comment #27
mkalkbrennerI agree but I think the Predis implementation requires some more refactoring or needs to be removed. It doen't implement some required abstract methods. Therefore I suggest to open a dedicated issue.
The module already contained a ServiceProvider to do everything magically. But for the same arguments as written in https://www.drupal.org/node/839444#comment-11846090 I removed it for the moment. Currently you need to provide a piece of custom code or your own service.yml.
But the plan is to introduce some settings and to re-add the ServiceProvider. The default serializer for redis will be PHP compressed.
Once done, you can safely depend on the igbinary module if you like. This way you get at least compression even if you don't have the igbinary PHP extension.
But I think the current patch is a first step anyway.
Comment #28
berdirI think this is fine to commit now based on what we decided in the core issue. Looking forward to trying out igbinary ;)
Comment #30
berdirI did notice one unfortunate thing, this does break custom bootstrap_container_definition definitions to store the container in redis.
So if somoene is looking for this, this is the new definition:
Need to document this properly, help welcome :)