yes, can we have this please? we hit a bug in the latest php module, where keys passed to memcache greater than 250 are causing php to segfault. that's a bug a in the C module, but passing anything longer than 250 is a bug in this module.
chx's patch looks good to me, and we're using it in production.
Posted by Swampcritter on September 2, 2009 at 1:48pm
Any idea on when this patch can be committed w/ justinrandell's modification? I have it deployed for a couple of smaller production sites and so far it has definitely fixed our memcache long key problem.
Posted by ball.in.th on October 27, 2009 at 3:19pm
Hi,
Is there a particular reason why the $full_key has to be urlencode()? Does memcache support binary key? In fact, urlencoding could make the key a lot longer. For example, if the key is a string "http://drupal.org/สวัสดีดรูพัลด้วยภาษาไทยก็ยาวเกินแล้ว", urlencode it would make it become "http://drupal.org/%E0%B8%AA%E0%B8%A7%E0%B8%B1%E0%B8%AA%E0%B8%94%E0%B8%B5%E0%B8%94%E0%B8%A3%E0%B8%B9%E0%B8%9E%E0%B8%B1%E0%B8%A5%E0%B8%94%E0%B9%89%E0%B8%A7%E0%B8%A2%E0%B8%A0%E0%B8%B2%E0%B8%A9%E0%B8%B2%E0%B9%84%E0%B8%97%E0%B8%A2%E0%B8%81%E0%B9%87%E0%B8%A2%E0%B8%B2%E0%B8%A7%E0%B9%80%E0%B8%81%E0%B8%B4%E0%B8%99%E0%B9%81%E0%B8%A5%E0%B9%89%E0%B8%A7" totaling 342 characters.
Here is the patch with sha1. I don't think we need to go sha256 and risk more CPU usage for an operation that happens for every cache_get/set. Unless someone benchmarks them and finds the difference to be marginal.
Comments
#1
related to #355299: errors like Failed to set key: cache_content-content%3A148498%3A148498
#2
yes, can we have this please? we hit a bug in the latest php module, where keys passed to memcache greater than 250 are causing php to segfault. that's a bug a in the C module, but passing anything longer than 250 is a bug in this module.
chx's patch looks good to me, and we're using it in production.
#3
woops, that patch should be:
<?phpreturn strlen(urlencode($full_key)) > 250 ? $prefix . $bin . '-' . md5($key) : urlencode($full_key);
?>
because the urlencode'd key can be longer than the straight full key.
#4
Any idea on when this patch can be committed w/ justinrandell's modification? I have it deployed for a couple of smaller production sites and so far it has definitely fixed our memcache long key problem.
-- M
#5
Hi,
Is there a particular reason why the $full_key has to be urlencode()? Does memcache support binary key? In fact, urlencoding could make the key a lot longer. For example, if the key is a string "http://drupal.org/สวัสดีดรูพัลด้วยภาษาไทยก็ยาวเกินแล้ว", urlencode it would make it become "http://drupal.org/%E0%B8%AA%E0%B8%A7%E0%B8%B1%E0%B8%AA%E0%B8%94%E0%B8%B5%E0%B8%94%E0%B8%A3%E0%B8%B9%E0%B8%9E%E0%B8%B1%E0%B8%A5%E0%B8%94%E0%B9%89%E0%B8%A7%E0%B8%A2%E0%B8%A0%E0%B8%B2%E0%B8%A9%E0%B8%B2%E0%B9%84%E0%B8%97%E0%B8%A2%E0%B8%81%E0%B9%87%E0%B8%A2%E0%B8%B2%E0%B8%A7%E0%B9%80%E0%B8%81%E0%B8%B4%E0%B8%99%E0%B9%81%E0%B8%A5%E0%B9%89%E0%B8%A7" totaling 342 characters.
#6
Hi,
I've tested memcache with keys in utf8 and it worked fine. So, there's no need to urlencode the key at all.
<?php
$memcache = new Memcache;
$memcache->connect("localhost", 11211);
$tmp_object = new stdClass;
$tmp_object->str_attr = "test";
$tmp_object->int_attr = 123;
$key = "blahชุมชนคอบอลอันดับ 1";
$memcache->set($key, $tmp_object, false, 10);
echo "Data from the cache:<br />\n";
var_dump($memcache->get($key));
?>
#7
subscribing
#8
Subscribing.
#9
This is a problem.
For the long explanation see very long URL aliases not correctly cached in memcache.
I created a patch in #756926: Extra long URLs combined with unicode causes cid clashes and use it successfully on a large site. The other issue is marked as duplicate of this one.
I am attaching it here again. It is in use on a large site. That site has no prefixes for memcache though.
#10
kbahey, why do you urlencode() before md5()ing? it doesn't seem like there's any need.
#11
No good reason really other than it was there in the first place. I just took an md5 of whatever was there.
We could take out the url encoding with no ill effect.
What I want people to test really is the prefix stuff. If someone is using prefixes, then can they check if this works or not?
#12
Here is a better patch.
1. It does not urlencode the URL anymore, it is not needed.
2. It preserves the prefix and bin as they are without URL encoding.
Can someone give it a spin please?
#13
md5 is not necessarily the most robust hash function - at least use sha1, better sha256.
#14
Here is the patch with sha1. I don't think we need to go sha256 and risk more CPU usage for an operation that happens for every cache_get/set. Unless someone benchmarks them and finds the difference to be marginal.
#15
Slightly modified version committed to the 7.x branch:
http://drupal.org/cvs?commit=359644
#16
Also committed to the 6.x branch:
http://drupal.org/cvs?commit=359646
#17
Automatically closed -- issue fixed for 2 weeks with no activity.