I'm running cron every 15 minutes to pick up new changes, and for any user that has a tweet in the last 24 hours I'm getting a duplicate key error for the most recent tweet. The error looks as follows:
Duplicate entry '1999325571842000' for key
'PRIMARY'
query: INSERT INTO twitter (twitter_id, screen_name, created_at,
created_time, text, source, truncated) VALUES (1.999325571842E+15,
'sdague', 'Tue Nov 09 14:07:31 +0000
2010', 1289311651, 'I love poll questions in classes that
are designed to drive everyone to the right answer. The wrong answers are
always funny.', '<a
href=\"http://microblog-purple.googlecode.com\"
rel=\"nofollow\">mbpidgin</a>',
0) in /data/site/mhvlug.org/html/includes/common.inc on line 3528.
This is generating a lot of errors in my logs and my email.
| Comment | File | Size | Author |
|---|---|---|---|
| #14 | twitter-duplicate_key_error-966748-14.patch | 605 bytes | agileware |
Comments
Comment #1
bugster commentedDitto problem,
twitter_save_status test query if the row exits does return 1 for the id (1.999325571842E+15) <-using the reporter's id.
So somewhere in the cron flow this test fails and the status is saved normally?
Comment #2
Drupal Centric commentedI'm also getting a duplicate key error:
Duplicate entry '2340839216906200'; for key 1 query: INSERT INTO twitter (twitter_id, screen_name, created_at, created_time, text, source, truncated) VALUES.....
in /includes/common.inc on line 3528.
Comment #3
protoplasm commentedSame issue exactly as #2 on line 3528. It is also breaking cron.
I emptied the twitter table and even when the tweets regenerate, the same 3 tweets remain duplicates. These are not the last three ids but 3 different ones in the table-with large id intervals between them.
Comment #4
nosweat commentedI think I was experiencing the same issue. I found this:
twitter.inc on line 71
changed into
since $status->id is a float, not an int ?
What do u guys think?
Comment #5
protoplasm commentedBingo!!
Of course, that makes total sense.
This time I ran cron and no errors. Thank you so much.
I'll continue to watch, but I suspect that this is the fix.
Comment #6
Drupal Centric commentedYep #4 works for me :o)
Comment #7
xurizaemonI'm not sure %f is the correct fix, because it looks like it's not actually storing accurate status IDs in the local storage.
We'll get conflicts (eventually, current odds are 1:10,000 for each status by the look of things?).
Using %f (or '%s') in twitter_status_save's duplicate check I see a match, but I think it's wrong behaviour still.
Using %d in same I see -
IMO we should see 9403130370330624 in both queries, not just the first one.
Currently this is {twitter}.twitter_id -
Comment #8
xurizaemonI've opened a separate issue regarding my comment above.
#985544: {twitter}.twitter_id incompletely stored (final digits are zeroes) due to json_decode limitation in PHP<5.3
Comment #9
Gilneas commentedSubscribing same issue as explained in header topic.
Comment #10
kjholla commentedLook through my comments: #5 and #8 in the thread #985544: {twitter}.twitter_id incompletely stored (final digits are zeroes) due to json_decode limitation in PHP<5.3.
I think, with the modifications to twitter.lib.php that I have provided in that thread, this Duplicate key problem should be solved too.
Regards,
KH
Comment #11
xurizaemonUploaded a patch of kjholla's work @ #985544: {twitter}.twitter_id incompletely stored (final digits are zeroes) due to json_decode limitation in PHP<5.3. Please test & report back there.
Comment #12
wiifmPatch #4 worked for me and at least removed the nasty PHP error on cron run
Comment #13
Gilneas commentedThe patch works for me, too.
Comment #14
agileware commentedHere is a patch of the solution in #4.
It will apply to 6.x-3.0-beta3 and 6.x-3.x-dev cleanly.
It also isn't CNW anymore as grobot said his problem in #7 is a different issue (see #8).
The patch fixes the problem for me too.
Comment #15
hedac commentedthank you #4 !
Comment #16
karensmallwood commented#4 thank you :-)
Comment #17
Anonymous (not verified) commentedSweet! #4 totally fixed this issue for me.
Comment #18
WickedJ commentedThx, works fine.
Comment #19
raintonr commentedString, not float?
Re #4... I would suggest treating the ID as a string (%s) rather than a float (%f) would be safer. As one doesn't have to mathematically manipulate these IDs there really is no reason not to do that.
Comment #20
chia commentedCorrect query should be
it should %n (numeric) and not %f
Comment #21
raintonr commentedFrom http://api.drupal.org/api/drupal/includes--database.mysql-common.inc/fun...
Comment #22
agileware commentedThe twitter_id field in the database is numeric, not a string so to treat it as a string would also require a db schema change.
What is wrong with just using %f?
Comment #23
raintonr commentedFloats won't work because they are not precise. There's an explanation of floating point accuracy here:
http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems
From the PHP manual:
Given the last ID I can currently see on Twitter is 17 digits you can see the problem if a float is only accurate to 14 digits. Using 14 digits one cannot determine the difference between two IDs close together and the issue being discussed will appear.
It appears that MySQL can store large integers accurately enough, but PHP cannot handle the conversion. Thus, one can safely leave the DB alone and tell PHP to handle this field as a string. Clearly not ideal, but certainly better than using a PHP float.
Comment #24
agileware commentedThanks for the explanation, sounds like a better fix then.
We'll need a new patch then for testing.
Comment #25
michaek commentedYep, %s will be necessary because both integers and floats have a size limit because of the number of bits that represent them. According to the PHP documentation, the limit for an integer is around 2 billion on a 32-bit system, and float to about 14 decimal digits on a 64-bit system.
http://php.net/manual/en/language.types.integer.php
http://www.php.net/manual/en/language.types.float.php
Comment #26
ayesh commented%s worked for me, as described in #19
and now tweet importing works like a charm.
Cheers!
Comment #27
xurizaemon@michaek, there seem to be a group of related but not identical issues in Twitter module relating to large ids.
#491794: Twitpocalypse bug (will not import if twitter_id greater than 2,147,483,647)
#966748: Duplicate key error on normal import (that's this issue, listed here for completeness)
#985544: {twitter}.twitter_id incompletely stored (final digits are zeroes) due to json_decode limitation in PHP<5.3
I have a feeling there are some others which are plain dupes too :)
@eaton's writeup on the topic from long ago (recommends numeric with precision 20 and scale 0)
Comment #28
michaek commentedYou're right - a lot of problems have been unaddressed for long enough to collect quite a lot of issue reports (with their relationship to the actual issue often obscure). The solution to this issue seems to be getting somewhat baroque, see #985544: {twitter}.twitter_id incompletely stored (final digits are zeroes) due to json_decode limitation in PHP<5.3.
Comment #29
krisrobinson commented#19 Also worked for me it seems.
Comment #30
michaek commentedPushed to dev.