I'm running cron every 15 minutes to pick up new changes, and for any user that has a tweet in the last 24 hours I'm getting a duplicate key error for the most recent tweet. The error looks as follows:

Duplicate entry '1999325571842000' for key
'PRIMARY'
query: INSERT INTO twitter (twitter_id, screen_name, created_at,
created_time, text, source, truncated) VALUES (1.999325571842E+15,
'sdague', 'Tue Nov 09 14:07:31 +0000
2010', 1289311651, 'I love poll questions in classes that
are designed to drive everyone to the right answer. The wrong answers are
always funny.', '<a
href=\"http://microblog-purple.googlecode.com\"
rel=\"nofollow\">mbpidgin</a>',
0) in /data/site/mhvlug.org/html/includes/common.inc on line 3528.

This is generating a lot of errors in my logs and my email.

Comments

bugster’s picture

Ditto problem,

twitter_save_status test query if the row exits does return 1 for the id (1.999325571842E+15) <-using the reporter's id.

So somewhere in the cron flow this test fails and the status is saved normally?

Drupal Centric’s picture

I'm also getting a duplicate key error:

Duplicate entry '2340839216906200'; for key 1 query: INSERT INTO twitter (twitter_id, screen_name, created_at, created_time, text, source, truncated) VALUES.....
in /includes/common.inc on line 3528.

protoplasm’s picture

Same issue exactly as #2 on line 3528. It is also breaking cron.

I emptied the twitter table and even when the tweets regenerate, the same 3 tweets remain duplicates. These are not the last three ids but 3 different ones in the table-with large id intervals between them.

nosweat’s picture

I think I was experiencing the same issue. I found this:

twitter.inc on line 71

 if (db_result(db_query("SELECT 1 FROM {twitter} WHERE twitter_id = %d", $status->id))) {

changed into

 if (db_result(db_query("SELECT 1 FROM {twitter} WHERE twitter_id = %f", $status->id))) {

since $status->id is a float, not an int ?

What do u guys think?

protoplasm’s picture

Bingo!!

Of course, that makes total sense.

This time I ran cron and no errors. Thank you so much.
I'll continue to watch, but I suspect that this is the fix.

Drupal Centric’s picture

Yep #4 works for me :o)

xurizaemon’s picture

Status: Active » Needs work

I'm not sure %f is the correct fix, because it looks like it's not actually storing accurate status IDs in the local storage.

We'll get conflicts (eventually, current odds are 1:10,000 for each status by the look of things?).

Using %f (or '%s') in twitter_status_save's duplicate check I see a match, but I think it's wrong behaviour still.

SELECT 1 FROM twitter WHERE twitter_id = 9402275642470000
UPDATE twitter SET twitter_id = 9402275642470000, screen_name = 'worldfreelance', created_at = 'Tue, 30 Nov 2010 00:24:11 +0000', created_time = 1291076651, text = 'Drupal Site Project by mazsola: 1. You will install drupal on our test server. Only bid if you are comfortable i... http://bit.ly/fXzBRh', source = '&lt;a href=&quot;http://twitterfeed.com&quot; rel=&quot;nofollow&quot;&gt;twitterfeed&lt;/a&gt;' WHERE twitter_id = 9402275642470000

Using %d in same I see -

SELECT 1 FROM twitter WHERE twitter_id = 9403130370330624
INSERT INTO twitter (twitter_id, screen_name, created_at, created_time, text, source, truncated) VALUES (9403130370330000, 'work_freelance', 'Tue, 30 Nov 2010 00:27:35 +0000', 1291076855, 'Freelance Drupal Job - Drupal Site Project http://bit.ly/gTdlny', '&lt;a href=&quot;http://www.managetwit.com&quot; rel=&quot;nofollow&quot;&gt;ManageTwit&lt;/a&gt;', 0)

IMO we should see 9403130370330624 in both queries, not just the first one.

Currently this is {twitter}.twitter_id -

      'twitter_id' => array(
        'description' => t("Unique identifier for each {twitter} post."),
        'type' => 'numeric',
        'unsigned' => TRUE,
        'precision' => 20,
        'scale' => 0,
        'not null' => TRUE,
        'default' => 0
      ),
 
xurizaemon’s picture

Gilneas’s picture

Subscribing same issue as explained in header topic.

kjholla’s picture

Look through my comments: #5 and #8 in the thread #985544: {twitter}.twitter_id incompletely stored (final digits are zeroes) due to json_decode limitation in PHP<5.3.

I think, with the modifications to twitter.lib.php that I have provided in that thread, this Duplicate key problem should be solved too.

Regards,
KH

xurizaemon’s picture

wiifm’s picture

Patch #4 worked for me and at least removed the nasty PHP error on cron run

Gilneas’s picture

The patch works for me, too.

agileware’s picture

Status: Needs work » Needs review
StatusFileSize
new605 bytes

Here is a patch of the solution in #4.
It will apply to 6.x-3.0-beta3 and 6.x-3.x-dev cleanly.

It also isn't CNW anymore as grobot said his problem in #7 is a different issue (see #8).

The patch fixes the problem for me too.

hedac’s picture

thank you #4 !

karensmallwood’s picture

#4 thank you :-)

Anonymous’s picture

Sweet! #4 totally fixed this issue for me.

WickedJ’s picture

Thx, works fine.

raintonr’s picture

String, not float?

Re #4... I would suggest treating the ID as a string (%s) rather than a float (%f) would be safer. As one doesn't have to mathematically manipulate these IDs there really is no reason not to do that.

chia’s picture

Correct query should be

 if (db_result(db_query("SELECT 1 FROM {twitter} WHERE twitter_id = %n", $status->id))) {

it should %n (numeric) and not %f

raintonr’s picture

Valid %-modifiers are: %s, %d, %f, %b (binary data, do not enclose in '') and %%

From http://api.drupal.org/api/drupal/includes--database.mysql-common.inc/fun...

agileware’s picture

String, not float?

Re #4... I would suggest treating the ID as a string (%s) rather than a float (%f) would be safer. As one doesn't have to mathematically manipulate these IDs there really is no reason not to do that.

The twitter_id field in the database is numeric, not a string so to treat it as a string would also require a db schema change.

What is wrong with just using %f?

raintonr’s picture

What is wrong with just using %f?

Floats won't work because they are not precise. There's an explanation of floating point accuracy here:

http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems

From the PHP manual:

The size of a float is platform-dependent, although a maximum of ~1.8e308 with a precision of roughly 14 decimal digits is a common value (the 64 bit IEEE format).

Given the last ID I can currently see on Twitter is 17 digits you can see the problem if a float is only accurate to 14 digits. Using 14 digits one cannot determine the difference between two IDs close together and the issue being discussed will appear.

It appears that MySQL can store large integers accurately enough, but PHP cannot handle the conversion. Thus, one can safely leave the DB alone and tell PHP to handle this field as a string. Clearly not ideal, but certainly better than using a PHP float.

agileware’s picture

Status: Needs review » Needs work

Thanks for the explanation, sounds like a better fix then.

We'll need a new patch then for testing.

michaek’s picture

Yep, %s will be necessary because both integers and floats have a size limit because of the number of bits that represent them. According to the PHP documentation, the limit for an integer is around 2 billion on a 32-bit system, and float to about 14 decimal digits on a 64-bit system.

http://php.net/manual/en/language.types.integer.php
http://www.php.net/manual/en/language.types.float.php

ayesh’s picture

%s worked for me, as described in #19
and now tweet importing works like a charm.

Cheers!

xurizaemon’s picture

@michaek, there seem to be a group of related but not identical issues in Twitter module relating to large ids.

#491794: Twitpocalypse bug (will not import if twitter_id greater than 2,147,483,647)
#966748: Duplicate key error on normal import (that's this issue, listed here for completeness)
#985544: {twitter}.twitter_id incompletely stored (final digits are zeroes) due to json_decode limitation in PHP<5.3

I have a feeling there are some others which are plain dupes too :)

@eaton's writeup on the topic from long ago (recommends numeric with precision 20 and scale 0)

michaek’s picture

You're right - a lot of problems have been unaddressed for long enough to collect quite a lot of issue reports (with their relationship to the actual issue often obscure). The solution to this issue seems to be getting somewhat baroque, see #985544: {twitter}.twitter_id incompletely stored (final digits are zeroes) due to json_decode limitation in PHP<5.3.

krisrobinson’s picture

#19 Also worked for me it seems.

michaek’s picture

Status: Needs work » Closed (fixed)

Pushed to dev.