A small error in the _get_token() method results in none unique tokens being return. This means that when listing all the records in a repository, an infinite loop can be entered.
A microtime of: 0.65573800 1195483767
Returns 656 + 1195483767 = 1195484423
Just under two seconds later ...
A microtime of: 0.65399921 1195483769
Returns 654 + 1195483769 = 1195484423
I've changed the _get_token function so that it also multiplies the seconds by the same value as the fraction of a second, resulting in a greater probability of the token being unique (now a request would have to occur within 1/10000 of a second of another).
Also included in the patch is an improvement to the SQL which stops the full query being executed to ascertain the maximum number of results obtainable (instead a COUNT query is executed).
Finally, one more point, we're using UTF-8 for our databases, and have found that the output of an OAI query is encoded twice. By commenting out the utf8_encode function (not included in this patch) we've fixed this. However, what we'd like to know is how do you define the charset being used?
| Comment | File | Size | Author |
|---|---|---|---|
| oai2.patch | 716 bytes | sdrycroft |
Comments
Comment #1
rjerome commentedI've added this patch and commented out the utf-8 encoding. Your right this would be redundant. I ported this code from http://physnet.uni-oldenburg.de/oai/, and to be honest I don't know how the character set is being defined.
Comment #2
rjerome commentedFurther to this, I've changed the token generation again to use the PHP uniqid() call to generate a unique 13 character string rather than trying to generate it from the time value.