Early Bird Registration for DrupalCon Portland 2024 is open! Register by 23:59 PST on 31 March 2024, to get $100 off your ticket.
I have duplicate entries in my aggregator. Actually, I have 10 entries that are identical.
I deleted them from the MySQL table and they just came back the next time the cron was run. I updated to CVS and the problem persisted.
See the problem here:
http://www.gassavers.org/aggregator/
Can anyone offer a solution? I would like to keep the aggregator, but not if it keeps pulling duplicate entries.
Comments
Comment #1
Prometheus6 CreditAttribution: Prometheus6 commentedGot a url for the feed itself?
Comment #2
pfaocleThis has happened to me a few times, both with 4.6.3 and recent HEAD versions. The feed responsible was http://www.demolicious.org/node/feed
Comment #3
Mateo CreditAttribution: Mateo commentedhttp://www.gassavers.org/aggregator/sources/1
http://www.gassavers.org/aggregator/opml
Comment #4
breyten CreditAttribution: breyten commentedMateo, it's because (in your case) the url to the permalinks are more than 255 characters long, so they get cut off when saving the items. Not sure what we should do about it.
Comment #5
Mateo CreditAttribution: Mateo commentedThank you for the reply. I will just create multiple feeds using different keywords.
Comment #6
dopry CreditAttribution: dopry commentedbest solution would be
ALTER TABLE `aggregator_item` CHANGE `link` `link` TEXT NOT NULL
I think according to the mysql docs its no more expensive in diskspace than varchar.
Anyone know how to roll db changes as patch.. I figure patch database.mysql, but I don't know the new update.php stuff yet.
This applies to 4.6.5 and head.
Comment #7
dopry CreditAttribution: dopry commentedfor 4.6.5
function update_X() {
$ret = array();
if ($GLOBALS['db_type'] == 'mysql') {
$ret[] = update_sql("ALTER TABLE {aggregator_item} CHANGE link link TEXT NOT NULL"):
}
elseif ($GLOBALS['db_type'] == 'pgsql') {
$ret[] = update_sql("ALTER TABLE {aggregator_item} RENAME link TO link_old");
$ret[] = update_sql("ALTER TABLE {aggregator_item} ADD link TEXT");
$ret[] = update_sql("UPDATE {aggregator_item} SET link = link_old");
$ret[] = update_sql("ALTER TABLE {aggregator_item} ALTER link SET NOT NULL");
$ret[] = update_sql("ALTER TABLE {aggregator_item} ALTER link SET DEFAULT ''");
$ret[] = update_sql("ALTER TABLE {aggregator_item} DROP link_old");
}
return $ret;
}
for 4.7.0-beta4
function system_update_X() {
$ret = array();
if ($GLOBALS['db_type'] == 'mysql') {
$ret[] = update_sql("ALTER TABLE {aggregator_item} CHANGE link link TEXT NOT NULL"):
}
elseif ($GLOBALS['db_type'] == 'pgsql') {
$ret[] = update_sql("ALTER TABLE {aggregator_item} RENAME link TO link_old");
$ret[] = update_sql("ALTER TABLE {aggregator_item} ADD link TEXT");
$ret[] = update_sql("UPDATE {aggregator_item} SET link = link_old");
$ret[] = update_sql("ALTER TABLE {aggregator_item} ALTER link SET NOT NULL");
$ret[] = update_sql("ALTER TABLE {aggregator_item} ALTER link SET DEFAULT ''");
$ret[] = update_sql("ALTER TABLE {aggregator_item} DROP link_old");
}
return $ret;
}
I'm not sure how to roll this as a real patch... any takers.
Comment #8
Dries CreditAttribution: Dries commentedWe pretty much standardized on URL being no longer than 255 characters ... Of course, we could change that.
URLs longer than 255 characters are not likely to occur; I suggest dropping the priority of this problem. I doesn't affect most people, and when it does, it doesn't render your site useless.
Comment #9
Dries CreditAttribution: Dries commentedComment #10
Morbus IffDuplicate checking in general needs work - see the new approach outlined here.
Comment #11
mfarroyo CreditAttribution: mfarroyo commentedI am new to Drupal, and so I have not looked into finding out why there are duplicate entries in the aggregator_items table. Instead, what I did was create a MySQL index that hides the duplicates. The MySQL statement is as follows:
ALTER IGNORE TABLE aggregator_item ADD UNIQUE INDEX(fid,title);
This is issued to the drupal database (mysql -u -p drupal). If anyone should know any untoward effects of that index, I would appreciate some feedback.
Thanks
Comment #12
magico CreditAttribution: magico commentedWhat should be done about this? Is this a work in progress in HEAD?
Comment #13
magico CreditAttribution: magico commentedClosing this in favour of #10.