delete items on a per feed basis

zis - March 5, 2007 - 22:53
Project:Leech
Version:4.7.x-1.6
Component:leech_news
Category:feature request
Priority:normal
Assigned:Aron Novak
Status:won't fix
Description

My site has ~100 feeds to leech. Most of them have a retention period of 2 weeks.
Old items are simply not being deleted. Nothing is has been deleted for more than a month. I have no idea why. Any ideas?
The problem is quite urgent, my database is growing very quickly.

#1

Aron Novak - March 7, 2007 - 21:41
Assigned to:Anonymous» Aron Novak
Status:active» fixed

Thanks for reporting this!
Actually this feature was simply not implemented. You can find the fix at the CVS.
You can patch your file according this: http://cvs.drupal.org/viewcvs/drupal/contributions/modules/leech/leech.m...

#2

zis - March 8, 2007 - 11:38

Thx.
I'll test it soon.

Perhaps it would be better to aka the table names in the query. Check the snippet below.

<?php
 
// Delete too old items
        
$result = db_query("SELECT f.items_delete, n.created, i.nid
                  FROM {node} AS n, {leech_news_item} AS i, {leech_news_feed} AS f
                  WHERE ( n.nid = i.nid AND
                          f.nid = i.fid)"
        
);
        
$now = time();
         while (
$node = db_fetch_object($result))  {
           if (
abs($now - $node->created) > $node->items_delete) {
            
node_delete($node->nid);
           }
         }
?>

#3

Aron Novak - March 12, 2007 - 18:54

Thanks for this advice!
I changed this part of the code, it's really much nicer :)
Have you tested it? Any experience?

#4

zis - March 14, 2007 - 13:08
Status:fixed» needs work

I tested it, and it is working allright.
But, i think it would be better to check for old items for each feed at a time rather than checking the whole database at once. This way, after a feed is refreshed, only that feed is checked for old items..

Because I had a couple thousand old items, the script timmed out a couple of times before deleting them all.
One more thing, i can't seem to find a hook_delete function for feed items, i might have missed it though.

#5

Aron Novak - March 21, 2007 - 16:23
Status:needs work» fixed

Thanks for trying out the solution.
Yes, you mentioned that the script ran into timeout. But if this deletion-stuff would work okay nicely, it couldn't happened i think.
hook_delete: the node_delete does the job what you mentioned.

#6

zis - March 22, 2007 - 16:30
Status:fixed» needs review

I really think it should be set up to delete items on a per feed basis.

If a site aggregates 100 items/day, 1 cron run will have to delete 100 items in 1 time.. which is very time consuming..
if it is set to items on a per feed basis, the 100 items will spread to multiple runs..

#7

Aron Novak - April 10, 2007 - 21:12

Unfortunately now I don't have time to implement your suggestion (the suggestion is good) about the per feed basis deletion. If you implement it, I'm glad if you can send a patch :) Ask me if any help is needed you to do this.

#8

Aron Novak - January 2, 2008 - 19:39
Title:Old items not deleted» delete items on a per feed basis
Category:bug report» feature request
Priority:critical» normal
Status:needs review» won't fix

I would like to suggest you FeedAPI, there this feature is implemented already. Let's look at the attached screenshot.

AttachmentSize
feedapi-per-feed-delete-interval.png 19.37 KB
 
 

Drupal is a registered trademark of Dries Buytaert.