Nice little utility module! Unfortunately, it didn't help as much as I'd hoped. My hosting_task_log table is still over 1GB. This appears mostly due to daily 'backup' and 'backup delete' tasks on all sites generated from hosting_backup_queue and hosting_backup_gc.

I wonder if we couldn't allow deletion of tasks of certain types from live sites over a certain threshold... IMO, task logs are mostly useful when there's a warning or error, and the successes don't bring a whole lot of value. So maybe just remove older successful tasks?

hosting_task also has all those (mostly) useless queued tasks. I wonder if we could just remove those wholesale?

One concern is that the task node revision for a deleted log would still exist... Maybe we could add an entry to hosting_task_log of type 'garbage_collected' so that if someone came across said task node revision, there'd be an indication of what happened.

Thoughts?

CommentFileSizeAuthor
#6 revisions_cleanup.jpg161.93 KBomega8cc
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

ergonlogic’s picture

Note: #2053929: Trim hosting_task tables, where I suggest experimenting with the ideas above in crontrib; either here in hosting_task_gc, or in a sandbox project.

ergonlogic’s picture

Also, I'm well aware that adding this functionality will probably quadruple the code-base of this module. So, I'm happy to help maintain it, if there's interest. Also, I was thinking this functionality could fit nicely into hosting_tasks_extra...

Dane Powell’s picture

Great ideas. I'm not sure I'm comfortable deleting logs based on whether there were errors or not. Partly because I've run into issues in the past where errors and warnings aren't handled properly as such. Partly because there might be useful information even in successful logs (for instance, seeing what directories / makefiles / build paths were used, etc...). However, this still might be a useful option for people.

I think I'd rather use a combination of time and number of revisions. For instance, only keep task logs for the latest revision of a task node, and revisions less than 1 month old. This would cut the size of the hosting_task_log table tremendously.

I tried to code up a prototype of this, but my SQL-fu was not strong today... I had trouble finding a solution that didn't involve running tens of thousands of SQL queries.

I'd love any help you can provide.

j0nathan’s picture

Hi. We would be interested in that feature too. I'm sorry I cannot help coding.

j0nathan’s picture

Maybe the information that would be deleted from the DB can go into a text file. Just an idea.

omega8cc’s picture

FileSize
161.93 KB

There is a module for this: Revision Deletion - we use it in BOA for a long time already. Here is our default config:

revisions_cleanup

Dane Powell’s picture

Status: Active » Closed (won't fix)

@omega8cc That's awesome, thanks! I'll update the project page to point to that.

Dane Powell’s picture

FYI, it seems that the 'revisions per page' setting actually controls how many revisions are deleted on each cron run, which means that you'll only be deleting 8 revisions per day. Maybe that's enough for you, but that's totally insufficient for our site. There's an issue about this: #1166368: Explanation of Cron Limit Needed on Admin Form