New indexes on flood.timestamp and history.timestamp for cron runs. [#165061]

This morning it was brought to our attention, forcefully so, that the crun runs that clear out watchdog and history tables take quite awhile to complete. After tossing around several ideas about this, I eventually actually looked at the tables and noticed that while they are preened with a DELETE FROM

WHERE timestamp, there is no index on timestamp so it is doing a full table scan. Below are some numbers with and without the timestamp index, I would request that there be one by default.

Without index:
mysql> delete from test1 where timestamp < 1185832802;
Query OK, 33896 rows affected (2 min 56.80 sec)

With Index:

mysql> delete from test where timestamp < 1185832802;
Query OK, 33896 rows affected (20.13 sec)

Insert benchmarks without index: (1032 watchdog inserts)

Run1: .797 seconds
Run2: .759 seconds
Run3: .612 seconds

With index: (1032 watchdog inserts)

Run1: 1.07 seconds
Run2: 1.32 seconds
Run3: 1.7 seconds

A performance hit, but a trade-able one considering the gains on the other side.

Comment	File	Size	Author
#24	timestamp_indexes-165061_D7-24.patch	1.53 KB	basic
#22	timestamp_indexes-165061_D7-21.patch	1.27 KB	basic
#21	timestamp_indexes-165061-16.patch	760 bytes	basic
#16	timestamp_indexes-165061-16.patch	912 bytes	basic
#1	timestamp_indexes.patch	2.92 KB	damien_vancouver

Comments

Comment #1

damien_vancouver commented 11 September 2007 at 00:19

Category:	feature	» bug
Status:	Active	» Needs review

Status	File	Size
new	timestamp_indexes.patch	2.92 KB

I just ran into this too... deadlocking due to extra logging I added filling the watchdog tables, then the qeuries to DELETE FROM watchdog WHERE timestamp < XXXXXX were timing out due to the extra rows in watchdog, then causing MySQL deadlocks, which then slowed down the site and caused monitoring errors....... you get the idea My mistake was definitely me putting in way too much watchdog logging, my DB jumped in size from 3 gigs to over 20 (!).

SO - under normal circumstances this should not be a major problem. However, it is definitely still a problem in my opinion. Often when you need this efficiency the most is when something has gone wrong and you have lots of rows... it should be fixed I think and Drupal should never ever be doing full table scans of a table no matter what.

I've attached a patch for DRUPAL-5 (works fine against drupal 5.2) so indexes will be added on the following:

{flood}.timestamp
{history}.timestamp
{watchdog}.timestamp

The patch also adds a new update for system.module, #1023, which creates the same indexes. If you are installing this patch onto an existing Drupal site, you will have to visit the update.php and run available updates (system #1023 will create your indexes).

As for Drupal 6, HEAD seems to not have this problem for watchdog or history - but it does still apply for {flood}, which is deleting based on timestamp still. In theory {flood} should not get a huge number of rows so that it really matters, but again it probably should be fixed. If this patch makes any progress I can create one for HEAD with just the {flood} changes.

Comment #2

drumm

he/him

NY, US

commented 3 December 2008 at 18:19

Title:	New indexes on watchdog.timestamp and history.timestamp for cron runs.	» New indexes on flood.timestamp and history.timestamp for cron runs.
Version:	5.x-dev	» 7.x-dev
Status:	Needs review	» Needs work

HEAD does have a query on {history}.timestamp, so I think it might be affected.

I am a bit cautious about applying this to 5.x. {watchdog} has many more inserts than deletes, and indexes slow down inserts.

Comment #3

Crell commented 4 December 2008 at 07:15

Indexes do add time on insert, but it's very spread out compared to the large whammy of a big select, update, or delete. I'm inclined to eat the time on insert and just add the indexes.

Comment #4

nnewton commented 8 April 2009 at 23:36

They do slow down inserts. I am actually mildly mixed on this, but only mildly. Most people don't need this index at all, however, anyone using a replication setup that can't fall too far behind will require it. It is not an option for them. So, maybe this should just be rolled into the slave patch for D5.

Comment #6

Crell commented 21 October 2009 at 05:48

Issue tags:

+Performance

Narayan, is this Performance-necessary for D7? I say we either move forward and just do it (since we DO support slaves in D7 natively) or mark it won't fix and let the super-high-end sites figure it out themselves. Your call.

Comment #7

nnewton commented 22 October 2009 at 01:39

For watchdog yet, I'll have to test again with history. I'll do that.

Comment #8

catch

he/him

English

commented 27 October 2009 at 08:10

For watchdog we have syslog.module now - so I'd rather have the index there if that's still valid, and avoid potential crashing, than have it optimized to be super-fast on inserts in general - since if you need that you can use another logging module anyway.

Comment #9

Crell commented 12 November 2009 at 05:59

What's the verdict, oh people who run bigger servers than I do? :-)

Comment #10

killes@www.drop.org commented 4 July 2010 at 20:22

I think we should have this in standard Drupal. Even if it slows down sites a bit: if you have a small site you won't have that many inserts.

Comment #11

nnewton commented 6 July 2010 at 21:06

I'd say this is worth it. If someone is that worried about a single index on the watchdog table, they probably shouldn't be using the db backend for watchdog.

-N

Comment #12

Crell commented 6 July 2010 at 21:23

Component:

mysql database

» dblog.module

OK, let's do then. Refiling over to the right component, since this isn't really a MySQL driver question.

Comment #13

irandream commented 26 March 2011 at 17:14

Assigned:

Unassigned

» irandream

anyone using a replication setup that can't fall too far behind will require it. It is not an option for them.
طراحی وب سایت

Comment #14

DanPir commented 28 March 2011 at 10:39

Comment #15

sun

German

Karlsruhe

commented 15 April 2011 at 15:52

subscribing

Comment #16

basic commented 20 December 2011 at 03:03

Version:	7.x-dev	» 8.x-dev
Status:	Needs work	» Needs review

Status	File	Size
new	timestamp_indexes-165061-16.patch	912 bytes

Adding timestamp index to dblog and node modules for watchdog and history tables.

The flood table now has an index including timestamp (added to D7 in #485974: Improved security: rate limit login attempts.).

Comment #17

Crell commented 20 December 2011 at 06:20

Status:

Needs review

» Needs work

Looks like the line endings in #16 are AFU.

Also, update functions needed.

Comment #18

basic commented 20 December 2011 at 06:47

Status:

Needs work

» Needs review

Are updates required for D8?

Not sure what's going on with the line endings... oh odd, 'git diff > patch' kept the color codes. How nice.

Comment #19

Crell commented 20 December 2011 at 07:26

Unless I missed a memo we still need to have an upgrade path from Drupal 7 sites, so yes.

Comment #20

catch

he/him

English

commented 20 December 2011 at 07:46

Issue tags:

+Needs backport to D7

I think we should backport this to 7.x and add the update functions there tbh.

Comment #21

basic commented 20 December 2011 at 08:24

Status	File	Size
new	timestamp_indexes-165061-16.patch	760 bytes

Agreed. Here's the D8 patch without color codes

Comment #22

basic commented 20 December 2011 at 08:56

Version:

8.x-dev

» 7.x-dev

Status	File	Size
new	timestamp_indexes-165061_D7-21.patch	1.27 KB

Here's an attempt at the D7 backport with updates.

Comment #23

damien tournoud commented 20 December 2011 at 10:22

Assigned:

irandream

» Unassigned

Those two upgrades need to be inside a updates-7.x-extra group. For example:

/**
 * @defgroup updates-7.x-extra Extra system updates for 7.x
 * @{
 */

/**
 * @} End of "defgroup updates-7.x-extra"
 */

Comment #24

basic commented 20 December 2011 at 10:31

Status	File	Size
new	timestamp_indexes-165061_D7-24.patch	1.53 KB

Something like this? (edit: i'm not sure why git is mucking with my whitespace)

Comment #25

mgifford

he/him

English

commented 23 November 2012 at 17:49

Issue tags:

-Performance, -Needs backport to D7

#24: timestamp_indexes-165061_D7-24.patch queued for re-testing.

Comment #26

23 November 2012 at 18:01

Status:	Needs review	» Needs work
Issue tags:		+Performance, +Needs backport to D7

The last submitted patch, timestamp_indexes-165061_D7-24.patch, failed testing.

Comment #27

candelas commented 27 July 2013 at 16:02

any news on this?

Comment #28

danblack commented 4 April 2014 at 21:07

fixed - watchdog / flood

watchdog table deletions have been improved without adding an index. The flood in the system module now and a expiration that is indexed however it could of been implemented similar to watchdog however I can't see anyone complaining about the performance on this table and @damien_vancouver in comment #1 seem happy with indexs.

history table

history based deletions are in modules/node/node.module are still DELETE FROM history WHERE timestamp < X and still don't have a timestamp based deletion. The trick used for watchdog table won't work as it is a table that is updated and not inserted on.

First thing to check is if this table is innodb. If it is myisam, every insert,update, delete will be a table lock so you won't get much thoughput.

Options for improvement here (aside from adding a timestamp index are):

Second, is if the delete is running at the default transactional isolation level of repeatable-read it may be getting more locks than required. In this case (not generally) a lower read-commited/read-uncommited (a transaction is only every going to update the timestamp and not roll back so uncommited reads are probably useful) will restrict locks to those rows it intends to delete.
An isolation mechanism will need to be written in a db abstraction as this is mysql specific db_query("set tx_isolation="read-uncommitted'") (postgres implementation)).

Even with a limited number row locks, with a big delete is that every update of a row in the middle of being deleted will be locked. DELETE FROM can support DELETE FROM history WHERE timestamp < X LIMIT Y on mysql and sqlite(if compiled with it) but not postgresql which would of been a nice way to delete in batches providing the opportunity for other instances to perform updates between loop iterations (perhaps the availability DELETE LIMITS can be exposed from the driver if exists and places like this can use it?).

To perhaps reduce some locking for the period it takes to determine all rows where timestamp < X;

CREATE TEMPORARY TABLE history_delete(`uid` int(11) NOT NULLm `nid` int(11) NOT NULL, PRIMARY KEY (`uid`,`nid`))
INSERT INTO history SELECT uid,nid FROM history WHERE timestamp < X

and then

DELETE FROM  history USING history,history_delete WHERE history.uid=history_delete.uid and history.nid=history_delete.nid;
 delete would reduce the locking to a primary key and perhaps save deadlocks for the length of time for this INSERT statement.

my only other suggestion is set a lower session locktimeout (also in need of abstraction mysql/postgres) in node_tag_new and catch and ignore any deadlock/locktimeout errors and loose a bit of data.

Comment #29

dagmar

Spanish

Argentina

commented 14 April 2017 at 19:19

Component:

dblog.module

» history.module

Comment #30

14 April 2017 at 19:19

Status:

Needs work

» Closed (outdated)

Automatically closed because Drupal 7 security and bugfix support has ended as of 5 January 2025. If the issue verifiably applies to later versions, please reopen with details and update the version.

New indexes on flood.timestamp and history.timestamp for cron runs.

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

Comment #18

Comment #19

Comment #20

Comment #21

Comment #22

Comment #23

Comment #24

Comment #25

Comment #26

Comment #27

Comment #28

fixed - watchdog / flood

history table

Comment #29

Comment #30

News items

Our community

Documentation

Drupal code base

Governance of community