node_tag_new() currently does a merge query on every authenticated page view, to update the {history} table.

afaik history is used for only three purposes:

1. To determine if a node is 'new' for that user or not - whether they ever visited it.

2. To determine if a node has new comments since the last time the user visited it.

3. To determine if the node was updated since the last time the user visited it.

It seems like we could check the node updated and last comment timestamp times, then compare them to the history table, then only do an insert, if one of those three conditions is met - and possibly also every few hours regardless of how many times they've visited the node. This would remove the insert on any repeat page views, and it's not likely to have any real noticeable cost when the value actually needs to be updated. This would be a similar approach to that taken for updating user_access().

Comments

catch’s picture

Here's a patch. We only update history if it's > 30 minutes since the last time we updated, if the node was updated, or if a comment was posted. That should avoid any hammering from people hitting refresh but still keep it accurate enough.

Completely unrealistic benchmarks but they show the difference from removing the insert, benchmarked with auth session cookie. However they're very disappointing. The query takes around 7ms on my system, which ought to be in the region of 10% of request time, but I'm only seeing a couple percent improvement. It may be the benchmarks are wrong though?

HEAD:

Concurrency Level:      1
Time taken for tests:   76.494 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      9082000 bytes
HTML transferred:       8578000 bytes
Requests per second:    13.07 [#/sec] (mean)
Time per request:       76.494 [ms] (mean)
Time per request:       76.494 [ms] (mean, across all concurrent requests)
Transfer rate:          115.95 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       1
Processing:    66   76   8.5     74     128
Waiting:       60   69   8.0     67     115
Total:         66   76   8.5     74     128

Patch:

Concurrency Level:      1
Time taken for tests:   75.006 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      9082000 bytes
HTML transferred:       8578000 bytes
Requests per second:    13.33 [#/sec] (mean)
Time per request:       75.006 [ms] (mean)
Time per request:       75.006 [ms] (mean, across all concurrent requests)
Transfer rate:          118.25 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:    67   75   7.2     73     145
Waiting:       60   68   6.6     66     116
Total:         67   75   7.2     73     145
catch’s picture

Priority: Normal » Minor
Status: Active » Needs review

Hmm, profiling shows it's about 3% unpatched and 0.3% patched - that at least means the node_last_viewed() and node_load() are very little overhead, however that means the benchmarks are about right, at least with a 1 node, 2 users data set.

However I also realised we have #165061: New indexes on flood.timestamp and history.timestamp for cron runs. open, which if we patch it (and we probably should for stability) will mean slower inserts on this table. i benchmarked with an index on history.timestamp though, and got very similar results to above, so I dunno if this is worth it really. It's relatively harmless but might be a no-op.

dries’s picture

No patch?

catch’s picture

StatusFileSize
new1.45 KB

Argh.

alexanderpas’s picture

Issue tags: -Perfomance +Performance

spelling

dries’s picture

This makes sense and I support it. It sounds a little complex. Everywhere we call node_tag_new() we call it using node_tag_new($node->nid); so it seems like we could pass in $node instead of $nid and simplify the logic drastically. I support us doing that, even at this stage. node_tag_new() is not really called by other modules, AFAIK.

catch’s picture

Status: Needs review » Needs work

Yeah that would massively simplify the patch, and I agree it doesn't really get used as an API function. Will also see about generating large numbers of dummy rows in {history} to see if I can get those inserts dragging some more for benchmarks. CNW for now.

catch’s picture

Status: Needs work » Needs review
StatusFileSize
new1.36 KB

Here it is with the param change.

catch’s picture

StatusFileSize
new1.63 KB

Now with the threshold as a variable - that way sites which need this 100% accurate, or sites which want to update it less frequently can do so without hacking core. We don't need a UI for the variable though. I haven't tested this in all possible situations yet.

dries’s picture

This looks good, but let's add a small code comment to explain why we are doing this.

catch’s picture

StatusFileSize
new1.86 KB

Put the comment back in from previous patches, sloppy work on my part :(

dries’s picture

Status: Needs review » Fixed

Committed to CVS HEAD. Thanks.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

rjbrown99’s picture

Version: 7.x-dev » 6.x-dev
Status: Closed (fixed) » Needs review
StatusFileSize
new1.12 KB

I like avoiding INSERT/UPDATE queries so I rolled a slightly different backport for D6. Instead of modifying node_tag_new (which would break a number of widely used contrib modules such as ctools) I added the avoidance logic directly to node_show(). It won't catch all cases but should work for the biggest offender.

Doubtful this ever makes its way into D6 but I thought it was worth posting back in case someone else has a desire to use it.

igor.ro’s picture

I suggest to move update request to the shutdown function.
This will make bulk update in once.

@Dries what do you think about this?

igor.ro’s picture

StatusFileSize
new1.19 KB

Status: Needs review » Needs work
Issue tags: -Performance

The last submitted patch, drupal-node_tag_new-667152-16.patch, failed testing.

arunvs’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work
Issue tags: +Performance

The last submitted patch, drupal-node_tag_new-667152-16.patch, failed testing.

ianthomas_uk’s picture

Version: 6.x-dev » 7.x-dev

The patch on 7.x-dev was rolled back in https://drupal.org/node/704362

I can't see this being patched in 6.x at this stage, so setting version to 7.x-dev, although 8.x might be more appropriate. The equivalent function in 8.x is history_write() in history.module.

catch’s picture

Version: 7.x-dev » 8.x-dev

Yes this would need to go into 8.x first.

xjm’s picture

Component: node.module » node system
Issue summary: View changes

(Merging "node system" and "node.module" components for 8.x; disregard.)

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

  • Dries committed f97a92d on 8.3.x
    - Patch #667152 by catch: optimize node_tag_new().
    
    

  • Dries committed f97a92d on 8.3.x
    - Patch #667152 by catch: optimize node_tag_new().
    
    

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

  • Dries committed f97a92d on 8.4.x
    - Patch #667152 by catch: optimize node_tag_new().
    
    

  • Dries committed f97a92d on 8.4.x
    - Patch #667152 by catch: optimize node_tag_new().
    
    

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.6 was released on August 2, 2017 and is the final full bugfix release for the Drupal 8.3.x series. Drupal 8.3.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.4.0 on October 4, 2017. (Drupal 8.4.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.4.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.4 was released on January 3, 2018 and is the final full bugfix release for the Drupal 8.4.x series. Drupal 8.4.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.5.0 on March 7, 2018. (Drupal 8.5.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.5.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

  • Dries committed f97a92d on 9.1.x
    - Patch #667152 by catch: optimize node_tag_new().
    
    

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.2.x-dev

Drupal 8 is end-of-life as of November 17, 2021. There will not be further changes made to Drupal 8. Bugfixes are now made to the 9.3.x and higher branches only. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.15 was released on June 1st, 2022 and is the final full bugfix release for the Drupal 9.3.x series. Drupal 9.3.x will not receive any further development aside from security fixes. Drupal 9 bug reports should be targeted for the 9.4.x-dev branch from now on, and new development or disruptive changes should be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.9 was released on December 7, 2022 and is the final full bugfix release for the Drupal 9.4.x series. Drupal 9.4.x will not receive any further development aside from security fixes. Drupal 9 bug reports should be targeted for the 9.5.x-dev branch from now on, and new development or disruptive changes should be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

acbramley’s picture

Status: Needs work » Fixed

This was committed but the issue was never closed. Marking as fixed (although the function no longer exists)

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.