Problem/Motivation

In #3252126: Do not apply updates during cron if there are DB updates in the staging area, we introduced detection of new or changed update functions in staged modules. Currently, we just use simple hashing of the .install and .post_update.php functions to check for changes, but this can produce incorrect results if comments, whitespace, or runtime code like hook_requirements changes in those files. We figured that's good enough for MVP, but it might be nice to introduce something smarter.

We should solve this for stable in Drupal core.

Proposed resolution

Use low-level built-in PHP parsing (token_get_all()) to detect new update functions in the .install and .post_update.php files.

This will NOT detect removed or changed update functions. That's on purpose. Changed update functions are irrelevant -- updates are only ever run once, so they're not really something you ever change. And removed update functions are also not relevant to our purposes -- they are gone, so they don't affect anything. We only really need to detect new updates.

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

phenaproxima created an issue. See original summary.

effulgentsia’s picture

I think adding nikic/php-parser to core has other good use cases too, such as parsing annotations or PHP 8 attributes. If there isn't a core issue already for adding this library to core, let's open one.

wim leers’s picture

Priority: Normal » Major

Isn't core already depending on nikic/php-parser?

This seems like it potentially is worthy to be tagged core-mvp, since it has such a big impact?

I think this is Major at minimum?

Can we convert this into a Plan and create multiple child Tasks?

tedbow’s picture

Just to be clear

Currently, we just use simple hashing of the .install and .post_update.php functions to check for changes, but this can produce incorrect results if comments, whitespace, or runtime code like hook_requirements changes in those files.

I believe all of these could only produce false positives, determining there is an update when there is not. Not false negative, determining there is not an update when there is.

I think this case is much less critical because:

  1. This means we are not going missthat a code update has a DB update and then allow it to run in cron
  2. It seems very unlikely that we would determine security update does have a DB update when it does not and therefore not run it in cron. This is because security updates don't ship with unrelated fixes so we won't get

    comments, whitespace, or runtime code like hook_requirements changes in those files.

    if there weren't functional changes in those files

So this may be a reason for this not be major. I personally don't see why this would need to be core-mvp(unless everything we would ever want the system to be is core mvp😜). It doesn't seem changing this later would cause any API changes

wim leers’s picture

Issue tags: +Security improvements

I figured it was potentially core-mvp because it has significant security implications. Not because we missed something, but because we would prevent a security update from getting installed precisely because this is overly cautious…

tedbow’s picture

Assigned: Unassigned » wim leers

Isn't core already depending on nikic/php-parser?

It seems to be in composer/Metapackage/CoreRecommended/composer.json but nowhere else core so I am not sure what this mean. If it is a direct core dependency shouldn't it be in `core/composer.json`.

does that mean we should make a core issue to add the dependency to core before we put work into using in our module. Assigning to @Wim Leers because maybe you know how this works

wim leers’s picture

Title: Use static analysis to detect new or changed update functions » Use static analysis to detect new or changed update functions, to reduce false positives in StagedDBUpdateValidator
Assigned: wim leers » Unassigned
Status: Active » Postponed

Turns out it's a dev-only dependency for phpunit. But this does seem like a low-risk external dependency.

I do think that Major is still accurate based on #5. I still think that reducing false positives on this front is something with indirect security implications.

However, after inspecting the code (\Drupal\package_manager\Validator\StagedDBUpdateValidator), I do see that we can be 99% confident that this would be an isolated change, without the need for API changes/BC breaks. Plus, it's marked @internal.

So let's explicitly postpone this to a later phase?

wim leers’s picture

Assigned: Unassigned » tedbow
Status: Postponed » Active

So let's explicitly postpone this to a later phase?

Since @tedbow didn't respond after a week, explicitly assigning to him.

tedbow’s picture

Status: Active » Postponed

Yes I think we can postpone this for reasons stated in #7

tedbow’s picture

Issue tags: +core-post-mvp

I think we could do this after the first core commit but maybe @Wim Leers disagrees

wim leers’s picture

I agree 😊

So … is this a must-have for beta/stable, or a should-have for stable?

wim leers’s picture

Status: Postponed » Needs review
Issue tags: +Usability

Marking Needs review to get @tedbow's response to #11.

tedbow’s picture

Issue summary: View changes

I think for stable. Not sure how we should tag issues core-stable-blocker and core-beta-blocker?

wim leers’s picture

Issue tags: +beta bl

Why not just stable blocker and beta blocker? The implication is that it's referring to core?

wim leers’s picture

Issue tags: -beta bl

Oops! Was just testing it out :P

tedbow’s picture

Version: 8.x-2.x-dev » 3.0.x-dev
tedbow’s picture

Assigned: tedbow » Unassigned
Status: Needs review » Active
Issue tags: +beta blocker
tedbow’s picture

I noticed that 10.0.8 to 10.0.9 actually has a change to system_requirements that would flag has a DB update when it is not

tedbow credited FizCS3.

tedbow’s picture

crediting @FizCS3 for finding in testing

tedbow’s picture

Issue tags: +Pittsburgh2023
phenaproxima’s picture

I dug into this a bit and I think it actually wouldn't be too tricky.

It would require us to create a node visitor (as documented) that detects functions in certain files, matching certain naming patterns, and simply compares the lists of found functions. Obviously this would only be done if nikic/php-parser were installed; otherwise, we'd fall back to hashing.

But generally I think this would be relatively simple.

tedbow’s picture

Status: Active » Needs work
phenaproxima’s picture

Assigned: Unassigned » tedbow
Issue summary: View changes
Status: Needs work » Needs review
wim leers’s picture

@phenaproxima mentioned token_get_all(). It looks REALLY promising: https://3v4l.org/SX0rT

Any sequence like this:

T_FUNCTION
T_WHITESPACE
T_STRING

we'd need the value of T_STRING for, and … we'd have a complete list! 🥳

phenaproxima’s picture

Issue summary: View changes
phenaproxima’s picture

Title: Use static analysis to detect new or changed update functions, to reduce false positives in StagedDBUpdateValidator » Use static analysis to detect new update functions, to reduce false positives in StagedDBUpdateValidator
Issue summary: View changes
tedbow’s picture

Issue tags: +Needs manual testing

re #18 we should manually test that this new method would not have the false positive for the DB updates that current method has

tedbow’s picture

Assigned: tedbow » phenaproxima
Status: Needs review » Needs work

I few comments in the MR

phenaproxima’s picture

Assigned: phenaproxima » tedbow
Status: Needs work » Needs review
tedbow’s picture

Assigned: tedbow » phenaproxima
Status: Needs review » Needs work
phenaproxima’s picture

Assigned: phenaproxima » tedbow
Status: Needs work » Needs review
tedbow’s picture

Assigned: tedbow » phenaproxima
Status: Needs review » Needs work
phenaproxima’s picture

Assigned: phenaproxima » tedbow
Status: Needs work » Needs review
phenaproxima’s picture

For manual testing:

Drupal 10.0.9 changed some code in system.install, compared to 10.0.0. (The code in question is part of hook_requirements.) No update functions were added that I can see. So this is a perfect pairing of versions to prove that this static analysis is smarter than what we previously had.

phenaproxima’s picture

To manually test the status quo, I did the following:

composer create-project drupal/recommended-project:10.0.0 au
cd au
composer config minimum-stability alpha
composer require drush/drush:^11
composer require drupal/automatic_updates:^3
./vendor/bin/drush si -y
./vendor/bin/drush en -y automatic_updates
./vendor/bin/drush cset -y automatic_updates.settings unattended.level patch
./vendor/bin/drush auto-update

As expected, this died with the error:

 [error]  The update cannot proceed because possible database updates have been detected in the following extensions.
System
User
phenaproxima’s picture

To test this branch, I did the following:

composer create-project drupal/recommended-project:10.0.0 au
cd au
composer config minimum-stability alpha
composer require drush/drush:^11
composer config repo.au vcs https://git.drupalcode.org/issue/automatic_updates-3253828.git
composer require drupal/automatic_updates:dev-3253828-use-static-analysis
./vendor/bin/drush si -y
./vendor/bin/drush -y en automatic_updates
./vendor/bin/drush cset -y automatic_updates.settings unattended.level patch
./vendor/bin/drush auto-update

This time, it died with the following:

 [error]  The update cannot proceed because possible database updates have been detected in the following extensions.
User

However, this makes perfect sense: running git diff 10.0.0:core/modules/user/user.post_update.php 10.0.9:core/modules/user/user.post_update.php shows me that post-update functions were added to the User module!

This is already a lot more promising! It detected actual update functions in User, but didn't complain about System.

tedbow’s picture

Assigned: tedbow » Unassigned
Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs manual testing

Thanks for testing!

phenaproxima’s picture

  • phenaproxima committed 8db8ae3a on 3.0.x
    Issue #3253828 by phenaproxima, tedbow, Wim Leers, FizCS3, effulgentsia...
phenaproxima’s picture

Status: Reviewed & tested by the community » Fixed
wim leers’s picture

That looks amazing! 🤩

wim leers’s picture

Issue tags: +needs profiling

I am a bit curious if this is sufficiently fast for large codebases, but considering how little code is typically in *.install and *.post_update.php files, I'm not very concerned. Still, this might come back to haunt us at some point in the future in some extreme cases…

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.