Problem/Motivation
Path aliases are currently represented as arrays, and they are using a custom storage class, which means they can't be integrated in modern workflows built around Drupal's Entity API, such as Content Moderation or Workspaces.
Proposed resolution
Convert path aliases to a new path_alias
content entity type. The new entity type will be revisionable and non-translatable, and will also be made publishable in a followup issue: #3007669: Add publishing status to path aliases
In order to keep this non-trivial patch to a reasonable size, no methods from the current path.alias_storage
service will be deprecated in this issue. That work will happen in a followup: #2233595: Deprecate the custom path alias storage
For the same reason, other conversion to use the Entity API will also be done in followup issues:
#3009014: Convert path alias migrations to use entity:path_alias destination
#3007832: Convert custom path alias forms to entity forms
#3011807: Convert the path alias admin overview to a list builder
Note about performance
Operating with an entity is heavier than just sending arrays around. This patch doesn't change any of the queries that happen in bootstrap phase (which is most sensitive to performance regressions). All the queries (even functions) stay the same. Same is true for URL generation.
Remaining tasks
Review the patch.
User interface changes
Nope.
API changes
API addition: a new path_alias
revisionable content entity type.
Release notes snippet
Custom URL aliases are now provided by a new path_alias
revisionable content entity type. The path.alias_storage
service has been kept in place for backwards compatibility, and its hooks have been deprecated.
If you have the contrib Pathauto module enabled, you must update to the latest version when you update to Drupal 8.8.0. Failure to update Pathauto when updating core could result in data loss.
Comment | File | Size | Author |
---|---|---|---|
#195 | interdiff-195.txt | 862 bytes | amateescu |
#195 | 2336597-195.patch | 102.84 KB | amateescu |
#193 | interdiff-193.txt | 857 bytes | amateescu |
#193 | 2336597-193.patch | 102.72 KB | amateescu |
#191 | interdiff-191.txt | 1013 bytes | amateescu |
Comments
Comment #1
slashrsm CreditAttribution: slashrsm commentedI looked into this a bit. The more time I spent with it the more it looked that it wouldn't be much harder to go all the way to "alias entities". I realize it is a bit crazy idea, but it gives us a lot of power and flexibility from the other hand. The way I implemented it in this patch doesn't change bootstrap preloads, etc so it shouldn't affect bootstrap performance.
Attached (proof of concept) patch converts path aliases to content entities, continues terminology standardization that we started as part of #2002126: [meta] Refactor the alias subsystem (classes in \Drupal\Core\Path) (path => alias) and adds two minor fixes that are needed to correctly install content entity that is defined in Core namespace.
I managed to install site, create alias for a node and navigate node page. Didn't touch tests, migration, ...
Now tear it apart! Ready, set, go! :)
Comment #2
slashrsm CreditAttribution: slashrsm commentedComment #3
slashrsm CreditAttribution: slashrsm commentedI initially made it revisionable but that introduced dependency on user.module due to revision_uid base field so I decided to remove that.
Comment #4
Dave ReidRe-applying tags and priority from #1854284: Path aliases are no longer arrays and cannot pass additional data to path hooks since this fixes a regression between D7 and D8.
Comment #6
slashrsm CreditAttribution: slashrsm commentedAnother step forward but still work in progress.
Comment #7
BerdirAre you sure that removing them is correct and not replace with installEntitySchema('path') ?
Did not look at anything else yet, just noticed that.
Comment #9
slashrsm CreditAttribution: slashrsm commentedSome more progress.
Comment #11
slashrsm CreditAttribution: slashrsm commentedComment #12
slashrsm CreditAttribution: slashrsm commentedComment #15
slashrsm CreditAttribution: slashrsm commentedFixed two more tests; only one left. Moved installation of entities in core namespace to install_base_system().
Comment #17
slashrsm CreditAttribution: slashrsm commentedComment #18
slashrsm CreditAttribution: slashrsm commentedYay! I'll try to do some performance testing/profiling to see if this introduces any regression.
At this point container ends up with uncomplete list of namespaces. I'm not sure where this rebuild should happen but it apparently fixes the last failing test (field_storage_config entity not found).
Comment #19
slashrsm CreditAttribution: slashrsm commentedAlso, there are still some things that could be done. Edit, delete, listing, ... are currently not working as "real" entity handers. Since all those pages still work as they should I'd rather do this kind of tasks as follow-ups to prevent the patch from growing even bigger.
Comment #20
slashrsm CreditAttribution: slashrsm commentedComment #21
slashrsm CreditAttribution: slashrsm commentedComment #22
slashrsm CreditAttribution: slashrsm commentedDid some ab benchmarking and profiling:
ab benchmark
HEAD
Patch
Profiling
HEAD
patch
Comment #23
slashrsm CreditAttribution: slashrsm commentedReroll and summary update.
We could provide BC layer by keeping path.alias_storage service and existing CRUD functions in place and operating against alias entity tables directly.
This issue is contrib blocker and was worked on during Amsterdam, which hopefully makes it a bit more acceptable.
Comment #25
slashrsm CreditAttribution: slashrsm commentedComment #26
Fabianx CreditAttribution: Fabianx commentedMight need some more profiling, not yet reviewed in depth, yet.
Comment #27
Wim LeersThis would also solve #2480077: Using the URL alias UI to change aliases doesn't do necessary invalidations: path aliases don't have cache tags.
Comment #28
BerdirI really doubt that :)
It would give you a default cache tag for the alias, but that's easy enough to build when you don't have an entity too. That's not the hard problem here (that is things like that you can *add* an alias for something that didn't have one before, for example...)
Comment #29
Wim LeersRight, but aliases being entities means that we don't have to retrofit aliases with cache tags in some hacky way; they'll be able to use cache tags in exactly the same way as others can today.
Comment #30
jibranAwesome patch but we need a beta evaluation criteria and probably a reroll.
Comment #33
saki007sterTrying rerolling. Big reroll please review it, not sure if done the right thing here.
Comment #34
saki007sterComment #36
slashrsm CreditAttribution: slashrsm at Examiner.com commentedI honestly think this will unfortunately need to wait until D9. We could potentially do this in one of 8.X releases, but we'd have to do it in a non BC-breaking way.
Comment #37
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedNo, as this is BC compatible data structure wise it is internal re-factoring also unblocks some contrib projects, so likely 8.0.x or 8.1.x material.
Comment #38
slashrsm CreditAttribution: slashrsm at Examiner.com commentedI don't think this is BC-compatible. At least how it is done in current patch (underlying data structure and parts of API are significantly different). I discussed this with @berdir in Amsterdam last year and his opinion was that it is not realistic to expect this to go in after feature freeze. That was also the main reason why I stopped working on this patch.
Comment #39
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedYeah, I remembered wrong.
However without the rename of source to path I think it could be BC compatible.
Comment #40
googletorp CreditAttribution: googletorp as a volunteer commentedComment #41
slashrsm CreditAttribution: slashrsm at Examiner.com commentedThis is 8.1.x material I guess.
Comment #44
phenaproximaDiscussed with @catch, @moshe, and @timmillwood on IRC. We all agree that this is a good direction for path aliases, and we agree (so far) on the following implementation points that I want to repeat here so that we have it in writing:
More thoughts and ideas are of course welcome! :)
Comment #46
nevergone CreditAttribution: nevergone commentedAnd now?
Comment #47
timmillwoodBefore this issue, we're working in #2856363: Path alias changes for draft revisions immediately leak into live site to provide a interim solution to get path aliases working with forward revisions.
Comment #51
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThis functionality is needed by the Workspace module, so I rewrote the patch with an approach that keeps full BC for now. At least some tests pass, let's see what the testbot has to say.
Comment #53
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedAll the update tests are failing because of #2830326: Broken link to 'Put your site into maintenance mode' on update.php results in WSOD. The problem is that we are trying to generate a link to the maintenance mode page without a working path alias storage (that's being added in an update function), so that issue is a hard blocker for this one.
Other changes in this patch:
- dropped the "publishable" part because I feel that needs a separate issue so it can be discussed properly
- added a quick update path which just creates an entity for each existing entry in the
url_alias
table. A more elaborate version is needed so we can take advantage of the multilingual capability of the new entity type and group existing per-language aliases into a single path_alias entity somehow- fixed a bunch of kernel tests that need to install the schema for the new
path_alias
entity typeLet's see where we're at with all the test fixes.
Comment #55
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedTurns out #2830326: Broken link to 'Put your site into maintenance mode' on update.php results in WSOD doesn't help very much, so I opened #3006086: update.php should not process path aliases instead.
This patch is the "for review" one from #53 combined with #3006086-2: update.php should not process path aliases.
Comment #56
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedIt would help to attach the patch as well :)
Comment #58
Wim Leers@amateescu++
From 181 to 32 failures, nice!
Comment #59
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFinally managed to find a solution for #3006086: update.php should not process path aliases, so we just need a few more test fixes here. The combined patch includes #3006086-16: update.php should not process path aliases.
Comment #61
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedUpdated he combined patch to use the latest one from #3006086: update.php should not process path aliases and added the required REST test coverage.
There should be single fail left,
\Drupal\Tests\rest\Functional\Update\EntityResourcePermissionsUpdateTest::testBcEntityResourcePermissionSettingAdded()
, and the fail is caused by\Drupal\filter\FilterPermissions::permissions
trying to generate a URL with path processing enabled, which means that the blocker issue is maybe not done yet :/Comment #63
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedIt seems that the REST test coverage check wants the test classes in very specific locations, and I was missing the hal_json parts :/
Also looked into the failure from
EntityResourcePermissionsUpdateTest::testBcEntityResourcePermissionSettingAdded()
and it's just another example of trying to use Drupal's API before running the database updates, when the system is in an unknown/broken state. The fix is pretty easy though, we can check the output of rest's permission callback directly instead of going through the user permission service.Comment #64
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedComment #65
heddnJust had a concern come to me. When running a migration from D6 or D7, today the path alias migration takes really long. Do we expect this to get even worse after converting to a full entity? And do we have tests of a migration after the conversion is in place? I would expect more changes in those tests, so I wonder if we have enough coverage to test this. Or maybe the API changes we have here are transparent enough that those tests don't need to change. But I do see some direct database calls in the kernel tests, so it makes me wonder.
Comment #66
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThat's exactly what's happening :) The current path alias destination plugin works with the current APIs, which are not impacted in any way. It's also a very good test that we are not breaking BC with this patch.
However, we should definitely update the default migrations to use a regular entity destination plugin instead of the custom path_alias one in a followup issue.
Comment #67
heddnAh, my causal glance was in error. On more thorough study, I see that direct db call is against the d6 source.
Database::getConnection('default', 'migrate')
Let's make sure we get that follow-up opened? Even though it will be postponed for now.
Great work. It nice to see it working so painlessly on the API front.
Comment #68
benjifisher@amateescu, @heddn:
I just opened #3009014: Convert path alias migrations to use entity:path_alias destination to track the migration updates discussed in #65-#67.
Comment #69
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThis patch needed a quick reroll.
Comment #70
catchNot a comprehensive review because I just ran out of time in the middle of it, but some minor things:
This could use ::invokeAllDeprecated() now, see https://www.drupal.org/node/2881531
This should use escapeLike(): https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Core%21Database%...
For consistency this should have an accessCheck(FALSE);
Comment #71
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThanks, @catch! All good points :)
Comment #73
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedBased on the test fails from #71, I think we should deprecate the hooks in a followup issue, where we can also replace all their usages: #3011003: Deprecate the old path alias hooks
Comment #74
catchI'm not 100% on doing the hook deprecations in a follow-up because it leaves us with two parallel APIs, but can also see good arguments for keeping the patch below 90kb. Here's a bit more of a review otherwise:
This should probably be deprecated - are usages removed here?
It's confusing (to me at least) that lookupPathSource() and lookupPathAlias() have the same method signatures.
This is only used once so could skip assigning it?
This is concatenating % before and after $keys as well as using escapeLike(), as well as replacing * with % - should it be stripping * instead?
Comment #75
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedWell, to be honest, all of #3007661: Modernize the path alias system has to be done for 8.7 if we want to end up with a real improvement for the path alias system (i.e. usable with workspaces) :) A patch for everything listed in there would be huge, so I'm trying to split up the work in as many logical pieces as I can. I'm aware that every patch should leave the system in a shippable state but in this case I'm learning this subsystem as I go, and I can't really think of another way do it.
For example, after working on #3007832: Convert custom path alias forms to entity forms in the past couple of days, I figured that we need another issue for converting the path alias admin listing to an entity view builder, and also that some methods from the current alias storage will not be needed anymore in the new entity-based storage (
aliasExists()
,languageAliasExists()
andgetAliasesForAdminListing()
), but I can only be confident about removing them after those patches are done and reviewed.1. Usages are not removed here, and I've re-purposed #2233595: Deprecate the custom path alias storage for deprecating the current storage as well as converting its usages.
2. This is just carried over from the current implementation, but I agree that it's very confusing. Fixed.
3. Right, same in
lookupPathSource()
. Fixed.4. I think the
*
character is used as a wildcard in the custom filtering code used by the alias overview form, so we can't strip it :)Comment #76
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedOk, #3011807: Convert the path alias admin overview to a list builder is done and now I'm 100% sure that we don't need
aliasExists()
,languageAliasExists()
andgetAliasesForAdminListing()
in the new storage.Comment #78
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedIt would help if I ran at least some tests before posting a patch :/
The interdiffs in #76 and the one from this comment might look confusing, and the reason is that I'm trying to minimize the changes in the "for review" patch now that the path forward for the followups is a bit more clear.
Comment #80
catchDiscussed with Alex Pott, we both think it's OK to deprecate the hooks in a follow-up.
Comment #81
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@catch, great! I'll start working on that part now and it will most likely be ready before this one gets in :)
Comment #82
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedTurns out that the patch to deprecates the old path hooks is ~10K: #3011003-2: Deprecate the old path alias hooks. Let me know if you would prefer to fold it into this issue or keep it separate.
Comment #83
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedWhile waiting for reviews here, I started to work on updating Pathauto for this change: #3012050: Prepare for the conversion of path aliases to entities
Comment #84
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedA few improvements based on various things that were discovered in all the followup issues.
Comment #85
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedComment #87
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThis should be better.
Comment #88
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFixed that CS issue.
Comment #90
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedNeeded a reroll because of #2342699: SqlContentEntityStorage tries to update identity/serial values by default.
Comment #91
catchI think it's worth folding #3011003: Deprecate the old path alias hooks in here given it's already ready - the hook invocations are changed here then changed again, so it should be less than a 10k addition to this patch.
Comment #92
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedSure, let's do that :)
Comment #93
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedWe have a new approach in #3006086: update.php should not process path aliases so let's try a new combined patch as well.
Comment #94
catchI think this needs a re-roll - just committed the update.php patch.
I reviewed again, and only found one thing to comment on otherwise.
I think we should install the schema in a hook_update_N() then migrate the entities and drop the table in the post update. That way we'll start the post updates with a coherent schema that just needs a data migration.
Comment #95
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFixed #94 and added an upgrade path test.
I'll start writing a change record for all of #3007661: Modernize the path alias system because I think it would be more discoverable via search engines than separate ones for each issue from that plan.
Comment #96
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThe CR is at https://www.drupal.org/node/3013865, added it to the patch.
Comment #97
larowlanThe issue summary talks about 'passing around arrays' as being a motivation here, but because of the interface we're still doing so. Do we want to add a new interface (and implement it with this class) that allows an entity based return. Otherwise we're not really addressing the original problem.
Note to self and other reviewers: this is only used by UI elements like forms and not in the critical path, so there is no performance cost of adding the entity load
Is there a reason why we didn't just align there field names?
do we need accessCheck(FALSE) here too?
not a huge fan of these 'getters that set things'
Removing this is a BC break, its a public function. We need to retain it and trigger an error (and a legacy test to verify as such)
Should this call the parent and merge the result?
Can we get a more descriptive message here, it is pretty blunt. Something like 'hook_path_delete is deprecated, use hook_ENTITY_TYPE_update for the path_alias entity type instead' - also we need legacy tests for this.
these are already defined on AliasStorage, is there a way we can re-use with inheritance here?
note to self and other reviewers: these match the existing schema indices
So now we get into BC policy grey areas. We don't have a policy on whether hook implementations constitute API in https://www.drupal.org/core/d8-bc-policy
It is 100% legitimate to use hook_module_implements_alter and remove menu_link_content's implementations here, substituting them with your own custom implementation that may call the original functions. Dynamic entity reference does that for field module. Removing these functions would be a fatal for any modules/custom code that was doing so.
I discussed with @catch and he noted that hook implementations are considered internal, so I updated the BC policy as such.
However the removal of these functions (and the system ones) needs to be part of the change record.
why are these changes needed? - can we comment as to why we're calling the controller direct and bypassing the user permissions service?
According to https://www.drupal.org/core/d8-bc-policy#tests we should be using deprecation here. Modules could be making use of this utility. And that means we need a legacy test too.
Can you elaborate on how this happens in the installer with this patch? I see a lot of 'installEntitySchema()' changes in tests so I does that mean its no longer automatic, which probably means the original intent of this code (avoid a lot of Exceptions) is no longer occurring?
can we do away with these altogether now by using the entity specific cache tag when we write the cache?
As we saw with the taxonomy tree update to entity field, we really need to be testing this with a substantial data-set and with alternate languages. Can we build a custom data set that is more expansive for this update test. This issue has the potential to be as disruptive as that one. Once bitten twice shy etc
Comment #98
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThanks for the review!
Re #97:
1. We are adding a new class,
\Drupal\Core\Path\PathAliasStorage
, and the old one is kept only to preserve BC for the previous API. Note that the existing alias storage will be deprecated in #2233595: Deprecate the custom path alias storage.3. The nice thing about writing new APIs is that we are able to improve things, including naming :) The current field names are based on a discussion with @catch and we both think they make more sense.
4. Yup! Added it.
5. I don't see any other way around that..
6. It's also marked
@internal
, so no one should rely on it. And in this case, I think a hard break is preferable instead of having modules think that they can still work with aurl_alias
table and its schema.7. Not sure. I wanted to minimize the changes that might be problematic performance-wise in this patch. But we can discuss it in a followup.
8.
\Drupal\Core\Extension\ModuleHandler::triggerDeprecationError()
already provides a better and explicit message regarding the deprecated hook implementations, but I changed it a bit anyway to be more specific.9. I don't think so. The existing storage interface clashes with the new entity storage-based one at least on
load()
,save()
anddelete()
.11. Added a note in the change record about the hook implementations that have been renamed :)
12. Added a comment to explain why.
13. I found only one usage of this class in contrib: http://grep.xnddx.ru/search?text=UrlAliasFixtures . Is it ok if I submit a patch for that module instead? It will be easier for everyone :)
14.
install_profile_modules()
callsinstall_core_entity_type_definitions()
which installs the newpath_alias
entity type provided by core.15. I'm not sure, maybe this can be discussed in #2480077: Using the URL alias UI to change aliases doesn't do necessary invalidations: path aliases don't have cache tags along with point 7. above.
16. Expanded the test coverage :)
Comment #100
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedForgot that the testbot installs Drupal in a subfolder. Let's try this one.
Also forgot to address #97.8: added legacy tests for the deprecated hooks.
Comment #101
mondrakeWe are trying to avoid direct db calls to entity tables, see e.g. #3012599: Replace all db calls to aggregator_feed and aggregator_item tables with Entity APIs. Suppose these ones can be converted to EntityQuery instead?
Comment #102
Wim LeersCurrently just a spectator, just wanted to thank everyone here for the significant recent progress! 👏😃
Comment #103
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@mondrake, good point, I'll do that in #2233595: Deprecate the custom path alias storage since we're moving that test method to a different test class over there.
Comment #104
vasi1186 CreditAttribution: vasi1186 at Amazee Labs commentedHi all!
That's again some really great work! I've also tried this patch and it worked good, migration from the old aliases table worked perfect.
Comment #105
Wim LeersFascinating! Can we document why this is the case?
list_cache_tags = { "route_match" }
to the annotation. ✅Shouldn't this say that it'll be removed in Drupal 9.0.0?
Let's use
RestPermissions::class . ':permissions'
instead to simplify future refactoring. ✅Looks like
@expectedDeprecation
is missing? ✅Hah, another entity type that needs to override these relatively obscure edge case properties :)
Great to see you were able to write this test without changing anything in the existing base classes!
I'd argue that switching on method is unnecessary here :)
And in fact, in the case of
getExpectedUnauthorizedAccessMessage()
, this is what the base implementation already does for you. ✅This is copy/pasted from
\Drupal\Tests\node\Functional\Rest\NodeXmlCookieTest::testPatchPath()
, but does not make sense here. ✅Items with a ✅ have been addressed in this reroll.
Comment #106
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThanks, Wim!
Re #105:
1. Because that's what the current alias storage does :)
https://cgit.drupalcode.org/drupal/tree/core/lib/Drupal/Core/Path/AliasS...
https://cgit.drupalcode.org/drupal/tree/core/lib/Drupal/Core/Path/AliasS...
3. Sure, why not. The full deprecation message can be seen in the interdiff, in the
@expectedDeprecation
annotations.The rest of the changes look good to me from #105.
Comment #107
Wim LeersSounds good! And…
Hah, oops! 😅 Thanks for silently fixing it for me ❤️
Next steps:
Comment #108
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI pinged @Berdir and he will try to look at this patch sometime :)
As reviewed and mentioned by @larowlan in #97.2, the parts of the alias system that are in the critical path are not changed in this patch, they're just moved from the old alias storage to the new entity based one, so I don't think there's anything else we can do in terms of performance here..
Comment #109
Wim Leers👌 That's what I thought, thanks for confirming!
Comment #110
dawehnerI'm curious why we don't use
assert($path_alias instanceof PathAliasInterface)
?There is no checking for the return value, this would cause a fatal error when something passes along a wrong PID. Is this fine?
It would be nice to document why we don't want
access()
hereIs there a reason you don't use
language()->getId()
?I'm wondering whether this path alias entity should be marked as internal. I would expect that the prefered interacton is still via the path alias storage class?
Could/should we add constrains that validate that these paths are valid?
This comment should explain why, not what :)
It feels like using
source()
would be a better name aka. more obvious of what is going on.I'm confused about the conditions here. == points towards a string, but > points towards a number, but the constant is 'und'?
Comment #111
BerdirBetter to know than to guess :)
Test scenario, disabled page and data cache bins but enabled render cache and dynamic page cache, so enough to hit the alias check but not too much more. Testing with a plain ab -n 500. A node at /node/1 and aliased as /foo.
Both HEAD and with the patch have around 60 requests/s (16ms per request), varies a bit, sometimes as low as 55requests/s but the difference between the patch is definitely within the variance between each run. At first I tested with the data cache bin still enabled but then realized I was hitting the route cache and inbound path processing didn't even run. But having that enabled or disabled did not result in a measurable difference with ab, so there's simply too much other stuff going on that .
That's not very surprising, just like @amateescu said, that code path is basically unchanged.
Still need to review it, one thing I did run into is the constructor change in AliasStorage, once when applying the patch and running drush updb, it died with a fatal error on the old constructor argument. That's a tricky situation, one thing we could try is remove the type hint and check what we get but even that would break subclasses I think.
Comment #112
dawehnerIn general this is an easy to avoid problem, so let's just avoid it.
Comment #113
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@dawehner, thanks for the review!
Re #110:
path
feels better. @Berdir, any opinion on this point?Also fixed the problem mentioned in #111 with the old
AliasStorage
constructor.Comment #114
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedRegarding #110.8, how about calling the field
system_path
instead of justpath
->$path_alias->getSystemPath()
/$path_alias->setSystemPath()
? That should make it's purpose perfectly clear.Comment #116
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedRerolled for a small conflict with #2659940: Extension System, Part III: ThemeExtensionList and ThemeEngineExtensionList.
Comment #118
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFixed that test as well.
Comment #120
jibranReroll. I'd like to get this fixed before Drupal 9.
Comment #122
jibranSo entity without the title field cokes up JSON:API tests. Also,
Drupal\Tests\jsonapi\Functional\PathAliasTest::testRelationships
test doesn't handle 'und' languange very well. I'll ask JSON:API experts a ping in slack to help me with it.Comment #123
jibranHere is some details about what I commented in #122:
Even though there is no uid or revsion_uid field for PathAlias entity. I still have to mention this because
\Drupal\Tests\jsonapi\Functional\ResourceTestBase::setUpFields()
adds a user ref field.PathAlias sets default value of 'langcode' to 'und' but
\Drupal\Tests\jsonapi\Functional\ResourceTestBase::doTestRelationshipMutation
show 405 for 403 PATCH request.Drupal\Tests\jsonapi\Functional\PathAliasTest::testPostIndividual
fails with following error.Comment #124
Wim LeersGo @jibran! 🥳
Comment #125
jibranLet's try this.
Comment #127
jibranSome more progress.
Comment #129
jibranHT by @gabesullice worked great. This patch should be green unless I broke something in JSON:API. I'm going to credit @gabesullice for helping me out in the slack and during API first meeting.
Comment #131
jibranComment #132
jibranReroll after #3043907: DatabaseCacheBackend::ensureBinExists() does not properly handle exceptions, #3067196: Properly deprecate LinkGeneratorTrait, #2875394: Replace all calls to db_query, which is deprecated, #3039026: Deprecate file_directory_temp() and move to FileSystem service and #3073872: Allow for Layout Builder sections to be given administrative labels
Comment #134
jibranA new fail after #3065663: Several post-update functions try to update config entity types without checking if they exist
Comment #135
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThis should fix the test failure.
Comment #136
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedDid a self-review and cleaned up a few things.
Comment #137
jibranI think we need to get #3012050: Prepare for the conversion of path aliases to entities in committable condition to commit this patch to the core.
Comment #138
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFixed the description of the
alias
base field, as pointed out by @chx, and brought over a few improvements from #3007832: Convert custom path alias forms to entity forms that should be part of this initial patch.Comment #139
jibranThis is NW to fix the JSON:API fail here https://www.drupal.org/pift-ci-job/1391907.
Comment #140
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedIt can stay at NR so people can review it.
Comment #141
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFixed the JSON:API thing.
Comment #142
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedWhile working on #3012050: Prepare for the conversion of path aliases to entities, I found that there's a need for looking up a path alias by system path (source) in a few places, most notably
\Drupal\path\Plugin\Field\FieldType\PathFieldItemList::computeValue()
(core) and\Drupal\pathauto\AliasStorageHelper::loadBySource()
(contrib).The
\Drupal\Core\Path\AliasStorageInterface::lookupPathAlias()
from core is already doing that, but it's only returning the alias string, while those two methods above need a few more information (likeid
andlangcode
), so I changed the corresponding methods from the newly introduced\Drupal\Core\Path\PathAliasStorage
entity storage class to return a fullpath_alias
entity object, which is usable in all the places where we need to find a path alias for a given system path.Important note for reviewers: This change means that
AliasManager::getPathByAlias()
andAliasManager::getAliasByPath()
now have to do an additional (entity) cache get if the given system path or alias are not in the statically cached list of aliases. However,AliasManager::getAliasByPath()
is very aggressive with populating that static cache via$storage->preloadPathAlias()
, so I don't think that the entity cache get is actually reached in most cases (i.e. when the "preload cache" is warm).Comment #143
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFWIW, #3012050-12: Prepare for the conversion of path aliases to entities passes all tests, so Pathauto is ready for this core change :)
Comment #144
Wim Leers#141: that looks an awful lot like the changes being made in #3057175: Implementation of user name in JSON:API can result in overwriting data! :) That should land very soon, and will make this patch slightly simpler again 🥳
(Thanks to @jibran for pointing me to #141!)
Comment #146
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@Wim Leers, glad to hear that!
Fixed the ordering order :)
Comment #147
jibranDo we need to deprecate this class and its methods? Maybe in a follow-up?
Comment #148
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedSee #2233595: Deprecate the custom path alias storage :)
Comment #149
jibranThanks!
I was thinking about the performance impact of the conversation. Just putting my thoughts down here:
The only things concerning for me are changes like these:
We are replacing a simple select query with EFQ + loading of the entity using entity API but static and built-in entity cache will help here so I think this is not awfully concerning.
Recently, I worked on a client project on which we installed the redis module. Ever since we changed all the cache backends to redis,
MySQL url_alias select
(~1000 calls per min on ~200 requests per min) is the second most time consuming query afterMySQL key_value select
. After path aliases conversion to entitiesMySQL url_alias select
will go away becuase of built in entity cache in Durpal 8. This was the most interesting thing for me which intrigued me to work on this patch back in #120.Do we need to benchmark this patch to highlight its importance of the performance gains after path aliases conversion?
Comment #150
jibranFor me this issue is RTBC but it would be great to get some overall reviews from API-First team, entity subsystem maintainer other than @amateescu, @Berdir as pathauto maintainer and @catch as Path subsystem maintainer.
Comment #151
Wim Leers#149: That's a super super interesting perspective rooted in real world challenges. Thank you for sharing that!
@Fabianx added the #26, @amateescu removed it in #100. I also asked about this in #107.2. @amateescu responded in #108 that it doesn't because the parts of the alias system in the critical path are not changed, which framework manager and core committer @larowlan already pointed out in #97.2.
tag inBut … that seems to contradict what @jibran wrote in #149, where he specifically calls out that changing the critical path to using Entity API's loading + built-in caching improves performance for him. @amateescu, can you clarify?
Comment #152
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@Wim Leers, @Berdir profiled the impact of this patch on the critical path in #111 and the results were good, as expected, because it's not changed. Note that _saving_ a path alias is not in the critical path at all :)
Comment #153
Wim LeersAha! GREAT! I'll do a deep dive on the patch then. Meanwhile, it'd be super useful to get an updated issue summary. That will be necessary for this to be committed anyway.
(The change record at https://www.drupal.org/node/3013865 already looks 👍 Although I think a valuable addition would be to include a recommendation for how a contrib module can support both 8.7 and 8.8 at the same time: presumably by implementing both hooks and maximally reusing code? Or perhaps not? That's exactly why a recommendation would be valuable!)
Comment #154
Ghost of Drupal PastMy interest in this patch comes from, if I gather correctly, this will make aliases work with the workspaces module which we need. We currently coded up a drafter module to work around the problem of workspaces not being ready (I will release this shortly, I already wrote a blogpost too). Without further rambling:
Amazing, incredible work. A few nits:
The post update function scares me... I mean, yeah, it's using the entity API is real nice but the performance implications scare me a bit. I feel this could be solved with a few INSERT ... SELECT at a much much higher speed. Was this option considered?
Comment #155
Wim Leers#3057175: Implementation of user name in JSON:API can result in overwriting data just landed, the part of the patch added in #141 should become much simpler now. :)
Comment #156
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI was hoping for that as well, but apparently that's not the case, we still have to complement the changes that were done in that issue.
Comment #158
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI shouldn't have tried to fix JSON:API's test coverage in this patch, reverted most of the changes from #156 and kept only what's strictly needed.
Comment #159
jibranHere is an interesting point about upgrade path, in pervious patch releases we have updated existing entities(block content, menu link, taxonomy terms) to revisionable/publishable and we have seen developers battling with the upgrade paths for those entities.
In this case, the upgrade path is pretty straight-forward:
Installing a new entity type is way simpler than upgrading an existing one. It also has fewer edge cases.
I can't predict the future and it is just an observation so don't quote me on this. :)
Comment #160
BerdirStarting with a code review to get back into it. I'll try to reply to some comments above, need to catch up on what exactly I did in regards to performance testing and if the logic is really still the same (see below).
Also, I've been testing the upgrade path on a site with around 700k or so aliases and yes, as #159 said, this update function is simpler than existing entity => revisionable update functions, *but* that doesn't mean is faster. Based on some rough calculations, it would take about 2h for this site (and I'm sure there are larger ones out there) and increasing the entity batch size only helped marginally.
So what we agreed on is to switch to raw INSERT queries. I don't think we can do a single INSERT INTO ... SELECT FROM, because we also need to generate UUID's, but with a few bulk insert queries, I still expect this to be massively faster than saving entities.
could use language()->getId()?, I see that came up before, but not sure if that follow-up issue you promised exists? :)not quite sure about the comment here and also another one above. The necessary generic statement ((all) api functions should be able to access all entities) isn't actually always true.
Maybe just: // Ignore access restrictions for this API?
we use the storage here 3x, maybe store in $storage, then you could possibly even do something like $storage->delete($storage->loadMultiple())?
is there a reason we're not yet updating these other functions?
does an empty default really make sense, does that just reflect the existing schema?
hm. I haven't fully thought this through, but we had a few really weird problems in the past when making fields not revisionable.
They are not translatable, but the language can still change, typically between language specific and all languages. that would mean that doing this in a workspace would also immediately affect the default revision?
This would likely be another case where the API from #2580551: Optimize getCommentedEntity() would come in handy.
One of the slowest parts about loading an entity is accessing the first value through a field, as we have to load the field definitions, initialize multiple objects and so on.
huh. if we actually have a custom implementation and don't just set the key, would it make sense to do something like "$source => $alias" instead of just alias?
Hm, wondering if we want to use a different name for this, as in context of an entity storage, it doesn't actually pre-"load" (the entity)?
Below we have the lookupX methods, those then however do an actual full load. I'm wondering if that changed since my previous (performance) review, because I'd definitely expect that to be a visible performance issue.
this. seems a bit strange now, apart from the problem that entity query is slower than a DB query, this seems doable as an entity query + load if that's really what we want to do?
isn't this an API change?
This uses the existing getAliasesForAdminListing() method and desn't have to change it. Should that method ensure that they properties remains the same?
I guess as a follow-up, we'd look into converting this into an entity list builder + view?
I guess the comment could be be slightly improved to say it needs to have the entity schema installed or so, but it's a deprecated class anyway (but the same might exist in the new base class)
I was hoping we'd finally find a way to do this better (not have hooks in system.module), but the entity system doesn't have events yet, so.. :)
the comment will be a bit confusing when commit, because the "also need to" only makes sense in the context of this diff when you know the old value. Maybe explain why both need to be uninstalled or just drop the also?
How nice, 20 new test classes :p
I'm sure Wim will not like it, but one option could be to split out the path entity json/rest (?) test coverage to a follow-up, then we could skip all those label-related changes here.
interesting location for these tests, too. Would IMHO make more sense for them to be in rest/jsonapi, now we have Tests in core/tests that need those modules.
Comment #161
Berdir> @Wim Leers, @Berdir profiled the impact of this patch on the critical path in #111 and the results were good, as expected, because it's not changed.
Yes, just like I suspected :) Yes, I did test it back then, but the lookup* methods back then returned the raw value, so really nothing changed. That apparently changed now, which definitely does change the code in the critical path.
Yeah, no, that's not how this works, I wish it were that simple :)
First, you still have a query to find the alias ID, that isn't going away. If the entity is then cached, then in the best-case scenario, you end up with the same amount of queries. *If* the entity is cached. If not, then you end up with two (?) additional queries against the base and the revision table to then actually load the data, instantiate the entity object, and then access the information through it, which needs to get the field definitions, instantiate field item and list objects, ....
There are two relevant scenarios for performance of the alias storage:
A) Every incoming request (excluding things that where a internal page cache hit) needs to do a lookup by alias to translate it back into the system path. That happens very early in the request and is very much in the critical path. redirect.module has a similar scenario, what we did there is add \Drupal\redirect\RedirectRepository::findMatchingRedirect(), that with minimal dependencies executes a raw query with query() to find a possibly existing redirect ID. The difference there is that actually finding a redirect and returning it isn't really in the critical path anymore, because most requests won't have that. For path, that's different, having an alias is the rule not the exception and it needs to be as fast as possible as well.
B) The other side is that every URL we *build* while creating a page needs to look up if there is an alias, so we have lots of queries against that table on sites with a lot of teasers, menu links and so on. We spent *a lot* of time optimizing that in D8 and before, for example with the whitelist (unchanged by this) and also the preloading. I didn't profile it yet ( will do so on my large site after we have an upgrade path that doesn't take 2h), but I expect that we have a considerable performance hit there when the preload cache isn't ready yet. Which will be the case for many requests as that is basically per path.
As commented above, the lookup/preload methods are currently inconsistent in what they return. My basic, not fully thought through, proposal would be to keep these functions that really are in the critical path as they are now, with direct DB queries and returning raw array results in place. I'd even suggest to keep them as a separate AliasStorage service/interface. And then we only deprecate all the stuff that isn't actually in the critical path and the entity system does better anyway (save(), load(), delete()*, that admin listing stuff). Going back to the original use case, that would allow to implement a decorator of this service that stores the data in redis/memcache and then you could really get to a point where you can avoid queries against the url_alias/path_alias table at all. Given your redis/memcache instance has enough memory to store all that information.
* For delete(), I realized that we could run into scalability issues, simple use case: Deleting all DE aliases because you just removed that language (not that we actually do that right now...). You might have 100k of them. And right now delete() is a DELETE query with conditins, so that would kinda work. But the new implementation needs to load all entities and then delete them. I don't think we can or have to do something about it, except maybe document that in the CR, that you should use batch for that? (pathauto btw already does that, as it also has to clean up some of its own data).
Comment #162
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThanks @Charlie ChX Negyesi and @Berdir for the reviews!
Posting just a quick patch for now which changes the update function to do raw insert queries, and also fixes #160.6. I'll write a more detailed reply in a followup comment :)
Comment #164
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedHere we go:
Re @Charlie ChX Negyesi in #154:
2. I discussed this with @catch when I started working on this patch last year, and the consensus was that we should rename the fields/columns to something that makes (more) sense, for example "source" to "path". The "pid" to "id" change was based on the fact that the entity type ID is "Path Alias", so the short version would have to be "paid", and we both though that a simple "id" would be better :)
3. That comment no longer exists because the
langcode
field is revisionable now.4. The update path was changed to raw insert queries in #162, and apparently it's ~20x faster according to @Berdir's test.
Re @Berdir in #160:
langcode
field to be revisionable.preloadPathAlias
is no longer part of the path alias entity storage class ;)Re #161: I removed all the performance-sensitive methods from the new entity storage class and kept them in the existing
path.alias_storage
service. Let's decide whether we want to move them to a different repository class/service in #2233595: Deprecate the custom path alias storage.Comment #165
Ghost of Drupal PastI was thinking
INSERT INTO path_alias SELECT *,uuid() FROM url_alias
or some variant thereof. PostgreSQL hasuuid_in(md5(random()::text || clock_timestamp()::text)::cstring);
per https://stackoverflow.com/a/21327318/308851. This wouldn't be 20% faster, it'd be ~infinitely faster :)Comment #166
BerdirFWIW, it's ~20x faster (based on the estimated 2h in the previous patch) not 20% :)
If you think a db-agnostic upgrade path is possible then that could be interesting, there's also an issue to add a revision UUID and the upgrade path is one of the things it got stuck on. but my personal opinion is that this is fast enough.
A single query that would write 700k or possibly millions of records at once might run into other limits, e.g. transaction size on complex mysql setups with multiple servers?
Comment #167
Ghost of Drupal PastMy bad, sorry. Sure, let's keep like this. Let's explore my idea in a separate issue. And let's hope people do statement/hybrid replication :)
Comment #168
jibranExcellent analysis @Berdir.
This observation is spot on I was running some before and after scenarios and it always comes back to time spent querying
url_alias
or after the patchpath_alias
. The patch did add two extra queries tokey_value
table on each page load.Comment #169
BerdirI guess the AliasStorage ands its current state with this patch is likely the trickiest part of this patch.
With our strict deprecation rules now, it does feel a bit like cheating as what we are doing here is basically soft-deprecating a bunch of methods like save(), delete() and load() that we definitely want to get rid of. Actually deprecating them would force us to replace all usages.
On the other side, there are good arguments to do it like this:
* We have to stop *somewhere* with this initial patch. If we convert this, what's next? convert the alias forms to entity forms? convert the overview controller to a list builder/view?
* Doing so would considerably increase (double?) the size of this already non-trivial patch, there are about 50 usages of the alias storage service at the moment in core, mostly in tests.
I guess the decision will be on whether the current state of the patch is something we could live with for 8.8 (and 8.9 and 9.0) in the worst case of not bein able to land the follow-ups in time.
Wondering if we want to drop these variable/documentation renames (for now anyway), they're the only changes in this interface. For the return value, we also keep the existing keys and while these might actually be the methods that stick, we can still do that in the follow-up that will deprecate all methods on this service that we no longer want to maintain.
Actually defining 'alias' as the label entity key would probably have the same effect and not require quite a few changes in the rest/json base classes. But we discussed that we might want to change this implementation and e.g. include more than one field, so I'm fine with keeping it like this.
This will break a bunch of contrib tests, which is fine, the annoying part is that they can't just add this line and move on because that will in turn fail tests on 8.7.
Maybe we can add a snippet to the change record that combines it with a class/entity type exists check before trying to install it?
Hm, last time I checked this I focused only on the comment, but now I'm wondering, aren't we cheating here?
The call chain where this fails is pretty wild, it's actually the uninstallation of block, which in turn uninstalls first block_content, then we delete the basic default bundle, which in turn deletes its default field, and this finally calls the menu_link_content hook and there $entity->uriRelationships().
So this specific case is self-inflicted by uninstalling a module before running updates, but couldn't this also happen in a real update function as well, before our update runs? You're not *supposed* to use the entity type, but installing/uninstalling modules in update hooks isn't uncommon and who knows what that does.
What about adding a try/catch in that hook instead? maybe with a watchdog_exception() and explanation why it can fail? That makes the test pass too.
PS: This also has one of the those post updates that you mentioned.
I suppose its fine to remove this, we basically have to remove the fixtures and http://grep.xnddx.ru/search?text=PathUnitTestBase doesn't find anything except that single core class using it. And there's no reason to do so.
Comment #170
BerdirAh, the change record needs a few updates (the part about the storage providing some methods isn't true anymore). Maybe we want to wait until we mention the AliasStorage at all until we really deprecate any of these methods?
And I'm sure this is important enough to also need a release note snippet in the issue summary. Speaking of that, maybe you can cover some of my notes above in there and specifically explain what is done here in this issue and what's done follow-ups?
Comment #171
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThanks, @Berdir! I've done all the updates to the CR and IS, added a release note snippet and fixed the points from #169 except for 5.
For that point, I started to implement your proposal and do a try/catch in
menu_link_content_entity_predelete()
, but then I remembered that we recently made a change to not use the alias system during updates exactly for this purpose, so the scenario for uninstalling a module in an update function should work fine. The change intestViewsPostUpdateExposedFilterBlocksWithoutBlock()
is, as you said, self-inflicted because we choose to uninstalling a module _outside_ of the regular db update process :)Comment #172
Berdir@see should only be used in a docblock as a standalone link, I'd just write this as "after they become publishable in URL."
maybe we can explain what you said in #171 in this comment, in case someone else is going to wonder about it as well?
Comment #173
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFixed both points from #172 :)
Comment #174
BerdirOk, I think this is ready.
This doesn't change anything in the queries that we execute in the critical path, only saving/deleting/loading with conditions is done as path entities, and those aren't used when viewing an entity, only when e.g. editing aliases or nodes.
I've also tested the update path on a very large site, which lead to refactoring it to use direct database queries and making it much faster with that. Options using a single query were discussed but I think this is good enough.
There are quite a few follow-ups but as discussed extensively, we need to split up this patch somehow and this seems like a good compromise. The next steps are defined and the plan for AliasStorage is to basically deprecate everything except those three lookup/preload methods. But for that we need to convert the migration, forms and the overview first, or we have to to touch quite a few things there twice.
Comment #175
catch@jibran #168
Is this still the case? (I'm assuming not since we've kept the critical path the same) And if so which queries are these?
There are a lot of follow-ups here, but as with Berdir's comment in #174 this seems like the best compromise for splitting things up. This patch, and the one to make things publishable, have the biggest impact on the schema and potential data issues, and are cleaning up the technical debt of path aliases being a custom table.
The other follow-ups (entity list builder, entity form) are further clean-up of technical debt that this patch enables, but doesn't really introduce (i.e. we already have those custom forms etc., we're just not getting rid of them yet). What it does introduce is a discrepancy/duplication where we have custom forms for an entity, which could be confusing for someone learning the path alias system from scratch, but that seems like an OK issue to introduce vs. having an unreviewable patch.
One thing to note is that publishable path aliases will still need to happen in 8.8.x, or be deferred to 9.1.x, because we're trying to have a moratorium on schema changes between 8.8.x and 8.9.x/9.0.x.
Very glad that the update path has been optimized to direct queries here that will save a lot of stress for a lot of people.
I've reviewed the patch a few times before, and reviewed the last few interdiffs, and don't have anything to complain about.
Comment #176
catchAlso tagging WI critical.
Comment #177
Berdir> Is this still the case? (I'm assuming not since we've kept the critical path the same) And if so which queries are these?
Correct, that was based on a patch from before #162 which passed the queries through to the entity storage. And the two queries are from initializing the installed field storage/entity type definitions, which are kept in key_value, so every entity storage we access on a page results in two extra key_value queries. That's a side effect of using the installed definitions in 8.7 that I think we didn't fully realize back then.
Comment #178
jibranYeah, I can confirm before and after SQL query count is the same now.
Comment #179
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedComment #180
plachThe patch looks great! Aside from a few minor issues and some totally non-blocking nits this seems good to go to me.
However, one thing that I was wondering is why we need alias management to be baked in
\Drupal\Core
rather than in its own module. Of course, this design decision is pre-existing but the refactoring initiated here could be a good chance to extract all the alias management logic to a newpath_alias
module. It seems that with the deprecation follow-ups we would already be rewriting/getting rid of most of the alias-related code in\Drupal\Core\Path
and in thepath
module.If we decided to go this way, it might make sense to create the new module here, but it would definitely not be a blocker as long as we do that before 8.8.0.
Nit: the second
if
s could be replaced withelseif
s to improve clarity :)To match the previous logic exactly shouldn't we add a
->range(0,1)
here as we do in::load()
?Why are we removing the try/catch blocks in these cases?
I don't understand this: isn't
trim($string, '')
a noop? If not, can we update the comment?Could we check for
EntityPublishedInterface
instead of hard-coding thepath_alias
type?Nit: we could replace these with
static::$resourceTypeName
.Nit: we could type hint with
PathAliasInterface
(and$path_alias
) here.We already have
system_update_8802()
committed now. These need to be renumbered and all references updated accordingly.I think we need an early return when the storage is not an instance of
PathAliasStorage
.Nit: this bit confused me at first sight, what about
$last_url_alias
?What about using the
entity_update_backup
setting to drop this conditionally?Nit: as above, we could type hint with
PathAliasInterface
(and$path_alias
) here.Comment #181
Berdir> However, one thing that I was wondering is why we need alias management to be baked in \Drupal\Core rather than in its own module. Of course, this design decision is pre-existing but the refactoring initiated here could be a good chance to extract all the alias management logic to a new path_alias module. It seems that with the deprecation follow-ups we would already be rewriting/getting rid of most of the alias-related code in \Drupal\Core\Path and in the path module.
Hm. I think that might have been a good idea if we'd define a new system. I'm also a bit confused by the last sentence. 80% of Drupal\Core\Path *is* alias management (AliasStorage, AliasManager, AliasWhiteList) and 100% of path.module (it's nothing but the UI for path aliases), so there would be very little left in either place.
We'd have to move all of that to a the module (actually, we would IMHO instad move it to path.module), meaning we'd have to deprecate all of these services and would need to make the module either required or would require everything that looks up aliases to explicitly check for that module. We also have dependencies to Drupal\Core\Path\Alias.. in \Drupal\Core\EventSubscriber\PathSubscriber (which shouldn't be in that namespace but that's a different topic) and \Drupal\Core\PathProcessor\PathProcessorAlias, that would need to move too. As well as code in system.module: the hooks (these would be nice ;)), conditions (moving plugins is annoying) and the site information form (this can't move)
In short, I don't think that's worth doing and alias handling is such a central part of Drupal that I think it is fine to keep it there.
Comment #182
plach@Berdir:
Ideally decoupling all subsystems and leaving
\Drupal\Core
only absolutely required stuff should lead to a better system (more maintainable, more testable, blah, blah, blah), but I agree that we should definitely evaluate the effort of such a refactoring. The changes you described more or less match my expectations from my cursory look and didn't seem hugely problematic to me.Personally I think we should try to reduce the amount of entity definitions we have in
\Drupal\Core
(ideally to 0) and make what's in there just glue/adapter code between\Drupal\Component
and core modules. I understand though that this might not be the right time to focus on this stuff, as -alpha1 is looming, I was just pointing out that investing some time now would save (way) more time later.Comment #183
plachTo clarify, I'd be fine even with temporary having backwards dependencies between
\Drupal\Core\Path
and thepath
/path_alias
module (I thought a new module might make dealing with BC easier), as long as the plan is completing the refactoring.Comment #184
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedAddressed all the points from #180, thanks for taking a look!
url_alias
table. Now that the entity system installs it for us, they're no longer needed :)trim($var, '')
is a no-op, so fixed!old_url_alias
instead :)While working and testing all the changes above, I realized that the
path
(previouslysource
) field is not "read -only" for an URL alias, so it needs to be revisionable as well. Fixed that and added test coverage that it's copied properly to the revision table.post*
methods on thePathAlias
entity type, and it's certainly nice to get rid of scattered code from the system module.Comment #185
BerdirHm, This is actually the delete method, which currently doesn't have a range?
With aliases for multiple languages, it is definitely possible right now that you'd want to delete multiple aliases at once when e.g. deleting a node? So I don't think this is correct, wondering if we actually have test coverage for it.
was wondering about that line, but I just assumed you copied it from somewhere and then didn't look at it more closely. Agreed that the extra trim isn't necessary and doesn't do anything.
nitpick: I guess "if requested" or "if configured" makes more sense, it's always "possible"?
Comment #186
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedDiscussed the
range()
stuff with Berdir and he mentioned that in HEAD we only load a path alias before executing the delete statement so we can pass it along to the delete hook. Fixed that and also point 3. from #185.Comment #188
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedOne test failure from #184 shows that we do have test coverage for the range() stuff, because it passes (locally) with the patch from #186 :)
Fixed the other fails.
Comment #190
plachThanks for the updates!
But won't we be able to support content moderation at that point? Or will that happen in a later iteration? It seems checking the publishable interface would be useful regardless, I guess it's not even possible to support an entity type not implementing that...
Why don't we do this in
PathAliasStorage
itself, so we can inject the services?We're never dropping the table now :)
Btw, what about
'old_' . uniqueid() . '_'
as prefix to reduce the likelihood of collisions?Comment #191
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedYup, we will be able to support CM, but that also means some more UI work, which should happen in a later iteration.
CM actually supports non-publishable entity types, it's just not that useful for them :)
Not injecting the service was one of the goals for doing it in the entity class, and the reason is that we want to prevent (or at least lessen the chances of) a circular dependency with the alias manager service.
Oops, fixed :)
Comment #193
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedSigh.. since we are now generating a unique table name, we can't reliably determine whether the table exists, so we need to drop that assertion and trust the code :)
Comment #194
BerdirAs mentioned on slack, I think "$schema->findTables('old%url_alias')" or so should work?, e.g. as an assertCount(1, ...)
For integration with Content Moderation, I think the problem basically is that there's no way to actually test/see a draft alias. So while technically possible, there's little point in doing it (Similar to problems with media entities). Unlike workspaces, where we should be able to intercept the alias lookup when being in a non-default workspace.
Comment #195
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedYup, that works ;)
Comment #196
plachThanks, back to RTBC!
Leaving the final word to @catch since he's reviewed this many times and is the subsystem maintainer (among the rest :)
Ouch, I hope we can clean this up while working on the various deprecation issues.
Comment #197
BerdirRTBC +1.
On the move-to-module topic. I do think that it would be a fairly complex task that would result in having to deprecate and move ~5 services, and updating a number of services depending on them and some tricky interactions e.g. in system.module. IMHO, it would be pretty challenging to manage that all in time (not just implement, but then also review again) with the various follow-ups that we have. On the other side, if we don't do it now it will likely be even harder to move it all again later. So not really something we can push to a follow-up. My recommendation would be to not move it, but if others (specifically @catch I guess) think it is a good idea then I'm not going to argue against it.
Comment #198
plachSame here, from the opposite side. If there were any way to make the move-to-module thing happen I'd be happier, but that's definitely not going to impact on my +1 on this :)
Thanks again!
Comment #200
catchSo I really like the idea of the entity type living in a module, but it opens up other problems, not the least of which is we'd need to make either path module or the new path_alias module required and have an update to force-enable it.
Agreed that doing it later is going to be even harder but we will probably need to figure out a change of module/provider for entity types eventually, even if it's not for path aliases in the first instance.
So it's a shame none of us thought of this earlier, but feels like something we can try to tackle during the 9.x cycle and shouldn't block this patch going in. Opened an issue here #3084874: Consider moving path aliases to a new path_alias module.
As before, the patch is a massive step forward and enables us to make lots of other improvements in further issues.
So..... Committed 099ce08 and pushed to 8.8.x. Thanks!
Comment #201
jibranYay!!! Congrats everyone it was a team effort.
Comment #202
catchComment #203
larowlanExpanded the release notes to add reference to pathauto update #3012050: Prepare for the conversion of path aliases to entities which will be needed by people as part of core update. I suspect that this will trip people up. I wonder if we can add a conflict entry to our composer.json for versions of pathauto that don't have #3012050: Prepare for the conversion of path aliases to entities to help out those who use composer. For those who use drush upc, not sure we can do anything.
Comment #204
larowlanOpened #3085062: Declare a conflict on pathauto versions that aren't compatible with 8.8.0 as a critical followup
Comment #205
xjmMaking the release note a bit more emphatic.
Comment #206
penyaskitoFor those writing upgrade tests in contrib:
if you used to verify anything before running
runUpdates()
, your calls todrupalLogin
ordrupalGet
will fail withRemove your calls to
drupalLogin
ordrupalGet
to get them back to green. See #3085889: Fix \Drupal\Tests\lingotek\Functional\Update\LingotekJobIdInvalidCharsPostUpdateTest after path aliases converted to entities as an example.Comment #207
xjmThanks @penyaskito. Maybe you could add that to the CR for the issue?
Making a small wording tweak to the release note.
Comment #209
penyaskitoAdded https://www.drupal.org/node/3013865#upgrade-tests