Executive Summary as of May 24, 2013

This issue has been patched and committed in D8. There has been a patch created in D7 and the remaining task is to create a unit test in D7.

The problem

The query we added in #106559: drupal_lookup_path() optimization - skip looking up certain paths in drupal_lookup_path() is not well indexed - on a Drupal 6/Pressflow site with 4.5m aliases it takes between 5 and 19 seconds (on their staging server). We knew this was a nasty query when we put it in, part of the reason it's using the variable system instead of cache_get()/cache_set() so it doesn't get blown away on cache clears.

Here's the EXPLAIN:

mysql> EXPLAIN SELECT SUBSTRING_INDEX(src, '/', 1) AS path FROM url_alias GROUP BY path;
+----+-------------+-----------+-------+---------------+------+---------+------+---------+----------------------------------------------+
| id | select_type | table     | type  | possible_keys | key  | key_len | ref  | rows    | Extra                                        |
+----+-------------+-----------+-------+---------------+------+---------+------+---------+----------------------------------------------+
|  1 | SIMPLE      | url_alias | index | NULL          | src  | 386     | NULL | 4586690 | Using index; Using temporary; Using filesort |
+----+-------------+-----------+-------+---------------+------+---------+------+---------+----------------------------------------------+

This same client (with a lot of contrib modules installed) has 138 unique system path prefixes on their site, i.e. the same query run on menu_router:

SELECT SUBSTRING_INDEX(path, '/', 1) AS path2 FROM menu_router GROUP BY path2;

138 rows in set.

Pressflow currently stores the whitelist in the cache system, and rebuilds it every time an alias is added/update/deleted - see https://code.launchpad.net/~catch-drupal/pressflow/path_slow_query for changes that bring that back to where D7 currently is, this led to the query showing up near the top of the slow query log since it was running a lot more often than necessary. However while looking at that I think there may be away around running this query at all.

So here's a suggestion:

Instead of array($path => TRUE, $path1 => TRUE); for the whitelist, we could store array ($path => TRUE, $path1 => TRUE, $path2 => FALSE);

paths set to FALSE are blacklisted rather than whitelisted.

In drupal_lookup_path(), if strtok($path) isn't found in the whitelist at all (neither blacklisted nor whitelist), run this query:
SELECT dst FROM url_alias WHERE src LIKE 'user%' LIMIT 0, 1; - that's the same logic as we have currently, except for one path root at a time.

EXPLAIN looks like this:

mysql> EXPLAIN SELECT SQL_NO_CACHE dst FROM url_alias WHERE src LIKE 'node%' LIMIT 0, 1;
+----+-------------+-----------+-------+----------------------+------+---------+------+--------+-------------+
| id | select_type | table     | type  | possible_keys        | key  | key_len | ref  | rows   | Extra       |
+----+-------------+-----------+-------+----------------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | url_alias | range | src_language_pid,src | src  | 386     | NULL | 154112 | Using where |
+----+-------------+-----------+-------+----------------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)

A lot of rows examined, but not 4m+ and various different iterations of this query all completed in milliseconds via MySQL cli.

Same query for D8:

MariaDB [d8]> SELECT alias FROM url_alias WHERE source LIKE 'admin%' LIMIT 0, 1;
Empty set (0.00 sec)
MariaDB [d8]> EXPLAIN SELECT alias FROM url_alias WHERE source LIKE 'admin%' LIMIT 0, 1;
+----+-------------+-----------+-------+---------------------+---------------------+---------+------+------+-------------+
| id | select_type | table     | type  | possible_keys       | key                 | key_len | ref  | rows | Extra       |
+----+-------------+-----------+-------+---------------------+---------------------+---------+------+------+-------------+
|  1 | SIMPLE      | url_alias | range | source_language_pid | source_language_pid | 767     | NULL |    1 | Using where |
+----+-------------+-----------+-------+---------------------+---------------------+---------+------+------+-------------+
1 row in set (0.03 sec)

A default Drupal 8 install shows the following in menu_router:

MariaDB [d8]> SELECT SUBSTRING_INDEX(path, '/', 1) AS path2 FROM menu_router GROUP BY path2;
+--------------+
| path2        |
+--------------+
| admin        |
| batch        |
| comment      |
| file         |
| filter       |
| node         |
| overlay      |
| overlay-ajax |
| rss.xml      |
| search       |
| sites        |
| system       |
| taxonomy     |
| toolbar      |
| user         |
+--------------+
15 rows in set (0.03 sec)

So in the best case, we'd be replacing the SUBSTR_INDEX query + set, for 15 well indexed queries + set. In the worst case, it would replace that same query for 130+ well indexed queries + set (although likely the more cruft there is in {menu_router} the less likely sites will be linking to every single one of those paths - I doubt we ever call url() on overlay-ajax for example.

The most unique paths with a substring of 5 characters I could find in {menu_router} for any site was 130 by the way.

As well as those additional queries, we'd be increasing the size of the whitelist array from say 5-10 items to between 20 and 130, so there'd be a bit more memory cost as well.

The advantage is that with a lot of path aliases and a cold start, instead of a 5+ second gap building this whitelist where you can't serve any pages (and currently there's no locking so it could also be a stampede), assuming the LIKE 'foo%' query always runs in around 1ms or so, you'd have roughly flat performance - a very slight hump from the extra sets building the cache, but no single massive rebuild blocking everything.

This is another case where we'd be cumulatively building a cache entry as an array, so it is likely a use case for #402896: Introduce DrupalCacheArray and use it for drupal_get_schema() again (whether we stick with variable_set()/variable_get() or switched to the cache system).

Proposed solution

The current patch completely removes the slow query from the whitelist, replacing it with an indexed LIKE 'foo%' query (although one that will run more often).

Unfortunately url() does not have an option for "this path has already been altered, don't try to look up an alias", so this means it's valid to pass arbitrary paths to url(), not only things which match paths in {menu_router}. We need to fix that for 8.x, see #1297714: url() does not allow the caller to prevent alias lookups and alters for paths. Since you could end up with an unlimited number of paths from the strtok($path, '/') check and each of those could create an array key in the new cached item, we need to work around that.

So to deal with that, the patch extend DrupalCacheArray to instead of using strtok(), use a substring of the path. Since this could still be arbitrarily high on many sites, it starts at 5, but each time the size of the array reaches over 100, reduces the substring length by one character.

The result of this is that if you only call url() with paths in {menu_router} (which lots of sites will do), you save the slow query and still have an effective alias whitelist.

If you are calling url() with arbitrary paths (either broken code or on altered paths due to the API limitation), then the whitelist will gradually get less effective, but you still avoid the slow query.

Files: 
CommentFileSizeAuthor
#145 path-alias-whitelist-1209226-145.patch8.73 KBarnested
PASSED: [[SimpleTest]]: [MySQL] 40,355 pass(es).
[ View ]
#143 path-alias-whitelist-1209226-143.patch8.64 KBarnested
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to login to test site.
[ View ]
#140 path-alias-whitelist-1209226-140.patch19.83 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 53,054 pass(es).
[ View ]
#140 path-alias-whitelist-1209226-140-interdiff.txt1.73 KBBerdir
#136 path-alias-whitelist-1209226-136.patch19.69 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 52,453 pass(es).
[ View ]
#133 path-alias-whitelist-1209226-133.patch33.34 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 52,273 pass(es).
[ View ]
#133 path-alias-whitelist-1209226-133-interdiff.txt1.83 KBBerdir
#129 path-alias-whitelist-1209226-129.patch17.59 KBBerdir
FAILED: [[SimpleTest]]: [MySQL] 49,889 pass(es), 4 fail(s), and 0 exception(s).
[ View ]
#129 path-alias-whitelist-1209226-129-interdiff.txt956 bytesBerdir
#128 path-alias-whitelist-1209226-128.patch17.58 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 48,999 pass(es).
[ View ]
#128 path-alias-whitelist-1209226-128-interdiff.txt2.13 KBBerdir
#127 path-alias-whitelist-1209226-127.patch17 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 48,984 pass(es).
[ View ]
#122 path-alias-whitelist-1209226-122.patch16.85 KBBerdir
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch path-alias-whitelist-1209226-122.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#122 path-alias-whitelist-1209226-122-interdiff.txt8.77 KBBerdir
#120 path-alias-whitelist-1209226-120.patch12.77 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 48,781 pass(es).
[ View ]
#120 path-alias-whitelist-1209226-120-interdiff.txt712 bytesBerdir
#116 1209226_path_alias_whitelist_cache_115.patch12.82 KBBerdir
FAILED: [[SimpleTest]]: [MySQL] 48,828 pass(es), 11 fail(s), and 0 exception(s).
[ View ]
#104 1209226_path_alias_whitelist_cache_104.patch12.25 KBmiro_dietiker
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 1209226_path_alias_whitelist_cache_104.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#97 1209226_path_alias_whitelist_cache_97.patch12.27 KBmiro_dietiker
PASSED: [[SimpleTest]]: [MySQL] 47,809 pass(es).
[ View ]
#95 1209226_path_alias_whitelist_cache_95.patch12 KBmiro_dietiker
FAILED: [[SimpleTest]]: [MySQL] 47,564 pass(es), 10 fail(s), and 0 exception(s).
[ View ]
#93 1209226_path_alias_whitelist_cache_93.patch11.96 KBmiro_dietiker
FAILED: [[SimpleTest]]: [MySQL] 47,620 pass(es), 9 fail(s), and 50 exception(s).
[ View ]
#91 alias-whitelist-1209226-91.patch11.14 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 42,126 pass(es).
[ View ]
#88 alias-whitelist-1209226-88.patch11.06 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 41,900 pass(es).
[ View ]
#88 alias-whitelist-1209226-88-interdiff.txt1.79 KBBerdir
#86 1209226-86-path-alias-whitelist.patch9.27 KBdagmar
FAILED: [[SimpleTest]]: [MySQL] 41,484 pass(es), 1 fail(s), and 2 exception(s).
[ View ]
#86 interdiff.txt2.27 KBdagmar
#84 path-alias-whitelist-1209226-84.patch9.28 KBBerdir
FAILED: [[SimpleTest]]: [MySQL] 40,362 pass(es), 1 fail(s), and 2 exception(s).
[ View ]
#84 path-alias-whitelist-1209226-84-interdiff.txt660 bytesBerdir
#82 1209226-82-path-alias-whitelist.patch8.5 KBbeejeebus
FAILED: [[SimpleTest]]: [MySQL] 36,690 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
#78 path_alias_whitelist_1209226.patch8.47 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed attempting to get list of tests from run-tests.sh.
[ View ]
#76 path_alias_whitelist_1209226.patch8.47 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in core/includes/path.inc.
[ View ]
#75 path_alias_whitelist_1209226.patch8.48 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in core/includes/path.inc.
[ View ]
#69 path_cache.patch7.73 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 34,028 pass(es).
[ View ]
#65 path_cache.patch7.91 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 33,994 pass(es).
[ View ]
#64 path.patch7.82 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 33,759 pass(es).
[ View ]
#61 path.patch7.78 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 33,636 pass(es).
[ View ]
#58 path.patch7.76 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 33,617 pass(es).
[ View ]
#55 path.patch7.63 KBcatch
FAILED: [[SimpleTest]]: [MySQL] 33,387 pass(es), 18 fail(s), and 3,236 exception(es).
[ View ]
#51 path.patch5.65 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 33,453 pass(es).
[ View ]
#49 path.patch4.92 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch path_2.patch. This may be a -p0 (old style) patch, which is no longer supported by the testbots.
[ View ]
#45 path.patch4.51 KBbeejeebus
FAILED: [[SimpleTest]]: [MySQL] 33,407 pass(es), 4 fail(s), and 0 exception(es).
[ View ]
#43 path.patch4.51 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in includes/path.inc.
[ View ]
#41 path.patch4.49 KBcatch
FAILED: [[SimpleTest]]: [MySQL] 33,421 pass(es), 4 fail(s), and 0 exception(es).
[ View ]
#40 path.diff4.49 KBcatch
FAILED: [[SimpleTest]]: [MySQL] 33,391 pass(es), 19 fail(s), and 0 exception(es).
[ View ]
#35 path.whitelist.patch7.96 KBbeejeebus
PASSED: [[SimpleTest]]: [MySQL] 33,290 pass(es).
[ View ]
#32 whitelist.patch8.41 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 33,269 pass(es).
[ View ]
#30 whitelist.patch8.41 KBcatch
FAILED: [[SimpleTest]]: [MySQL] 33,155 pass(es), 1 fail(s), and 0 exception(es).
[ View ]
#28 whitelist.patch7.9 KBcatch
FAILED: [[SimpleTest]]: [MySQL] 33,146 pass(es), 7 fail(s), and 30 exception(es).
[ View ]
#26 whitelist.patch8.36 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to enable simpletest module.
[ View ]
#25 path_alias_whitelist.patch7.94 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 33,151 pass(es).
[ View ]
#23 path_alias_whitelist.patch8.4 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to enable simpletest module.
[ View ]
#22 whitelist.patch5.57 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 33,046 pass(es).
[ View ]
#20 whitelist.patch5.56 KBcatch
FAILED: [[SimpleTest]]: [MySQL] 33,037 pass(es), 15 fail(s), and 0 exception(es).
[ View ]
#18 whitelist.patch5.54 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Drupal installation failed.
[ View ]
#16 whitelist.patch5.55 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Drupal installation failed.
[ View ]
#15 whitelist.patch4.67 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 32,909 pass(es).
[ View ]
#13 url_whitelist.patch4.61 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to enable simpletest module.
[ View ]
#13 Desk 1_008.png71.17 KBcatch
#11 url_whitelist.patch4.61 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to enable simpletest module.
[ View ]
#8 url_whitelist.patch4.6 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 32,913 pass(es).
[ View ]
#6 url_whitelist.patch4.43 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 32,877 pass(es).
[ View ]
#4 url_whitelist.patch10.79 KBcatch
PASSED: [[SimpleTest]]: [MySQL] 33,636 pass(es).
[ View ]
#2 path_slow_query_D8.patch3.93 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch path_slow_query_D8.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#2 path_slow_query_combined.patch10.29 KBcatch
FAILED: [[SimpleTest]]: [MySQL] 33,624 pass(es), 0 fail(s), and 1 exception(es).
[ View ]
#2 Desk 1_075.png63.49 KBcatch

Comments

Component:path.module» base system
Status:Active» Needs review
StatusFileSize
new63.49 KB
new10.29 KB
FAILED: [[SimpleTest]]: [MySQL] 33,624 pass(es), 0 fail(s), and 1 exception(es).
[ View ]
new3.93 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch path_slow_query_D8.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Here's an excerpt from the slow query log of a D6 site, then EXPLAIN and query times before and after:

Count         : 467  (18.78%)
66 Time          : 2715.258467 s total, 5.814258 s avg, 5.221062 s to 32.682901 s max  (10.28%)
67   95% of Time : 2505.725291 s total, 5.656265 s avg, 5.221062 s to 6.128902 s max
68 Lock Time (s) : 544.443 ms total, 1.166 ms avg, 29 µs to 523.531 ms max  (9.94%)
69   95% of Lock : 18.712 ms total, 42 µs avg, 29 µs to 57 µs max
70 Rows sent     : 31 avg, 31 to 31 max  (0.12%)
71 Rows examined : 4.96M avg, 4.95M to 4.97M max  (15.11%)

mysql> EXPLAIN SELECT DISTINCT SUBSTRING_INDEX(src, '/', 1) AS path FROM url_alias;
+----+-------------+-----------+-------+---------------+------+---------+------+---------+------------------------------+
| id | select_type | table     | type  | possible_keys | key  | key_len | ref  | rows    | Extra                        |
+----+-------------+-----------+-------+---------------+------+---------+------+---------+------------------------------+
|  1 | SIMPLE      | url_alias | index | NULL          | src  | 386     | NULL | 3884667 | Using index; Using temporary |
+----+-------------+-----------+-------+---------------+------+---------+------+---------+------------------------------+
1 row in set (0.00 sec)

(query takes 15 seconds on the dev hardware, averages at 5 seconds on production).

After:

mysql> EXPLAIN SELECT 1 FROM url_alias WHERE src LIKE 'node%' LIMIT 0, 1;
+----+-------------+-----------+-------+----------------------+------------------+---------+------+--------+--------------------------+
| id | select_type | table     | type  | possible_keys        | key              | key_len | ref  | rows   | Extra                    |
+----+-------------+-----------+-------+----------------------+------------------+---------+------+--------+--------------------------+
|  1 | SIMPLE      | url_alias | range | src_language_pid,src | src_language_pid | 386     | NULL | 189036 | Using where; Using index |
+----+-------------+-----------+-------+----------------------+------------------+---------+------+--------+--------------------------+
1 row in set (0.00 sec)

Query completes in < 5ms on the dev hardware. Still has to scan quite a lot of rows but it's significantly better.

Also a diffstat of the patch without the DrupalArrayCache hunk which should be added from #402896: Introduce DrupalCacheArray and use it for drupal_get_schema(). Ends up a net reduction of 6 lines of code.

includes/path.inc                           |   36 +++++++++++-----------------
modules/simpletest/drupal_web_test_case.php |    8 +++---
2 files changed, 19 insertions(+), 25 deletions(-)

The simpletest changes are because we are currently calling drupal_static_reset() after the host database has been moved out of the way, but before the child database has been installed. This means if the destructor of an object gets called (which will happen if it's held in drupal_static() storage) and the destructor queries the db, you get a lovely PDO exception telling you the table doesn't exist. Just changing the order of operations fixes this. This is related to #1188316: DrupalUnitTestCase incompatible with any classes with destructors that access the database but tests won't pass without it.

Also attaching a screenshot from xhprof of PathAliasWhitelist::resolveCacheMiss() on a default D8 install - this shows four queries for path aliases, 'node', 'user', 'admin' and 'toolbar'. Two of those are only exposed to admins, so an anonymous user on a cache miss would only end up with one extra query compared to now.

Note this also takes the whitelist out of the variable system and puts it back into cache - it's been in variables previously to avoid this query running just because there was a cache clear.

Status:Needs review» Needs work

The last submitted patch, path_slow_query_combined.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new10.79 KB
PASSED: [[SimpleTest]]: [MySQL] 33,636 pass(es).
[ View ]

ThemeLinksUnitTest is calling all kinds of functions that rely on global scope etc. and only narrowly avoids querying the db, so this patch just makes it use DrupalWebTestCase.

Category:task» bug
StatusFileSize
new4.43 KB
PASSED: [[SimpleTest]]: [MySQL] 32,877 pass(es).
[ View ]

Exactly the same patch minus DrupalCacheArray now that's been committed. Also 5-20 second queries in core are a bug report.

Issue tags:+needs backport to D7

Fixing tag.

StatusFileSize
new4.6 KB
PASSED: [[SimpleTest]]: [MySQL] 32,913 pass(es).
[ View ]

Added some PHPdoc for the new class.

Status:Needs review» Needs work

+    $exists = (bool) db_query_range('SELECT 1 FROM {url_alias} WHERE source LIKE :path', 0, 1, array(':path' => $offset . '%'))->fetchField(); misses db_like

Was this benchmarked? That's a lot of getOffset magic calls.

Finally, while I know $offset is proper on an interface level, for a PathAliasWhitelist I would've expected something like path_root, first_path_part, path_arg_zero etc

discussed making the path lookup code into a pluggable class with catch in IRC, very rough interface idea being:

<?php
interface DrupalPathRegistryInterface {
  function
save(array $path);
  function
load($pid);
  function
delete($pid);
  function
getNormalPath($path, $path_language = NULL);
  function
getPathAlias($path, $path_language = NULL);
}
?>

EDIT: created a new issue here: #1269742: Make path lookup code into a pluggable class.

Status:Needs work» Needs review
StatusFileSize
new4.61 KB
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to enable simpletest module.
[ View ]

I added db_like() and changed the $offset argument to $path_root.

Have not benchmarked this. I doubt benchmarks will be meaningful however I'll post profiling data from a page with a lot of links assuming this patch passes tests.

Status:Needs review» Needs work

The last submitted patch, url_whitelist.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new71.17 KB
new4.61 KB
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to enable simpletest module.
[ View ]

Fixed the patch.

Also profiling for chx. 69 calls to DrupalCacheArray::offsetExists() is 369 microseconds (wall time in xhprof) - equates to 5.3 microseconds per call. 200ms out of the 369 is spent in array_key_exists() in offsetGet() - we might want to make that an isset + array_key_exists() instead of straight array_key_exists() - but I'd want some more subsystems converted to this to see the actual distribution (that pattern is slower if isset() returns false a lot).

Status:Needs review» Needs work

The last submitted patch, url_whitelist.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new4.67 KB
PASSED: [[SimpleTest]]: [MySQL] 32,909 pass(es).
[ View ]

Grr. Fixed the patch but uploaded the wrong one :(

StatusFileSize
new5.55 KB
FAILED: [[SimpleTest]]: [MySQL] Drupal installation failed.
[ View ]

Alright so the current approach here can lead to cache pollution if you have url() called with lots and lots of URLs that have different content before the first /. This wasn't a problem with the original single query approach since it'd only ever store what it found, not what it didn't find as we need to do building this over time.

So... here's a new version. Instead of strtok($path, '/') it uses an arbitrary substring of the path. This starts at 5 characters, but if $this->storage gets too big, it cuts it down by one and starts again, until it gets down to a single character. Once it's at a single character we can't cut it down any longer, but in any remotely normal usage there should not be too many unique single characters.

This same logic is in the latest releases of memcache (for dealing with wildcard logic) and has been working well there. I'd considered trying to put it into a dedicated class by itself but that's for another patch if at all.

Status:Needs review» Needs work

The last submitted patch, whitelist.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new5.54 KB
FAILED: [[SimpleTest]]: [MySQL] Drupal installation failed.
[ View ]

Status:Needs review» Needs work

The last submitted patch, whitelist.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new5.56 KB
FAILED: [[SimpleTest]]: [MySQL] 33,037 pass(es), 15 fail(s), and 0 exception(es).
[ View ]

Status:Needs review» Needs work

The last submitted patch, whitelist.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new5.57 KB
PASSED: [[SimpleTest]]: [MySQL] 33,046 pass(es).
[ View ]

This one should pass tests, or at least more tests.

StatusFileSize
new8.4 KB
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to enable simpletest module.
[ View ]

Added explicit tests for the whitelist class to check the key shrinking behaviour and that the cache item gets written etc.

Status:Needs review» Needs work

The last submitted patch, path_alias_whitelist.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new7.94 KB
PASSED: [[SimpleTest]]: [MySQL] 33,151 pass(es).
[ View ]

That's going to fatal due to .htaccess cruft, sorry.

StatusFileSize
new8.36 KB
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to enable simpletest module.
[ View ]

Removing debug.

Status:Needs review» Needs work

The last submitted patch, whitelist.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new7.9 KB
FAILED: [[SimpleTest]]: [MySQL] 33,146 pass(es), 7 fail(s), and 30 exception(es).
[ View ]

:(

Status:Needs review» Needs work

The last submitted patch, whitelist.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new8.41 KB
FAILED: [[SimpleTest]]: [MySQL] 33,155 pass(es), 1 fail(s), and 0 exception(es).
[ View ]

This should actually pass, missing one line.

Status:Needs review» Needs work

The last submitted patch, whitelist.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new8.41 KB
PASSED: [[SimpleTest]]: [MySQL] 33,269 pass(es).
[ View ]

Hmm, this passes locally. Trying one more time.

Discussed with irc in chx and he reckons supporting non-system paths being passed to url() is 'babysitting broken code'. I'm ambivalent on this - since while url() has an 'alias' option it does not have an 'altered' option so if you have a site that alters something-foo-bar to something/foo/bar that feels valid to pass to url() without the 'alias' option - in fact the alias option would be wrong since it's not one. We could look at changing the docs for this for D8 though. #15 is still valid for that original approach though.

Status:Needs review» Reviewed & tested by the community

We discussed this a lot and for now we do not have a better choice because anything better would require a change to url() signature -- namely changing alias to something like 'final' to stop not just aliasing but altering as well.

Issue summary:View changes

Issue summary:View changes

Updated issue summary.

Status:Reviewed & tested by the community» Needs review
StatusFileSize
new7.96 KB
PASSED: [[SimpleTest]]: [MySQL] 33,290 pass(es).
[ View ]

the patch at #32 didn't apply for me, here's a re-rolled one that does. please RTBC if i got the re-roll right.

Another possible way to do this, no time to write a patch though and may not get to it this week.

db_query("SELECT DISTINCT SUBSTRING_INDEX(path, '/', 1) AS path FROM {menu_router}");

That gets all the possible 'directories' for router items in the system, but on a much smaller table so it doesn't matter that the query isn't indexed. Or could even do it in PHP.

Use this as as the basis for the whitelist, then record whether a path has an alias or not using the same logic as now.

Since the list would be finite, if something's not in the list at all, we just don't check for it. Would need to clear that cache on menu_rebuild().

Subscribing. My new policy is to provide value on 'subscribe' posts, so here goes.

A SQL Query walks into a bar. In one corner of the bar are two tables. The Query walks up to the tables and asks: Mind if I join you?

I have nothing relevant to offer here, just quite interested in following the discussion.

I love #36. Given that we have this information during menu building we could directly store it and never run that query (aside from an hook)update_N function). You want $item['_parts']); inside _menu_router_build once it's completed (just before   // Apply inheritance rules.)

Status:Needs review» Needs work

Yeah let's do #36. Figuring it out from the router sounds good although we need to be able to rebuild it on a cache miss as well - but could possibly just invoke hook_menu() and hook_menu_alter() ourselves even?

This same problem of needing to distinguish between nothing and nothing came up with #1011614: Theme registry can grow too large for MySQL max_allowed_packet and memcache default slab size as well, and a similar approach worked there (store the keys of the theme registry only then build out from there).

StatusFileSize
new4.49 KB
FAILED: [[SimpleTest]]: [MySQL] 33,391 pass(es), 19 fail(s), and 0 exception(es).
[ View ]

Here's a start on #36, not passing tests yet though.

Status:Needs work» Needs review
StatusFileSize
new4.49 KB
FAILED: [[SimpleTest]]: [MySQL] 33,421 pass(es), 4 fail(s), and 0 exception(es).
[ View ]

And this should pass.

Status:Needs review» Needs work

The last submitted patch, path.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new4.51 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in includes/path.inc.
[ View ]

Well maybe not. Thought of one other logic issue, will look at the rest later.

Status:Needs review» Needs work

The last submitted patch, path.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new4.51 KB
FAILED: [[SimpleTest]]: [MySQL] 33,407 pass(es), 4 fail(s), and 0 exception(es).
[ View ]

one character fix to make path.inc compile.

Status:Needs review» Needs work

The last submitted patch, path.patch, failed testing.

I think your patch will rebuild the menu router on every page load.

It calls parent::__construct() before checking if (empty($this->storage)) so that should only ever happen on cache misses.

Status:Needs work» Needs review
StatusFileSize
new4.92 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch path_2.patch. This may be a -p0 (old style) patch, which is no longer supported by the testbots.
[ View ]

Looked at the test failures.

update.php calls drupal_bootstrap_full(), this calls path initialize, which is hitting field_ui_menu(), which is calling entity_get_info(), which is undefined because the entity system moved to a module and isn't enabled yet.

Status:Needs review» Needs work

The last submitted patch, path.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new5.65 KB
PASSED: [[SimpleTest]]: [MySQL] 33,453 pass(es).
[ View ]

Added a comment to clarify what's happening in __construct()

Status:Needs review» Needs work

What's missing from the patch is that the parent constructor will fill in the storage if there is something to fill. But you are likely to rebuild the menu unnecessarily. I suggested pushing the parts right where they are available readily parsed into parts during router rebuild ... somewhere :)

Do you mean in addition to the logic here or instead? If we did it instead it won't catch cache clears. I agree with trying to do this from a performance angle but it could get a bit messy.

we store menu masks in a variable. cant we store this in a variable too?

Status:Needs work» Needs review
StatusFileSize
new7.63 KB
FAILED: [[SimpleTest]]: [MySQL] 33,387 pass(es), 18 fail(s), and 3,236 exception(es).
[ View ]

Here's a version with the variable. I couldn't think of a better name than 'menu_path_roots' but there ought to be a better one.

Also did some manual testing and profiling with this, all looks to be working fine and tests passed locally for me.

Forgot to mention, this means we'll need an update when this is backported to D7 - the update will need to run the original query except against the {menu_router} table and store the variable - that way we're not invoking any hooks in the update and we don't get left with an empty variable.

Status:Needs review» Needs work

The last submitted patch, path.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new7.76 KB
PASSED: [[SimpleTest]]: [MySQL] 33,617 pass(es).
[ View ]

Right patch would help.

Question. $query->condition('u.source', db_like($root) . '%', 'LIKE') ->range(0, 1); does this need /% ?

I think so yeah, if we don't have a path whitelisted, someone calls url('node/1');

That gives us a path root of 'node'.

We then want to know if there's any alias in the system at all for any path starting with 'node', so LIKE 'node%' will find one of those.

StatusFileSize
new7.78 KB
PASSED: [[SimpleTest]]: [MySQL] 33,636 pass(es).
[ View ]

Spoke to chx about the new variable, we can save a bit of memory by using array_values() when we set the variable.

Implementation looks good to me, just one question...

+++ b/includes/path.inc
@@ -355,6 +347,76 @@ function current_path() {
+    // FALSE. This ensures that paths which do not exist in the router are not
+    // looked up, and that paths that do exist in the router are only looked
+    // up once. Since url() can be called during maintenance mode, avoid

"that paths that do exist in the router". Not sure I understand this. Everything in $this->storage *does* exist in the router, no? Looking at resolveCacheMiss(), what it actually checks is if there is a path *with an alias* with that prefix.

Maybe that can be mentioned more explicitly here?

-24 days to next Drupal core point release.

Hmm yeah the comment isn't great, basically we store three states:

NULL - exists in the router but not whitelisted or blacklisted yet.

FALSE - blacklisted.

TRUE - whitelisted.

By having the array with those three possible values, it means that if you do url('12345'); and that's not in the router, it's not going to run the query at all - this was mainly just about explaining why we initialize the entire array to NULL rather than just querying and storing TRUE/FALSE (which is what earlier patches did but is dangerous given what url() allows for).

Priority:Normal» Major
StatusFileSize
new7.82 KB
PASSED: [[SimpleTest]]: [MySQL] 33,759 pass(es).
[ View ]

Re-rolled for /core. Also bumping priority to major - the query can take multiple seconds to complete, and until it does so, blocks any request to the site that calls url() which could be all of them.

StatusFileSize
new7.91 KB
PASSED: [[SimpleTest]]: [MySQL] 33,994 pass(es).
[ View ]

Discussed this with chx in irc and he pointed out adding an extra array key to be parsed with list() is not safe for backwards compatibility. So instead menu_router_build() and _menu_router_build() get an optional $save parameter.

Status:Needs review» Reviewed & tested by the community

I like it. Very, very much like it finally.

Assigned:Unassigned» webchick

Since this is for backport to D7 and my patch, assigning to Angie.

Status:Reviewed & tested by the community» Needs work

I had started reviewing this the other night with catch on IRC and I pointed out:

+++ b/core/includes/path.inc
@@ -355,6 +347,76 @@ function current_path() {
+    if (empty($this->storage) && !defined('MAINTENANCE_MODE')) {

What happens when we're in maintenance mode?

And he said "Oh, you found a bug." :) So marking needs work for now.

6 days to next Drupal core point release.

Status:Needs work» Needs review
StatusFileSize
new7.73 KB
PASSED: [[SimpleTest]]: [MySQL] 34,028 pass(es).
[ View ]

Alright, fix the no longer needed MAINTENANCE_MODE check and updated the comment. This was previously in there because a cache miss could trigger a menu_rebuild(), but that is no longer the case.

Status:Needs review» Needs work

this is ugly. i don't mean catch's work, i just mean the whole path lookup system :-(

looks like we're missing some of the implications of coupling with menu rebuilds here. AFAICS unless new router item additions clear the path whitelist cache, we can refuse to look for aliases for valid system paths, so i think we need something in _menu_router_build() to wipe or check-and-wipe-if-new-roots or something.

also, drupal_path_alias_whitelist_rebuild() doesn't look right when called from path_* CRUD functions.

1. add a path alias for 'admin/people' --> 'mypeeps'. we write out 'admin' => TRUE to the whitelist cache, all good, number of rows in url_alias == 1.

2. delete the 'mypeeps' alias, number of rows == 0. in drupal_path_alias_whitelist_rebuild(), we don't clear path_alias_whitelist cache, because we get a match for 'admin' in the whitelist.

also, i think we need to wrap a single instance of the PathAliasWhitelist class, or use drupal_static to grab the instance we build in drupal_lookup_path(). otherwise we load another one in drupal_path_alias_whitelist_rebuild() (another hit on persistent cache, *without* anything we've learned from this request :-(), then depending on the flow, we might just go ahead and fetch another one. if we just add a PathAliasWhitelist::reset() or similar we can fix that.

i'll try to catch catch and chx in IRC to hash this out.

Assigned:webchick» Unassigned

Unassigning webchick.

catch this is still problematic for me on a D6+Pressflow site, with a similar situation to your original issue description and comment #2. Might you have a workable start to a D6 fix for this? I checked your github tree and it doesn't look like you committed anything there. The original launchpad code is here but that just backs it out of cache and into a variable. Just trying to ease a bit of work in hacking at it myself. Thanks!

The master branch of that tree is just the same as Pressflow, https://github.com/tag1consulting/pressflow6/tree/path_alias_slow_query should be the right branch for D6 work.

Excellent, thank you for the pointer and for contributing the fix to the community! This will definitely be helpful.

Status:Needs work» Needs review
StatusFileSize
new8.48 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in core/includes/path.inc.
[ View ]

Coming back to this.

looks like we're missing some of the implications of coupling with menu rebuilds here. AFAICS unless new router item additions clear the path whitelist cache, we can refuse to look for aliases for valid system paths, so i think we need something in _menu_router_build() to wipe or check-and-wipe-if-new-roots or something.

Added this check (untested).

also, drupal_path_alias_whitelist_rebuild() doesn't look right when called from path_* CRUD functions.

1. add a path alias for 'admin/people' --> 'mypeeps'. we write out 'admin' => TRUE to the whitelist cache, all good, number of rows in url_alias == 1.

2. delete the 'mypeeps' alias, number of rows == 0. in drupal_path_alias_whitelist_rebuild(), we don't clear path_alias_whitelist cache, because we get a match for 'admin' in the whitelist.

I think this is fine - it just means we'll get some pointless path alias requests until the cache is built from scratch again, for one very specific case.

also, i think we need to wrap a single instance of the PathAliasWhitelist class, or use drupal_static to grab the instance we build in drupal_lookup_path(). otherwise we load another one in drupal_path_alias_whitelist_rebuild() (another hit on persistent cache, *without* anything we've learned from this request :-(), then depending on the flow, we might just go ahead and fetch another one. if we just add a PathAliasWhitelist::reset() or similar we can fix that.

Went for drupal_static().

Also did PSR-0-ification after the DrupalCacheArray -> CacheArray conversion.

StatusFileSize
new8.47 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid PHP syntax in core/includes/path.inc.
[ View ]

Argh forgot to add some files before rolling the final patch, proper roll this time.

Status:Needs review» Needs work

The last submitted patch, path_alias_whitelist_1209226.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new8.47 KB
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed attempting to get list of tests from run-tests.sh.
[ View ]

Fixing trailing whitespace and the syntax error.

Status:Needs review» Needs work
Issue tags:-Performance, -scalability, -needs backport to D7

The last submitted patch, path_alias_whitelist_1209226.patch, failed testing.

Status:Needs work» Needs review

#78: path_alias_whitelist_1209226.patch queued for re-testing.

Status:Needs review» Needs work
Issue tags:+Performance, +scalability, +needs backport to D7

The last submitted patch, path_alias_whitelist_1209226.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new8.5 KB
FAILED: [[SimpleTest]]: [MySQL] 36,690 pass(es), 1 fail(s), and 0 exception(s).
[ View ]

fixed old function signature of PathAliasWhitelist::set(), lets see if this will get somewhere with the bot.

Status:Needs review» Needs work

The last submitted patch, 1209226-82-path-alias-whitelist.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new660 bytes
new9.28 KB
FAILED: [[SimpleTest]]: [MySQL] 40,362 pass(es), 1 fail(s), and 2 exception(s).
[ View ]

Not 100% sure this is correct but it does fix the test.

Status:Needs review» Needs work

The last submitted patch, path-alias-whitelist-1209226-84.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new2.27 KB
new9.27 KB
FAILED: [[SimpleTest]]: [MySQL] 41,484 pass(es), 1 fail(s), and 2 exception(s).
[ View ]

Re-rolled #82 including change by @Berdir.

I tried to install Drupal with this patch applied, and seems this change #1477446: Move lock backend to PSR-0 code was not included in this patch. Change lock_adquire by lock()->adquire() fix the issue.

Status:Needs review» Needs work

The last submitted patch, 1209226-86-path-alias-whitelist.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new1.79 KB
new11.06 KB
PASSED: [[SimpleTest]]: [MySQL] 41,900 pass(es).
[ View ]

Ok, this should pass the tests.

- Had to prevent drupalGet() from converting the source path to an alias, which makes sense.
- The forum tests had to wrong calls (drupalGet('/')) so offsetGet($offset) was called with FALSE. Fixed the calls, we might also want to add a check for this, not sure...

+++ b/core/includes/path.incundefined
@@ -385,27 +379,20 @@ function current_path() {
  * @return
- *   An array containing a white list of path aliases.
+ *   An instance of the PathAliasWhitelist class.
...
-  variable_set('path_alias_whitelist', $whitelist);
-  return $whitelist;
+  cache()->delete('path_alias_whitelist');
+  $whitelist = new PathAliasWhitelist('path_alias_whitelist', 'cache');

Not true, nothing is returned?

Not sure if the documentation is wrong or the code.

hmm it doesn't return it, it adds a new whitelist instance to the drupal_static() cache. It probably makes sense to return in the 7.x backport for bc, but I don't see a reason to in 8.x.

StatusFileSize
new11.14 KB
PASSED: [[SimpleTest]]: [MySQL] 42,126 pass(es).
[ View ]

Ok, here is a re-roll that removes the inconsistent return, uses state() for the root parts and also fixes the rebuild function, which was IMHO completely broken. It created a new instance and assigned it to $whitelist, which was only defined if a $source was passed in and it was *not* by reference.

I think now it should actually work. I believe it only "worked" previously because it assigned the empty return value to $cache['whitelist'] in case of a wipe call which completely disabled the whitelist handling. At least that's how I understood it.

Backreference to issue porting variables => state, postponed:
#1798884: Convert path_alias_whitelist to state system

This is still current:
#1269742: Make path lookup code into a pluggable class

StatusFileSize
new11.96 KB
FAILED: [[SimpleTest]]: [MySQL] 47,620 pass(es), 9 fail(s), and 50 exception(s).
[ View ]

Rerolled, clashed with #1798884: Convert path_alias_whitelist to state system

Some things described from Berdir seems not to be fixed in his latest patch. Discussed, fixed.

Added and fixed some doc, added missing public decorator.

Reviewed in depth, looks good. Found many inconsistencies with DrupalCacheArray that was renamed to CacheArray, followup:
#1831646: Document rename DrupalCacheArray to CacheArray

Added clear() function to CacheArray to cleanly abstract the clearing process. This was ugly.

Providing new version.

Status:Needs review» Needs work

The last submitted patch, 1209226_path_alias_whitelist_cache_93.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new12 KB
FAILED: [[SimpleTest]]: [MySQL] 47,564 pass(es), 10 fail(s), and 0 exception(s).
[ View ]

Next try. Fixed clear() that missed some reset parts.

Status:Needs review» Needs work

The last submitted patch, 1209226_path_alias_whitelist_cache_95.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new12.27 KB
PASSED: [[SimpleTest]]: [MySQL] 47,809 pass(es).
[ View ]

Debugged, fixed. :-)
Prepopulation of the cache was missing!

Status:Needs review» Needs work
Issue tags:-Performance, -scalability, -needs backport to D7

The last submitted patch, 1209226_path_alias_whitelist_cache_97.patch, failed testing.

Status:Needs work» Needs review

Status:Needs review» Needs work

The last submitted patch, 1209226_path_alias_whitelist_cache_97.patch, failed testing.

Status:Needs work» Needs review
Issue tags:+Performance, +scalability, +needs backport to D7

Yessssss! Passed!

+++ b/core/includes/menu.incundefined
@@ -3706,6 +3710,18 @@ function _menu_router_build($callbacks) {
+      state()->set('menu_path_roots', array_values($path_roots));

Shouldn't this use the new clear method?

Otherwise the updates to this look great.

StatusFileSize
new12.25 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch 1209226_path_alias_whitelist_cache_104.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Hmm you catched the point.. :-)

The separation was still not perfect.
I dropped the cache() call as it should be completely wrapped by the class .. and path.inc provides functional wrappers.
Switched to drupal_clear_path_cache() that doesn't do too much more.

Additionally found a typo LoadMenuPathRoots -> loadMenuPathRoots().

Let's see what testbot thinks about it.

+++ b/core/includes/menu.incundefined
@@ -3706,6 +3710,18 @@ function _menu_router_build($callbacks) {
+    if ($path_roots != state()->get('menu_path_roots')) {
+      state()->set('menu_path_roots', array_values($path_roots));
+      drupal_clear_path_cache();
+    }
+    _menu_router_save($menu, $masks);

What I'm worried about is that this introduces a call from menu.inc to path.inc, and I'm not sure that it's good to have this dependency.

Also, do we need this for the new router system as well?

We can only solve this by introducing a hook in the menu system. (e.g. hook_menu_router_save)
The path system is dependent on the menu building.

Generally, extracting the roots is here done in menu.inc. But the only reason why we're doing this is the whitelisting of the path.inc.
I consider this a followup issue as this dependency was here already before. It currently just became visible.

EDIT: So if we want to reduce the dependency, we also need to pass extraction of the menu_roots to path.inc and even add a second hook. Possibly we can cover it with one, if we pass $menu...

Posting from phone but the generator patch depends on the pluggable path inc patch which should be easier with this patch in. It'd be better if menu.inc provided a hook to react to so the cache could clear there.

:-) crosspost!

Followup or in this issue?

I don't quite see how a hook would be helpful here, since path.inc is not a module, and the hook implementation would have to be invoked regardless of whether path.module is enabled or not.

For now, I think it would be better to move along with this hard-coded call from menu.inc to path.inc. Both operate at full bootstrap. Both will be converted into services either way, so the only real requirement we have are good tests to prevent this code from being mistakenly deleted without modern replacement. ;)

Path aliasing is likely to move to a listener as part of the new router changes - in effect it'll be implementing the replacement of hook_url_outbound_alter(), if that happens it ought to be possible to provide it from a module. Since there's multiple patches being worked on to refactor both these systems I think the hard-coded call is probably OK too - it's that or taking another performance hit from building and throwing away menu routers in path.inc.

Issue tags:+Needs tests

Looks like #973436: variable_set() should rebuild the variable cache, not just clear it also introduced a clear method but called it remoteStorage() and only added it to the specific cache array implementation.

@sun: There are no tests for this functionality yet, I think. Sounds like we need some specific tests that create/remove aliases for something like system/... or user/.. and then check if this was correctly updated. Adding tag.

Status:Needs review» Needs work
Issue tags:+Performance, +Needs tests, +scalability, +needs backport to D7

The last submitted patch, 1209226_path_alias_whitelist_cache_104.patch, failed testing.

Doesn't apply locally either:

$ git apply 1209226_path_alias_whitelist_cache_104.patch
error: patch failed: core/includes/path.inc:9
error: core/includes/path.inc: patch does not apply

Yes the path system is now completely rewritten so this needs complete overhaul.. (but is still relevant!)
(The previous patch still is the best start for a later D7 backport.)

Status:Needs work» Needs review
StatusFileSize
new12.82 KB
FAILED: [[SimpleTest]]: [MySQL] 48,828 pass(es), 11 fail(s), and 0 exception(s).
[ View ]

Here is a re-roll. @katbailey probably isn't going to like this much yet as it doesn't inject stuff yet, but I *really* like this, the code is way easier now that it's all encapsulated in a single class. Also moved the whitelist class into the new Drupal\Core\Path namespace.

Had to trick a bit in the Path unit tests by manually setting the menu roots variable.

Still would like to add better tests for the actual whitelist handling, so that we have an explicit test that makes sure the whitelist is updated correctly as aliases are added and removed. Also not yet sure how to integrate it with the new router system (is that necessary to get it in in atm?)

Status:Needs review» Needs work

The last submitted patch, 1209226_path_alias_whitelist_cache_115.patch, failed testing.

A lot of things don't work with the new router system yet, so that's not a requirement no.

+++ b/core/lib/Drupal/Core/Path/AliasWhitelist.phpundefined
@@ -0,0 +1,109 @@
+      debug($roots);

A debug statement :-)

+++ b/core/lib/Drupal/Core/Path/AliasWhitelist.phpundefined
@@ -0,0 +1,109 @@
+    debug($this->storage);

Another debug :-)

Rest looks really awesome! I'm also dreaming of tests for these cases. I tried for a short time without too much success of a simple / clean approach.

Status:Needs work» Needs review
StatusFileSize
new712 bytes
new12.77 KB
PASSED: [[SimpleTest]]: [MySQL] 48,781 pass(es).
[ View ]

Yes, and removing those also seems to fix the tests.

This looks like a great improvement - but yes, you guessed it, I am curious as to why the AliasWhitelist isn't being injected into the AliasManager :-)

StatusFileSize
new8.77 KB
new16.85 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch path-alias-whitelist-1209226-122.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Inject all the things :)

- Defining AliasWhitelist as a service now, and inject cid, bin, state and database (not sure about having to inject cid/bin)
- Injecting that to AliasManager
- Fixing some comments
- Added tests for the whitelist

Let's see if this works. A bit worried about the __destruct() in a service as that's breaking my lock/form_state patches. But if that would be a problem here, the above patch should already fail as well I think.

Attached are some coding/documentation standard thoughts from reading through the patch in #122:

+++ b/core/includes/menu.incundefined
@@ -2688,8 +2687,11 @@ function menu_router_rebuild() {
+ *   (Optional) Save the new router to the database. Defaults to FALSE.

My understanding is that '(Optional)' should be '(optional)' [lower case].

+++ b/core/lib/Drupal/Core/Path/AliasManager.phpundefined
@@ -78,14 +70,10 @@ class AliasManager implements AliasManagerInterface {
-  public function __construct(Connection $connection, KeyValueFactory $keyvalue) {
+  public function __construct(Connection $connection, AliasWhitelist $whitelist) {
     $this->connection = $connection;
-    $this->state = $keyvalue->get('state');

I believe this needs a doclock with @param directive for each parameter. Same with other constructor in this patch.

+++ b/core/lib/Drupal/Core/Path/AliasWhitelist.phpundefined
@@ -0,0 +1,128 @@
+  public function clear() {
+    parent::clear();
+    $this->loadMenuPathRoots();
+  }
+}
diff --git a/core/lib/Drupal/Core/Utility/CacheArray.php b/core/lib/Drupal/Core/Utility/CacheArray.php

I am not sure, but @sun, @xjm and @tim.plunkett have all said that there should be a blank line before the of a class declaration. I am not sure, but they say that it is part of our coding standards.

Status:Needs review» Needs work
Issue tags:-Performance, -Needs tests, -scalability, -needs backport to D7

The last submitted patch, path-alias-whitelist-1209226-122.patch, failed testing.

Status:Needs work» Needs review

Status:Needs review» Needs work
Issue tags:+Performance, +Needs tests, +scalability, +needs backport to D7

The last submitted patch, path-alias-whitelist-1209226-122.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new17 KB
PASSED: [[SimpleTest]]: [MySQL] 48,984 pass(es).
[ View ]

Re-roll, guzzle had to add it's stuff right where I made my change as well :p

StatusFileSize
new2.13 KB
new17.58 KB
PASSED: [[SimpleTest]]: [MySQL] 48,999 pass(es).
[ View ]

Updated to fix the documentation issues outlined in #128.

I think this is ready.. any final reviews? :)

StatusFileSize
new956 bytes
new17.59 KB
FAILED: [[SimpleTest]]: [MySQL] 49,889 pass(es), 4 fail(s), and 0 exception(s).
[ View ]

Another one to fix two copied comments in AliasTest, did not fix the original that file needs a cleanup follow-up to avoid code duplication and fix those wrong and missing comments. Not a problem introduced here.

Status:Needs review» Needs work
Issue tags:+Performance, +Needs tests, +scalability, +needs backport to D7

The last submitted patch, path-alias-whitelist-1209226-129.patch, failed testing.

Argh, here we go. Segmentation faults.

Segmentation fault
FATAL Drupal\views\Tests\Handler\FilterInOperatorTest: test runner returned a non-zero error code (139).
Segmentation fault
FATAL Drupal\views\Tests\Handler\FilterNumericTest: test runner returned a non-zero error code (139).
Segmentation fault
FATAL Drupal\views\Tests\Handler\FilterEqualityTest: test runner returned a non-zero error code (139).
Segmentation fault
FATAL Drupal\views\Tests\Handler\FilterStringTest: test runner returned a non-zero error code (139).

And the same views Filter tests as e.g. in the form state issue.

Which means we are blocked on #512026: Move $form_state storage from cache to new state/key-value system and possibly #1858616: Extract generic CacheCollector implementation and interface from CacheArray too although we can avoid that by overriding __destruct() if we have to.

Status:Needs work» Needs review
StatusFileSize
new1.83 KB
new33.34 KB
PASSED: [[SimpleTest]]: [MySQL] 52,273 pass(es).
[ View ]

Re-roll that includes the latest service destructor patch and uses it, including an override of __destruct() to avoid the segfaults. Ugly but the cache collector will remove the need for that.

Patch applies nicely. What do we need to do to get this marked RTBC?

#1891980: Add Service termination API to reliable terminate/shut down services in the container needs to get in, which should hopefully happen soon as it's now RTBC. Then I will re-roll this issue and it should be ready for final reviews/RTBC.

StatusFileSize
new19.69 KB
PASSED: [[SimpleTest]]: [MySQL] 52,453 pass(es).
[ View ]

Ok, the service destructor is in, here's a re-roll.

Patch from #136 applies, doesn't break anything (install) so far and works.

Issue tags:-Needs tests

Maybe someone can do some extended manual testing and confirm that it is working. Basically, what you have to do is add and remove aliases for different root paths (node, user, admin, ...) and make sure that they are found and correctly resolved without having to do manual cache clears and report back with what you did exactly.

There is test coverage for the same thing, so removing the Needs tests tag. I think this is ready.

+++ b/core/modules/system/lib/Drupal/system/Tests/Path/AliasTest.php
@@ -148,4 +150,43 @@ function testLookupPath() {
+    $this->assertNull($whitelist['invalid']);

Just a small nitpick here, I'd use randomName() for generating an invalid path. Other than that, it looks great to me.

StatusFileSize
new1.73 KB
new19.83 KB
PASSED: [[SimpleTest]]: [MySQL] 53,054 pass(es).
[ View ]

Updated the patch to use randomName() for invalid rooth paths and alias names. There is no conflict potential there because user/admin don't have a length of 8.

Note that the array syntax will be replaced with a CacheCollector from #1786490: Add caching to the state system once it is in. It doesn't matter which way this gets in, I can re-roll/do a follow-up to update it either way round. Both issues need to go in anyway.

Status:Needs review» Reviewed & tested by the community

I think we're good to go here.

Version:8.x-dev» 7.x-dev
Status:Reviewed & tested by the community» Patch (to be ported)

I worked on this a fair bit but enough people reviewed/updated the patch I'm OK committing it.

Committed/pushed to 8.x, moving to 7.x for backport.

Status:Patch (to be ported)» Needs review
Issue tags:-needs backport to D7
StatusFileSize
new8.64 KB
FAILED: [[SimpleTest]]: [MySQL] Failed to run tests: failed to login to test site.
[ View ]

I backported the patch to Drupal 7 (actually I started off with last D7-version of the patch and carefully implemented all further changes that made sense in D7).

Status:Needs review» Needs work

The last submitted patch, path-alias-whitelist-1209226-143.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new8.73 KB
PASSED: [[SimpleTest]]: [MySQL] 40,355 pass(es).
[ View ]

Woops. Left out the $save = TRUE flag on menu_router_build(TRUE) in menu_rebuild()

Status:Needs review» Needs work

Looks good, but I think those tests need to be backported as well. We'll have to make it a normal webtest though, as 7.x doesn't have DUBT :(

Issue summary:View changes

Updated issue summary.

Issue summary:View changes

added executive summary so that people know where things stand.

Issue tags:+Needs tests

The path in #145 still applies nicely in D7. Looks like this would be a good fix.

Still waiting on simple tests to get into Core.