So after checking out the custom redirect module in drupalorg.module, it completely hits me that for the most part we can and should store redirects using hook_menu or hook_menu_alter() and it allows us to not use hook_init() which could be a performance hit for large sites that don't have thousands of redirections.
Comments
Comment #1
deviantintegral commentedSeems like a good idea to me. I'm not sure why you would need to use hook_menu_alter() at all.
Comment #2
dave reidUsing hook_menu_alter rather than hook_menu would allow us to override existing paths like we want to do with #615008: "Forced" path redirect - override the existing path or alias
Comment #3
gregglesThis only kind of makes sense to me. I believe this just shifts the performance problem from a database lookup during hook_init to a database lookup when the menu system fires up. I don't think this really gains very much benefit from a performance perspective and may actually be worse.
Is there anything beyond the menu system that would benefit aside from being more logically consistent?
Comment #4
deviantintegral commentedI wonder if there would more benefits from working some kind of caching into the lookup logic. Might be worth running some benchmarks on a large number of paths and redirects to see where (if?) the bottleneck is.
Comment #5
dave reidBasically I think of it this way. The vast majority of page requests are valid links and not redirections. Let's estimate that number at 95% vs 5% redirections. Currently, because we use hook_init() it causes a redirect lookup (and possibly more if source paths with query strings are involved) that in 95% of the time does not apply. Whereas if we integrate into the menu router, we only lookup redirects on those 5% of requests that are *only* redirections. This is the way the redirections.module in the drupalorg module project does it and I think it makes a lot of sense for a large site with not a huge amount of redirects. Obviously if a larger part of your page requests are redirections, using hook_init() makes more sense. The point would be to give an option to switch between these methods and use a smart default to start with.
Comment #6
gregglesGiving an option seems like feature bloat that is likely to introduce more bugs in the long run.
Putting those redirects (the 5%) into the menu router tables will make the menu lookups on each page slower (I don't know percents, but the more data in the table the slower it is). As I think on it more, perhaps it will be faster because it is doing one query on one dataset - but given that the menu uses some wildcards in the lookups it might also be a slower comparison.
Comment #7
moshe weitzman commentedInteresting. I have a feeling that if we switch to menu_router, we are going to see a trickle of bug reports about mysterious slow sites. Those will be sites that have dozens of aliases for same node for some reason. Maybe they like the current year+month in the url, so they alias every month. We could tell those folks to use a different solution, IMO. I kind of like using menu_router for the reason that it saves the DB hit.
Comment #8
dave reidComment #9
carlos8f commentedSwitching to menu_router is a drastic change, and while it would cut the overhead of hook_init(), it also bloats the menu_router table and the caching system associated with that, especially if the number of redirects grows over time.
Still supporting the path_redirect table would have to be an option. I've dealt with migrating several large WordPress blogs to Drupal, requiring thousands of path_redirect entries to preserve permalinks. Obviously all that can't go in the menu_router.
It might be nicer to create a new contrib module with this approach, so existing path_redirect sites can live peacefully. menu_router would only be for savvy site managers that want to skip the per-request overhead.
Comment #10
dave reidLots of people seem to be missing the point. It'd be an option, not just replacing hook_init().
Comment #11
carlos8f commentedMaybe I have missed the point, but I did read the comments :)
I still feel that menu_router isn't quite appropriate, so how about this instead (pseudo code):
Obviously this only works when the redirects table is very small. When switching to the new mode, the table could be stored in {variables} which has already been fetched from the cache. Alternatively a cache_get() might be used, but that usually means another query unless you're running memcached or the like.
Forgive me if this is too simplistic, I admit I've only just glanced at path_redirect under the hood.
Comment #12
crea commentedMy thoughts:
1) There's already enough bloat in the menu router system. We don't want to grow it even more.
2) With hook_init() approach we could store redirects outside of DB e.g. in memory-based cache backend. With menu router system we can't do that obviously.
3) As a general performance improvement, I suggest to implement a new feature: #791754: Export redirects to webserver rules
Comment #13
dave reidDon't think I made myself quite clear. This is an option, not a one or the other change. :)
Comment #14
dave reidMoving to the new project. Out of scope for a path_redirect 1.0.
Comment #15
dave reidBetter yet, I've figured out how to get the redirects to be stored in page cache instead, which is even less work than storing in the menu router, so abandoning this method.