Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Feel free to change the title and relevant project information if I categorized incorrectly.
This is a placeholder task to avoid collisions of database sprinters and the search sprinters.
Comment | File | Size | Author |
---|---|---|---|
#81 | 394182docb.patch | 1.03 KB | jhodgdon |
#77 | 394182doc.patch | 1.03 KB | jhodgdon |
#71 | dbtng-search-reverse.patch | 36.58 KB | davyvdb |
#68 | search_dbtng.patch | 48.58 KB | chx |
#66 | search_dbtng.patch | 48.46 KB | chx |
Comments
Comment #1
Dave ReidComment #2
Crell CreditAttribution: Crell commentedmradcliffe says he'll be a while, so this is open again.
Comment #3
BerdirOk, uploading a first version of the patch.
The problematic part is do_search() and everything that is related.
It *seems* to work fine, however, there are a few API changes in do_search() itself, search_parse_query(), _search_parse_query() and hook_rankings() (which is not yet documented, btw)
Comment #5
BerdirI forgot to mention that the tests fail because of #430904: PagerDefault displayes PHP notice when multiple Pagers are used on the same page, not a bug in this patch..
Comment #6
BerdirPager fix has been commited, retesting this...
Comment #8
BerdirOk, this evolved quite a bit...
The above patch just converted do_search() internally to DBTNG but that doesn't really allow to use what DBTNG provides. I discussed this a bit with DamZ and we agreed that this should probably be a class.
So I went ahead and did the obvious: Converting do_search() and the related functions to a Search DB Extender. However, there were a few problems and limitations, namely:
- The extender only works for the search_index table
- The order of the extenders is now relevant. Because the current implementation of the DefaultPager increases a counter for every call of execute(), it "paged" both the first pass and the second pass query, so while everything worked, theme('pager', NULL) returned nothing. This means that the DefaultPager extender needs to be "outside" of the Search.
The next idea was to extend SelectQuery directly, however, that doesn't work as we don't know which SelectQuery is actually used because each DB Driver can overwrite that class.
Because of this, I made a "special" DB Extender that is created directly (by calling do_search()) and does solve the above two points:
- There is no way to use it on a different table, because the SelectQuery object is created in the constructor
- Because of that, the Search Extender is always the most inner extender
Now, what does the SearchQuery class provide, compared to do_search()
node.module can now simply do:
All tests pass on MySQL and PostgreSQL, now that the dwr() bug is fixed.
TODO:
- I don't like how hook_ranking() currently works. Imho, it would be nice if we could pass $query directly to the hooks, so that they could add ranking stuff (especially joins) directly. But I'm not sure how to do that, so that only enabled scores are added.
* According to my tests, search score normalization works for everything except comments but I don't think that's a problem inside SearchQuery.
Comment #10
Berdirmore bugs that I forgot to include... The tests should pass with the patch at #467984: DBTNG bugfixes applied.
Comment #11
Berdir#467984: DBTNG bugfixes is commited, testing again...
Comment #12
chx CreditAttribution: chx commentedIn overall, using the extender facility here IMO is very nice. There are a few thing though that could be refined
=& $
and Dries agreed to stick the reference to the variable. Core needs fixing if it does otherwise.do_search()->extend('PagerDefault')
I doubt we ever want to run a search unpaged so ->extend('PagerDefault') should be inside do_search. If we want to leave an "out" add an argument defaulting to TRUE, $use _pager. This will give a reason for do_search to exist, too. However, it needs a rename as it no longer does a search.Comment #13
BerdirWell, the tests don't need a pager ;)
What about:
db_*something* would conform with the other DBTNG functions which return a SomethingQuery object.
Fully agree with the other points, will re-roll asap.
Comment #14
Crell CreditAttribution: Crell commentedOn the contrary we do want the pager to be specified outside of the search extender... The goal is to potentially have several pagers to choose from that use different algorithms. Why hard-code the search extender to just one of them if we don't have to?
Comment #15
BerdirAccidently changed the status...
Comment #16
BerdirNew version...
- changed =& to = &, as this is a de facto standard in core (we should document that somewhere if it isn't already). chx had a good reason for it, =& can be easily confused with &= and that is a completely different thing.
- removed OUTER
- renamed do_search() to db_search(). I tend to agree with Crell regarding the pager. It would also be possible to add more extenders, a TableSort for example so it imho doesn't make sense to handle PagerDefault specially.
- renamed firstPass() to executeFirstPath()
- renamed *query* to *expression*
- updated the example in search.api.php
Questions..
- Still looking for a different approach for hook_rankings()...
- Can someone point me to contrib modules that do implement hook_search() on their own? I'd like to make sure that SearchQuery can be adopted to those too.
Comment #18
BerdirRe-roll, not sure what it did not work.
Comment #20
BerdirD'oh!
Forgot the -N switch in my two latest patches, so SearchQuery was missing...
Comment #21
BerdirComment #22
chx CreditAttribution: chx commentedI ran out of things to complain about.
Comment #23
webchickI'd like Crell to give this a final +1 before committing. But overall, this looks like great work!!
Some minor things as I was glancing through quickly; I'm going to need a good half hour or so to review this patch once it's marked RTBC 'for reals':
Extra space before $values.
Ugh. Really? :\ That syntax is not very intuitive...
supports
extended
I can tell we are missing test coverage.... ;) It's $this->executeFirstPass();
Comment #24
BerdirNo, it's not. But neither is the resulting sql query: "SELECT COUNT(*) FROM (SELECT sid FROM....)". DamZ was thinking about making this the default count query: #423888: Use subqueries for ->countQuery(), at least for MySQL.
Good catch. That check isn't needed anymore anyway, as I moved the i.relevance replacement into execute().
I'll update the patch soon. I also wrote Doug Green and he'd like to review the patch too.
Comment #25
BerdirRe-roll with the things in #23 fixed.
As I said, search.module does pretty crazy stuff and I don't understand every bit, especially that search expression parsing and I'd like to get Doug Green's review for this.
Comment #26
Crell CreditAttribution: Crell commentedI barely understand search.module, but I'll try to give this a review tonight at least for the DB coding style bits.
Comment #27
BerdirWe can leave it for now, the patch is already huge, but what about the following changes to hook_ranking, for a follow-up issue...
- Merge it into SearchQuery
- make it search type agnostic, so that modules can add scores to all search types. the name would change to something like hook_ranking_$type()
- Change how it does handle things like joins and arguments to have better DBTNG compatibility. A possible way would be to move complex things to hook_search_alter() implementations, SearchQuery would just tag the query so that the correct hooks are executed. I'm not sure if that is good DX :)
Comment #28
BerdirAnother small improvment.
When looking at the implementation of location_search's hook_search, I noticed that options can be more complex than what node.module uses. So I added a new method, extractOption($option) which just extracts (and removes) the value from the search string and simply returns the value.
Comment #29
robertDouglass CreditAttribution: robertDouglass commentedGreat! I'm glad someone stepped up to do this! It will make it easier to try to refactor the code for more extensibility and performance.
Comment #31
BerdirRe-roll.
Comment #33
BerdirRe-roll.
Comment #35
BerdirThe last patches missed the statistics.module changes and it wasn't earlier because the search tests weren't run.
Comment #36
Dries CreditAttribution: Dries commented- I stared at this patch for 30 minutes and it looks good.
-
This function is normally only called by each module that supports the
is no longer valid. It's a class now.- I'd like to brainstorm some more about using the Extenders. I can see why we'd want to use an object, but I'm not sure I 100% buy into using the Extender system. It's not like it cleans up a huge amount of code. Also, it feels like regular database queries and search algorithms are two different categories of problems so it feels slightly awkward to think of search as a query -- it could be a lot more complicated than that, which is already illustrated by the execute() function which executes multiple database queries. It is no longer a single query so it gives me a "forced feeling" -- it is not all that pure, if you will, and not what a user of the API might expect. I feel it goes beyond 'decoration'.
- The documentation of SearchQuery explain the algorithm, but it doesn't show me the potential of being a SelectQuery object. Maybe we could add a paragraph of text to articulate the reason for using an Extender?
Comment #37
BerdirI'm neither. And the discussion in #497804-5: [meta] Search entities (nodes, terms, etc.) within the administrative interface shows that we might need a generic search api, that needs to work for other search backends too. However, I have currently no idea on how to do that.
A few points why an Extender might be the right way:
- It still allows access to the actual query object and it can be extended with other Extenders. Given the above discussion, this might not be a good idea, but currently, we need it: http://api.drupal.org/api/function/node_search/7. As I already said, I have no idea on how to make things like the taxonomy options, which require a JOIN, generic enough to work with apache solr and other backends.
- Executing a second query in execute() *does* look strange, however, PagerDefault is doing exactly the same with the count query. And this is very similiar, we check if there are possible matches for our search query and detect the maximal search score.
Comment #38
Dries CreditAttribution: Dries commentedIt is limiting to think of search as a query you can extend or decorate. It's still somewhat different with PagerQuery because for the pager, you're altering/extending a given query -- because the input is a query, and because it doesn't really matter what query you give as input, it correctly feels like decorating to me.
What we're doing in this patch, feels like reducing a search algorithm to a single specific query with some helper queries. Calling that a decoration seems odd.
As you mention, not all search algorithms use (sort of) a single query -- in fact, many of the more advanced search solutions don't use a relational database at all. That said, as far as I can see, this patch does not prevent alternative search algorithms or a generic search API..
Comment #40
BerdirTestbot, I don't believe you :)
Comment #42
BerdirJust a re-roll..
Regarding the approach, as I've said, I'm not sure. The extender approach was the only one that I could get to work without big workarounds. If it would be an own class, we would still need to allow access to the actual query object, and for that, we would have to duplicate the functionality of the extenders. And while maybe not in theory, that's practically a decoration. And that's what I've reused of the extender stuff.
The second thing is that it does not present itself as an Extender to the outside, you invoke it by calling db_search() and not db_select('some_table')->extend('Search').
But I'd like to hear some more opinions too, especially from people who are familiar with search.module and/or DBTNG.
Comment #43
robertDouglass CreditAttribution: robertDouglass commentedI'm still digesting the architecture discussion but the current code is a massive improvement in terms of legibility and order.
One small note, since it jumped out at me:
I think you need to separate the body and the bits you're adding with a space (this is a core bug), else we risk concatenated words (last one in the body and the first one in the comment/taxonomy). This doesn't happen most of the time because of HTML, but it's possible.
Comment #45
BerdirRe-roll, also added the suggested change of #43 as I'm touching these lines anyway...
Comment #46
joshmillerI spent about 30 minutes going through the patch. I don't understand everything about the code, but I read the comments very carefully and looked for code formatting inconsistencies. It looks like a lot, but they are very minor. Hopefully this helps.
hook_search();
hook_search(); case "search"
"...(such as negative..."
"Type of search." (remove the) and "This maps" (remove the "is")
Missing extra " * " between definition and @var
"if the first pass query..." shouldn't "path" be "pass" ?
"atleast" is two words: "at least"
This function isn't defined in the patch and doesn't look like "DBTNG" so I'm putting it in this list of fixes. Not sure if its a problem, just noting it.
We're using the variable "$warning" to hold the value of "$match[2]" for the sole purpose of determining if there should be a warning. Sounds like it should at the very least be TRUE or FALSE. At the very most maybe turn it into a generic warning function that can collect other warnings in the future
This error message seems redundant.
I know this isn't part of the patch, but search_update_totals()'s comments don't include periods.
Shouldn't "->key" be on a newline?
Ok, so that took longer than I expected. This is my first really big code review, hopefully it doesn't sound too nit-picky. And on a side note there was a whole bunch of code in there that was extremely WTF. When people above are saying they "don't get it" I totally understand. I was just like -- what the hell is going on here? Maybe in a follow-up patch we could more thoroughly document the more "this is weird, but we need it because" sections.
Josh
Comment #47
BerdirThanks for the thorough review! Lots of nice catches...
A few short notes, I will re-roll asap.
7:
It is DBTNG, and so are db_or() and db_xor() :) http://api.drupal.org/api/function/db_and/7
8+9:
That error handling stuff is really ugly I know. I just "exploded" ($warning instead of $query[3]) a return key with 10 different things in it (see http://api.drupal.org/api/function/search_parse_query/7). The thing is, this patch is already big and I'm not sure how many changes I should add and what should be done in follow-up patches. #258998: do_search refactoring: performance, legibility, move validation to validation/submit handlers of form. tried to improve these things a bit but got stalled, maybe I can merge some parts of that patch into this.
PS: There are very few who actually understand that search expression parsing and query building stuff inside those functions ;)
Comment #48
BerdirFixed all of #46 except 7, as that is not wrong and 8+9 as I need to think about that. I agree that fapi stuff should be removed from the search query class, but I'm not yet sure how.
Comment #49
webchickBumping to critical. We really need to get this finished up before code freeze IMO so we can drop the DBTNG backwards compatibility layer.
Comment #50
webchickTagging.
Comment #51
Crell CreditAttribution: Crell commentedIMO dropping the BC layer is going to happen anyway, it doesn't have to be before code freeze. The documented API won't be changing in the process.
Comment #52
BerdirMaybe, but this patch *will* change search.module API, and it's not a minor change :)
Comment #54
chx CreditAttribution: chx commentedI spent hours figuring out that this patch works just http://drupal.org/node/496500#comment-1961346 broke it. I have included the rollback of that issue here.
Comment #56
chx CreditAttribution: chx commentedI suspect a bot fluke. I installed HEAD and worked.
Comment #58
chx CreditAttribution: chx commentedStupid me! the search.extender.inc was missing. But why did that fail HEAD ? Curious.
Comment #59
TheRec CreditAttribution: TheRec commentedSubscribing.
Comment #60
chx CreditAttribution: chx commentedSo yeah,this passes but again: I have rolled back #496500: Remove global placeholder counter inside it.
Comment #61
chx CreditAttribution: chx commentedThat has been rolled back.
Comment #62
chx CreditAttribution: chx commentedSo, to clarify I changed nada in the patch, I have actually diffed my patch to Berdir's and found that I left in a bit of debug and no other change. I am calling this ready -- I worked several hours with the patch to figure out what's wrong and found no glaring holes it's a very fine piece of work.
Comment #63
webchickI stared at this for a good hour tonight, and had a review written up but lost it when I clicked "cancel" rather than "paste" *smacks self*
But it basically amounted to "this patch hurts my wee head" so I will try looking at this again tomorrow a bit more refreshed, unless Dries wants to jump in and commit it first. Honestly, I don't see this getting many more reviews/progress before code freeze with Berdir out of commission.
Comment #64
chx CreditAttribution: chx commentedVultures.
Comment #65
chx CreditAttribution: chx commentedComment #66
chx CreditAttribution: chx commentedLet's try review so the bot picks it up.
Comment #67
webchickI spent another hour or so with this patch tonight, and apart from some confusion around the db_search() function in #62, which was since removed after Larry pointed out that extenders should never instantiate themselves, only be appended to existing queries, I could only find a few minor commenty things.
However, as a result, I'm pretty sure this is now going to fail:
So will wait for testing bot to come back green before proceeding with a commit.
Here are the minor things which either you could fix or I could. The only one I could use your help on is the regex comment. I know that was copy/pasted and so technically isn't your domain, but I highly doubt anyone touches this again for the next 36 months, so let's fix it here. ;)
{search_index}.type
{search_index}.word
They rest of these are just "array"
comment please.
Beer-o-mania starts in 2 days! Don't drink and patch.
Comment #68
chx CreditAttribution: chx commentedComment #69
webchickOk, this addresses all of my feedback, though in reading through the issue I'm not quite sure how to respond to the comments Dries made in #36 and #38. The final approach is not really different from what was done back then. I'm obviously reviewing it at a higher-level than Dries, but I will say that I was mostly able to follow along with what this patch was doing, which was amazing to me given it's search.module. ;) But feels safer to move to RTBC and let Dries have one last look before pressing the button.
Curiously, #66 didn't fail. We're probably lacking test coverage in our search system.
AWESOME work, btw. Once this is committed, it officially marks the end of non-DBTNGified code in core, afaik. :)
Comment #70
Dries CreditAttribution: Dries commentedMy comments still hold, but I'm comfortable committing this for the sake of making good progress. We can have a meta/architecture discussion in the Drupal 8 release cycle.
Comment #71
davyvdb CreditAttribution: davyvdb commentedThis just broke HEAD. Can't install anymore. Reversing this patch solves the issue.
Comment #72
TheRec CreditAttribution: TheRec commentedIndeed, that solves the problem at install. Now it's time to find what is wrong with the patch... and maybe why testbot did not report this (either bad testing or missing test I guess) ;)
Comment #73
cburschkaThe patch is supposed to add a new file named search.extender.inc, and it has to be added with
cvs add
. Registry chokes on non-existant files with empty filectime (maybe a separate bug there) Yay for mad detective skills. :)Comment #74
cburschkasearch.extender.inc is now in CVS, so this can be closed.
Comment #76
jhodgdonI'm reopening this -- the search API changed quite a bit with this patch (search modules used to be able to call do_search() and they should do something different now), and this is not currently mentioned in the module upgrade guide. It should have been documented before the issue was allowed to be marked "fixed".
Comment #77
jhodgdonI've added a section to the module upgrade guide - please review.
http://drupal.org/update/modules/6/7#search-api
Also, here's a quick patch to a small part of the search doc, to fix reference to now nonexistent hook_search().
Comment #78
BerdirTwo things could be improved in the example.. (I can't edit the wiki page myself..)
a) To recieve the same result as in D6, you should call limit(10)
b) execute() just returns the result object, to have an array with nid's as in D6, you should call $query->execute()->fetchCol()
Comment #79
jhodgdonThanks! Please check if the example is better now.
Comment #80
BerdirThat seems to contain a c too much :)
This review is powered by Dreditor.
Comment #81
jhodgdonGood catch Berdir, thanks! A reminder of why we have patch reviews. :)
Here's a fixed patch (I hope with no typos this time).
Comment #82
jhodgdonAnd Berdir, did you have any other comments on the Module Upgrade Guide section http://drupal.org/update/modules/6/7#search-api or do you think it is OK now?
Comment #83
catchLast patch is good.
Comment #84
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD. Thanks!