Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
I would like to be able to make some search indexes read-only, so that an external service can index data, and then that data can be used in Drupal via Search API. I've written a patch that adds a 'read only' flag to Search API index entities. This was straightforward, but involved touching several places in the code.
This issue comes out of #1064884: Add support for indexing non-entities, starting with comment #23.
Comment | File | Size | Author |
---|---|---|---|
#4 | read_only_flag-1138992-4.patch | 19.29 KB | drunken monkey |
#3 | read_only_flag-1138992-3.patch | 13.96 KB | becw |
#2 | read_only_flag-1138992-1.patch | 13.96 KB | becw |
Comments
Comment #1
drunken monkeyAs written in #1064884-24: Add support for indexing non-entities, I still see some issues with this patch that need to be fixed or discussed.
Subscribing, for now.
Comment #2
becw CreditAttribution: becw commentedHere's an updated version of the "read-only" patch from #1064884: Add support for indexing non-entities, with the following changes (comments quoted from comment 24 on issue #1064884):
I've added two methods to
SearchApiIndex
,::queueItems()
and::dequeueItems()
, and called them inSearchApiIndex::postCreate()
,SearchApiIndex::postDelete()
,search_api_index_enable()
, andsearch_api_index_disable()
. Is that what you intended when you said "split the 'Remember items to index' logic"?I also removed my modifications to
search_api_mark_dirty()
; I had removed it because it was only called in one place, and refactored its functionality because using a merge query would ensure that items would get added to an index on update if they weren't already present. This should now be handled by the::queueItems()
and dequeue... methods. I would consider moving all "mark dirty" functionality into the index class itself, perhaps by passing an array of ids intoSearchApiIndex::queueItems()
(with similar logic to the$ids
parameter ofentity_load()
).I didn't call
SearchApiIndex::removeIndex()
so that the server wouldn't delete items. I don't think it would make much sense for a read-only index to switch servers, but then again the point of Search API is to abstract the service details, so it's possible that someone might want to change the type of server where data was being indexed, but not change searches and views based on the Search API index. However, this functionality is pretty far out since the only service I'm working on right now is Solr.In an ideal world, I would avoid creating a Sarnia server entirely and just use features present in Search API Solr's server.
I've removed any read-only-flag-related access restrictions to the fields and processors tabs. This intertwining of index and search functionality was party of what made it very hard to separate the indexer and searcher functionalities of the index class.
Done.
Done.
Comment #3
becw CreditAttribution: becw commentedOops, forgot to update a method call. Fresh patch attached.
Comment #4
drunken monkeyYes, that's what I meant. However, I moved the whole logic for index updates from the to
hook_search_api_index_update()
, as such reactions to property changes are all executed there. This also ensure this is called when someone doesn't use theSearchApiIndex::update()
method or thesearch_api_index_[en/dis]able()
functions.Also, we should use similar code in
hook_enable()
andhook_disable()
.The
search_api_mark_dirty()
is an API function that could also be used by other modules, so we shouldn't remove it unless necessary. And regarding items being already present in the item list: a) this now shouldn't be the case anyways; and b), we can just dequeue all items before queueing them, to make sure. This removes both the join and the @TODO from the method.I think instead of restricting access to the "Status" tab we should just explain things there.
See the attached patch for my suggestions. Sorry that there's also a bit of unrelated clean-up, but when I stumbled upon those things, I just thought I should fix them right away.
Hm, and now that I think about it – why not track dirty items even when the index is disabled or in read-only mode? Would certainly help if we don't have to re-index all items when re-enabling the index (or unsetting the read-only flag).
Comment #5
becw CreditAttribution: becw commentedYour changes make sense, and work in my testing.
I think that part of the idea behind a read-only index is that Search API may or may not know what's in the index. I agree that it makes sense not to re-index content when switching an index out of read-only mode, if possible, but I think that while an index is in read-only mode, and especially when a read-only index is created, no attempt should be made to add indexed items to
{search_api_items}
.I don't think that a join is a terrible thing to have (there is an index on the 'item_id' and 'index_id' columns, after all) so perhaps we could:
SearchApiIndex::queueItems()
for entities with base tablesSearchApiIndex::dequeueItems()
for entities that we have to queue usingentity_load()
, and document the disadvantagehook_entity_update()
usesearch_api_mark_dirty()
on existing items (ie, leave it alone)hook_entity_insert()
Well, that sounds less good now I've written it.
Comment #6
drunken monkeySo, other suggestions? Or is the patch therefore OK as it is?
As indexes aren't re-enabled that often (and probably even more rarely set to read-only and back), I don't think the performance matters that much, whether we use a join or a simple
db_delete()
(which also will be pretty quick). Deleting and re-adding the items is probably just my taste, because this is then handled consistently for both cases (direct db query andentity_load()
).Comment #7
becw CreditAttribution: becw commentedI think the patch is ok as it is. Tracking indexed items while an index is in read-only mode adds too much complexity.
Comment #8
drunken monkeyOK, I now committed this. Thanks for the input!
One line in the test file had to be deleted, but otherwise the tests ran fine, too.