Indexing Fails When Content Types are Excluded from Search Indexing
| Project: | Search config |
| Version: | 6.x-1.x-dev |
| Component: | Code |
| Category: | bug report |
| Priority: | critical |
| Assigned: | Unassigned |
| Status: | closed |
Jump to:
I have over 20 content types on my site, only two of which (story and page) should be indexed. I checked the all of the boxes to prevent the other types from being indexed, but found that my indexing never went past 2%. After some digging around, I found an error in the module's logic.
node_update_index looks for all nodes that are not yet in the search dataset and does the necessary operations on it to index it.
search_config_update_index removes entries in the search dataset that match selected content types.
This explains why when I was running search_cron over and over again, it was attempting to index the same 500 nodes, since the node_update_index would enter it into the dataset and search_config_update_index would remove it.
Are there any other methods for removing a node from the search_dataset but still mark it as "indexed"? Perhaps leaving it in the dataset but simply clearing the text associated with it?

#1
This is a known issue. There is no way to effectively prevent indexing. You can alter the database query that does the search but that's about it. To get around the bug with cron I plan to add a throttle (about 100) to search_config_update_index. I just haven't gotten around to it yet.
#2
This should now be fixed in commit http://drupal.org/cvs?commit=216032. Please test and let me know.
#3
Works for me!