It would be practical, both for the Facet API integration (#1182614: Integrate with Facet API) and generally, to be able to index all parents along with a taxonomy term. This should be easily possible with a simple data alteration, which could also be coded generic enough to work for other hierarchies.

It would also be possible to add this to the "Aggregated fields" data alteration, but I doubt that would be a good idea.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

zambrey’s picture

Subscribe

dmiric’s picture

sub

drunken monkey’s picture

Status: Active » Needs review
FileSize
13.09 KB

This one works for me, please test / review.

I've also sneaked in a few other minor fixes, you can just ignore those. It would just be good to know if search_api_extract_fields() still works the same with the change (should now just need much less entity loads).

zambrey’s picture

I finally have time to test this feature.
So I have applied patch and enabled Index hierarchy data alteration for one category.

I have this error when trying to reindex data:

Fatal error: Unsupported operand types in .../sites/all/modules/search_api/contrib/search_api_db/service.inc on line 495

As you can see I'm using DB backend with PHP 5.2.6.

drunken monkey’s picture

Hm, I can't reproduce this. I take it you use the latest version of all modules?
Then please add debug($item); return FALSE; at the beginning of SearchApiDbService::indexItem(), index a single item manually and post the result.
Also: Does this happen for all items, or just for some? (Kind of complicated to tell that, I guess. You can index a single item manually and see whether the error occurs. Then you could also mark items indexed by adding return TRUE; as the first line in indexItem() and see if it also occurs when indexing other items manually. Or call search_api_index_specific_items() directly from code.)

zambrey’s picture

Attaching debug message. It's kinda big but hope it will help a little.
I'm using SearchAPI taken from git with this patch applied, Entity API 7.x-beta10, Views latest dev.

EDIT: enabled only this one category for indexing and the error is gone, attaching debug with that also.
But now only two items are indexed and I have many messages in logs like this:

SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '8' for key 'PRIMARY'

I will try to find which field causes fatal error mentioned in #4.

zambrey’s picture

Okay, I can't reproduce the error but still only 2 nodes are indexed.

drunken monkey’s picture

FileSize
11.51 KB

Ah, I knew I messed something up in search_api_extract_fields(). Restored that to the original version, I should just fix it properly, in a different issue.
The SQL exception is really weird, this really should not hap— oh, wait. Please go to the Fields tab, save the form and see whether this fixes the error.

The error could also stem from an interaction between the "Aggregated fields" data alteration and this one. See if the error occurs when enabling only those two data alterations and no other fields.
Otherwise, yes, please find out what fields cause the errors.

drunken monkey’s picture

Ah, of course you can't reproduce – as you just wrote, you changed the fields you index, so you already saved the Fields for mas I suggested. Sorry, didn't think of that.
Anyways, seems like that has to be fixed somehow. The issue here seems to be that when the type is changed, the server won't be notified about it.

drunken monkey’s picture

OK, now I'm confusing myself completely. To summarize, the two bugs:

- SQL exception when indexing most items: I'd guess this is due to stale field type data in the index. When activating the alteration and saving the "Fields" tab, this should vanish. (But you say, they didn't, so maybe I need a new theory …) Do the entries that are reported as duplicates correspond to the node nids you try to index?
- Fatal error: This I think could come from the interaction between the two data alterations.

zambrey’s picture

Okay, it works now BUT you must re-enable field that you want to index with hierarchy.

Here's detailed info:
- after applying latest patch category wasn't indexed (it was enabled - first WTF)
- resaving field configuration (haven't changed any field) and disabling hierarchy caused field to be indexed again (NICE)
- enabling hierarchy for category caused #4 error (second WTF)
- resaving field configuration (haven't changed any field) - only two nodes are indexed (getting #6 exceptions)
- disabling and enabling field finally makes all nodes to be indexed with all hierarchy (Yay!)

drunken monkey’s picture

Thanks for investigating and explaining the problem! I think I now understand what might be wrong here.
The attached patch hopefully fix the bug. I've also included a note that all fields should be de-selected before disabling the data alteration – otherwise there's hardly a way to keep this from triggering in this case. (A flaw in the current framework which will have to be fixed separately.)

marvil07’s picture

I just wanted to point that maybe is better to rely on a external module for doing conversions with fields. Actually cpliakas and me are starting a discussion about it. So this alteration could be a plugin inside converter module.

I hope I am getting this right :-p

drunken monkey’s picture

I heard about the Converter module from Chris already, but from the description it doesn't sound like those kinds of things would fall into its area of expertise. This is, after all, more about changing meta data than changing the data format.
Also, Converter isn't usable yet, so we're gonna need an interim solution anyway. However, once Converter is stable enough, it'd be definitely great to integrate it into the Search API (probably with a generic "Converter" data alteration). After all, I've had only positive experiences with Chris' generic frameworks so far. ;)

(By the way: Are you gonna be in London? We'll probably do a Search API BoF there.)

zambrey’s picture

I tried to use this on another site and it works brilliantly!
Thank you very much for this! Awesome work.

marvil07’s picture

@drunken monkey: I see your point, and I am glad you are open for integration in the future. About London, visa stuff let me out this time :-(, but hopefully I will be tracking activity about search anyway the best I can(this project queue issue queue is growing exponentially, good work!).

dmiric’s picture

Hey thanks for solving this problem. Only cant find how to make this work. Updated search api to latest dev version applied patch. Everything passed ok.

Only problem is I don't know how to use it.

Where are the options?

on admin/config/search/search_api/index/nodes_3/fields

i dont see anything new

Can you please write few steps on how to make this work. Maybe I'm misunderstanding what this patch should be doing ?

What I think it does is when i enable indexing of Nodes it should give me somewhere option to say that in that node this vocabulary should index the complete hierarchy. I looked over every option there is again and cant find anything new.

drunken monkey’s picture

Go to the index's "Workflow" tab. There is now a new data alteration, "Index hierarchy", which lets you do that.

dmiric’s picture

Yeah I checked there and theres no such option :( ...

Should I role both patches or just last one ? And does it work on latest dev or only on beta build ?

drunken monkey’s picture

Ah, my apologies! I didn't remember to re-roll this after committing #1064884: Add support for indexing non-entities – a working patch should be attached.

And thanks for reviewing!

davidseth’s picture

Just applied patch at #20 and no "Index hierarchy" shows up in the data alteration area. :(

drunken monkey’s picture

Do you have the latest module version, and did you clear the cache?

davidseth’s picture

I am using the facet_api integration branch. Your patches applied cleanly. I assumed I needed to use that branch to get integration with Facet API in the first place. Maybe you can shed some light on that? Thanks.

drunken monkey’s picture

I am using the facet_api integration branch. Your patches applied cleanly. I assumed I needed to use that branch to get integration with Facet API in the first place. Maybe you can shed some light on that? Thanks.

You need the branch for Facet API integration – however, this patch is in principle independent of that development. You can index hierarchies without it – you just won't get the corresponding hierarchical facets.
And I just tested it, it works with both the normal Search API and the Facet API integration branch / fork. If you cleared the cache and it still doesn't appear in the workflow (even though you have taxonomy term references on the indexed entity), maybe check the logs what might go wrong. Or debug the search_api_admin_index_workflow() function to see what goes wrong there.

davidseth’s picture

@drunken monkey: I have got it to work! I had to change two things though.

First there is no entry in search_api_search_api_alter_callback_info() for the new class: SearchApiAlterAddHierarchy. So I added:

  $callbacks['search_api_alter_add_hierarchy'] = array(
    'name' => t('Index Hierarchies'),
    'description' => t('Index Hierarchies for Faacet API.'),
    'class' => 'SearchApiAlterAddHierarchy',
  );

to the end of the definitions. This now meant I could see the option, but then I kept getting errors. I have Field Collection module in place and it was causing errors so I added (at line 211):

        if (empty($entity_info[$type]) || $type == 'field_collection_item') {
          continue;
        }

Note the $type == 'field_collection_item' addition. Is there any reason that this doesn't simply filter only on taxonomy_vocabary?

After I made those two changes above and re-indexed everything worked perfectly!

Cheers,

David

drunken monkey’s picture

First there is no entry in search_api_search_api_alter_callback_info() for the new class: SearchApiAlterAddHierarchy.

This probably means you made some mistake while applying the patch. See the last-but-one change in the patch linked in #20.

Note the $type == 'field_collection_item' addition. Is there any reason that this doesn't simply filter only on taxonomy_vocabary?

Yes – as the functionality is not really specific to taxonomy terms (by the way, for those, filtering on "taxonomy_term" would be the right thing to do), I figured we could just as well allow all other hierarchical information, too.

What kind of errors did field collection items cause? Hard-coding a special case is definitely not the way to go here, so we should look for why this causes errors.

davidseth’s picture

#26 - This is the error I get when it is parsing 'field_collection_item' field types:


EntityMetadataWrapperException: Missing data values. in EntityMetadataWrapper->value() (line 73 of /Users/david/web/gccsi/contactmgr-git/contactmgr/app/sites/all/modules/contrib/entity/includes/entity.wrapper.inc).

drunken monkey’s picture

When does the error occur – during indexing, or already when configuring the data alteration?
And are you using the latest version of the Entity API? If I'm not mistaken, a similar bug was recently fixed there, and this does seem like an Entity API problem.

davidseth’s picture

Status: Needs review » Reviewed & tested by the community

Okay, updated to latest Entity API dev and it works.

BTW, the original error occurred when configuring the data.

This patch works well, please commit.

Cheers,

David

drunken monkey’s picture

Status: Reviewed & tested by the community » Fixed

BTW, the original error occurred when configuring the data.

Ah, thought so.

Great to hear it now works. Thanks for reviewing!
Committed.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.