Hi joachim et al,

I've been working on pulling the vocabularies from the distributor site, and loading them into the retriever site.

Previously this was achieved with Deploy (with a push method) or by setting up the vocabularies manually on the retriever site and the terms that had been tagged on distributed nodes being pulled. Tell me if I am wrong and there is an alternative? :-)

The former means you have to temporarily disable key authentication and then deploy the vocabularies from the distributor site, whereas the latter means you only get tags that have been selected (you could have one node with all tags, but that is messy).

What I have created that patches the content retriever module and also the services > taxonomy service module, to enable a remote load of a vocabulary and tree and pull that to the retrieval site.

The vocabulary is loaded onto the site along with all new tags. If the vocabulary/tags exist then they are just updated.

There is currently just a saveVocabulary service, which means saving remotely (i.e. a push method like Deploy), whereas content retriever works with a pull method. I therefore have implemented the loadVocabulary service, which as I said will remotely load the vocabulary to get the details of the vocab. This works well alongside getTree, so we can get all the terms in the vocabulary for populating on the retrieval site.

Let me know what you think or any comments/concerns? :-)

Thanks,

Luke

PS. I've posted the services patch here for now, so you can have a look, I will put it on there queue once I see if there is interest.

Comments

joachim’s picture

content_retrieverTaxonomy.patch -- if at all possible I'd rather see this handled in its own module and hooked into the main processes.

It's ages since I've looked at all this code, so I'm not sure how doable that would be.

Also, I must remember to commit some of the stuff I was working on -- I made a UI where you can pick exactly which of the incoming node's properties you want to keep and which to discard.

luketsimmons’s picture

OK cool,

Thing is the vocabulary pull is part of the main processes, i.e. the way I've set it up is it appears in the manual control section.

Are you suggesting I move it out and we have a completely new section for retrieving vocabularies?

Thanks,

Luke

joachim’s picture

The manual control section is really meant just for debugging. I expanded it hugely simply to save my sanity when setting up a new retrieval site, so I could pinpoint where things were going wrong. It's meant to be stepped through so you can check in order: the basic connection, the user account, the existence of nodes to retrieve.

The core operation of CD is meant to happen on cron. I'm not sure whether there's a way of hooking in extra operations to happen when CD does its cron run.

luketsimmons’s picture

Ah OK, yer I've been using the manual operation whilst in development phase, once we launch our set of sites the plan is obviously to run retrievals with cron as you say.

I've added the taxonomy bit before the node retrieval which seemed logical to me as you would want to retrieve your vocabularies and terms before loading nodes.

That said, I do need to add the taxonomy in to be run with cron too. I'm sure there is a way to hook it in, I'll take a look. It would mean that leaving the new vocabulary retrieval bit in the manual section as per the patch makes sense for development/testing phases.

It's no biggy if the taxonomy bits don't get committed to the main release for some time as the patch should work for now for people who may want this functionality.

Thanks,

Luke