Hi,

if 'Tagging' is used in combination with the 'Extractor' module to automatically suggest tags, certain (possibly important) keywords/tags are not not recognized. I think this is because stemming (?) is not supported. Example: a vocabulary contains the term "Book", the text contains the string "Books", so the existing term "Book" is not suggested. Other examples for stemming: "engineering," "engineered," and "engineer"; the stemmed form would be "engineer".

This is not trivial to solve, and it IMHO is language dependent (English needs a different stemmer than German). A well known approach to stemming used by full text indexers is the Porter algorithm. Disclaimer: I'm using this with terms in German, so possibly English might bebetter supported.

Greetings, -asb

Comments

eugenmayer’s picture

Project: Extractor » Tagging
Version: 6.x-1.0-alpha1 » 6.x-1.x-dev
Component: Code » Backend

well this is more an issue / feature request for extractor, not for tagging. Tagging itself does only implement the suggestions-api, others should easily implement it.

you can even alter the results of extractor and add keywords by the stemming algorythm or add features in extractor

asb’s picture

Project: Tagging » Extractor
Version: 6.x-1.x-dev » 6.x-1.0-alpha1
Component: Backend » Code

Moving this feature request into the 'Extractor' issue queue.