Hi,
if 'Tagging' is used in combination with the 'Extractor' module to automatically suggest tags, certain (possibly important) keywords/tags are not not recognized. I think this is because stemming (?) is not supported. Example: a vocabulary contains the term "Book", the text contains the string "Books", so the existing term "Book" is not suggested. Other examples for stemming: "engineering," "engineered," and "engineer"; the stemmed form would be "engineer".
This is not trivial to solve, and it IMHO is language dependent (English needs a different stemmer than German). A well known approach to stemming used by full text indexers is the Porter algorithm. Disclaimer: I'm using this with terms in German, so possibly English might bebetter supported.
Greetings, -asb
Comments
Comment #1
eugenmayer commentedwell this is more an issue / feature request for extractor, not for tagging. Tagging itself does only implement the suggestions-api, others should easily implement it.
you can even alter the results of extractor and add keywords by the stemming algorythm or add features in extractor
Comment #2
asb commentedMoving this feature request into the 'Extractor' issue queue.