Background

First to show that I've done some research, I've read through this:

http://drupal.org/node/266511

And searched the forums for previous entries.

My employer have given me the opportunity to create an online dictionary for the Sami language. My data are a collection of audio files (one file per letter in the Sami alphabet) and a list of all the words read, their word class and translation to Norwegian. I've split those files and created the mappings between the words and the files where the words are read.

Now I need to concentrate on importing these into Drupal. The total number of words in the wordlist is 35 000 entries, give or take, and I will probably be asked to expand the translation to English, Finnish, Swedish and Russian as well.

I've also been playing with Wordnet and read through this: http://groups.drupal.org/node/8516 and http://drupal.org/node/299599 and with the Taxonomy Manager module and the Term Relation Types module the Taxonomy module can be used to create the Wordnet linkings easily.

I also found PHP code to extract info from the wordnet lex files, but I don't know if it only returns the data for one term, or if it can be used to scan the whole system and import it.

In actuality I have everything I need to make an import script and just try it, yet, I submit my doubts to this group for consideration ;)

Questions

From the comparision of G2 with Glossary, it seems G2 is better suited for large dictionaries, right?

Of course Drupal is no match against a lex for the wordnet, but in real life working off a webserver, I would believe that the difference between looking up each term or using them as nodes are negligible, and the nodes will have extra features (like linking to the sound files, editing by an editor, participation in workflows, easy translation etc).

Will taxonomy be able to handle all the relations? I am not sure how many relations there will be, but I believe I will have to use a module for automatically creating terms (can't remember the name) for each word, and inserting the relationships into the trt module. There will be a lot of relationships as I believe each word will have at least one, but probably about the number of lemmas in the dataset (1 504 077).

From my experience with Drupal this is of course doable, but I want to throw in a net and see if I catch some people who might have done this.

Anyway, deadline for my project is sunday this week. ;)

Paul