Is anyone using any Word doc to HTML converters? I've got about 7500+ I want to convert to book pages! Ironically, something that works in Linux would be ideal as I think scripting would be easier, like in PERL etc..

Since I want to put these into drupal book pages, it would also be ideal to be able to limit the tags used during the conversion process.

Thanks in advance!

Comments

subspaceeddy’s picture

I used to use wvware (http://sourceforge.net/projects/wvware/) for this purpose a number of years ago but haven't looked at it for ages. Runs in linux and worked fine for me. It converts into a number of different formats, including html and it should be possible to write something server side that filters uploads though this, takes the text and creates a book node. I notice their latest release was Oct 2006 so it's still being developed.

I just had a quick look around sourceforge and noticed there seem to be a few projects now. This one looks interesting: http://sourceforge.net/projects/wordhtml/

Whatever the outcome, please keep us posted since I'm sure there a loads of interested parties here...

nishitdas’s picture

bookmarked

gurukripa’s picture

wld like to know how possible this is..
later i wld like to import these books as drupal book pages..tell me if its working..and how?

taqwa’s picture

I don't know about wvware, but ABC Amber makes some decent software for these types of conversions. You can get the trial version for 30 days, which has full functionality. This might not be the best way, but I usually convert my files to TXT. Then use the "find and replace" feature on dreamweaver to cut out the parts I don't need and create the proper tags. Finally I use HTML2Book to create the book pages.

gurukripa’s picture

thanks for replying..

my current problem is this..
i have a soft copy of a book...its a few thousand pages...and its in html...its like 18 books..merged in one..so 18 folders each with 100s of html pages.

each page has a front and back link to the related pages..also a link to the index, main page etc...

its good to be uploaded as a mini website...

i would like to bring this in as drupal nodes..with all the features...like meta tags, search features, taxonomy etc...also with commenting options...

how can i do this..

pls help with ideas..and if u have done something like this b4...that gives more confidence..thanks.

taqwa’s picture

Check out this module:
http://drupal.org/project/import_html

I haven't used it but it looks like it might do the job.