We just launched a PDF to Word converter @ pdftoword.pro

Since professional PDF to Word converters are extremely complex & costly, we decided to program one built with open-source techniques.

Don't expect a 1:1 copy of your PDF in Word format, but it will give you everything to need to make one with a little bit of work.

There are two options available:

  • Text & image extraction
  • OCR (text recognition)

Normally you will use text & image extraction, it tries to maintain the text layout, but the fonts and font-sizes are not extracted.
So you have to do a little bit of work to rebuild the PDF with the text and images given.

OCR technology will give a nice tool in case you scanned a document.

Credits go to:
- Drupal 7 for the excellent extremely modular platform.
- Conditional fields module for a really nice node form interface experience.
- PDF to Image module to deliver the images to the OCR processer.
- Tesseract OCR including some language packages, maintained by Google.
- pdftotext for text extraction from the Xpdf package
- pdfimages for image extraction also from the Xpdf package
- ImageMagick for image conversion and scaling
- JODConverter, the Java OpenDocument converter, for .txt to .doc conversion
- Drupal Business theme for a nice and tidy front-end

With all those projects we where able to build a free online tool to create word documents out of PDF files!