Hello,

File Framework is working great on my intranet. But it cannot extract any information from PDF files, like author, title, etc. I tried it with a PDF file which has this information.
It may be a problem related to the handler : in the handlers pages (admin/settings/file/handler), on the PDF line, I see no associated MIME type but I see it is handled by file_slideshow module. But on the MIME types configuration page, I can see the application/pdf MIME type associated to file_module.

Do you know how I could retrieve this information ?

Thank you.

Comments

miglius’s picture

There should not be a MIME type next to the PDF line. So the handlers page is correct.

You have to install "pdfinfo" to your server and it should be in the PATH for the user the web server is running as. The module checks if it can find pdfinfo in the path and if it finds, it executes it and extracts the PDF information from the file.

Arto’s picture

Issue tags: +PDF
miglius’s picture

Status: Active » Postponed (maintainer needs more info)

Have you installed the 'pdfinfo' to your server and do you still have this issue?

eternaluxe’s picture

Hi. I'm having the same issue.
www-data is the user
$PATH is /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
pdfinfo is located at: /usr/bin/pdfinfo

miglius’s picture

Try running manually the pdfinfo command on the server for the file which is not working for you. Is metadata extracted then?

Veggieryan’s picture

for those on centos 5:

"xpdf is gone from CentOS 5. Install poppler:

yum install poppler poppler-utils

as root.

Poppler, a PDF rendering library, it's a fork of the xpdf PDF..."

johanneshahn’s picture

Status: Postponed (maintainer needs more info) » Closed (cannot reproduce)

try latest stable