BOUNTY $40 - count number of pages of uploaded PDF file [SOLVED]

By odyseg on 11 May 2012 at 10:36 UTC

PROBLEM:

When you click the 'upload' button when uploading a PDF file to a node of type 'copyzol', it needs to be able to count the number of pages of that pdf file. Bounty for $40. We really need this, ASAP. Skype me at teodyseguin.

thanks.

Comments

nice exercise

dman commented 11 May 2012 at 11:17

FYI, http://drupal.org/project/pdf_to_imagefield has code you can scour to find an answer to this.
There are two classes of PDF - one with the page count that can be scraped directly from embedded meta data, which can be done by string-scanning, but there are some (I think older encodings, or from different encoders) where the only reliable count can be deduced by rendering each page and counting the result. That method requires you to have the tcpdf application installed on the server - which means you must have good control on the host. And may take some liason.

I'll leave the bounty for someone who wants to experiment with this as an easy job, but if you do get it sorted in a clean way - come join us as a maintainer of http://drupal.org/project/pdf_to_imagefield

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

To my knowledge, by far the

Drave Robber commented 11 May 2012 at 11:31

To my knowledge, by far the simplest (and not too resource-consuming, at least compared to sifting through the file with preg_match()) method is this. Requires ImageMagick.

(I'm leaving it open, too.)

If you don't have decent

criznach commented 11 May 2012 at 15:08

If you don't have decent hosting this may not be easy. As dman said, there are many different types of PDFs, and to accept any arbitrary file will require a robust solution. I see a few options...

TCPDF + FPDI - only supports up to PDF 1.4
TCPDF + FPDI PDF Parser - should work with 1.5+, but requires a 100 euro license and good hosting.
PDF to imagefield - requires Imagemagick - not all hosts support it.
Imagemagick - utilizes ghostscript - could just use ghostscript - again, not all hosts support it.
preg_match search - may be resource intensive, but any solution is going to load the file. Not sure if this will work with 1.5+

http://www.trailheadinteractive.com

Here's the code that uses

dman commented 11 May 2012 at 21:58

Here's the code that uses imagemagick identify
http://drupalcode.org/project/pdf_to_imagefield.git/blob/refs/heads/6.x-...
Caveats about installing ghostscript etc are on the proj page.
But that's the slow version.

I think the faster but more haphazard string approach is here: (a grep not a preg match)
http://drupalcode.org/project/pdf_to_imagefield.git/blob/refs/heads/7.x-...

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Thanks for all the reply and

odyseg commented 13 May 2012 at 08:36

Thanks for all the reply and suggestions. I had found a developer who solves this :)

Care to share?

criznach commented 13 May 2012 at 15:10

Care to share?

http://www.trailheadinteractive.com

Sure

odyseg commented 15 May 2012 at 10:56

here is the function we've use to count the number of pages when you upload a pdf file

function copyzol_pdf_count_get_number_of_pages($filepath) {
    $filepath = realpath("./$filepath");
    $fp = @fopen($filepath,"r");
    $max = 0;
    while(!feof($fp)) {
            $line = fgets($fp,255);
            if (preg_match('/\/Count [0-9]+/', $line, $matches)){
                    preg_match('/[0-9]+/',$matches[0], $matches2);
                    if ($max<$matches2[0]) $max=$matches2[0];
            }
    }
    fclose($fp);

    if($max == 0 && class_exists('imagick')){
        $im = new imagick($filepath);
        $max = $im->getNumberImages();
    }

    if ($max == 0)
        $max = 1;

    return $max;
}

We hav ImageMagick installed to our server to make this works.

Thank you. This code helped

iwant2fly commented 27 September 2012 at 02:43

Thank you. This code helped out greatly in a module we had developed.

BOUNTY $40 - count number of pages of uploaded PDF file [SOLVED]

Comments

nice exercise

To my knowledge, by far the

If you don't have decent

Here's the code that uses

Thanks for all the reply and

Care to share?

Sure

Thank you. This code helped

News items

Our community

Documentation

Drupal code base

Governance of community