Project:Printer, e-mail and PDF versions
Version:7.x-1.x-dev
Component:User interface
Category:feature request
Priority:normal
Assigned:Unassigned
Status:postponed

Issue Summary

I would like to generate PDF versions of the stories on my site. My stories are static, and it makes more sense to generate a PDF file from a story once and save it on the server than it does to generate it on-the-fly each time the file is requested. Are the developers of this module planning to provide an option that allows for the saving of generated PDF files on the server, rather than having them streamed to the user?

My apologies if this has been covered elsewhere or is too far beyond the scope of this module.

Comments

#1

Title:Option to save PDF file to server?» Cache PDF file in server
Status:active» postponed

No. What you're referring would be to develop some kind of 'cache' of PDF files. This could be done, but it would require tracking the revision date of the node and the date of generation of the latest PDF, to invalidate the 'cached' copy in case there has been any changes to the node since the last time that the PDF was generated.

This would save some seconds of PDF generation time, but would waste a lot of space in the hard disk. Since in most situations the only limit is the hard disk space, this feature is probably not useful to the vast majority.

#2

I run a fairly high traffic site. I am concerned that potentially hundreds of users trying to generate PDFs at once would significantly impact the performance and reliability of my site. I am interested in maintaining a repository of PDFs anyway, so the disk space is not an issue for me.

What I'm thinking now is that I would have the saving of a node trigger the generation of the PDF, which would then be saved on the server. This seems different enough from "Printer, e-mail and PDF versions" to warrant a separate, albeit fairly simple module.

Thanks for the feedback.

#3

Makes a lot of sense to me, Super G. Not a 'cache', but an archive of snapshots, in PDF, in parallel to the DB contents generating the nodes, which is dynamic and inherently lacks snapshots in the typical hostnig MySQL situation. I agree with you: disk space is cheap and responsive relative to server CPU in the typical shared hosting evironment, particularly with Drupal which is so DB query intensive.

#4

I think that generating a PDF of all nodes would both slow down the site too much and take up too much space. Unless, of course your users would really like to download all nodes as PDF.

I see these options:
Option 1: Saving all nodes as PDF
- Pros:
Faster for all users
- Cons:
Slower site as each node editing action forces a PDF generation
Huge disk space waste as PDFs are stored for all nodes, including nodes not accessed in several weeks/months

Option 2: PDF 'cache'
- Pros:
Faster for most users
Smaller disk space usage
Ability to configure cache size to leave only the top accessed PDFs
- Cons:
No speed-up for the first user to ask for the PDF of a node modified recently

#5

Well, every site is different. For me, server performance and reliability is paramount, and I have a lot of disk space, so I'm going with the "save all nodes as PDF" approach. I wrote a small custom module triggers a "generate PDF" action when a node is saved. Right now, the module is specific to my site, but if someone wants to help me make it worthy of being listed on drupal.org, drop me a line.

#6

You don't understand..

Your solution makes your server LESS performant. Unless of course, you rarely create/edit nodes, and all your users all downloading several PDFs all the time. Think about it, and you'll probably reach the same conclusions.

#7

Hi SuperG. I am interested in this small module which trigger the generation of PDF when the node is saved. I am currently implementing a site where the content is mostly static, eventually it should be possible to modify your trigger such that it will allow an administrattor to request the generation of PDF for specific node types. If you can share your module I will look into extending it to allow generation of content "by request" of administrator (or other user with sufficient permission rights).

#8

If the node are rarely edited, it is the better solution.
For me, by example, it is the case.

Why don't provide this functionality as on option for advertised user.
for example we could choose between
1- normal mode
2- generated on creation/update of the node
3- generated on the fly for the first time and by update later

It just need to prevent the administrator on what he's doing.

I don't understand your opinion jcnventura an administrator is not a baby.

#9

Status:postponed» closed (won't fix)

No.. I'm not a baby but I'm going to act like one on this issue, because you are and you need to learn some manners.

Understand two things:

1. You're using software for FREE that is designed by me in my free time.
2. I decide what I do in my free time.

So, if you want this one, this feature now costs you 5000€ (five K euros). Normally, I'm willing to accept patches that other people do and keep on maintaining those for free. In this one case, even if that happens, I won't accept it. You can, of course, fork the module, and keep on maintaining it yourself.. It would actually create more free time for me, so I would welcome it.

João

#10

Oups, maybe it is my English, so I want to rectify.

When I say "an administrator is not a baby", it is not for you, but for the people who use "your" module.
I wanted to post a patch and I thinked that you were opposite to this. that's why i have said "I didn't understand your opinion".

If you've been hurted by what i have written, sorry it wasn't my intention.
I'm a developper, it was to make advance "Your" project which is used by many people on Drupal.

5000 euros lol

Peace and THX A LOT FOR YOUR WORK !!!

#11

Thanks for clearing it up.. Yes, it really annoyed me at the time, as the phrase only made sense to me as an insult.

The main reason why I don't provide such an option is not that I am fundamentally against it, as I think it would be extremely useful and increase performance in sites that use the PDF functionality. It's the lack of time to do it, and the fact no-one has done it for me. Of course, if someone did sponsor this it would probably move up in the list of stuff to do in my free time :)

It's easy to add a node API hook and have the module create a PDF file each time the module is edited/created. Doing that however, would slow down the node creation/editing process that in extreme cases it may lead to PHP timeouts with unknown consequences (the worst being loss of the node contents). Also, Some kind of interface must be provided to enable the user to re-create all the nodes when a newer/better version of the PDF tool is released. So, on this option, the best would be to use the Job Queue to schedule the creation of these PDFs during the next cron run (of course, if a user were to ask for the PDF before the cron execution, then he would get a delay, and the cron job would now be irrelevant).

As this could eventually lead to several gigabytes of wasted space on the server, I would prefer the cache option, whereas the first user to access a PDF would get the delay, but all the others would just download the cached copy... This would allow the PDF 'store' to be configurable and all the usuallly-used PDFs would be instantly available. The problem with this is that handling a file cache is something that would probably benefit from a third-party module, but the one available (http://drupal.org/project/fastpath_fscache) doesn't seem to be actively maintained on a regular basis, and is used by almost no-one.

João

#12

Status:closed (won't fix)» postponed

#13

Subscribing. Thanks

#14

subscribe

#15

I read a couple of threads about the caching issue and all point to this thread now. Has there been any progress on the caching of PDF files? João mentioned in some of the other issues that Drupal cannot handle delivery of cached PDF files. Is that still the case or might core updates have fixed this?

I can see this as a very useful feature, but I do see the performance concerns as well. In my case I only want very few PDFs to be cached, not all of them. I created a view to pull together various pages (like a print basket) which then can be printed into one PDF file. It works fine, but the delay while the PDF is created is a concern. The link to the PDF version is send out in an email, recipients click on it and see the PDF (instead of individual pages with a print to PDF button on it). The few seconds it takes to load might signal to them that the page is not working correctly and they might close the tab/browser. Ideally, I would be able to cache this file to inprove the load time, but I am not concerned about all the other pages that do not need cached versions and work just fine with generation on the fly.

It seems that various people were interested in variations of the caching option. Should we set a different status and see if this can be moved forward?

Thanks, J.

#16

A status change would only force me to set it back.. If you write a file caching module or tell me of an existing module that does file caching, I would be willing to use it to store the generated PDF files and to look for them there before generating them.

#17

@Super G,

Hi,

I am interested in your module that could expand print.module; "a module you wrote to save a PDF output to a file on the server instead of saving it to the visitor's computer."

Could you upload it, so we could try to add it to print.module ?

Thank you

Leon

#18

adding those lines juste after $pdf = stream_get_contents($pipes[1]); (line 386), makes it run for wkhtmltopdf.

$pdfoutput = $pdf;
$filepath = "sites/default/files/print/" . $filename .".pdf";
$fp = fopen($filepath, "w");
fwrite($fp, $pdfoutput);
fclose($fp);

#19

Version:6.x-1.7» 7.x-1.x-dev

@chipway-drupal: thank you. In D7, I changed your lines (in print_pdf.pages.inc) as follow:

global $user;
$pdfoutput = $pdf;
$filepath = "public://fileviewer/". $user->uid . "/". $filename;
$fp = fopen($filepath, "w");
file_unmanaged_save_data($pdfoutput, $filepath, FILE_EXISTS_REPLACE);
fclose($fp);

so that PDFs are saved to specified folders, that can be deleted at user deletion (Rules can create such folders upon the first login of a new user, and then delete them upon user deletion - at least this is the way I'm doing it, but please point me to a more functional method if this one is naif).

PS. If anybody is interested... I'm writing the generated PDFs to these /sites/default/files/fileviewer/[user-ID]/ folders because I'm using Fileviewer to have PDFs rendered in an html5 PDF viewer - instead of having them immediately available for direct download, or to screen. To explain further: I imagined that the PDF version of a node has to be stored and rewritten both by Print and Fileviewer in a "shared" folder (the mentioned /sites/default/files/fileviewer/[user-ID]/ folder), so that Fileviewer is able to always load the right PDF (there is always only one available) and convert it to PNGs, for direct HTML5 rendering. This way, no third party software i.e. Adobe Acrobat or Flash is needed to access a beautifully readable PDF document made in Drupal.

At the moment I am using File Field Sources to manually load the stored PDF to be rendered in the PDF viewer (we are always talking about a single PDF per node), at node save... but the idea is to have Rules doing this job, at node save (I don't know if this is possible... it should be matter of controlling fields content at node save - but point me, possibly).

And I would like to say hello to jcnventura: many thanks for your time!

#20

Hi miro marion,
Could you tell if this could be applied to D6?
Thanks!
Rosamunda

#21

Oh, I didn´t realize that #18 seems to be applicable to 6.x :)

But I dont get one thing, those lines shouldn´t be added to the .module file?
And how can I tell in wich nodetypes is this applicable?
Thanks!

#22

Hi again,
I´ve tried modifying print_pdf.pages.inc according to #18 and nothing happens. There´s no pdf created. (I´ve run cron and still nothing)

#23

This is what I have now:

  if (is_resource($process)) {
    fwrite($pipes[0], $html);
    fclose($pipes[0]);

    $pdf = stream_get_contents($pipes[1]); <==== LINE 354
$pdfoutput = $pdf; <==== NEW LINE ADDED
$filepath = "sites/default/files/print/" . $filename .".pdf"; <==== NEW LINE ADDED
$fp = fopen($filepath, "w"); <==== NEW LINE ADDED
fwrite($fp, $pdfoutput); <==== NEW LINE ADDED
fclose($fp); <==== NEW LINE ADDED
    fclose($pipes[1]);

I´ve installed (and I´m using to generate the pdf´s): wkhtmltopdf-0.9.9-static-i386.tar
I´m using 6.x-1.12 version of Print.

#24

Now, I´ve updated print to the latest stable 6.x version, and still won´t do anything.

#25

bump?