Closed (won't fix)
Project:
ImageCache
Version:
6.x-2.x-dev
Component:
Code
Priority:
Normal
Category:
Feature request
Assigned:
Unassigned
Reporter:
Created:
29 Jan 2009 at 21:30 UTC
Updated:
24 Sep 2012 at 14:25 UTC
Jump to comment: Most recent file
Comments
Comment #1
duntuk commentedComment #2
egfrith commentedI'm interested in this issue, as I'm looking into writing some code to create a preview of pdf document uploaded in a filefiled, and making these available to views.
The conversion would require imagemagick to be installed, as gd can't convert pdf to jpg.
Comment #3
egfrith commentedHmmm... looks like PDF to jpg conversion isn't going to happen in imagefield module: #339266: Feature Request: Convert PDF to image. Perhaps in filefield module? Or as a contrib module? There is also the pdfstamper module, though this is not yet views enabled, and does more than I really want: #391308: Future direction of the module.
Comment #4
TyraelTLK commentedSubscribing
Comment #5
egfrith commentedTo get this to work, a sequence of patches to imageapi and imagecache modules is required:
getimagesize($src)around line 412 in imagcache module withimageapi_image_get_info($src).Then, create an imagecache preset which contains the "Change File Format" action from imagecache_coloractions module. You can specify that the pdf (or any other type) is converted to jpeg, png or gif.
To move the work on this forward, reviews are needed of #416254: Add equivalent of image_get_info() at the toolkit level.
Comment #6
fei commentedSubscribing (thanks for the support)
Comment #7
egfrith commentedI've got this working - just. I've edited comment #5 so that it gives up-to-date instructions.
Comment #8
egfrith commentedThere is now a patch for imagecache. Here are updated instructions for testing:
1. imageapi module: apply latest 6.x patch at #416254: Add equivalent of image_get_info() at the toolkit level
2. imagecache module: apply patch attached here.
3. imageapi module: apply patch at #375218: Changing file type with imagemagick
Then, create an imagecache preset which contains the "Change File Format" action from imagecache_coloractions module. You can specify that the pdf (or any other type) is converted to jpeg, png or gif.
Comment #9
Alex Andrascu commentedAll ok after applying patches on step 1 and 2 at #8 but the last one fail against version. Please review.
Comment #10
egfrith commentedI've just tested this with the latest -dev version of imagapi. The last patch applies OK for me, though with an offset:
$ patch -p0 < imageapi_375218-2.patch
patching file imageapi/imageapi_imagemagick.module
Hunk #1 succeeded at 111 (offset 9 lines).
Does it apply at all for you?
Comment #11
Alex Andrascu commentedI've applyed the patches in the order you describe in #8 with TortoiseSVN. Maybe that's why. Anyhow i've applyed it by hand and it seems it working. Now i don't know how to use all this stuff to write a IM raw command to try tiff->jpg conversion.
Thanks for the blitz reply :)
[Update]
I figure that we shall do a cumulative patch with
#416254: Add equivalent of image_get_info() at the toolkit level
#375218: Changing file type with imagemagick
for this to work without errors.
Comment #12
egfrith commentedHave you tried using imagecache_actions.module (as described at the end of #8)? It may not be how you want to do things in the long run, but it would confirm whether things are working. I'd be interested to know!
Re the patch, once you've confirmed things are working, perhaps it would make sense to post a new combined patch to #416254: Add equivalent of image_get_info() at the toolkit level
Comment #13
Alex Andrascu commentedI guess we're very close now...i just lack some imagemagick skills
[UPDATE]
Holly molly this is workin' :)
It just doesn't append the .jpg at the end of the file
It creates a jpg with the .tif extension. Wonder where's the problem.
Comment #14
egfrith commentedGreat! Yes, the file has to have the orginal extension, otherwise imagecache will think it doesn't exist. As far as I can see, this doesn't cause problems when viewing in browsers.
Comment #15
Alex Andrascu commentedNo it doesn't :) But we need to fix this anyhow.
Comment #16
schildi commentedNot sure if this hint is helpful for your project, but
- converting PDF to JPG will drop the complete text (no cut and paste any more)
- you will get the well known JPEG-artefacts around sharp edges
may be you will have a look at the DJVU-format which is also a raster format but preserves the text when converting from PDF. Text is still selectable. And it has some other advantages. For more background please see http://en.wikipedia.org/wiki/DJVU.
The disadvantage might be that the format is not as wide spread today.
Comment #17
egfrith commentedThanks for your hint schildi. I hadn't thought about DJVU, which does have the advantages you say over jpeg. However, is it viewable in a browser? And can imagemagick convert to it?
Your comment also reminds me that I've had problems with the jpegs that imagemagick has produced from some PDF files. On some machines I've used (all Linux) they have either not showed in the browser, or show in a partial way. On other machines (again Linux) they have been fine. The workaround has been to convert the files to png rather than jpeg.
Comment #18
egfrith commented@alex_andrascu: I agree fixing the filenames would be nice, but I think that it is a separate - and potentially very thorny - issue. I think I may have seen it discussed elsewhere, so it might be worth searching.
Comment #19
cbrody commentedI've got #8 to work using a CCK filefield and Views to display the imagecache converted images but the images are each displayed multiple times in the view (as many times as there are images, e.g. three images results in each being displayed three times). Any hints?
Comment #20
schildi commentedOn Linux it installs with some stand alone application (converters like cjb2) and a plugin for firefox.
I checked this out and it worked well for me.
For a complete conversation cycle you can start from e.g. a jpeg or tif file and use one of the converters mentioned above to create the djvu file.
For example command lines see
Converting form png is also described to be possible. You have probably use "convert" to get a pbm-stream and pipe the result through cjb2 (not checked).
Comment #21
vthirteen commentedsubscribing
Comment #22
egfrith commented@19 cbrody: I'm not sure that this is an issue with the code in this patch. To test whether it is, can you check that the multiple images are actually one image file? E.g. examine the HTML on the pages on which you have the multiple images displayed, and find the href of a converted image, and then view it in the browser on its own. Also, you could check the HTML of the page to make sure there aren't multiple hrefs to the same image.
If the image itself is fine, and there are multiple hrefs, perhaps there is problem with the view?
Comment #23
cbrody commentedHi egfrith, the img src and href is the same for all the images. Seems this could be a problem with Views, as I have it set to select distinct and group multiple values. The query is as follows:
SELECT DISTINCT(node.nid) AS nid, node_data_field_menu.field_menu_data AS node_data_field_menu_field_menu_data, node_data_field_menu.nid AS node_data_field_menu_nid, node.type AS node_type, node.vid AS node_vid FROM node node LEFT JOIN content_field_menu node_data_field_menu ON node.vid = node_data_field_menu.vid WHERE node.status <> 0Comment #24
agileware commentedSubscribing.
Comment #25
rachel_norfolksubscribing
Comment #26
egfrith commentedI've merged the two imageapi patches, and fixed a problem with one of them which prevented images appearing the first time they were generated, leaving "Failed generating an image..." messages in the logs.
Here are updated instructions for using the patch:
1. imageapi module: apply latest 6.x patch at #416254: Add equivalent of image_get_info() at the toolkit level, #19
2. imagecache module: apply patch attached at #8.
Then, create an imagecache preset which contains the "Change File Format" action from imagecache_coloractions module. You can specify that the pdf (or any other type) is converted to jpeg, png or gif.
Other news: the changes to the core code now mean that this functionality should be in D7 with the imageapi_imagemagick module; see #269337: Support for more image types (PDF, TIFF, EPS, etc.).
Another point: it seems that some versions of Safari do have a built-in PDF viewer, so JPEG or PNG files which have a .pdf ending aren't displayed, because the built-in viewer tries to display them as PDFs. At the moment, the best guess I have about what to do about this is to implement a wrapper module for imagecache that would map URLS such as imagecache_wrapper/files/test.jpg to imagecache/files/test.pdf ... but other ideas are welcome.
Comment #27
rachel_norfolkI'm wondering if there is another way to approach this that might be more flexible.
In the flashvideo module, they have a flashvideo_cck module that takes the incoming video file in one cck field, converts it and sticks the .flv result into a second cck field. The module itself hides the appropriate fields on the input form from the authors.
If we were to implement the pdf --> .jpg system in a similar way, we would have access to the original pdf and also to the resultant jpg. It may even be possible to output multiple pages of the pdf into multiple occurances of the jpg cck field.
I know what I'd like to do but I'd need guidance on how to do it. I am a willing volunteer to help, though...
Comment #28
rachel_norfolkand I guess because the filename can then properly relfect the content, Safari will be okay...
Comment #29
egfrith commentedThanks for your comments ricklawson.
At present we do have access to the original PDF - it's at a location like /files/original.pdf . The problem is that the file ending of the resultant JPEG or PNG file is also .pdf
The solution you propose should fix this problem, but I'm wondering if it's more complicated than it needs to be? I was thinking of a bit of code that didn't have to insert anything into the database, but which would pretty much use the tools given by imagecache. Also, we would have to work out how to map different imagecache presets onto the CCK fields. And what would happen when the presets are altered or flushed?
It might be possible to create a module that effectively re-implements imagecache_cache() so that it if asked generate presetname/files/original.jpg, it would look for /files/original.pdf if it couldn't find /files/original.jpg
http://drupalcontrib.org/api/function/imagecache_cache/6
Comment #30
boobaaSubscribe
Comment #31
anrikun commentedPlease have a look at:
Create PDF thumbnails with imagecache and ImageMagick while GD is still the default toolkit
And review it.
Comment #32
iva2k commentedThis would be an awesome feature to have. Please, think also of supporting other file types, or at least a roadmap to do it. Can it potentially employ a mimedetect module to recognize file types?
Once the feature is committed, I would be picking it up into iTweak Upload module. People there are requesting previews of other file types besides images #601896: Allow preview / thumbnail for PDF and other non-image attachments.
What I would like to have from imagecache is a function that returns TRUE if there is a preview image for any given file, either an image or any other supported type, like PDF. This will decouple nicely and make iTweak Upload's code support (without modifications) any future imagecache updates. See _itweak_upload_isimage() and itweak_upload_itweak_upload_preview() functions in itweak_upload.module - these are the ones I will modify/replace with corresponding imagecache call.
@drewish
Before I get too excited - please chime in if you would consider committing a final patch from this issue into imagecache project? What would be your requirements?
Comment #33
egfrith commentedThere's now a solution for the problem with Safari (and some other browsers, it turns out) not displaying converted thumbnails. See #628146: File extensions don't match the actual MIME type - Some browsers do not display converted images. The code is in the attachment - it's not committed to a module yet.
Comment #34
Alex Andrascu commentedAny updates on this? we can't drop dead with this when we're so close to it.
Comment #35
egfrith commentedHello Alex,
I think drewish (maintainer of imagcache and imageapi) has been focussing his effort on Drupal 7.
In order for the imageapi_imagemagick module to be ported fully to D7, a version of the patch for D6 at #416254: Add equivalent of image_get_info() at the toolkit level will have to be applied. I think the D6 patch is pretty much ready. I don't know what drewish's ideas are for the future of the D6 versions of these modules, but when he gets to work on imagecache for D7, it would be great from my point of view if he could commit the D6 patch beforehand.
Comment #36
Alex Andrascu commentedWonderfull news! Just looked at the patch at #416254: Add equivalent of image_get_info() at the toolkit level and i can't wait to play with it a little.
Thanks David.
Comment #37
sorensong commentedVery interested in seeing this functionality included, also. Thanks a lot!
Comment #38
mattwmc commentedRE: convert PDF to JPG support
So is this a yes or no for drupal 6?
Comment #39
samdeskin commentedwhat do y'all think about displaying PDFs as HTML5 instead of JPGs?
Comment #40
egghunter commentedsubscribing
Comment #41
dman commentedGuys.
It looks like imagecache is not the place for radical document conversions to happen. There are so many kinks that have to be built into the process it gets hard to handle.
http://drupal.org/project/pdf_to_imagefield
- is a proper solution, and takes the #27 approach.
Right now it's a total doc conversion (all pages), but it's just a small feature request to #798996: Just create an image of the first page?
I'd really suggest pulling this feature request out of imagecache - it doesn't fit. Suggest leave it here as 'by design' and concentrate efforts on a fuller solution over at that other module.
(You can still run imagecache effects on the results it produces)
Comment #42
shenzhuxi commented@dman
I use poppler for the comverting in my module http://drupal.org/project/fileviewer.
How pdf_to_imagefield solve the same problem?
Comment #43
dman commentednot the same approach.
If you use a browser utility to download and display the full PDF inline, that's fine.
Folk here want to create a jpeg - like of the first page - that can be used as a thumbnail or preview.
A different task.
Neither job is likely to be taken care of within imagecache module itself.
Comment #44
jejk commentedsubscribe
Comment #45
develcuy commentedAlternative patch from frankiedesign at http://drupal.org/node/460132
Comment #46
develcuy commentedWhat about providing us with a way to extend imagecache so that a plugin module can provide support for PDF files? see previous patch, perhaps is a good start to create a hook.
Comment #47
shenzhuxi commentedhttp://drupal.org/project/fileviewer
My module supports the image handler in Drupal 7 core for the PDF thumbnails now.
Comment #48
upupax commented[deleted]
my fault.
Comment #49
romiomon commentedHow can I convert pdf tp image thumbnail on Drupal 7
Please Advise
Comment #50
fizk commentedThis feature should be in a separate module.
See:
http://drupal.org/project/pdf
http://drupal.org/project/fileviewer
http://drupal.org/project/pdf_to_imagefield
Comment #50.0
jwilson3Add Drupal 7 solutions
Comment #51
jwilson3Note: The FileViewer module page mentioned in previous comments now directs people to use http://drupal.org/project/pdf, which uses HTML5 to embed and display pdfs inline inside a page.