Hello Drupal community,
I'd like to submit this project that I created a few years ago to help a little project of mine. I basically needed to generate PDFs from my content. The content is not really complex but there are a few content types and nodes are linked together using node_reference and viewfield. What I needed was a XML export of my nodes that was able to represent those links and a way to transform that XML into FOP so that I could run FOP on it to generate PDFs. This is basically what my module does (XML export, XSL transformations and FOP).
User interface:
The module adds a "Export" tab to a node where one can select a template and an export format:
https://drupal.org/files/project-images/export_node.png
Through an admin interface, the module allows to upload XSL templates that will then show up in the radio list on the node "Export" tab:
https://drupal.org/files/project-images/export_templates.png
Architecture:
The module basically uses hook_node_* to cache the XML export of each node when they are saved. The cache is saved on the database in a "xml_export" table. On export, the cache is refreshed if it has not been created yet, otherwise it is just read. Fields of type "viewfield" are exported at runtime because it is not possible to refresh all related nodes whenever a view content changes. The module adds a "export_filename" hook that allows module to specify their own logic for a default file name.
Hope this module helps the community and I'm looking forward to hearing any feedback or improvement suggestions!
Franz
GIT repository: git clone --branch 7.x-1.x fterrier@git.drupal.org:sandbox/fterrier/2224051.git xml_export
Project page: https://drupal.org/node/2224051
Comment | File | Size | Author |
---|---|---|---|
#11 | exported_node.pdf | 5.12 KB | majorrobot |
#11 | node_xml.txt | 268 bytes | majorrobot |
#11 | hello_world_xsl.txt | 1.79 KB | majorrobot |
#9 | fop_error.txt | 3.21 KB | majorrobot |
Comments
Comment #1
PA robot CreditAttribution: PA robot commentedThere are some errors reported by automated review tools, did you already check them? See http://pareview.sh/pareview/httpgitdrupalorgsandboxfterrier2224051git
We are currently quite busy with all the project applications and we prefer projects with a review bonus. Please help reviewing and put yourself on the high priority list, then we will take a look at your project right away :-)
Also, you should get your friends, colleagues or other community members involved to review this application. Let them go through the review checklist and post a comment that sets this issue to "needs work" (they found some problems with the project) or "reviewed & tested by the community" (they found no major flaws).
I'm a robot and this is an automated message from Project Applications Scraper.
Comment #2
fterrier CreditAttribution: fterrier commentedHello,
I've cleaned up the code as suggested by the automated review tool.
Cheers,
Franz
Comment #3
fterrier CreditAttribution: fterrier commentedComment #4
phoang CreditAttribution: phoang commentedPlease find another way to get Temporary folder path from Drupal system file rather than hard code '/tmp' folder.
The code below will not work for windows file system.
Comment #5
fterrier CreditAttribution: fterrier commentedHi, thanks for the hint! I fixed the problem now and am using the file_directory_temp and drupal_tempnam functions.
Please have a look and let me know!
Comment #6
fterrier CreditAttribution: fterrier commentedComment #7
majorrobot CreditAttribution: majorrobot commentedHi fterrier,
It was a pleasure to review this module! It's nicely coded, and I can see that you've implemented the change from phoang above.
I did, however, get a warning when saving or updating nodes — and it appears to be related to the field type of a node's body. On my vanilla install, a node body by default is "text_with_summary," so when saving/editing a node, I get the following error:
Warning: call_user_func_array() expects parameter 1 to be a valid callback, function '_xml_export_format_text_with_summary' not found or invalid function name in _xml_export_field() (line 597 of [...]/sites/all/modules/xml_export/xml_export.module).
I tried removing the body field completely from the content type to see if that would work, and instead got the following:
Notice: Undefined property: stdClass::$body in _xml_export_xml_export_node() (line 579 of [...]/sites/all/modules/xml_export/xml_export.module).
So, I would recommend putting a conditional around lines 594-597 that makes sure a function for that field type exists. You may also want to put a conditional around any uses of $node->body, but that's probably not a huge deal (it's certainly not something I would have thought of).
Also, when I attempt to create a PDF, I get a Page Not Found and the following error in watchdog:
fop command: sites/all/modules/xml_export/fop/fop -fo [...]/temp/FOPSSIGvG -pdf [...]/temp/FOPNXuSbm exited with error 1, output: , message:
So, no real error message. However, I suspect this is either related to the xml stored in the db not being valid as a result of the above issue, or something unique to my local environment.
Thanks!
Comment #8
fterrier CreditAttribution: fterrier commentedHi majorrobot,
Thanks for taking the time to make such a thorough review! I'm happy that you like the module!
Funny enough, I had already corrected your 2 first remarks (with the undefined function and $body properties) but had forgotten to push the code. It's pushed now so you can check it out and let me know if it works better for you!
I've also added some debugging/error messages around the FOP part. I assume the problem you have comes from the fact that, either your "fop" file is not executable, or you were testing with a windows machine, which was not supported until now. I therefore changed the following:
- Better error messages in watchdog, now you also will see the output and eventual error messages of the "fop" command
- Support/detection of windows machine and switch to the fop.bat executable in that case
Hopefully now you should be able to generate PDF, or at least to find out why it's not working for you. Please give it one more try and let me know what you think!
Thanks again for the review!
Comment #9
majorrobot CreditAttribution: majorrobot commentedHi fterrier,
That's great! I can confirm that the conditionals you added prevented the errors and that XML is getting cached now. Also error reporting is much more helpful!
However, I'm still having issues generating a PDF. Again, it could be my local setup (a Mac running XAMPP, PHP 5.3.1), but this is seeming less likely, as I also tried it on our CentOS server and got the same error (which I'll attach here in a text file). I did a little research on the error, and it sounds like the most likely culprit is an invisible BOM marker at the beginning of the XML (and I definitely don't see any stray characters).
I definitely don't want to hold up your application if it's something unique to my environments, but it's probably something you should check out.
I also have a couple of other hints/suggestions that, in my opinion, are non-blockers (didn't mention these above, as I wanted to get the main issues to you first):
Thanks!
Comment #10
fterrier CreditAttribution: fterrier commentedHi majorrobot,
I've removed .gitignore, changed README.txt and used menu_get_object() instead of node_load(). Thanks for the hints!
Now remain the issues of FOP and the private directory file. For the latter, I have now changed my install so it uses the public directory instead of the private. Is the public directory configured on a new install? If not, then I'd have to review that. In any case, it would probably be more appropriate to use the private directory, since the file should not really be accessible. What would be a proper way to go about using the "private" directory? Should it just be documented that it has to be set? Or should an error be thrown that prevents installing the module?
For the FOP issue, it would be great if you could also attach the content of the temporary file generated by the module. In your case, you can see where it is if you look at the command-line: sites/all/modules/xml_export/fop/fop -d -fo /tmp/FOPWuMUtS -pdf /tmp/FOPt0jAvP
The first file is the input, in this case: /tmp/FOPWuMUtS. If that file does not exist any more, then simply run the export again and get a new file.
One thing to note is that FOP is generated out of specific FO XML, and would not work if you don't apply a proper XSL transformation first. Normally though, the FOP transform option should not be available if you don't select a template. Could I ask you to maybe also attach your XSL template and the XML export of your node? You can also get that one by exporting the node without applying transformation.
Thanks again for the thorough review!
Comment #11
majorrobot CreditAttribution: majorrobot commentedHi fterrier,
No problem! This has been pretty fun.
So, I can confirm that .gitignore is gone, README.txt is corrected and that you're making good use of menu_get_object()!
Re: the private directory, I completely agree that it should be used. I think a warning in the documentation would be good. For user experience, if a user does not have it enabled, it would also probably be best to let them enable the module, but not allow export/conversion until it is set. Otherwise, the module might feel "broken" to the user, even though this behavior is by design.
As for the FOP, I think I may be out of my league here, as I only have the most basic understanding of XSLT. But, I did get it to work. My understanding of XSLT being very limited, I gave myself a crash-course in XSLT and FO today and eked something out. I'll attach my source XML (from the db), my rudimentary XSL, and the resulting PDF so that you can confirm. Note that the system won't allow files with xsl or xml extensions to upload, so I've named these files "hello_world_xsl.txt" and node_xml.txt.
Coincidentally, when I did get output, I found an additional bug: the nodes' bodies are being stored in the db as "Array." This stems from line 613: $node->body is actually an array, not a string. I believe the best practices way to get the body value would be to treat it as other fields:
Thanks!
Comment #12
fterrier CreditAttribution: fterrier commentedHi majorrobot,
Thanks again for the quick reply! So there, we should be pretty close now :)
I re-changed directory from public to private and added checks. Now you can't upload a template if the directory is not set. You can still export though, but then the warning is displayed that the transform function is not available.
As for the FOP, I didn't check your files in detail (I did open them) but I gather if you managed to get it working, then there is not much more I could do :) I see what's wrong with the body field though and I fixed it. It was just a matter of implementing the one correct export function. It now works and exports the "value" as well as the "summary" properly (since the body has those 2 fields).
What do you think? Did I miss anything?
Thanks and have a great evening (if you're somewhere around this timezone).
Comment #13
majorrobot CreditAttribution: majorrobot commentedHi fterrier,
I guess I'm not around your timezone! It was midday here in New York when you posted that. :) I'm glad to reply as quickly as I can — I'm in the same boat with my own project application.
I can confirm that the $node->body is being parsed correctly (with summary). The private directory is still a bit of an issue, though, since it's still required in hook_requirements(). So, when I install via drush w/o that directory set, I get:
If I'm understanding correctly, you are now aiming in the direction that the module should install but not be completely functional if the private directory were not set (which was my recommendation). So, I think removing hook_requirements() would be necessary. Let me know if I'm misunderstanding.
If you end up deciding to keep the hook_requirements() and make the private directory setting a "hard" requirement, then I recommend updating $requirements['xml_export_directory']['description'] on line 18 of xml_import.install so that it mentions that the error may be a private file directory issue. Also, notice that $requirements['xml_export_directory']['value'] should be "The current value (e.g., version, time, level, etc). During install phase, this should only be used for version numbers, do not set it if not applicable," according to hook_requirements() docs. So, I would omit this element of the $requirements array.
Yes, almost there!
Comment #14
fterrier CreditAttribution: fterrier commentedHi majorrobot,
Good, so I've now removed the "hook_requirements" completely. You are right that it does not make sense to have it since it is now "optional" to configure the private directory.
I don't use drush myself but I guess it does the exact same thing as enabling the module through the interface? If yes, then everything should be fine now.
Let me know if it works for you and thanks again :)
Comment #15
majorrobot CreditAttribution: majorrobot commentedHi fterrier,
Perfect — it installs without a hitch now. I can see the warning on the template upload page re: the private directory. Good.
I do get a PHP error on the node export tab though, with no private directory set:
Warning: Invalid argument supplied for foreach() in xml_export_form() (line 179 of [...]/sites/all/modules/xml_export/xml_export.module).
Looks like this is because _xml_export_get_templates() only returns a value if $file_path is not null. I'd either return an empty array if $file_path is null or check if $templates is an array before operating on it in xml_export_form().
Almost there!
Re: drush — I wholeheartedly recommend getting acquainted with it. You will literally save hours of development time. Downloading, enabling, disabling, updating, and uninstalling modules is much faster, as is clearing caches, reverting features, etc. etc. I'm not sure how I ever lived without it.
Comment #16
fterrier CreditAttribution: fterrier commentedHi majorrobot,
Thanks! I've fixed the code and added a check in that spot so if the private directory is not set, it'll just skip the whole thing and issue the warning. So now, you shouldn't see the PHP error any more.
Let me know if it works for you!
Franz
Comment #17
majorrobot CreditAttribution: majorrobot commentedHi fterrier,
Great — that PHP error is gone now. But there may still be an issue. I see that part of the form is still on the page — 'filename' and 'submit.' I don't think there is an intended behavior for being able to submit this form without a private directory, correct? If I do submit it, I get PHP errors:
To me, it seems best to remove the form altogether for this case, but I may not be understanding the intention.
Thanks!
Comment #18
fterrier CreditAttribution: fterrier commentedHi majorrobot,
That's actually correct behaviour. If you don't have FOP you can still export your node in pure XML. I fixed that now and it works as expected (also without warnings).
I also got rid of various PHP notice and warnings that occurred when you export XML.
What do you think? Have a look and let me know!
Thanks again!
Franz
Comment #19
majorrobot CreditAttribution: majorrobot commentedHi fterrier,
That makes sense — and works without notices and warnings now. I think it looks good. I'd recommend making sure it's still good with the bot, and setting it to RTBC!
It's been great reviewing with you!
—majorrobot
Comment #20
fterrier CreditAttribution: fterrier commentedGreat! Thanks to you, I've learned quite a lot during this review!
I'm now setting it to RTBC, I've just fixed a few things from the bot.
Franz
Comment #21
stBorchertPlease do not set your own application to 'RTBC' (see Review process for Full Project Applications).
To speed up reviewing this application by one of the git administrators, it is strongly recommended that you review other project applications according to the review bonus program.
Comment #22
fterrier CreditAttribution: fterrier commentedHi @stBorchert,
Sorry about that and thanks for letting me know. I have set it myself to RTBC because of comment https://www.drupal.org/node/2224079#comment-8773947. But I guess @majorrobot should set my application to RTBC then? He has reviewed the module already (see comments).
I will start reviewing other projects, thanks for pointing that out.
Comment #23
sandiracy CreditAttribution: sandiracy commentedI have found several error for this project http://pareview.sh/pareview/httpgitdrupalorgsandboxfterrier2224051git
You should use db_delete() instead of delete query using db_query
Please follow drupal coding standard https://www.drupal.org/node/318
Comment #24
PA robot CreditAttribution: PA robot commentedClosing due to lack of activity. If you are still working on this application, you should fix all known problems and then set the status to "Needs review". (See also the project application workflow).
I'm a robot and this is an automated message from Project Applications Scraper.
Comment #25
mikl CreditAttribution: mikl commentedHolding back someone from contributing over tiny details like db_delete vs. db_query is silly. This is a perfectly good module, it works well.
Comment #26
cweagansThanks for your contribution!
I updated your account so you can promote this to a full project and also create new projects as either a sandbox or a "full" project.
Here are some recommended readings to help with excellent maintainership:
You can find lots more contributors chatting on IRC in #drupal-contribute. So, come hang out and stay involved!
Thanks, also, for your patience with the review process. Anyone is welcome to participate in the review process. Please consider reviewing other projects that are pending review. I encourage you to learn more about that process and join the group of reviewers.
Thanks to the dedicated reviewer(s) as well.
Comment #28
apadernoI am giving credits to the users who participated in this issue.
Comment #29
apaderno