Hello Drupal community,

I'd like to submit this project that I created a few years ago to help a little project of mine. I basically needed to generate PDFs from my content. The content is not really complex but there are a few content types and nodes are linked together using node_reference and viewfield. What I needed was a XML export of my nodes that was able to represent those links and a way to transform that XML into FOP so that I could run FOP on it to generate PDFs. This is basically what my module does (XML export, XSL transformations and FOP).

User interface:

The module adds a "Export" tab to a node where one can select a template and an export format:
https://drupal.org/files/project-images/export_node.png

Through an admin interface, the module allows to upload XSL templates that will then show up in the radio list on the node "Export" tab:
https://drupal.org/files/project-images/export_templates.png

Architecture:

The module basically uses hook_node_* to cache the XML export of each node when they are saved. The cache is saved on the database in a "xml_export" table. On export, the cache is refreshed if it has not been created yet, otherwise it is just read. Fields of type "viewfield" are exported at runtime because it is not possible to refresh all related nodes whenever a view content changes. The module adds a "export_filename" hook that allows module to specify their own logic for a default file name.

Hope this module helps the community and I'm looking forward to hearing any feedback or improvement suggestions!
Franz

GIT repository: git clone --branch 7.x-1.x fterrier@git.drupal.org:sandbox/fterrier/2224051.git xml_export
Project page: https://drupal.org/node/2224051

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

PA robot’s picture

Status: Needs review » Needs work

There are some errors reported by automated review tools, did you already check them? See http://pareview.sh/pareview/httpgitdrupalorgsandboxfterrier2224051git

We are currently quite busy with all the project applications and we prefer projects with a review bonus. Please help reviewing and put yourself on the high priority list, then we will take a look at your project right away :-)

Also, you should get your friends, colleagues or other community members involved to review this application. Let them go through the review checklist and post a comment that sets this issue to "needs work" (they found some problems with the project) or "reviewed & tested by the community" (they found no major flaws).

I'm a robot and this is an automated message from Project Applications Scraper.

fterrier’s picture

Status: Needs work » Needs review

Hello,

I've cleaned up the code as suggested by the automated review tool.

Cheers,
Franz

fterrier’s picture

Issue summary: View changes
phoang’s picture

Status: Needs review » Needs work

Please find another way to get Temporary folder path from Drupal system file rather than hard code '/tmp' folder.

The code below will not work for windows file system.

$tmpfo = tempnam('/tmp', 'FOP');
  file_put_contents($tmpfo, $fo);
  if ($type != 'xml') {
    $tmpout = tempnam('/tmp', 'FOP');
fterrier’s picture

Hi, thanks for the hint! I fixed the problem now and am using the file_directory_temp and drupal_tempnam functions.

Please have a look and let me know!

fterrier’s picture

Status: Needs work » Needs review
majorrobot’s picture

Status: Needs review » Needs work

Hi fterrier,

It was a pleasure to review this module! It's nicely coded, and I can see that you've implemented the change from phoang above.

I did, however, get a warning when saving or updating nodes — and it appears to be related to the field type of a node's body. On my vanilla install, a node body by default is "text_with_summary," so when saving/editing a node, I get the following error:

Warning: call_user_func_array() expects parameter 1 to be a valid callback, function '_xml_export_format_text_with_summary' not found or invalid function name in _xml_export_field() (line 597 of [...]/sites/all/modules/xml_export/xml_export.module).

I tried removing the body field completely from the content type to see if that would work, and instead got the following:

Notice: Undefined property: stdClass::$body in _xml_export_xml_export_node() (line 579 of [...]/sites/all/modules/xml_export/xml_export.module).

So, I would recommend putting a conditional around lines 594-597 that makes sure a function for that field type exists. You may also want to put a conditional around any uses of $node->body, but that's probably not a huge deal (it's certainly not something I would have thought of).

Also, when I attempt to create a PDF, I get a Page Not Found and the following error in watchdog:

fop command: sites/all/modules/xml_export/fop/fop -fo [...]/temp/FOPSSIGvG -pdf [...]/temp/FOPNXuSbm exited with error 1, output: , message:

So, no real error message. However, I suspect this is either related to the xml stored in the db not being valid as a result of the above issue, or something unique to my local environment.

Thanks!

fterrier’s picture

Status: Needs work » Needs review

Hi majorrobot,

Thanks for taking the time to make such a thorough review! I'm happy that you like the module!

Funny enough, I had already corrected your 2 first remarks (with the undefined function and $body properties) but had forgotten to push the code. It's pushed now so you can check it out and let me know if it works better for you!

I've also added some debugging/error messages around the FOP part. I assume the problem you have comes from the fact that, either your "fop" file is not executable, or you were testing with a windows machine, which was not supported until now. I therefore changed the following:
- Better error messages in watchdog, now you also will see the output and eventual error messages of the "fop" command
- Support/detection of windows machine and switch to the fop.bat executable in that case

Hopefully now you should be able to generate PDF, or at least to find out why it's not working for you. Please give it one more try and let me know what you think!

Thanks again for the review!

majorrobot’s picture

Status: Needs review » Needs work
FileSize
3.21 KB

Hi fterrier,

That's great! I can confirm that the conditionals you added prevented the errors and that XML is getting cached now. Also error reporting is much more helpful!

However, I'm still having issues generating a PDF. Again, it could be my local setup (a Mac running XAMPP, PHP 5.3.1), but this is seeming less likely, as I also tried it on our CentOS server and got the same error (which I'll attach here in a text file). I did a little research on the error, and it sounds like the most likely culprit is an invisible BOM marker at the beginning of the XML (and I definitely don't see any stray characters).

I definitely don't want to hold up your application if it's something unique to my environments, but it's probably something you should check out.

I also have a couple of other hints/suggestions that, in my opinion, are non-blockers (didn't mention these above, as I wanted to get the main issues to you first):

  1. You'll want to remove your .gitignore from the repo. It's not completely clear in the documentation, but from other reviews I've read, this will be an issue with project approval.
  2. Your README.txt refers to "cck_export/fop," when I think what you mean is "xml_export/fop." Oops!
  3. We don't usually have our private file directory set up on our installs, so we got PHP errors when trying to install the module until we added it. You may want to mention this in your documentation and on your module page.
  4. Just a tip, in the category of Drupal functions I wish someone had told me about 3 years ago: on line 271 you call node_load(). This works just fine, but you'll likely get better performance if you use menu_get_object(). Since you're on a node page, Drupal has already loaded the node. Calling menu_get_object() taps into this instead of attempting to load again. This works for other entities such as users, too.

Thanks!

fterrier’s picture

Status: Needs work » Needs review

Hi majorrobot,

I've removed .gitignore, changed README.txt and used menu_get_object() instead of node_load(). Thanks for the hints!

Now remain the issues of FOP and the private directory file. For the latter, I have now changed my install so it uses the public directory instead of the private. Is the public directory configured on a new install? If not, then I'd have to review that. In any case, it would probably be more appropriate to use the private directory, since the file should not really be accessible. What would be a proper way to go about using the "private" directory? Should it just be documented that it has to be set? Or should an error be thrown that prevents installing the module?

For the FOP issue, it would be great if you could also attach the content of the temporary file generated by the module. In your case, you can see where it is if you look at the command-line: sites/all/modules/xml_export/fop/fop -d -fo /tmp/FOPWuMUtS -pdf /tmp/FOPt0jAvP
The first file is the input, in this case: /tmp/FOPWuMUtS. If that file does not exist any more, then simply run the export again and get a new file.

One thing to note is that FOP is generated out of specific FO XML, and would not work if you don't apply a proper XSL transformation first. Normally though, the FOP transform option should not be available if you don't select a template. Could I ask you to maybe also attach your XSL template and the XML export of your node? You can also get that one by exporting the node without applying transformation.

Thanks again for the thorough review!

majorrobot’s picture

Status: Needs review » Needs work
FileSize
1.79 KB
268 bytes
5.12 KB

Hi fterrier,

No problem! This has been pretty fun.

So, I can confirm that .gitignore is gone, README.txt is corrected and that you're making good use of menu_get_object()!

Re: the private directory, I completely agree that it should be used. I think a warning in the documentation would be good. For user experience, if a user does not have it enabled, it would also probably be best to let them enable the module, but not allow export/conversion until it is set. Otherwise, the module might feel "broken" to the user, even though this behavior is by design.

As for the FOP, I think I may be out of my league here, as I only have the most basic understanding of XSLT. But, I did get it to work. My understanding of XSLT being very limited, I gave myself a crash-course in XSLT and FO today and eked something out. I'll attach my source XML (from the db), my rudimentary XSL, and the resulting PDF so that you can confirm. Note that the system won't allow files with xsl or xml extensions to upload, so I've named these files "hello_world_xsl.txt" and node_xml.txt.

Coincidentally, when I did get output, I found an additional bug: the nodes' bodies are being stored in the db as "Array." This stems from line 613: $node->body is actually an array, not a string. I believe the best practices way to get the body value would be to treat it as other fields:

$raw_field = field_get_items('node', $node, 'body');
$field = field_view_value('node', $node, 'body', $raw_field[0]);
$body = render($field);

Thanks!

fterrier’s picture

Status: Needs work » Needs review

Hi majorrobot,

Thanks again for the quick reply! So there, we should be pretty close now :)

I re-changed directory from public to private and added checks. Now you can't upload a template if the directory is not set. You can still export though, but then the warning is displayed that the transform function is not available.

As for the FOP, I didn't check your files in detail (I did open them) but I gather if you managed to get it working, then there is not much more I could do :) I see what's wrong with the body field though and I fixed it. It was just a matter of implementing the one correct export function. It now works and exports the "value" as well as the "summary" properly (since the body has those 2 fields).

What do you think? Did I miss anything?

Thanks and have a great evening (if you're somewhere around this timezone).

majorrobot’s picture

Hi fterrier,

I guess I'm not around your timezone! It was midday here in New York when you posted that. :) I'm glad to reply as quickly as I can — I'm in the same boat with my own project application.

I can confirm that the $node->body is being parsed correctly (with summary). The private directory is still a bit of an issue, though, since it's still required in hook_requirements(). So, when I install via drush w/o that directory set, I get:

$ drush en -y xml_export
is_dir(): Unable to find the wrapper "private" - did you forget to enable it when you configured PHP? file.inc:437                                                                                  [warning]
is_dir(): Unable to find the wrapper "private" - did you forget to enable it when you configured PHP? file.inc:437                                                                                  [warning]
Module xml_export doesn't meet the requirements to be enabled.                                                                                                                                      [error]
The XML Export directory, xml_export could not be created due to a misconfigured files directory. Please ensure that the files directory is correctly configured and that the webserver has         [error]
permission to create directories. (Currently using  Unable to create)

If I'm understanding correctly, you are now aiming in the direction that the module should install but not be completely functional if the private directory were not set (which was my recommendation). So, I think removing hook_requirements() would be necessary. Let me know if I'm misunderstanding.

If you end up deciding to keep the hook_requirements() and make the private directory setting a "hard" requirement, then I recommend updating $requirements['xml_export_directory']['description'] on line 18 of xml_import.install so that it mentions that the error may be a private file directory issue. Also, notice that $requirements['xml_export_directory']['value'] should be "The current value (e.g., version, time, level, etc). During install phase, this should only be used for version numbers, do not set it if not applicable," according to hook_requirements() docs. So, I would omit this element of the $requirements array.

Yes, almost there!

fterrier’s picture

Hi majorrobot,

Good, so I've now removed the "hook_requirements" completely. You are right that it does not make sense to have it since it is now "optional" to configure the private directory.

I don't use drush myself but I guess it does the exact same thing as enabling the module through the interface? If yes, then everything should be fine now.

Let me know if it works for you and thanks again :)

majorrobot’s picture

Hi fterrier,

Perfect — it installs without a hitch now. I can see the warning on the template upload page re: the private directory. Good.

I do get a PHP error on the node export tab though, with no private directory set:

Warning: Invalid argument supplied for foreach() in xml_export_form() (line 179 of [...]/sites/all/modules/xml_export/xml_export.module).

Looks like this is because _xml_export_get_templates() only returns a value if $file_path is not null. I'd either return an empty array if $file_path is null or check if $templates is an array before operating on it in xml_export_form().

Almost there!

Re: drush — I wholeheartedly recommend getting acquainted with it. You will literally save hours of development time. Downloading, enabling, disabling, updating, and uninstalling modules is much faster, as is clearing caches, reverting features, etc. etc. I'm not sure how I ever lived without it.

fterrier’s picture

Hi majorrobot,

Thanks! I've fixed the code and added a check in that spot so if the private directory is not set, it'll just skip the whole thing and issue the warning. So now, you shouldn't see the PHP error any more.

Let me know if it works for you!

Franz

majorrobot’s picture

Hi fterrier,

Great — that PHP error is gone now. But there may still be an issue. I see that part of the form is still on the page — 'filename' and 'submit.' I don't think there is an intended behavior for being able to submit this form without a private directory, correct? If I do submit it, I get PHP errors:

Notice: Undefined index: type in xml_export_form_validate() (line 263 of […]/sites/all/modules/xml_export/xml_export.module).
Notice: Undefined index: template in xml_export_form_validate() (line 264 of […]/sites/all/modules/xml_export/xml_export.module).
Notice: Undefined index: template in xml_export_form_submit() (line 275 of […]/sites/all/modules/xml_export/xml_export.module).
Warning: Invalid argument supplied for foreach() in xml_export_get_template() (line 294 of […]/sites/all/modules/xml_export/xml_export.module).
Notice: Undefined index: type in xml_export_form_submit() (line 281 of […]/sites/all/modules/xml_export/xml_export.module).
There was an error trying to generate the export. Please contact an administrator.

To me, it seems best to remove the form altogether for this case, but I may not be understanding the intention.

Thanks!

fterrier’s picture

Hi majorrobot,

That's actually correct behaviour. If you don't have FOP you can still export your node in pure XML. I fixed that now and it works as expected (also without warnings).

I also got rid of various PHP notice and warnings that occurred when you export XML.

What do you think? Have a look and let me know!

Thanks again!
Franz

majorrobot’s picture

Hi fterrier,

That makes sense — and works without notices and warnings now. I think it looks good. I'd recommend making sure it's still good with the bot, and setting it to RTBC!

It's been great reviewing with you!

—majorrobot

fterrier’s picture

Status: Needs review » Reviewed & tested by the community

Great! Thanks to you, I've learned quite a lot during this review!

I'm now setting it to RTBC, I've just fixed a few things from the bot.

Franz

stBorchert’s picture

Status: Reviewed & tested by the community » Needs review

Please do not set your own application to 'RTBC' (see Review process for Full Project Applications).

To speed up reviewing this application by one of the git administrators, it is strongly recommended that you review other project applications according to the review bonus program.

fterrier’s picture

Hi @stBorchert,

Sorry about that and thanks for letting me know. I have set it myself to RTBC because of comment https://www.drupal.org/node/2224079#comment-8773947. But I guess @majorrobot should set my application to RTBC then? He has reviewed the module already (see comments).

I will start reviewing other projects, thanks for pointing that out.

sandiracy’s picture

Status: Needs review » Needs work

I have found several error for this project http://pareview.sh/pareview/httpgitdrupalorgsandboxfterrier2224051git
You should use db_delete() instead of delete query using db_query
Please follow drupal coding standard https://www.drupal.org/node/318

PA robot’s picture

Status: Needs work » Closed (won't fix)

Closing due to lack of activity. If you are still working on this application, you should fix all known problems and then set the status to "Needs review". (See also the project application workflow).

I'm a robot and this is an automated message from Project Applications Scraper.

mikl’s picture

Status: Closed (won't fix) » Reviewed & tested by the community

Holding back someone from contributing over tiny details like db_delete vs. db_query is silly. This is a perfectly good module, it works well.

cweagans’s picture

Status: Reviewed & tested by the community » Fixed

Thanks for your contribution!

I updated your account so you can promote this to a full project and also create new projects as either a sandbox or a "full" project.

Here are some recommended readings to help with excellent maintainership:

You can find lots more contributors chatting on IRC in #drupal-contribute. So, come hang out and stay involved!

Thanks, also, for your patience with the review process. Anyone is welcome to participate in the review process. Please consider reviewing other projects that are pending review. I encourage you to learn more about that process and join the group of reviewers.

Thanks to the dedicated reviewer(s) as well.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

apaderno’s picture

Status: Closed (fixed) » Fixed

I am giving credits to the users who participated in this issue.

apaderno’s picture

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.