Description

Webform tracking collects data about your users and associates it with their webform submissions. It uses session_cache and hook_boot to avoid interference with Drupals page cache. If you want to actually see the resulting data you need to apply the patch in #2117285: Allow extra data to be added to submissions in result displays to webform (or implement a better solution, the data is stored in $submission->tracking).

There are no similar solutions beside using an external analytics software like Piwik or Google Analytics. The code conforms to Drupals coding standard and recommendations from https://drupal.org/writing-secure-code and pareview.sh have been followed.

It does not support webform-4.x at the moment.

Project

https://drupal.org/project/webform_tracking

Git

git clone --branch 7.x-1.x http://git.drupal.org/sandbox/phaer/2118153.git webform_tracking

Reviews of other projects

This project has been developed by more onion as part of Campaignion - our Drupal distribution for eCampaignign and Online Fundraising

Comments

phaer’s picture

Issue summary: View changes
phaer’s picture

Issue summary: View changes
phaer’s picture

Issue summary: View changes
phaer’s picture

Issue summary: View changes
Issue tags: +PAreview: review bonus
torotil’s picture

Upfront. This is great work ;)

Here's my review:

Licensing

The module doesn't bundle any non-GPLed files. (I took a look at all 4 for files included in the module).

Similar modules

Analytics Tracking seems to cover some minor parts of this modules features, namely: gathering of referers. I don't think that the overlap is big enough to block this project though.

Drupal API

Drupal API: The module makes heavy use of the DB-API and other Drupal APIs when applicable. I haven't found any parts that reimplement Drupal functionality.

Security:

  • The module stores additional tracking data for weform submissions. It might be sensible to introduce a new to allow permission for viewing that information. Currently the module presumes that any person authorized to see webform results is allowed to see the tracking data - which is probably fine for nearly all installations.
  • The module stores parameters (received via $_GET) in the database. It uses check_plain() to sanitize those parameters.

Coding style

A pareview doesn't show any recommendations.

  // Do not track if Do-Not-Track (DNT) is set, but ignore IE 10 as it sends
  // DNT by default and thus undermindes the concept.
  $ua = isset($_SERVER['HTTP_USER_AGENT']) ? $_SERVER['HTTP_USER_AGENT'] : '';
  if (empty($_SERVER['HTTP_DNT'])
      || preg_match('/MSIE\s*10/', $ua)
    || !variable_get('webform_tracking_respect_dnt', TRUE)) {

Perhaps the condition could be made a bit clearer by using temporary variables with meaningful names. Something along the lines of:

$ua = isset($_SERVER['HTTP_USER_AGENT']) ? $_SERVER['HTTP_USER_AGENT'] : '';
$respect_dnt = !preg_match('/MSIE\s*10/', $ua) && variable_get('webform_tracking_respect_dnt', TRUE);
if (empty($_SERVER['HTTP_DNT']) || !$respect_dnt) {

Code documentation

While some function have concise comments others only have the "Implements hook…()" line. Adding one or two sentences about what the function actually does - especially for the longer functions - would be perfect.

phaer’s picture

Thanks a lot for your review, I just pushed to new commits which include your recommendation regarding the condition as well as some new comments, I hope that makes it clearer.

itsmebhupendra’s picture

Hi,

I download and installed the module and other required module. It does not shows any extra information said in the above description. When I started to apply patch provide with the module to get the extra details, its gives an error "includes/webform.report.inc: No such file or directory".

phaer’s picture

Issue summary: View changes

Hi,

I am sorry for the confusion, you need to use the patch for webform-3.x in comment #5, someone has posted a version for 4.x which produces your error when used on webform-3.x. Webform_tracking support only 3.x at the moment, I will look into support for 4.x if needed. Would you use it?

The correct version of the patch is also included in webform_trackings git repo, I just updated it and tested it on a clean minimal-site (webform-3.20)

Thanks for your interest, I am looking forward to your review ;)

shyam kumar kunkala’s picture

  • Instead of using $_GET. It is recommended to use drupal_get_query_parameters() in module file.
phaer’s picture

Does it make a difference for this module? As fas as I can see, it's only an extra loop over all $_GET-variables in my use case. Could you maybe link me the recommendation, I did not find it so far. I will follow any strong recommendation by official documentation or an convincing argument why it's better to drupal_get_query_parameters() over $_GET but, on the other hand, I am a bit hesitant to add more code than needed to hook_boot specifically.

Thanks!

shyam kumar kunkala’s picture

Hi Phaer,

Its not madatory to use drupal_get_query_parameters() . As there is a drupal function, It's better to use it instead of core PHP functions. That's fine for me. But I have seen many $_GET -variables without check_plain etc like filters in webform_tracking.module. Fillter is mandatory for $_GET-variables.

Also I noticed that you used $_GET['q'] in line number 87 of webform_tracking.module. Instead of $_GET['q'] use current_path. As $_GET['q'] got depreciated in drupal 8. It will be difficult if any one wants to convert your module from drupal 7 to drupal 8.Refer: https://drupal.org/node/1659562

Thanks!

phaer’s picture

Hi,

You are right, recent commits introduced two unchecked $_GET-variables. Now all of them are check_plain()'ed. If you don't agree, line numbers would be helpful. :)

I can't use current_path() because the API documentation says

This function is not available in hook_boot() so use $_GET['q'] instead.

Personally, I won't have any need to port this module to drupal 8 soon, because we are not using it so far but I will look into it if someone requests a drupal 8 port.

I also fixed a bug which prohibited $_GET['tags'] from working correctly.

phaer’s picture

Status: Needs review » Reviewed & tested by the community

Now we have two weeks without any other problems reported and two good looking reviews, so I allow myself to set it to RTBC and hope thats okay. Feel free to put it back on "needs review" if you think the existing reviews are not enough.

phaer’s picture

Priority: Normal » Major
phaer’s picture

Priority: Major » Critical

I created this issue 7 weeks ago and the last response came over 5 weeks ago, so I set it to critical in accordance with https://drupal.org/node/894256

moreonion’s picture

Issue summary: View changes
mpdonadio’s picture

phaer, sorry for the delay, but the project review administrators have been busy. I cannot approve your application,
but I can give it a thorough review. In the future, though, let others change the status to RBTC on your issues.

Automated Review

FILE: /var/www/drupal-7-pareview/pareview_temp/webform_tracking.module
--------------------------------------------------------------------------------
FOUND 1 ERROR(S) AFFECTING 1 LINE(S)
--------------------------------------------------------------------------------
169 | ERROR | Inline comments must start with a capital letter
--------------------------------------------------------------------------------

Minor.

Manual Review

Add a link to automagically run pareview to the description above.

README files normally have .txt extensions in Drupal, but I have seen enough .md ones to say that this
isn't a problem.

Project page should mention that is only works with webform-3.x.

Project page should mention the soft dependency on geoip

.info file should reflect the proper dependency for webform. See https://drupal.org/node/542202#dependencies for how to
specify this.

Your hook_update_N don't follow the conventions in the API docs. Do these work? As this is a brand new module, you may also just
want to start out with the current schema.

Your module has a setting variable, but no hook_uninstall() to delete it.

You should move the link out of the #title in webform_tracking_form_webform_admin_settings_alter() and into a #description.

I thought session_cache_set() and session_cache_get aleady() handled the serialization? Did this change?

The logic in webform_tracking_boot() may not work with sites running out of multiple domains that don't use canonical
URLs, or potentially ones using Domain Access. See what happens here.

I would combine all of the session_cache_sets into a single array/object.

Do you need to handle hook_node_delete() to get rid of data when nodes are deleted? I don't recall whether
hook_webform_submission_delete() gets invoked from webform_node_delete().

I *strongly* suggest a separate permission for viewing this data, or potentially two: one for viewing full data, and the
other for viewing slightly anonymized data (eg, X out the last octet of the IP).

webform_tracking_load() can likely use ->fetchAllAssoc() instead of a loop.

Data filtering is normally done on output, but as the data source in this case may not be clear to others as actually being
user input, I would say that doing it before writing to the database is OK. Be prepared if the project admins disagree
with this assessment, as this does differ from Drupal practices.

The only way to opt out of this is with the DNT header?

Not seeing any third party code, security issues, project duplication issues, or major API issues, so leaving this as RBTC.

klausi’s picture

Status: Reviewed & tested by the community » Fixed

manual review:

  1. why do you need hook_boot() and cannot use some hook_form_FORM_ID_alter() when the actual webform is displayed? Please add a comment.
  2. check_plain() should only happen when data is printed as output to HTML, not when it is saved to the database. Since your module is also responsible for display it should then sanitize the user provided data before printing to HTML. See https://drupal.org/node/28984 . webform_tracking_boot() for example should contain no check_plain() because it does not print anything. webform_tracking_webform_results_extra_data() might be the place, or does the webform module apply sanitization anyway after the hook?
  3. I agree with all points made by mpdonadio.

No critical application blockers, so ...

Thanks for your contribution, phaer!

I updated your account so you can promote this to a full project and also create new projects as either a sandbox or a "full" project.

Here are some recommended readings to help with excellent maintainership:

You can find lots more contributors chatting on IRC in #drupal-contribute. So, come hang out and stay involved!

Thanks, also, for your patience with the review process. Anyone is welcome to participate in the review process. Please consider reviewing other projects that are pending review. I encourage you to learn more about that process and join the group of reviewers.

Thanks to the dedicated reviewer(s) as well.

phaer’s picture

Nice, thanks for the reviews and the permissions. I will follow most of your recommendations when I find the time.

The logic in webform_tracking_boot() may not work with sites running out of multiple domains that don't use canonical
URLs, or potentially ones using Domain Access. See what happens here.

Yes, I will test that and add at least a warning to the project page if there is no easy way to fix this.

I *strongly* suggest a separate permission for viewing this data, or potentially two: one for viewing full data, and the
other for viewing slightly anonymized data (eg, X out the last octet of the IP).

Yes, I planned to do that but did not implement it yet because it works without an extra permission for our use-cases but its definitly planned.

The only way to opt out of this is with the DNT header?

Yes. Or maybe by blocking cookies, depending on the sites session_cache settings.

why do you need hook_boot() and cannot use some hook_form_FORM_ID_alter() when the actual webform is displayed? Please add a comment.

Because, as far as I understand it, hook_boot() also runs on cached pages, but hook_form_FORM_ID_alter() does not.

check_plain() should only happen when data is printed as output to HTML, not when it is saved to the database. Since your module is also responsible for display it should then sanitize the user provided data before printing to HTML. See https://drupal.org/node/28984 . webform_tracking_boot() for example should contain no check_plain() because it does not print anything. webform_tracking_webform_results_extra_data() might be the place, or does the webform module apply sanitization anyway after the hook?

The rationale behind this seems to be to keep the original user input editable which is not possible for webform_tracking anyway so I don't really see the benefit in this case, but will follow Drupal conventions.

I updated your account so you can promote this to a full project and also create new projects as either a sandbox or a "full" project.

Thanks! Highly appreciated, but in this case my colleague torotil already published webform_tracking as a full project for me because the review process took a bit longer than expected: https://drupal.org/project/webform_tracking

klausi’s picture

Then you should delete your sandbox to avoid any confusion which project should be used. Thanks!

phaer’s picture

Issue summary: View changes

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.