Given that there are several duplicates of the same issue, but none that stood out to cover all parts of the problem, I wanted to create this "Master Issue" that could be used to cover all parts of this optimization & configuration effort.

Many users have complained of problems exporting aka downloading the results/submissions from webforms, either as CSV/delimited files or as Excel spreadsheets with either error messages or more commonly just blank screens after several minutes waiting. Further, the watchdog log / dblog may not record any error messages.

There are two parts to this problem:

  1. memory limits building the output
  2. timeouts because it takes too long to finish building the output

Solving Problem 1: Memory Limits

Resolving memory limits may be as simple as adjusting the .htaccess file where it says:

# PHP 5, Apache 1 and 2.
<IfModule mod_php5.c>
  php_value magic_quotes_gpc                0
  php_value register_globals                0
  php_value session.auto_start              0
  php_value mbstring.http_input             pass
  php_value mbstring.http_output            pass
  php_value mbstring.encoding_translation   0
</IfModule>

just add the following extra lines before the line:

  # New settings:
  php_value memory_limit 128M

Unfortunately some web hosts may limit your control over this and may not allow the memory_limit value be configured through a .htaccess file, in which case it may have to be modified in the system's primary php.ini file. The standard system requirements call for this to be 16M (i.e. 16 megabytes) but you may find a need for it to be 64M, 96M or even 128M depending on the number of records or fields in the webform.

Solving Problem 2: Timeout

The PHP configuration variable max_execution_time controls how long the PHP parser will allow a script to run before being forcibly terminated. By default this is set to 30 seconds, but depending on the number of records and fields this will probably not be sufficient. Similarly to the memory limit, this can be configured through the .htaccess file, as follows:

  # New settings:
  php_value max_execution_time 180 #seconds

Unfortunately, even this may not be sufficient as it may still take a very long time to complete.

Proposed Solution

The output generation is controlled by the webform_results_download() function in webform_report.inc which generates a delimited file in the desired format. Tracing through the file, the initial lines run rather quickly but the "foreach ($submissions..." block can still take several hundred miliseconds to save each record, which for a heavily used webform can still cause a timeout even with a generous max_execution_time. This structure needs to be optimized to be much faster.

Comments

damienmckenna’s picture

StatusFileSize
new0 bytes

I added a ton of microtimer() statements to the webform_results_download() function and recorded the execution time taken to output ten records on my MacBook Pro. The top row indicates what the field was to store, the second indicates the time taken to output that one cell; the first column contains the row number, second-last column indicates how long the fwrite() call took, the last column indicates how long the processing of the entire record took; the State select field stores a list of all US States, Canadian provinces and US outlying territories grouped with optgroups, while the Country select stores a list of ~240 countries.

As you can see, textfield and email fields fly through in no time, but selects take a very long time, especially if there are a lot of predefined options to check against.

damienmckenna’s picture

StatusFileSize
new1.7 KB

Oops, I renamed the file before uploading..

damienmckenna’s picture

Here's a patch for _webform_filter_values() that keeps track of whether it has already loaded the user profile, which saves on a large number of unneeded queries.

damienmckenna’s picture

StatusFileSize
new1.74 KB

FYI the first patch has boosted the local execution time from ~0.7s per row to ~0.29s, as seen in the attached CSV.

damienmckenna’s picture

Another patch, this one keeps a static array of the processed options variable from components/select.inc, indexed by the md5sum of the unprocessed options variable.

damienmckenna’s picture

StatusFileSize
new1.74 KB

FYI the second patch has given an approximate 100:1 speed optimization, as seen in the attached CSV.

damienmckenna’s picture

Status: Active » Needs review

I've tested both of the patches on a copy of our production database and I can now successfully output a 6877 record webform in a matter of minutes versus it timing out after ten minutes having barely gotten half-way. Please test the patches and provide feedback. Thank you.

damienmckenna’s picture

Title: Timeouts and memory errors exporting/downloading CSV/Excel spreadsheet of results » Timeouts and memory errors exporting/downloading CSV/Excel spreadsheet of results (D5, D6)
StatusFileSize
new4.83 KB

Please also test the attached cumulative patch for Dupal 5.

damienmckenna’s picture

Have identified a problem stemming from not separating the cached select lists based on whether they were set to be $flat or not. Here's a new patch for D6 that takes this into account.

damienmckenna’s picture

StatusFileSize
new4.94 KB

An updated patch for D5 based on the bug discovered for comment 10.

damienmckenna’s picture

FYI the problem identified in comment 10 would only be a problem when using option groups (optgroup).

leenwebb’s picture

FWIW, my issue at #407460: "Download" results link leads to blank screen is NOT a timeout issue. My webform had 5 fields, and only 4 submissions -- I'm pretty certain that if the server can handle views and aggregator and all of the other parts of that site, it's not a memory error that leaves this screen blank.

damienmckenna’s picture

Just to see, please try the patch and see if it makes a difference.

leenwebb’s picture

Oooh, turns out I had some other problem. I tried to apply the patch and got a big FAIL message. So I re-extracted the webform module and now everything works perfectly (sans patch). Thanks for your help!

(I always think it is so weird when that happens -- I expect that if some file gets corrupted I will end up with big PHP errors and a huge mess, not some stealth unobtrusive error.)

damienmckenna’s picture

leenwebb: What was the error message? D5 or D6? I patched against the relevant branch in CVS rather than the release versions.

leenwebb’s picture

D5; it was a patch error that said that Chunk 2 had failed during the webform.module patching. I figured that since I hadn't changed anything in the actual webform module, something was amiss! Hence the re-extraction of a re-downloaded zip and now everything works as expected.

fredcy’s picture

I came up with a similar patch (to #4) before seeing this issue, and for me it alone speeds things up enough to avoid the PHP timeout problem.

jakraska’s picture

Version: » 6.x-2.6

I applied the patches in #4 and #6 and it fixed my timeout issue - Havent noticed any bad effects yet. Thanks for the patch

damienmckenna’s picture

jakraska: thanks. Maybe we'll be able to twist Quicksketch's arm into pushing out an update to lock it in? :)

quicksketch’s picture

Status: Needs review » Needs work

I definitely like the improvements to _webform_filter_values(). There's no good reason to load that multiple times. The changes to select.inc look extremely suspicious though. Right now there's accounting in the static variable for "$flat = TRUE" and "$flat = FALSE", but it doesn't account for multiple select lists that are both flat or both not flat. It looks like this would erroneously return options for the first select list in the form for all the remaining select lists. Perhaps the $cid needs to be passed into the select list options.

A small code-style issue also: comments should be formatted as sentences, starting with capital letters and ending with periods.

quicksketch’s picture

Title: Timeouts and memory errors exporting/downloading CSV/Excel spreadsheet of results (D5, D6) » Improve efficiency of options in select.inc

Just a note that caching to _webform_filter_values() have been included in the 2.7 version. The select list options would still be good to cache, but they still require work per my comments in #21.

damienmckenna’s picture

Thanks, I'll look into it.

quicksketch’s picture

Category: bug » task
Status: Needs work » Closed (fixed)

Unless there's further interest in completing these changes to the select list options, this will probably go unchanged. The caching in _webform_filter_values() already has corrected the bulk of the problem this issue was trying to solve.

damienmckenna’s picture

I've not been able to get time to work on this since last Summer, I'll try out the latest release and see if there are any other improvements I can submit.

leprechau’s picture

Status: Closed (fixed) » Needs work

I would like to take a look at this issue some more. I am currently running webform-6.x-3.0-beta2 and routinely have 30k+ submissions to my monthly sweepstakes webform. If I disable the output of my two option selects for country and state/province I can download all results with just a few seconds total processing time. However, if I enable both of those selects the process times out even with 512M max script size and 600 seconds max execution time which just seems totally ridiculous. This is definitely something that needs more work.

leprechau’s picture

I just modified select.inc based somewhat on the earlier patch in #10 taking into account the comments in #21 and decided to just not cache non-flat select lists. I can now successfully generate and download a webform with 40k results resulting in a 9.5MB csv with just a few seconds parsing time. I am going to go ahead and make a patch and post it. It would be great to get this included in 3.0 final as I would almost consider it a bug.

leprechau’s picture

StatusFileSize
new4.25 KB

Here is the patch I am currently using...it is working fine for me both on webform display and when downloading results. Please comment on your thoughts.

quicksketch’s picture

Thanksk leprechau! I don't understand why $flat determines when an item is cached or not. For select lists, $flat = FALSE, but for both radios and checkboxes this is TRUE. So we're only caching for select lists? This patch also deviates pretty far from coding standards, if you could clean it up a bit that would be great.

leprechau’s picture

Yes, I was only interested in caching select lists. I can't see a situation where someone would have dozens of check/radio boxes listed down a page. I think most people when listing several dozen options would use a select list. I may be incorrect but it seemed rational to me.

I will re-read the drupal coding standards page and make changes where appropriate.

leprechau’s picture

Okay after some input from quicksketch on irc I think this should be committable. The code is cleaner and less involved than previously suggested patches.

quicksketch’s picture

Version: 6.x-2.6 » 6.x-3.0-beta2
Status: Needs work » Fixed
StatusFileSize
new3.54 KB

Thanks leprechau, I shortened up the variable name for readability (just $options until the end of the loop, then assign it to $cached_options[$flat][$md5]). Committed to both 3.x branches.

leprechau’s picture

Looks great, thank you.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Jorrit’s picture

Would it be possible to apply those changes to 6.x-2.x as well?

ldweeks’s picture

I'm running a site with a webform that currently has just over 500 submissions. It's a large form, and I get a fatal error when I try to download a CSV file.

I would also be very interested in a port of #32 to the 2.x branch. Thanks for your work!

EmanueleQuinto’s picture

Version: 6.x-3.0-beta2 » 6.x-2.9
Status: Closed (fixed) » Needs review
StatusFileSize
new3.14 KB

This patch is a backport of #32 and apply to Webform 6.x-2.9 (select.inc, v 1.22.2.30) and, at least, works for us.

Please disregard this attached patch, small typo at the end

EmanueleQuinto’s picture

StatusFileSize
new3.15 KB

This should work.

alanburke’s picture

Subscribe

alanburke’s picture

Patch at 38 is working fine for me on a 6.2.9 install.

machi27’s picture

hi, could anyone pls post the file itself (select.ini patched) for version 6.2.9, i dont have the tools to patch it here

thanks

quicksketch’s picture

Status: Needs review » Closed (fixed)

This will not be backported, I suggest upgrading to 3.x now that it is the recommended/stable release.

mrfelton’s picture

StatusFileSize
new3.2 KB

Here is the patch from #38 against 6.x-2.x that will apply from drush make, for those that need it.

luthien’s picture

I'm having the same problem with 6.x-3.18 version. Blank page for table view and download results view. Only 800 submissions, the form has several select.

memory_limit = 128M
max_execution_time = 30 (180 same results)