Download & Extend

Pager results page

Project:Webform
Version:5.x-2.x-dev
Component:Code
Category:feature request
Priority:normal
Assigned:Unassigned
Status:closed (fixed)

Issue Summary

We are using webform for handling sweepstakes with generally a lot of submissions. The last one had over 30k submissions. Once the webform_submitted_data gets over 400k rows it is basically impossible to export the data or even to view the results page.

I have two proposed fixes for this and a larger issue that needs to be addressed.

The first one is for webform.inc in which I changed the JOIN for webform_submissions and webform_submitted_data to an INNER JOIN instead of a LEFT JOIN. The LEFT JOIN takes significantly longer to run, and it seems to me that there is never going to be a case where there is a sid that exists in webform_submitted_data and not in webform_submissions.

mysql> SELECT sd.nid, sd.sid, s.submitted, sd.cid, sd.no, sd.data
-> FROM webform_submitted_data as sd
-> LEFT JOIN webform_submissions as s on (sd.sid = s.sid)
-> WHERE sd.sid = 828;
+------+-----+------------+------------+----+------------------------+
| nid | sid | submitted | cid | no | data |
+------+-----+------------+------------+----+------------------------+
| 8012 | 828 | 1172748445 | 1171467413 | 0 | Name |
| 8012 | 828 | 1172748445 | 1171467437 | 0 | Name |
| 8012 | 828 | 1172748445 | 1171467299 | 0 | email@example.com |
| 8012 | 828 | 1172748445 | 1171467327 | 0 | email@example.com |
| 8012 | 828 | 1172748445 | 1171467502 | 0 | address1 |
| 8012 | 828 | 1172748445 | 1171467523 | 0 | address2 |
| 8012 | 828 | 1172748445 | 1171467541 | 0 | city |
| 8012 | 828 | 1172748445 | 1171467561 | 0 | state |
| 8012 | 828 | 1172748445 | 1171467579 | 0 | zip |
| 8012 | 828 | 1172748445 | 1171467620 | 0 | 000-000-0000 |
| 8012 | 828 | 1172748445 | 1171468251 | 0 | gender |
| 8012 | 828 | 1172748445 | 1171468466 | 0 | b-day |
| 8012 | 828 | 1172748445 | 1172083375 | 0 | form_checkbox_1 |
| 8012 | 828 | 1172748445 | 1172083857 | 0 | form_checkbox_2 |
+------+-----+------------+------------+----+------------------------+
14 rows in set (1.03 sec)

Now run with INNER JOIN:

mysql> SELECT sd.nid, sd.sid, s.submitted, sd.cid, sd.no, sd.data FROM webform_submitted_data as sd INNER JOIN webform_submissions as s on (sd.sid = s.sid) WHERE sd.sid = 828;
+------+-----+------------+------------+----+------------------------+
| nid | sid | submitted | cid | no | data |
+------+-----+------------+------------+----+------------------------+
| 8012 | 828 | 1172748445 | 1171467413 | 0 | Name |
| 8012 | 828 | 1172748445 | 1171467437 | 0 | Name |
| 8012 | 828 | 1172748445 | 1171467299 | 0 | email@example.com |
| 8012 | 828 | 1172748445 | 1171467327 | 0 | email@example.com |
| 8012 | 828 | 1172748445 | 1171467502 | 0 | address1 |
| 8012 | 828 | 1172748445 | 1171467523 | 0 | address2 |
| 8012 | 828 | 1172748445 | 1171467541 | 0 | city |
| 8012 | 828 | 1172748445 | 1171467561 | 0 | state |
| 8012 | 828 | 1172748445 | 1171467579 | 0 | zip |
| 8012 | 828 | 1172748445 | 1171467620 | 0 | 000-000-0000 |
| 8012 | 828 | 1172748445 | 1171468251 | 0 | gender |
| 8012 | 828 | 1172748445 | 1171468466 | 0 | b-day |
| 8012 | 828 | 1172748445 | 1172083375 | 0 | form_checkbox_1 |
| 8012 | 828 | 1172748445 | 1172083857 | 0 | form_checkbox_2 |
+------+-----+------------+------------+----+------------------------+
14 rows in set (0.26 sec)

The patch for this is attached as webform_inc_0.patch

The second issue is that we need to paginate the submissions page so that the initial load of that page is not so slow. I will hopefully have some patches for this issue soon.

The final issue is that we need to figure out how to optimize these queries so that we don't have to iterate over these rows for each submission. In my testing, doing a large select for a nid instead of each sid show significant performance for exporting this data. I will test some code and get a patch asap.

If anyone has any additional thoughts / questions / answers, I would love to talk about this further.

steve rude

AttachmentSize
webform_inc_0.patch764 bytes

Comments

#1

Nice, thanks! I'll review and apply this patch went I have time. Please continue to contribute!

It would be nearly trivial to db_pager_query() the results page also as you suggested. Keep it coming :)

#2

I just realized we were using an older version of webform. The patch still applies, but there is now a _webform_fetch_submissions() that gets everything in one big query instead of connecting each time to the database. (ouch, we were using a very old version :| )

I am syncing ourselves with head now, and I am re-rolling the patch to include the INNER JOIN for _webform_fetch_submissions() as well as the singular version.

steve rude

AttachmentSize
webform_inc_1.patch 1.1 KB

#3

Haha, yeah I had a similar site :)

MTVUK uses webform for their competitions. It took 45 minutes to generate the CSV file before :D

It's now much faster with that single query.

#4

I did some additional testing with this based on the issue I was seeing with indices. I noticed that the WHERE clause using sd.nid causes a a pretty big delay in query time. If you use the s.nid instead it can see a pretty big performance boost. I will re-roll the patch as soon as I have a little more time.

mysql> SELECT COUNT(*) FROM webform_submissions as s INNER JOIN webform_submitted_data as sd ON sd.sid = s.sid WHERE s.nid = 8012;
+----------+
| COUNT(*) |
+----------+
|   442400 |
+----------+
1 row in set (0.36 sec)

mysql> SELECT COUNT(*) FROM webform_submissions as s INNER JOIN webform_submitted_data as sd ON sd.sid = s.sid WHERE sd.nid = 8012;
+----------+
| COUNT(*) |
+----------+
|   442400 |
+----------+
1 row in set (1.83 sec)

mysql> SELECT COUNT(*) FROM webform_submissions as s LEFT JOIN webform_submitted_data as sd ON sd.sid = s.sid WHERE sd.nid = 8012;
+----------+
| COUNT(*) |
+----------+
|   442400 |
+----------+
1 row in set (1.83 sec)

mysql> SELECT COUNT(*) FROM webform_submissions as s LEFT JOIN webform_submitted_data as sd ON sd.sid = s.sid WHERE s.nid = 8012;
+----------+
| COUNT(*) |
+----------+
|   442400 |
+----------+
1 row in set (0.39 sec)

steve

#5

I happen to need the pagination functionality, so I wrote my own patch, which I've attached here.

AttachmentSize
webform-paginate.patch 6.12 KB

#6

Title:problems with large amount of submissions» Pager results page
Status:active» needs work

Thanks Gribnif, could you make a setting in admin/settings/webform for the number of submissions displayed per page (with 0 being all results)?

#7

I've found that the default /node/X/results table is generally much more compact than the /node/X/results/table view, so I think these should have different per page settings. Agreed?

#8

Hmmm... or, better yet, allow the user to choose how many rows to display at once by adding a select list to the pages containing tables.

#9

gribnif,

I tried that patch for the latest version of webform (// $Id: webform.inc,v 1.48.2.12 2007/06/12 03:02:44) and webform_report (// $Id: webform_report.inc,v 1.15.2.3 2007/05/28 19:40:54) and I got the following error message.

Fatal error: Unsupported operand types in /includes/pager.inc on line 73

I know this issue is marked as HEAD, but, I just thought I'd check if anyone has got paged results (submissions) as opposed to just the one list.

ta,

Phil

#10

Marked http://drupal.org/node/250226 as duplicate. I'm still interested in this feature for anyone looking to implement it.

#11

Category:bug report» feature request
Status:needs work» active

The current version 5.2x uses the following query to get the submissions:

$query = 'SELECT s.*, sd.cid, sd.no, sd.data, u.name, u.mail, u.status '.
           'FROM {webform_submissions} s '.
           'LEFT JOIN {webform_submitted_data} sd ON sd.sid = s.sid '.
           'LEFT JOIN {users} u ON u.uid = s.uid '.
           'WHERE sd.nid = %d';

i think to get the pager working we need to split the submission_data in a separate query, or?

regards pebosi

#12

i got it working with changing the function like this:

function webform_get_submissions($nid, $header = NULL, $uid = NULL) {
  $query = 'SELECT s.*, u.name, u.mail, u.status FROM {webform_submissions} s
    LEFT JOIN {users} u ON u.uid = s.uid WHERE s.nid = %d';
   
  if ($uid) {
    $query .= ' AND u.uid = %d';
  }
 
  if (is_array($header)) {
    $query .= tablesort_sql($header);
  }
 
  $res = pager_query($query, 50, 0, NULL, $nid, $uid);

  $submissions = array();

  // Outer loop: iterate for each submission.
  while ($row = db_fetch_object($res)) {
    $submissions[$row->sid]->sid = $row->sid;
    $submissions[$row->sid]->submitted = $row->submitted;
    $submissions[$row->sid]->remote_addr = $row->remote_addr;
    $submissions[$row->sid]->uid = $row->uid;
    $submissions[$row->sid]->name = $row->name;
    $submissions[$row->sid]->status = $row->status;
   
    foreach (webform_get_submission($nid, $row->sid) as $submission_data) {
      $submissions[$row->sid]->data = $submission_data;
    }
  }

  return $submissions;
}

please review ;)

#13

Version:» 5.x-2.x-dev

#14

I implemented #12 and it seems to work fine. However, there are no pager links at the bottom of the page. So to advance to the next page you have to manually append ?page=1 to the end of the URL in the web browser.

#15

No you dont need to add the page to the url. just override the theme function " $output .= theme('table', $header, $rows) . theme('pager', NULL, 50, 0, $nid);" and change the last line to the following code:

  $output .= theme('table', $header, $rows) . theme('pager', NULL, 50, 0, $nid);
  result $output;

#16

sorry, the function should be "phptemplate_webform_results_submissions"

#17

Status:active» needs work

Could this be written as a patch for me to test? Also, the theme function should include the pager if necessary for the user, we can't be expecting people to theme the page if they have more than 50 results.

Finally, I would *love* if there was an option to adjust how many results are shown per page. Simply adding a few links at the top of the page for results per page: 10, 25, 50, 100, all. Each setting a param on the page ?count=x. Then pulling in that $_GET['count'] into the page query (make sure to check for integers first).

#18

i could create a patch but in the current dev Version the query has changed so that my changes from #12 dont work. Do you really need the submitted data in webform_get_submissions ?

#19

Yes, absolutely. We can't change the API function webform_get_submissions unless we create another whole branch (which would become Webform 3.0). Considering that we're still getting people migrated to 2.x, there's no way we can change this function currently.

#20

Status:needs work» needs review

I have form with thousands of submissions and I need this feature. So I've made a little patch for myself.
You could use it if you want. It's for version 5.x-2.3.
I tried to add paging only to result page — download option still gets all available submissions.

AttachmentSize
webform_paging.patch 3.49 KB

#21

Status:needs review» needs work

You should be using the Drupal function pager_query() (http://api.drupal.org/api/function/pager_query/6) rather than generating your own custom constructed query. "IN" clauses are also rather inefficient, but certainly faster than loading everything ;).

#22

You should be using the Drupal function pager_query() (http://api.drupal.org/api/function/pager_query/6) rather than generating your own custom constructed query. "IN" clauses are also rather inefficient, but certainly faster than loading everything ;).

I am using pager_query() for $pager_query result. I've tried to use this function for main query (SELECT s.*, sd.cid, sd.no...) but it isn't possible, because it get many rows for every submission — each with single field data. So pager_query not suitable in this case. So first I gets only submissions by pager_query, one submission for row, and then I use IN() for filter field data by selected submissions.

Maybe my solution not very elegant =) but I've tried to limit module code patching. With current queries I can't offer another variant.

#23

Ah, okay fair enough. I just browsed over the code and noticed the lack of pager_query() in that case. I think I'd be fine with this patch, though we should change the $paged variable to be something like $pager_count that determines the number of results per page. A value of 0 would mean all results.

#24

Ok, here is version with $pager_count variable. I think you right - this variant is better, but maybe you want to make for this number option in the module settings?
Plus, I've added paging to viewing results as table. All other things the same.

AttachmentSize
webform_paging_2.patch 4.64 KB

#25

Status:needs work» needs review

Awesome, thanks! I'll review it when I get a chance and put it in.

#26

Status:needs review» fixed

As per #6, I wanted users to be able to select the number of results per page. This revised patch does *not* make a setting for it at admin/settings/webform (though that would be great to add later). Based on the excellent patch by beholder, I've modified it slight so that it only defaults to 50 results per page, but a new option at the top of the page allows the user to change it to a higher number. The options to select number of results per page only shows up if there are more than 20 submissions (which is the smallest increment).

Thanks for all the work Gribnif, pebosi, and beholder! I've committed this patch and I hope to have a 2.4 version out soon, this was my one last feature hold up with the release.

AttachmentSize
webform_paging.patch 12.4 KB
webform_paging5.patch 10.5 KB

#27

yay! i can't believe it was 21 months ago that i opened this. :) good job.

#28

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

nobody click here