Multipage forms with CCK
Sometimes forms are too long to fit on a single Internet page without looking sloppy. Many folks on the Drupal forums have expressed interest in building multipage or multistep forms but it turns out it’s not the easiest thing to do. Let me set up what I’m trying to do. Then I’ll talk about what others have done. Finally we’ll delve into my solution.
The Goal
The handbook page “Dynamic and Multipage forms with Forms API” (http://drupal.org/node/101707) gives an excellent introduction and simple solutions to more complex form situations. The author describes three scenarios, one of which is the “large form divided into multiple steps for ease of use”. This is the scenario addressed here. You create a custom form so that in the hook_form you hide fields not to be displayed and in the hook_submit, you submit the form only after you’ve reached the last step.
But we need to go a couple steps farther. First, we are building our form using the Content Construction Kit (CCK). The CCK module stores data in special tables, but at a higher level, the content type is a node type. The solutions provided in the link above are given for generic non-node types. In fact, comments on that handbook page indicate people still don’t know how to do it for node types. My solution, though specific to CCK types, can be altered to work for other or custom node types.
Assume we have a very long CCK form with very many fields. When a user goes to create new content of this CCK type, they are presented with a subset fields on the first page, click submit, and are then presented with a different subset fields on the second page, and so on. On the last step, all of the field are FINALLY submitted as a node. Additionally, each step does its own validation so that required fields or badly formatted input must be fixed before going to the next step. The interface to edit the form after submission is the exact same as that for a new form, except the entered field values are already given in the fields.
Other attempts
There are a couple of modules that attempt multipage forms:
- Pageroute
- CCK Wizard
Pageroute
The pageroute module allows you to create a chain of pages that will be display, one after the other, in the order specified. This could certainly work for a multipage form, but the problem is that each form is submitted as a separate node. Finding a way to reference them all is somewhat messy. And forget about making a simple view!
CCK Wizard
The CCK wizard module creates a CCK field which you insert into your CCK type to demark where your pages will be split. Recently the code has been upgraded to work with Drupal 5. But the code is not ready for production. In fact, I had trouble getting it to work at all.
That aside, because there are a limited number of fields you can have in a CCK type (due to the limited default range of the ‘weight’ field), it is most practical in long forms to create field groups for your fields. Then you separate the field groups by these Wizard fields. CCK Wizard works by changing field types to “hidden” in the $form array. Unfortunately, changing a field group to type “hidden” results in the fields within that field group not being propagated to the next step in the form. The field’s structure is lost, as are any values that you have saved in them.
So far, CCK Wizard has not addressed fieldgroups. If it eventually does (and fixes a couple other bugs) it might be a pretty good solution for multistep forms.
Form API 3.0
I haven't looked into Drupal 6.0 yet, but I kept reading this and that about FormAPI changes that will supposedly make multi-step forms much easier. I don't know anything concrete though.
Our Solution
Some Implementation Decisions
Our example CCK content type has so many fields that we either have to enlarge the range of field “weights” by modifying the core, or have to place the fields into field groups. I like the latter solution as it presents fewer problems for minor core version upgrades and is easier to organize. We are going to place multiple fields inside of field groups, and each field group represents a single step in the multistep form. In fact, other than the title and body, all fields are going to be in fieldgroups.
We are going to need to write PHP code that will implement the following three hooks:
- hook_node_form() – preparation for making a multi-step form
- hook_form_alter() – modifying the form to have only the fields we want
- hook_form_submit() – storing the form values from all steps into the node
It is therefore necessary to create our own module, which we will have to enable in the admin section. This module will not make any DB changes, so we don’t have to worry about future module revisions causing DB inconsistencies.
In order to ensure that our hooks get called, we need strict naming conventions for our hooks. We really only need the _node_form hook to be named after the machine-readable name of our content type (i.e., our_cck_type_node_form). But for consistency, we name the other hooks similarly.
General Flow
In this flow, a whole slew of details are missing. But it gives a general idea of where our module comes into play. Some familiarity with the FormAPI and the Form workflow is helpful to understand this.
- The user initiates creating new content of our CCK type (node/add/our_cck_type).
- drupal_get_form is called, which (several layers deep) calls our_cck_type_form(), which is our hook_form.
- At this point, none of our CCK fields are present. But we need to add our ‘step’ hidden field to the form, and a couple of other things.
- During the hook_alter_form step, the CCK module finally adds the CCK fields.
- Our hook_alter_form function is called, our_cck_type_alter_form(). Here we “hide” the fields that aren’t to be shown and make it so that our submit function is called instead of node_submit.
- The user enters values into the visible fields and clicks “Submit”.
- Our hook_submit function, our_cck_type_submit() is called (several layers deep in drupal_get_form) and the form data is saved for later use.
- If we are not on the last step, go to step 2.
- The last step has been submitted, so we retrieve the previously stored form values and manually call node_submit() so that the node is created and the values are stored to the DB.
The Hooks
Here are our three hooks.
$max_step = 6;
/**
* This gets called before the $form_values are filled with the CCK fields.
* This is where I set the 'step' hidden field so I know what step I'm in
* There's also a couple of things in here that are necessary to do multipage
* forms and are specified in the drupal API.
*/
function our_cck_type_node_form ($node, $form_values = NULL)
{
$form = array();
// Determine the current step from the 'step' hidden field.
if (!isset($form_values['step']))
$step = 1;
else
$step = $form_values['step'] + 1;
$form['#multistep'] = TRUE;
$form['#redirect'] = FALSE;
$form['step'] = array (
'#type' => 'hidden',
'#value' => $step,
);
// Save the form ID from previous steps
if (is_array($form_values))
{
foreach ($form_values as $k => $v)
if (preg_match('/^build_id/', $k))
{
$form[$k] = array (
'#type' => 'hidden',
'#value' => $v,
);
}
}
// Save the form ID from the last step
if ($step > 1)
$form['build_id' . ($step - 1)] = array (
'#type' => 'hidden',
'#value' => $form_values['form_build_id'],
);
$form = array_merge($form, node_form($node));
// This is where people land after submitting the form.
if ($step == $max_step)
$form['#redirect'] = 'user';
return $form;
}The function is our_cck_type_node_form(). This is not precisely a hook_form. Because CCK types are node types, adding a node is done in the menu callback for ‘node/add/’. That callback is [type]_ form() if it exists, or node_add(). In our case, it’s the latter (otherwise it would be called our_cck_type_form). Node_add() calls our_cck_type_node_form which is preferable because then the node core module will do access control checking and set up some initial variables.
What is done in this function is pretty much what the “Dynamic and Multipage …” handbook page says should be done. First I determine what step I’m on based upon the existence and value of the ‘step’ hidden field. The ‘#multistep’ field ensures that drupal will store the previous step’s values so that they are retrievable during hook_submit. Setting the ‘#redirect’ field to FALSE will prevent drupal from redirecting to a completely other page after the submit button is pressed. And then we set the ‘step’ hidden value in the form so that our module knows what step we are on.
We merge the currently created form with the form created by node_form (so that we get all the node goodies like UID, title, etc.). Our example node has six steps, specified in the $max_step variable. On the sixth step, we actually do want to redirect the page after submission, so that is done. I hardcoded ‘6’ into this example, but it could probably be done more generally by looking at the number of fieldgroups in the CCK type. Lastly we return the form, just as a good hook_node_form should do.
You’ll notice I skipped a big section having to do with ‘build_id’. I did that intentionally because it is better explained when I’ve gone through the submit hook.
/**
* This gets called after the CCK fields are present. So I hide everything that
* isn't supposed to be displayed on this step. The fields to be displayed are all in
* field groups, one for each step. So I 'unset' all other fieldgroups. This gets rid
* of all subfields. Then I do some munging of other fields.
*/
function our_cck_type_form_alter ($form_id, &$form)
{
if ($form_id != 'our_cck_type_node_form' || arg(0) == 'admin')
return $form;
// There are six field groups in the CCK node, one for each step. This is not a coincidence.
$group = array("none", "group_1", "group_2", "group_3", "group_4", "group_5", "group_6");
$step = intval($form['step']['#value']);
$prefix = $group[$step];
// Here I 'unset' field groups in the form that are not associated with the current step. This
// hides them, in effect.
if (is_array($form))
foreach ($form as $key => $element)
if (preg_match('/^group_/', $key))
if (!preg_match("/^$prefix/", $key))
unset($form[$key]);
// This makes drupal call my submit hook instead of node_form_submit
$form['#submit'] = array('our_cck_type_submit' => array());
// This sets the title to read-only so it can't be edited after page 1
if ($step > 1)
$form['title']['#attributes'] = array('readonly' => 'readonly');
// This changes the submit button text.
if ($step < $max_step)
$form['submit']['#value'] = 'Go to Step ' . ($step + 1);
else
$form['submit']['#value'] = 'Save Form';
// And this hides the node body
$form['body_filter']['body']['#type'] = 'hidden';
// This is removed because it messes up the page order
unset($form['preview']);
return $form;
}All form_alter hooks are called after the form array has been put together and just before it is going to be rendered in HTML. It is an interesting trivia fact that CCK adds all of it’s fields in a form_alter hook, because it is altering a node form. Our form_alter hook must be called after the CCK hook (I'm not currently sure how to ensure this, but it seemed to work fine by default. Perhaps it is the order in which the modules are enabled that matters?).
Since I only want to modify our form, I ensure we’re dealing with the ‘our_cck_type_node_form’. I also don’t want to make any changes to the form if we are in ‘admin’, because otherwise this causes problems when trying to edit the content type itself. So if we don’t pass either of these tests, we return the unaltered form.
After this, we know we are dealing with the form we want to alter, so we get some preliminary information first. We create our $group array, which coincides one-for-one with the names of the fieldgroups in our form (CCK automatically adds the “group_” prefix to the machine-readable name of the fieldgroup). The first element (“none”) is just an unused placeholder for index 0. We retrieve our step value from the hidden field (set in the node_form hook) and then we retrieve the fieldgroup name we are looking for.
Here’s the deal – we want to hide all fieldgroups except the one associated with this step. But we don’t want to hide all the rest of the node information (or maybe we do, I'm not sure?). So we go through the form array, looking for any fields that fit our generic name of a fieldgroup (simply “group_”). Now we need to decide whether this fieldgroup should be hidden, so we compare it to the field group that should be displayed in this step. If it doesn’t match, then we need to hide it. And we hide it by ‘unset’ting the entire fieldgroup form value. This way the fields don’t even show up in the HTML.
The “Dynamic and Multipage …” handbook page says that the fields should be hidden in order to propagate field values through the different steps. Well that’s one way to do it, but it doesn’t work with CCK fieldgroups. That is because changing a fieldgroup form array type value to “hidden” causes the fieldgroup to collapse, and all the fields under it are now lost. There are other things you can do, like recursing into fieldgroups and hiding non-fieldgroup elements, but that is not how we chose to do it here. As we will see when analyzing the submit hook, field values are propagated using the built-in PHP $_SESSION variable. And since the values are stored on the server, we can completely remove them from the form using ‘unset’.
After this we set ‘#submit’ so that after clicking the submit button, drupal will call our_cck_type_submit, instead of the default for nodes, node_form_submit. After this is a bit of custom cleanup stuff. First, I didn’t want the user to be able to edit the title after the first page, so I made the field read-only. Then I changed the text on the submit button depending upon what step we are in. Then I hid the body field. In this example CCK type, we still need the body field, but we don’t want the user to be able to see it during the form submission. It’s just a specific need I had for this specific application. Next I unset the ‘Preview’ button that comes by default with all node forms, since it really gets page order out of whack. And lastly, I return the altered form.
/**
* Except when submitting the last step, the form values need to be stored in memory. On the last step,
* the form values are finally stored in the database (as a node).
*/
function our_cck_type_submit($form_id, $form_values = NULL)
{
// Here, I'm saving the form values that were just submitted in memory.
$args = $_SESSION['form'][$_POST['form_build_id']]['args'];
$_SESSION['myform'][$_POST['form_build_id']] = array('timestamp' => time(), 'args' => $args);
// Get rid of data stored more than 24 hours ago.
_clean_myform_sessions();
// If we didn't just submit the last step, then we have nothing else to do.
if ($form_values['step'] != $max_step)
return;
// The following several lines of code merges all previously saved form data into a single
// array called $values.
$values = array();
$args = array();
if (is_array($form_values))
foreach ($form_values as $k => $v)
if (preg_match('/^build_id\d/', $k))
if (is_array($_SESSION['myform'][$v]['args']))
$args = array_merge($args, $_SESSION['myform'][$v]['args']);
if (is_array($_SESSION['form'][$_POST['form_build_id']]['args']))
$args = array_merge($args, $_SESSION['form'][$_POST['form_build_id']]['args']);
if (is_array($args))
foreach ($args as $k => $v)
if (is_array($v))
$values = array_merge($values, $v);
$values = array_merge($values, $form_values);
// Now the entire form is in $values, lets finally submit the node
return node_form_submit($form_id, $values);
}
/**
* Remove form information that's at least a day old from the
* $_SESSION['myform'] array.
*/
function _clean_myform_sessions() {
if (isset($_SESSION['myform'])) {
foreach ($_SESSION['myform'] as $build_id => $data) {
if ($data['timestamp'] < (time() - 84600)) {
unset($_SESSION['myform'][$build_id]);
}
}
}
} This is by far the ugliest of the custom hooks and much of what is here is a kludge. But it works! This submit hook is going to be called every time the user clicks the submit button. But we don’t want to save any data in the database until the very last step. So assuming we are not on each step, we want to save values submitted in the previous step to a more permanent location in memory. That location is the $_SESSION variable. First a bit of esoteric background.
Each instance of a form has a randomly generated, hopefully unique, “form_build_id” associated with it. This value is stored in the form as a hidden field. In a multistep form, the $form array from the previous step is saved in $_SESSION[‘form’] keyed on the value of ‘form_build_id’. This allows later retrieval of the form values in the case of multistep forms.
The problem is, for some reason after the submitted form is processed, the ‘form_build_id’ key in $_SESSION is deleted in the drupal core code. If we have more than two steps, that’s a problem. So, we have to save the form some other way. We store the form in different place is $_SESSION, namely $_SESSSION[‘myform’], and key the form on the ‘form_build_id’. This is how we ensure form values are propagated from step to step. So that is what the first two lines of the function does.
Since we are in charge of $_SESSION[‘myform’], we need to make sure it gets cleaned up occasionally, otherwise over time we would run out of memory. So we call _clean_myform_sessions(), which is almost a direct copy of a similar function in the drupal core, called from drupal_get_form().
In the next lines of code, if the user isn’t on the last step yet, we return which in essence means we are telling drupal core that we have taken care of the data and to go forward. We have taken care of the data – we stored in it $_SESSION.
On the other hand, if the user had submitted on the last step, we finally get to store the data in the database. In order to ensure that all of the values submitted in all steps are stored in the database, we need to merge all of the previously submitted values with the current values. This is done by merging all of the $form arrays from all of the previous steps with the current form array. And those previous $form arrays had all been stored in $_SESSION[‘myform’].
Remember that the key to retrieving these $form arrays is the ‘form_build_id’. Well, each of these form steps have been getting a new ‘form_build_id’. So in order to retrieve all of the previous ‘form_build_id’s, we needed to maintain a list of them. Fortunately, we took care of that in our_cck_type_node_form(). This is the chunk of code we skipped. In that chunk of code we are creating custom hidden fields (named “build_id#”) containing the ‘form_build_id’s of previously saved form steps. We finally get to do something with them when submitting the last step.
So the several obscure lines of code in our_cck_type_submit is a bunch of merging of $form arrays, current and past, into one array I called $values. This $values array contains everything that was submitted by this user in this form instance. To submit it, we simply call node_form_submit(), which will create the node and call the CCK submit hook to submit the fields to the database. The return value is a URL to which we are directed after submission. Since we set ‘#redirect’ to ‘user’ for step 6 in our hook_form, that is where the browser is redirected.
Usability
So, how generic is this code? If I were to reuse this code, I would do the following:
- Global
- change the value of $max_step to reflect the number of steps in my CCK type
- change the name of the hook functions to match our new CCK type
- hook_node_form
- change where the user is redirected after submitting on the last step
- hook_alter_form
- change the $form_id comparison to match the name of the CCK type
- change the $group array to reflect the actual names of my fieldgroups
- change the submit function to be called using the ‘#submit’ form value to match the name of the CCK type
- Change some of the output options, like the submit button text and the body visibility.
There is nothing to be done to hook_submit (other than change the function name). Overall, these changes should take only a couple minutes.
Missing Capabilities/Drawbacks
There are some capabilities that are not given here. That is either because they are not possible with this implementation (drawback), or the solution is beyond my mental grasp at the moment (missing). I don’t really know which capabilities below are drawbacks, or are just missing. Some of these are pretty important capabilities for production-level online forms.
Partial submission – the user fills in part of the form, but wishes to submit what he/she has already input, saving the rest of the form for later. This also assumes some method of “finalizing” submission.
Jump to step – the user has partially submitted a form and has returned, and would like to pick up where he/she left off. The current implementation only supports going through the form serially, step by step.
Go back – the user gets past the first page, realizes he/she entered an incorrect value, and wants to go back to a previous step to make the correction.
Progress indicator – the user is presented with an indicator, possibly a progress bar or a list of all steps with the current step emphasized, or something like “Step 2 of 13”.
Multiple values – the user fills in the available fields but needs more so he/she hits submit. This new page is just like the one just submitted, except with additional blank fields available.
This is my first stab at writing a handbook and my first stab at multipage forms. If others find this useful, that would be great. If you see anything that needs to be corrected, I will gladly do so. If you have questions about the implementation, I will try to answer them to the best of my ability. If you have betters ways of doing these things, or have ideas for overcoming the missing capabilities/drawbacks, I’d love to see them.
This work has been done as a part of projects for Initsoft LLC

Module load order (or module weight) and CCK
For the solution listed here to work, your module code needs to execute after the CCK module. To change the order or weight associated with modules, see the instructions included here:
Howto: Update a module's weight | drupal.org
A simpler way
Here is an alternate, simpler option:
1. create field groups (e.g. A, B,C or 1,2,3 or I., II., III.)
2. make all field groups collapsed by default
This solves: Partial submission, Jump to step, Go back, Multiple values, needless core hacking (unless you have more than 20 groups (-10 to +10 weights) and can't sort groups alphabetically or numerically to bypass that limitation)
This does not solve: Progress indicator, graceful Javascript degradation (you get a big ugly form like we're trying to avoid)