CCK checkbox groups do not sync correctly
| Project: | Salesforce |
| Version: | 6.x-2.x-dev |
| Component: | Code |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs review |
I've taken a first stab at synching multi-value fields. I hope this is useful for some other folks.
Use Case Addressed by This Patch
On the salesforce side, we have a multi-select picklist, and on the Drupal side, we have a multiple-value CCK field which displays as a set of checkboxes, one for each allowed value.
Current Behavior Fixed by Patch
When importing the field, all of the data is inserted as a semicolon-delimited list into value 0 of the CCK field. Not only is this incorrect, but it breaks the field in that, after importing, none of the checkboxes will show as checked, regardless of the choices the user has made for the field.
When exporting the field, only value 0 of the CCK field is sent to salesforce. Thus all but the first checkbox in Drupal are ignored when exporting.
Desired Behavior Achieved by the Patch
All of the data is preserved correctly when syncing in either direction. That is:
When importing the field, the semicolon-delimited data is parsed into multiple values, each of which is assigned to a value of the CCK field.
When exporting the field, all of the values are collected, and concatenated into a semicolon-delimited list for consumption by salesforce.
What the Patch Does
In order to achieve this fix, we had to make several changes.
Modify Import and Export Handlers
The signature for import and export handlers has been modified so they can be passed field definition information for both the salesforce and Drupal fields that they are synchronizing.
Since the field definition for salesforce fields now includes the field's type, handlers can use this information to process a particular synchronization. Similarly, since the field definition for Drupal fields now include whether the field supports multiple values, handlers can use that information in determining how to process a particular synchronization.
Thus the new signatures look like this:
_sf_node_export_cck_default($node, $fieldname, $drupal_field_definition, $sf_field_definition)
_sf_node_import_cck_default(&$node, $drupal_fieldname, $drupal_field_definition, $sf_data, $sf_fieldname, $sf_field_definition)Modify Calls to Handlers
Any code which calls an import or export handler has been updated to support the new signatures. To be clear, field definitions are extracted from object definitions, and those are obtained by calling salesforce_api_fieldmap_objects_load.
Since handlers require field definition information about both the salesforce and Drupal fields they are processing, anywhere which calls a handler must obtain the object definitions for both the salesforce and Drupal fields.
Modify Creation of Object Definitions
salesforce_api_fieldmap_objects has been modified so that it now includes 'type' => $field->type for each salesforce field in each object.
Similarly, sf_node_fieldmap_objects has been modified so that it now includes the following as part of the definition for each field in a node:
'type' => $field['type'],
'multiple' => $field['multiple'],Note that we are not currently taking advantage of including the Drupal field's type as part of its definition. However, this seemed like something that could easily be useful in future.
Refactoring
While in the process of applying the above changes, I also did some refactoring of the code. I did this because I have found the code pretty confusing to work with, and because I believe that part of that stems from too many variable names being overly general. All too often, I found myself having to do a fair amount of work to keep track of which variable held what information.
To mitigate this, I have modified variable names in an attempt to make abundantly clear things such as, whether a variable holds information pertaining to the Drupal field, or to the salesforce field, whether a variable holds an actual object (which is to say data) or an object's definition, and whether we are dealing with an entire object or only a single field. Similarly, I have tried to replace the variables $key and $variable with more specific variable names. The result is significantly longer variable names, but I think that price is worth paying in order to make the code easier to understand and work with.
Note: I only refactored functions which I was changing anyway. If people like the changes I made, I would be willing to do more such refactoring.
| Attachment | Size |
|---|---|
| salesforce_multiselect_fields.patch | 16.29 KB |

#1
One important thing to bear in mind when using the above patch is that your values must not have leading or trailing spaces. In particular, if your Drupal field uses the Allowed values list, and you enter both keys and labels, there must not be any spaces on either side of the | delimiter. That is, you need to enter "key|label" not "key | label".
If there are leading or trailing spaces, the result will be that the checkbox for those values will always be unchecked, regardless of whether or not the user selected them. I am specifically referring to the edit form for a node.
Note that it may look correct initially, but after one has exported to salesforce, and imported from salesforce, the broken effect will appear.
The reason for this is that salesforce runs a trim when it imports. So, if for example, one exports "key ", salesforce will convert that to "key". So far that's fine. However, this means that when one imports from salesforce, the Drupal field will receive the same value of "key". The problem is that when Drupal goes to determine which checkboxes are checked, it does not run a trim before doing so. This results in "key " not matching "key". Thus Drupal finds that the field value of "key " does not exist (even though the value "key" does), and thus the checkbox displays as unchecked. Finally, if the user then submits the form, the checkbox will in fact be unchecked, and the value will be lost.
So far as I can tell, the only way to change this would be to modify CCK so it ran a trim at the appropriate place. However, so long as that has not happened, the easy fix is to not have spaces on either side of one's | delimiter in the Allowed values list.
#2
This is more a bug than a feature. This requires extensive review, which I dont have time for. The patch mostly looks fine except for some offset and fuzz. It also has tabs instead of two space characters, and CRs instead of Drupal's standard unix-style line breaks. It applied though.
In order to make this patch easier to review you could move the variable renaming into another patch – though this is probably quite difficult and possibly not worthwhile.
#3
We should also make sure this fixes any similar issues with radio-groups, multiple-select CCK option fields (both checboxes and <selects>), multiple text fields and any other similar field types.
#4
Bevan, thanks for the feedback. I'm submitting a very slightly modified patch.
I've replaced tabs with 2 spaces.
I tried searching for \r, and couldn't find any. However I've saved using zend studio which is set to use unix line endings, so hopefully that'll fix that problem. (Fairly annoying if tortoise cvs is producing patches with windows line endings, especially as I don't see a setting where I can change that.)
I'm not sure what you mean by "offset and fuzz". If you clarify, maybe I'll be able to address it.
I hear you on breaking out the name changes as a separate patch, but it would be a lot of work, and I think the new names should make it easier to review the patch since they clarify what data is contained by the different variables and/or how we are using the data.
Although my example use case involves checkboxes, the larger issue is when the drupal field accepts multiple values: i.e. it is a question of data, not what widget is used to display that data. Since the patch addresses the data issue, it should work equally well for any multiple-value field, regardless of whether it uses a checkbox, select control, or some other widget for display. Note: it's because the data is independent of the display that you can change which widget you use for a field.
What is an open question is whether there are other types of salesforce fields which we need to handle. The current patch only deals with the multipicklist field type. I don't know what other salesforce fields might be relevant here. If somebody has some other salesforce multiple field types, I'm open to looking at them. In the absence of other field types to worry about, I'd like to go with the patch. Since we use a switch for salesforce field type, it will be easy to extend it if we encounter other salesforce field types which need to be handled.
#5
Sid, Thanks for your response. That makes sense. I agree with this in principle but don't have time to review sufficiently. I can commit if/when it gets RTBC though.
"offset and fuzz" are terms the patch command uses when successfully resolving conflicts due to minor changes elsewhere in the file. Make sure you are using latest DRUPAL-6--2 code and that you run cvs up -dP before testing/writing/rolling any patches.
The line ending issue s very minor.