Check for existing duplicates on object-create
| Project: | Salesforce |
| Version: | 6.x-2.x-dev |
| Component: | Code |
| Category: | feature request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs review |
Dependency Warnings
The sf_prematch module assumes that the patch found at http://drupal.org/node/506032 has been installed.
Although it doesn't require it, the sf_prematch module also implements the hook found at http://drupal.org/node/506012 Without that patch, rows will not be deleted from sf_prematch's database table, even if they are no longer needed.
Use case addressed by this patch
Corresponding objects are created in salesforce and Drupal: e.g. the same contact is created in both. Since these objects are independently created, there is no syncing relationship between them.
Current behavior addressed by this patch
When the Drupal object is exported, a new salesforce object is created for it to sync with. This means that there are now two corresponding salesforce objects for the one Drupal object, and that only the new salesforce object is in a synchronization relationship with the Drupal object.
Desired behavior achieved by this patch
Before creating a new salesforce object, attempt to find an already-existing salesforce object which corresponds to the Drupal object that is in the process of being exported. If a match is found, do not create a new salesforce object, but instead export to the found one, and create a synchronization relationship between the two objects.
The way it attempts to match is by looking for a salesforce object of the correct type where the value of some field(s) matches that of the Drupal object which is being exported.
Administrators are given the ability to choose one to three fields to use in matching, and to choose the logic for applying those fields.
Currently, the logic field is redundant. However, I anticipate the options for logic getting more sophisticated in future.
What the patch does
Add hook_sf_find_match
When a module is exporting a Drupal object, and that object does not yet have a corresponding salesforce object, the exporting module calls this hook to give other modules a chance to find a matching salesforce object. The patch modifies sf_node and sf_user so they call this hook when they are exporting an object that has no assigned sfid.
Note that the hook expects the first argument to be the action that is being taken: i.e. export or import. The current patch only deals with exporting, but there is no reason that future modifications could not implement matching on import (or other actions should some arise).
A new module
This patch includes sf_prematch, a new module specifically for the purpose outlined above. It implements hook_sf_find_match.
It also provides the administrative interface outlined above, and a new permission to control which roles can use that interface.
I implemented this is a module for two reasons.
Doing so allowed me to minimize the extent to which modifications were needed in the existing code, and this approach allows the possibility of other modules which might implement the new hook differently. Basically, although it seems to me that the functionality provided here might well belong in core, I could see other people thinking differently, and I could see people wanting similar but somewhat different functionality. A new module seemed like the best way to address all of that.
Why two patches?
Although it seems to me that it makes sense to include the new module as part of the general salesforce distribution, I am open to being told otherwise. Thus I have included two versions of the patch. One version includes the module, and the addition of the hook to sf_user and sf_node. The other version includes only the changes to sf_user and sf_node, but not the new module. If it seems preferable to make the new module a separate project, I am open to doing that.
| Attachment | Size |
|---|---|
| salesforce_prematch_hook_only.patch | 2.68 KB |
| salesforce_prematch_hook_and_module.patch | 22.64 KB |

#1
Minor fix. Turns out I messed up dependencies in the module's info file. Apply this patch after the module patch to fix it.
#2
This would be a great feature for SF API core. I am unable to review at this time but would be happy to test and commit once it is ready. A few notes. This section is duplicated in both sf_user and sf_node modules.
+ $prematch_found = false;+ if (empty($sfid)) {
+ // call hook_sf_find_match to give opportunity to try to match existing sf object instead
+ // of creating a new one.
+ $matches = module_invoke_all('sf_find_match', 'export', 'user', $account, $map);
+ // TODO: handle case where multiple implementers of hook return multiple results.
+ if (count($matches)) {
+ $sfid = $matches[0];
+ $prematch_found = true;
+ }
+ }
Lets move it to a wrapper in salesforce_api;
salesforce_api_search_for_duplicates(), and address the TODO. Though having the feature in it's own module is useful for the reasons you mentioned, and also since not all websites will want this feature.Please submit the changes to salesforce_api as a patch/feature of it's own in a new issue node. Also, upload any entirely new files as the whole file, rather than as patches. This makes it easier for others to review and maintain the changes/new files.
Great feature!
#3
#4
Also I don't think prematch is a very good/unambiguous name for this module, though I can't think of better ones right now. Ideas?
#5
salesforce_prematch_hook_only.patchworks great for me, but I've not tested the full module. I've implemented my own hook_sf_find_match that's saving lots of headache. I would mark that patch RTBC if i could mark half of an issue...Instead of relegating this to its own module, what do people think about including this hook in core?
Can anyone think of a use case in which an optional reduce-duplicates feature would be undesirable?
Personally I will *always* use such a hook (or some variation thereof) in my Drupal+SF implementations.
#6
Okay. I agree in principle to implementing this as a hook. Please split out the hook and the new module into two separate issue nodes and patches, since the former is quick and simple, while the latter is larger and requires much more review.
Aaron. I wouldn't use this for SF integrations which are very simple and one-way only. The client also has their own de-duping mechanism, and may have limited bandwidth. It's possible the module may falsely identify duplicates. Even if a very unique field is used, such as email addresss, one email address might be shared between a couple or an entire household.
#7
Bevan, I think we're on the same page - I don't think the module should necessarily go into SF core, just the hook. You're definitely right that a soap-based prematch mechanism (or any prematch mechanism) might not be desirable or necessary for some use cases.
The hook by itself would provide the flexibility for maintainers to decide whether to enable the prematch module or developers to roll their own prematch mechanism.
This is good stuff, Sid. Thanks for your contrib.
#8
Okay, so let's split this into two patches/issues then; One for the core patch, and one for pre-match module. also, I'd like to come up with a less ambiguous name than prematch. Ideas?
#9
Bevan and Aaron,
Thanks for the feedback and encouraging words.
I think what you say makes sense.
Right now I'm pretty swamped, but I hope to get back for another run at all the SF stuff in the next month or two.
#10
See http://drupal.org/node/551910 for the hook as a standalone issue.