Bit of a support request here, hope someone can help to point me in the right direction.

I'd be happy to help update the documentation with a more detailed tutorial once I understand the process better myself.

I've got the Recommender API up and running and I'm working on developing a custom helper application. I'd like to create a similarity score between two different types of entities:
1) Job Seeker Profile (profile2 profile type)
2) Job Posting (node)

These share many of the same fields. For the sake of simplicity, let's say they share exactly two:
1) Industry (taxonomy term ID)
2) Salary (integer)

Essentially, a user will fill out a profile of preferences stating their desired salary and industry of employment. I'd like the Recommender API to then recommend Job Postings that are similar to their preferences based on the indicated fields (salary and industry). Is this possible?

I've been working off of the fivestar_rec.module and rec_example module to create a resultset for comparison, but I'm not quite sure that I'm doing it correctly. My current hook_install looks like this:

/**
 * Implements hook_install().
 */
function job_matcher_install() {
  recommender_app_register(job_matcher_apps());
}

function job_matcher_apps() {
  // Define sql result set with {node} as base table
  $sql = "(SELECT 
            profile.uid AS uid,
            profile.pid AS eid,
            industryTable.field_job_industry_tid AS industry_tid 
          FROM 
            profile
          LEFT JOIN 
            field_revision_field_job_industry industryTable 
              ON profile.pid = industryTable.entity_id 
          WHERE 
            profile.type='seeker_profile'
          GROUP BY 
            eid)
            
        UNION
        
          (SELECT 
            node.uid AS uid,
            node.nid AS eid,
            industryTable.field_job_industry_tid AS industry_tid 
          FROM 
            node
          LEFT JOIN 
            field_revision_field_job_industry industryTable 
              ON node.nid = industryTable.entity_id 
          WHERE 
            node.type='job_posting' 
          GROUP BY 
            eid)        
        ";
  // array(user_id, item_id, [preference], [timestamp])
  // [preference] and [timestamp] can be omitted.
  $fields = array('pid', 'nid', 'industry_tid', 'changed');
  
  return array(
    'job_matcher' => array(
      'title' => st('Job Seeker profile to Job Posting matcher.'),
      'params' => array(
        'algorithm' => 'item2item',
        'sql' => $sql,
        'fields' => $fields,
        'performance' => 'memory',
        'preference' => 'score',
        //'similarity' => VALUES: 'auto' (default), 'cityblock', 'euclidean', 'loglikelihood', 'pearson', 'spearman', 'tanimoto', and 'cosine'
      ),
    ),

    'job_matcher_update' => array(
      'title' => st('Job Seeker profile to Job Posting matcher (incremental update).'),
      'params' => array(
        'algorithm' => 'item2item_increment',
        'base_app_name' => 'job_matcher',
        'sql' => $sql,
        'fields' => $fields,
        'performance' => 'memory',
        'preference' => 'score',
      )
    ),   
  );
}

I'm making the assumption that it's possible to determine the similarity of different types of entities as long as they share common fields.

But I'm confused on a few points:
1) What exactly should be returned by the SQL query? A result set of entity targets, or a combination of source and targets? What are all of the acceptable values for the score field?
2) If I'm doing an item2item similarity comparison, what is the purpose of defining a uid field? Why are users entering into the equation at all?
3) Can similarity be determined using more than one field as a basis of comparison?
4) I see that the 'preference' field can be use with either boolean or score. What if I'd like use different criteria depending on the field that's being matched? For instance, I may want to match text strings for one field, compare the difference between integers in second, and compare boolean in a third. Is the purpose of the 'similarity' parameter?

I'm starting to think I'm approaching this incorrectly. I'm failing to understand how a single SQL query can result in a comparison to two different datasets.

Any help would be greatly appreciated! Thanks.

Comments

grasmash’s picture

Issue summary: View changes

removed extraneous code, fixed typo.

grasmash’s picture

Issue summary: View changes

adding assumption

grasmash’s picture

Issue summary: View changes

after more research, tried to fix code.

grasmash’s picture

Issue summary: View changes

formatting

grasmash’s picture

Issue summary: View changes

changed wording

grasmash’s picture

Issue summary: View changes

adding question

danithaca’s picture

will take a look later.

danithaca’s picture

Issue summary: View changes

adding

grasmash’s picture

Thanks danithaca,

I know this is a fairly large support request. I'd really appreciate your help with it.

If the feature doesn't exist, I'd be happy to submit a patch or provide documentation for it.

I'm going to be developing this functionality over the next month one way or the other, and I'd like to find a way to rest on Recommender API.

Once I understand the module a little better, I'd be happy to help pitch-in with development.

danithaca’s picture

sounds good. I'll take a look and give you some suggestions.

mrfelton’s picture

@madmatter23 did you ever manager to get this going?

mrfelton’s picture

Issue summary: View changes

added questions

danithaca’s picture

Status: Active » Closed (duplicate)

Please use the new release 7.x-6.x.

You just need to define the "preference" table in hook_recommender_data() that stores the "profile" => "nodes" data. The "user id" will be profile id here. Compute recommendations should not run in to problems, but display results in views might.

See #2377519: Remove the restriction of "user" entity type for "user field" to follow up this issue.