I have a lot of information in the NOTE section of a family tree and would like to be able to import it all easily. I will work on implementing this and posting a patch. I am reading the documentation for both the module & GEDCOM, and I'm having a bit of trouble slogging thruogh it all.

Currently it appears that each concatenation tag (CONC) is read in as its own fact. I would (if given no input) probably create a patch that combines the CONC tags into the preceding/parent tag and saves it there. Is there a reason I shouldn't do this?

Also, I will put a spot on the display for Notes.

Comments

seth.e.shaw’s picture

Oddly enough, I have been working on the CONT/CONC fact bit lately. I think I have the kinks worked out although it does need some more testing before being submitted to the repository. (It is included below) I haven't been able to get the Source as node working well enough to check the update/new functionality while integrated into the module although my controlled tests seemed fine. I think part of the problem is that I am also trying to integrate the Source notes with the Source node body.

Basically, we are trying to follow GEDCOM fairly closely which places a character per line, i.e. fact, cap (147 I think) with no newlines. CONT tags were created to represent the start of a new line and CONC tags were created to continue excessively long strings. Another reason this is needed is that Notes are not the only tags that allow continuation through CONT & CONC. Many others such as addresses permit them as well. We need functions that work for both Notes and these other facts.

So, when a note is read it needs to find the parent tag, then all the subordinate CONT and CONC tags, and then string them together inserting a newline break when it encounters the CONT tag. Also, when we create a new note or update an existing one we then need to break down the string by the new lines (the CONT tags) and then if those resulting strings are still too large, break them into CONC tags.

These strings then have to be inserted into the database respecting the order of fact_ids (the only sort mechanism we have right now; ugly, I know, but all we have). Otherwise, when they are called back out they might not be in order when reassembled. I try to conserve database inserts by recycling existing facts, changing the fact_code if need be.

/**
 * Retrieves and combines CONT & CONC fact texts
 */
function family_get_cont($fid){
  $text="";
  $results = db_query("SELECT {family_facts}.fact_code, {family_facts}.fact_value "
          .          "FROM {family_facts} "
          .          "INNER JOIN {family_relations} ON {family_facts}.fid = {family_relations}.fid1 "
          .          "AND {family_relations}.fid2 = %d "
          .          "WHERE {family_facts}.fact_code LIKE 'CONT' "
          .          "OR    {family_facts}.fact_code LIKE 'CONC' "
          .          "ORDER BY {family_facts}.fid ASC", $fid); //Order required for CONT & CONC
  while($attribs = db_fetch_array($results)) {
    if (strcmp($attribs['fact_code'], "CONT") == 0 ) {$text.="<br />";}
    $text .= $attribs['fact_value'];
  }
  return $text;
}

/**
 * Splits long strings into CONT & CONC fact texts for Insert (update case is below)
 * Returns the Subfact's fid (I don't know that we
 * will ever need it, but just in case').
 *
 * I am not convinced it is working correctly although controlled tests seem to work
 */
function family_make_tag_cons($fid, $sub_fact, $full_string){
  $max_line_chars = 237; //255 (GEDCOM spec) - 8 (for level and tag) - 10 (for good measure) = 237

  //Break text into CONT blocks
  $paragraphs = preg_split("/[\n\r\f]/", $full_string);

  //Create parent tag & value
  $first_paragraph = array_shift($paragraphs);
  $lines = explode("\n", wordwrap($first_paragraph, $max_line_chars, "\n") );
  $first_fid =  family_insert_fact($sub_fact, array_shift($lines));
  family_insert_relation($first_fid, $fid, 'FACT');

  //Create additional CONCs as needed for first paragraph
  foreach($lines as $line){
    $curr_fid = family_insert_fact('CONC', $line);
    family_insert_relation($curr_fid, $first_fid, 'FACT');
  }
  //For each CONT and trailing CONCs
  foreach($paragraphs as $paragraph) {
    $paragraph_lines = explode("\n", wordwrap($paragraph, $max_line_chars, "\n") );
    $curr_fid = family_insert_fact('CONT', array_shift($paragraph_lines));
    family_insert_relation($curr_fid, $first_fid, 'FACT');
    foreach($paragraph_lines as $line){
      $curr_fid = family_insert_fact('CONC', $line);
      family_insert_relation($curr_fid, $first_fid, 'FACT');
    }
  }
  return $first_fid;
}
/**
 * Splits long strings into CONT & CONC fact texts for Update
 * Returns the Subfact's fid (I don't know that we
 * will ever need it, but just in case').
 *
 * I am not convinced it is working correctly although controlled tests seem to work
 */
function family_update_tag_cons($fid, $full_string){
  // Get CONT, CONC fodder
  $results = db_query("SELECT {family_facts}.fid "
          .          "FROM {family_facts} "
          .          "INNER JOIN {family_relations} ON {family_facts}.fid = {family_relations}.fid1 "
          .          "AND {family_relations}.fid2 = %d "
          .          "WHERE {family_facts}.fact_code LIKE 'CONT' "
          .          "OR    {family_facts}.fact_code LIKE 'CONC' "
          .          "ORDER BY {family_facts}.fid", $fid);

  $max_line_chars = 237; //255 (GEDCOM spec) - 8 (for level and tag) - 10 (for good measure) = 237

  //Break text into CONT blocks
  $paragraphs = preg_split("/[\n\r\f]/", $full_string);

  //Create parent tag & value
  $first_paragraph = array_shift($paragraphs);
  $lines = explode("\n", wordwrap($first_paragraph, $max_line_chars, "\n") );
  family_update_fact($fid, array_shift($lines));
  foreach($lines as $line){
    if($attribs = db_fetch_array($results)){ // Use exisiting fact fodder
      family_morph_fact($attribs['fid'], 'CONC', $line);
    }
    else{ // Until we run out of fodder
      $new_fid = family_insert_fact('CONC', $line);
      family_insert_relation($new_fid, $fid, 'FACT');
    }
  }
  foreach($paragraphs as $paragraph) {
    $paragraph_lines = explode("\n", wordwrap($paragraph, $max_line_chars, "\n") );
    if($attribs = db_fetch_array($results)){
      family_morph_fact($attribs['fid'], 'CONT', array_shift($paragraph_lines));
    }
    else{
      $new_fid = family_insert_fact('CONT', array_shift($paragraph_lines));
      family_insert_relation($new_fid, $fid, 'FACT');
    }
    foreach($lines as $line){
      if($attribs = db_fetch_array($results)){
        family_morph_fact($attribs['fid'], 'CONC', $line);
      }
      else{
        $new_fid = family_insert_fact('CONC', $line);
        family_insert_relation($new_fid, $fid, 'FACT');
      }
    }
  }
}
Tistur’s picture

Title: Importing NOTE fact » Wow, thanks

That really helps.

A super quick hack of family_line_indi at least shows me the notes. I'm not sure why I have to add 1 to the fact id - this may be because my GEDCOM generator is fairly old.

  // Display NOTE before spouse
  if($subfacts['NOTE']){
    $content .= family_get_cont($subfacts['NOTE'][0]['fid'] + 1);
  }

Next piece to work on: putting facts into the node body for the person. Would the tables have to change on editing a node, or would it suffice to change the tables once before an export?

After that, maybe a checkbox when importing to turn this on and off.

Tistur’s picture

Title: Wow, thanks » NOTE, CONC, and CONT
pyutaros’s picture

Status: Active » Fixed

Setting to fixed.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.