Add import feature to the 5.x-3.x branch. [#273135]

I had tried the previous version of this module and was able to import using ged files. There used to be an option to complete the import but now I am not sure how it can be done now. I noticed that you mentioned importing in your disclaimer but no input about how to do it.

Also, is it possible for me to edit the union types for family groups. I attempted to do so but could not display or manage the fields.

Any assistance you can provide would be greatly appreciated.

Thanks,
Marcia

Comment	File	Size	Author
#22	import.inc_.txt	14.08 KB	Microbe

Comments

Comment #1

pyutaros commented 12 September 2008 at 00:37

Title:	how do you import ged files	» Add import feature to the 5.x-3.x branch.
Assigned:	Unassigned	» pyutaros
Category:	support	» feature

Started work on the import feature. The remainder of this thread is for the purpose of detailing the progress and thought process for developing the import feature. The import feature will be based off of the feature from the 5.x-1.x. More detail as I go.

Comment #2

pyutaros commented 12 September 2008 at 02:20

For my own information, this is the code of the old import function. This is the part that parses the ged file and throws entries into the DB. Since we massively revamped the DB in version 5.x-3.x, this is what needs the major rewrite. There are some minor edits already which I notate here.

function family_import_form_submit($form_id, $form_values) {
  $file = file_check_upload('gedcom_file');
  if (!$file) {
    form_set_error('',t("Didn't get GED file"));
  }
  
  $fp = fopen($file->filepath , "r" );
  if (!$fp) {
    form_set_error('',t("Couldn't open get GED file"));
  }

  
  //
  // Empty current content. This is useful for debugging, but more caution should be
  // done before deleting database in the working version
  //
  if ($form_values['replace'])
  {
    // all of the following truncate commands were changed for the 5.x-3.x version
    db_query("TRUNCATE {family_individual}");
    db_query("TRUNCATE {family_group}");
    db_query("TRUNCATE {family_location}");
    db_query("TRUNCATE {family_variable}");
    
    $q = db_query("select nid from {node} where type = 'family_individual'");
    $n = 0;
    while ($o = db_fetch_object($q)) {
      node_delete($o->nid);
      $n++;
    }
    drupal_set_message(t('Deleted @n family_individual nodes.', array('@n' => $n)));
    
	$q = db_query("select nid from {node} where type = 'family_group'");
    $n = 0;
    while ($o = db_fetch_object($q)) {
      node_delete($o->nid);
      $n++;
    }
    drupal_set_message(t('Deleted @n family_group nodes.', array('@n' => $n)));
    
	$q = db_query("select nid from {node} where type = 'family_location'");
    $n = 0;
    while ($o = db_fetch_object($q)) {
      node_delete($o->nid);
      $n++;
    }
    drupal_set_message(t('Deleted @n family_location nodes.', array('@n' => $n))); 
  }

  $rmin = $form_values['start']? $form_values['start']:0;
  $rcount = $form_values['nrecords']? $form_values['nrecords']:99999999;
  $rmax = $rmin + $rcount - 1;
  $rnum = 0;
  $rprocessed = 0;
  $lprocessed = 0;
  
  $lnum = 0;
  $gedcom_hier=array();   // References to GEDCOM parents on each level

I split the code up for ease of reference. This section contains the code that makes the DB entries. The funct

   while (!feof ($fp))
  {
    $gedline = fgets( $fp, 1024 );
    $lnum++;
    
    if (preg_match("/^\s*(\d+)\s*(?:@([^@]+)@)?\s*(\S+)\s*(.*\S)?\s*$/i", $gedline , $matches))
    {
      $level=$matches[1];
      if ($level == 0) ++$rnum;
      if ($rnum < $rmin) continue;
      if ($rnum > $rmax) break;
      if ($level == 0) ++$rprocessed;
      ++$lprocessed;
      $xref=$matches[2];
      $fact_code=$matches[3];
      $value=$matches[4];
      $gedcom_source=$gedline;
      $parent=$gedcom_hier[$level-1];
      //NEW IMPORT CODE STARTS HERE
	  //import creates a temp DB to help map relations
	  
	  if (strpos(";HUSB;WIFE;CHIL",$fact_code))
      {
        //
        // Lines that define only a relation
        //
        $gedcom_hier[$level]=NULL;
        preg_match("/@\s*([^@\s]+)\s*@/i", $gedline , $matches);
        $target_xref=$matches[1];
        $fid = db_result(db_query("SELECT fid FROM {family_facts} WHERE xref = '%s'", $target_xref));
        $relation=$fact_code;
      }
      else
      {
        //
        // Lines that define a fact
	// Every INDI fact gets a node
	//
        $fid = db_next_id('{family_facts}_fid');
        $nid=NULL;
        $relation="FACT";

        if ($fact_code == 'INDI') {
          unset($node);
          $node->type = "family_individual";
          $node->uid = $user->uid;
          $node->title = "Unknown";
          $node->status = 1;
          $node->moderate = 0;
          $node->comment = 2;
          $node->revision = 0;
          $node->fid = $fid;
          $node->xref = $xref;
          $node->value = $value;
          $node->gedcom_source = $gedcom_source;
          node_validate($node, $error);
          if (!node_access("create", $node)) {
            $error['access'] = message_access();
          }
          if ($error) {
   	    drupal_set_message(
              t(
		'Error at line @lnum of GED (@line): @error.',
              	array('@lnum' => $lnum, '@line' => $gedline, '@error' => print_r($error,true))
              )
	    );
          }
          else {
            node_save($node);
            $nid=$node->nid;
          }
          unset($node);
        }
	else
	{
          db_query("INSERT INTO {family_facts} (fid, nid, xref, fact_code, fact_value, gedcom_source)
	            VALUES (%d, %d, '%s', '%s', '%s', '%s')",
	  	    $fid, $nid, $xref,$fact_code,$value,$gedcom_source);
	}
        $gedcom_hier[$level] = $fid;
        $gedcom_source=null; // If the source is in fact table, no need to put it again in relation table
    	$nchanged++;
      }

      if ($level>0) {
        $rid = db_next_id('{family_relations}_fid');
        db_query("INSERT INTO {family_relations} (rid, fid1, fid2, relation_description,
                       gedcom_source) VALUES (%d, %d, %d, '%s', '%s')",
                       $rid,$fid, $parent, $relation, $gedcom_source);
      }
    }
  }
  fclose ($fp);

Obviously the entire methodology has to be changed here. I'll discuss ideas for accomplishing this in my next post.

  drupal_set_message(t('Processed @r records (@n lines) of GED.', array('@r' => $rprocessed, '@n' => $lprocessed)));
  if ($rnum > $rmax) drupal_set_message(t('Next start record: @r.', array('@r' => $rmax + 1)));
  else drupal_set_message(t('No more records to process'));

  //
  // Clean up: add appropriate Titles to nodes based on the NAME fact
  // -- An alternative to this is to cache created nodes until we
  //    have added all the family_facts.  That has the benefit of
  //    being easy to roll back if there are errors in the GED file
  //    but it uses more ram for a big import.  I may try this if
  //    my big GED still takes too long to import... --pf
  //
  $fids = db_query("SELECT f1.fid as fid, f1.nid as nid, f2.fact_value as name ".
  	"FROM {family_facts} f1, {family_facts} f2, {family_relations} r ".
	"WHERE f1.fact_code = 'INDI' AND f2.fact_code='NAME' AND f2.fid=r.fid1 AND r.fid2=f1.fid");
  while ($row = db_fetch_array($fids)) {
    $name=str_replace("/","",$row['name']); //Remove slashes arround surname

    //UPDATE NAME
    $node=node_load($row['nid']);
    if ($node->title != $name)
    {
      $node->title = $name;
      node_save($node);
    }
  }
  return 'family';
}

Comment #3

pyutaros commented 12 September 2008 at 03:46

Okay, once again to aid in my thought processes, here is an example GED file. We need to get this into the existing DB structure, which will be listed in my next post.

0 HEAD
1 DEST ANSTFILE
1 GEDC
2 VERS 5.5
2 FORM Lineage-Linked
1 CHAR UTF-8
1 SOUR PhpGedView
2 NAME PhpGedView Online Genealogy
2 VERS 4.0.3 stable
1 DATE 26 Sep 2007
2 TIME 07:06:52
1 PLAC
2 FORM City, County, State/Province, Country
0 @I1@ INDI
1 NAME First1 Middle1 /Last1/
2 GIVN First1 Middle1
2 SURN Last1
1 SEX M
1 BIRT
2 DATE 26 JUL 1978
2 PLAC Place1
1 CHAN
2 DATE 07 FEB 2007
3 TIME 17:37:12
1 NCHI 1
1 OBJE @M2@
2 TITL First1
1 FAMS @F1@
1 FAMC @F2@
1 NAME First2 Middle2 /Last2/
2 GIVN First2 Middle2
2 SURN Last2
1 SEX F
1 BIRT
2 DATE 02 FEB 1978
2 PLAC Place2
1 CHAN
2 DATE 29 JAN 2007
3 TIME 11:27:17
1 FAMS @F1@
1 FAMC @F4@
0 @I5@ INDI
1 NAME First3 Middle3 /Last1/
2 GIVN First3 Middle3
2 SURN Last1
1 SEX F
1 BIRT
2 DATE 24 JUL 2006
2 PLAC Place3
1 CHAN
2 DATE 29 JAN 2007
3 TIME 10:17:44
1 FAMC @F1@
0 @F1@ FAM
1 MARR
2 DATE 27 MAY 2006
2 PLAC Place4
2 OBJE @M4@
1 CHAN
2 DATE 31 JAN 2007
3 TIME 18:22:52
0 @F2@ FAM
1 CHAN
2 DATE 07 FEB 2007
3 TIME 16:59:03
1 MARR
2 TYPE Religious
2 DATE 20 FEB 1971
2 PLAC Plac5
1 DIV
2 DATE 1994
0 @F4@ FAM
1 CHAN
2 DATE 04 FEB 2007
2 TIME 19:20:38
1 MARR
2 DATE 11 OCT 1975
2 PLAC Place6
0 @M10@ OBJE
1 FILE media/HPIM0631.JPG
1 TITL Title1
0 @M11@ OBJE
1 FILE media/HPIM0601.JPG
1 TITL Title2
0 @M12@ OBJE
1 FILE media/HPIM0566.JPG
1 TITL Title3
0 TRLR

The biggest problem I am seeing out of the gate is creating these relationship entries (FAMC @F1) on the fly during import. Next post will really outline the new structure. Got to sleep for now.

Comment #4

Microbe commented 12 September 2008 at 22:34

Sorry for my bad communication recently, I haven't been able to do any work on the import feature. I see you look like you are starting it though. I you need any help with any coding things I can find time to answer them but I don't think i can take on any major tasks at the moment.
Sorry
Peter

Comment #5

pyutaros commented 12 September 2008 at 23:17

Peter,
Please, no apologies necessary!!! :) You've been a tremendous help so far! I guess if you can help me get my thoughts organized here, I would greatly appreciate that as well.
Thanks again,
Jonathan

Comment #6

pyutaros commented 13 September 2008 at 21:26

Here's a quick sketch of the logic loop I'm thinking of for the new import code. Since the old version just dumped the entire file into the family_facts table, initial import was easier, but working with the data in Drupal required massive amounts of work and was still not gedcom compliant in its output. This brings to mind a few methods that might be used to import the data.

Method 1 - Dump ged file into a temp DB and then do further processing with calls to the temp DB.
Method 2 - Parse the ged file line by line. Submit a new node with each 0 level XREF. Append each node with the child level facts. Relationships are stored in a temporary field for each node, which is removed afterward when the relationships are built.
Method 3 - Mostly the same as two, but would add the XREF field back permanently to the tables.

All that being said, a hybrid approach seems appropriate. I like the idea of the temp DB. Here's the proposed description of how the code will change.

Import feature begins at line 73. Line 132 (while (!feof ($fp))) begins the line by line evaluation. I'd like the leave the initial variable setting parameters in lines 134 thru 149. The ged files haven't changed, so how we set our initial variables can remain the same.

We will basically be replacing lines 154 to 223. We'll basically want to skip the header section of the ged file. I think we can accomplish this by NOT doing anything unless the XREFs or Fact Codes we are looking for come up.

We have five different conditions we are checking for:

Is $value = INDI (This establishes the beginning of an individual record. We could also first check for $level = 0 to establish that we're dealing with a parent level entry.)
Is $value = FAM (This establishes the beginning of an individual record.)
Is $fact_code one of [NAME, GIVN, SURN, SEX, BIRT, MARR, DIV]
Is $fact_code one of [PLAC, DATE]
Is $fact_code one of [FAMS, FAMC]

On the other hand, it may be better to just monitor what $level the $gedline is at and pass "tokens" from one while loop to the next.

Level 0 - INDI
- Resets any existing "tokens" to NULL.
- Node is created.
- Passes NID, @A#@, and INDI. (By setting in a variable that is checked next time around.)
Level 1 - Fact Code NAME, SEX, BIRT, MARR, DIV, OR FAMS, FAMC
- Establishes what type of subfact information is being edited.
- Updates node record based on passed information if there is applicable data.
- Passes NID, @A#@, INDI, and Fact Code.
- FAMS and FAMC are relationship entries. These will need to be stored temporarily until the family groups are processed.
Level 2 - Fact Code GIVN, SURN, PLAC, DATE
- Updates node record based on passed information.
Level 3 - Our program does not work with this level.

Well, out of time again. I'm getting a better idea of how I want to approach this. If anyone has any input, please chime in.

Comment #7

Microbe commented 13 September 2008 at 23:21

I think a case/switch based system would be good with a very similar structure to what you have added

switch($level){

case 0:
     //save current node and clear variables
     switch($value){      
             case "INDI":
                   //set variable e.g. Current0Record to INDI
             case "FAM":
                   //set variable e.g. Current0Record to FAM
      }
case 1:
      //set Current1Record to $value
      switch($Current0Record){
          case "INDI":
               switch ($value){
                     case "NAME":
                          //save name
                     case "BIRTH":
                          //save birth
                     ....
               }
          case "FAM":
               switch ($value){
                     case "FAMC":
                          //some sort of code to work out whos related to who and save data
                     case "FAMS":
                          //some sort of code to work out whos related to who and save data
                     ....
               }
       }
case 2:
       switch ($Current1Record){
                     case "NAME":
                             switch ($value){
                                     case "GIVN":
                                          //save name
                                     case "SURN":
                                          //save Surname
                                      ....
                             }
                     case "BIRTH":
                             switch ($value){
                                     case "PLAC":
                                          //save place
                                     case "DATE":
                                          //save date
                                      ....
                             }
                     ....

       }
}

level 3 will auto skip.
Each piece of data can be saved and process into the right location as it works like a tree because of the variable switches.
the only really difficult part is the family data. to do this i think you should save a column of the xref to the database which can be used to reference to the individual maybe? not sure though.

Comment #8

pyutaros commented 20 September 2008 at 03:06

I am very slowly making progress on this. Some issues are cropping up, but I will ask questions when I finish the parts I know how to get. Commented lines are snags. I have only created the switch/case code ofr INDI nodes. I have not yet created the DB Insert code. Very slow going due to other distractions.

	  switch($level){
	    case 0:
		  if ($current0record != NULL){
          //insert db values
		  }
		  $current0record = NULL;
		  
		  switch($value){
		    case 'INDI':
			case 'FAM':
			  $current0record = $value;
			  $current0xref = $xref;
			break;
		  }  
		break;
		case 1:
		  switch($current0record){
		    case 'INDI':
			  switch($fact_code){
			    case 'SEX':
				  $gender = $value;
				  $current1record = $fact_code;
				break;
				case 'NCHI':
				  $children_num = $value;
				  $current1record = $fact_code;
				break;
				case 'DEAT';
				case 'BIRT':
				case 'NAME':
				  $current1record = $fact_code;
				break;
				case 'FAMS':
				  $fams = $value;
				break;
				case 'FAMC':
				  $famc = $value;
				break;
			  }
			break;
			case 'FAM':
			break;
		  }
		break;
		case 2:
		  switch($current1record){
		    case 'NAME':
			  switch($fact_code){
			    case 'GIVN':
				//split data and create variables for first and middle names
				break;
				case 'SURN':
				  $lastname = $value;
				break;
			  }
			break;
			case 'BIRT':
			  switch($fact_code){
			    case 'DATE':
				//convert ged file date into YYYY-MM-DD format
				break;
				case 'PLAC':
				  $birthplace = $value;
				break;
			  }
			break;
			case 'DEAT':
			  switch($fact_code)(
			    case 'DATE':
				//convert ged file date into YYYY-MM-DD format
				break;
				case 'PLAC':
				  $deathplace = $value;
				break;
			  }
			break;
		  }
		break;
	  }

Comment #9

Microbe commented 20 September 2008 at 10:14

a date conversion function that i quickly wrote:

function family_changeDateFormat($oldFormat){
 $dateData = explode(" ", $oldFormat);
 $day = $dateData[0];
 $year = $dateData[2];
 switch($dateData[1]){
  case "JAN":
     $month = "01";
     break;
  case "FEB":
     $month = "02";
     break;
  case "MAR":
     $month="03";
     break;
  case "APR":
     $month="04";
     break;
  case "MAY":
     $month="05";
     break;
  case "JUN":
     $month="06";
     break;
  case "JUL":
     $month="07";
     break;
  case "AUG":
     $month="08";
     break;
  case "SEP":
     $month="09";
     break;
  case "OCT":
     $month="10";
     break;
  case "NOV":
     $month="11";
     break;
  case "DEC":
     $month="12";
     break;
 }
 $newDate="$year-$month-$day";
 return $newDate;
}

it has been tried and tested so should work fine
you can also use the explode function to split the names

Comment #10

Microbe commented 20 September 2008 at 10:20

having looked at the GEDCOM source that I have it doesn't have GIVN and SURN lines after the NAME line so maybe it would be safer to stick to splitting up the name line and not using the lines below. not sure though? i will send you my GEDCOM source so you can see and decide.

Comment #11

pyutaros commented 20 September 2008 at 14:29

Peter,
Thanks for the GED file. That's a big help to see how another program sets up a GED file. We should definitely do the name handling at the 1 level instead of level 2. I know the solution in either case involves preg_match, but I still need to read more to understand that. Also, excuse me for being dense, but as for your date function. Should I just put that in common.inc and then call it with something like $birthdate = family_changeDateFormat($value)?
Thanks,
Jonathan

Comment #12

Microbe commented 20 September 2008 at 14:58

Your spot on for how to use the date function :)

I would use two explodes for the name splitting maybe as preg_match is hard to use (well I tried and couldn't work out how)

I would use them as follows:

//split name value by / to separate surname
$splitName1 = explode("/", $value);
$SURNAME = $splitName1[1];

// split name by spaces
$splitName2 = explode(" ", $splitName1[0]);

// take the first name to be firstname
$FIRSTNAME = $splitName2[0];

// add all the other names together in a string
$MIDDLENAMES = "$splitName2[1] $splitName2[2] $splitName2[3] $splitName2[4] $splitName2[5] $splitName2[6] $splitName2[7]";

This certainly isn't the most effective way to get it too work and if you can get preg_match to work it will be more versatile.

Comment #13

pyutaros commented 20 September 2008 at 16:02

Well, so much for any kind of standard in GEDCOM. Take a look at how the FAM records are referred to in your file, then look at mine.

YOURS

0 @F65@ FAM
1 HUSB @I72@
1 WIFE @I75@
1 CHIL @I76@
1 CHIL @I77@
1 CHIL @I78@
1 @E89@ MARR
1 CHAN
2 DATE 23 FEB 2008
3 TIME 23:41:54

MINE

0 @F9@ FAM
1 MARR
2 TYPE Religious
2 DATE 29 JUN 1974
2 PLAC Straide,Co Mayo,Ireland
1 CHAN
2 DATE 07 FEB 2007
3 TIME 17:03:38

Just an observation. It's still workable. I think also what I'm seeing here is that you never entered information like type and date into your original program. It definitely gives me plenty to think about when creating the export file, but that is a whole other story. Anyhow, back to work.

Comment #14

Microbe commented 20 September 2008 at 18:28

Oh dear...

I think its only the positioning of the relations - HUSB, WIFE and CHIL on mine and and FAMS and FAMC on yours- I haven't added the data like date place and type so they should come up the same.

I'm not entirely sure but by looking at the old family module source code mine seem more like what it should be as there are alot of searches for CHIL and HUSB but none (as far as i have seen) for FAMS and FAMC which yours uses.

I'm now really confused because GEDCOM is supposed to be a very defined standard. :(

Comment #15

pyutaros commented 20 September 2008 at 21:19

I'd say you're probably right about your file being closer to standard. I guess it may just be a statement on how phpGedView handles the data. they obviously have their own "standard" of GEDCOM compliance.

I had to end this in mid coding again. Mostly done. Have to insert DB variables. Also have to create node. Have to look at the node creation text that the import previously used. Then I suppose I'm going to have to figure out what I'm going to do with all the relationship data. Here's where it currently stands, along with the old node creation routine following. No more till Tues.

	  switch($level){
	    case 0:
		  switch($current0record){
		    case 'INDI':
			  db_query("INSERT INTO {family_individual} (vid, nid, title_format, firstname, middlename, lastname, gender, birthdate, birthplace, deathdate, deathplace, ancestor_group) VALUES (%d, %d, '%s',  '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s')", $node->vid, $node->nid, $node->title_format, $node->FORE, $node->MIDN, $node->SURN, $node->SEX, $node->BIRT_DATE, $node->BIRT_PLAC,$node->DEAT_DATE, $node->DEAT_PLAC, $node->GRUP);
			break;
			case 'FAM'
			  db_query("INSERT INTO {family_group} (vid, nid, title_format, marr_type, marr_date, marr_plac, div_date, div_plac, parent1, parent2) VALUES (%d, %d, '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s')", 
  $node->vid, $node->nid, $node->title_format, $node->MARR_TYPE, $node->MARR_DATE, $node->MARR_PLAC, $node->DIV_DATE, $node->DIV_PLAC,$node->PAR1, $node->PAR2);
			break;
		  }
		  $current0record = NULL;
		  
		  switch($value){
		    case 'INDI':
			case 'FAM':
			  $current0record = $value;
			  $current0xref = $xref;
			break;
		  }  
		break;
		case 1:
		  switch($current0record){
		    case 'INDI':
			  switch($fact_code){
			    case 'SEX':
				  $gender = $value;
				  $current1record = $fact_code;
				break;
				case 'NCHI':
				  $children_num = $value;
				  $current1record = $fact_code;
				break;
				case 'NAME':
				  //split name value by / to separate surname
				  $splitName1 = explode("/", $value);
				  $lastname = $splitName1[1];
				  // split name by spaces
				  $splitName2 = explode(" ", $splitName1[0]);
				  // take the first name to be firstname
				  $FIRSTNAME = $splitName2[0];
				  // add all the other names together in a string
				  $MIDDLENAME = $splitName2[1] . " " . $splitName2[2] . " " . $splitName2[3] . " " . $splitName2[4] . " " . $splitName2[5] . " " . $splitName2[6] . " " . $splitName2[7];
				  $current1record = $fact_code;
				break;
				case 'DEAT':
				case 'BIRT':
				  $current1record = $fact_code;
				break;
				case 'FAMS':
				  $fams_xref = $value;
				break;
				case 'FAMC':
				  $famc_xref = $value;
				break;
			  }
			break;
			case 'FAM':
			  switch($fact_code){
			    case 'MARR':
				case 'DIV':
				  $current1record = $fact_code;
				break;
			break;
		  }
		break;
		case 2:
		  switch($current1record){
		    case 'BIRT':
			  switch($fact_code){
			    case 'DATE':
				  $birthdate = family_changeDateFormat($value)
				break;
				case 'PLAC':
				  $birthplace = $value;
				break;
			  }
			break;
			case 'DEAT':
			  switch($fact_code){
			    case 'DATE':
				  $deathdate = family_changeDateFormat($value)
				break;
				case 'PLAC':
				  $deathplace = $value;
				break;
			  }
			break;
			case 'MARR':
			  switch($fact_code){
			    case 'TYPE':
				  $marr_type = $value
				break;
				case 'DATE':
				  $marr_date = family_changeDateFormat($value)
				break;
				case 'PLAC':
				  $marr_plac = $value
				break;
			  }
			break;
			case 'DIV':
			  switch($fact_code){
			    case 'DATE':
				  $div_date = family_changeDateFormat($value)
				break;
				case 'PLAC':
				  $div_plac = $value
				break;
			  }
			break;
		  }
		break;
	  }

	// Every INDI fact gets a node
	//
        $fid = db_next_id('{family_facts}_fid');
        $nid=NULL;
        $relation="FACT";

        if ($fact_code == 'INDI') {
          unset($node);
          $node->type = "family_individual";
          $node->uid = $user->uid;
          $node->title = "Unknown";
          $node->status = 1;
          $node->moderate = 0;
          $node->comment = 2;
          $node->revision = 0;
          $node->fid = $fid;
          $node->xref = $xref;
          $node->value = $value;
          $node->gedcom_source = $gedcom_source;
          node_validate($node, $error);
          if (!node_access("create", $node)) {
            $error['access'] = message_access();
          }
          if ($error) {
   	    drupal_set_message(
              t(
		'Error at line @lnum of GED (@line): @error.',
              	array('@lnum' => $lnum, '@line' => $gedline, '@error' => print_r($error,true))
              )
	    );
          }
          else {
            node_save($node);
            $nid=$node->nid;
          }
          unset($node);
        }
	else
	{
          db_query("INSERT INTO {family_facts} (fid, nid, xref, fact_code, fact_value, gedcom_source)
	            VALUES (%d, %d, '%s', '%s', '%s', '%s')",
	  	    $fid, $nid, $xref,$fact_code,$value,$gedcom_source);
	}

Comment #16

Microbe commented 20 September 2008 at 21:48

only comment is that the variables you are entering into the database need to be the same as the ones you are assigning values to.
e.g. gender value goes to $gender whereas you insert the value of $node->SEX

Comment #17

Microbe commented 22 September 2008 at 19:43

oops sorry you said this in your post :(

Comment #18

pyutaros commented 27 September 2008 at 21:15

Okay. Here's where it stands today. All node creation and table insertion is complete. I believe the only nagging detail is going back and inserting the NID into the ancestor_group field for individuals. Should be small work, but I'm out of time. Also, I discovered a few flaws in how I am evaluating things. One such flaw is if children do not share the same name as PARENT1, some relationship data may be skewed.

Finally, I ASSUME that group data will not be in the gedfile until the end. If this is not the case for a file, the import will fail. My head is definitely spinning. Here is the current code:

//NEW IMPORT CODE STARTS HERE
	  //import creates a temp DB to help map relations
	  
	  switch($level){
	    case 0:
		  switch($current0record){
		    case 'INDI':
			  //create title_format variable that has not yet been set.
			  $title_format = $firstname . " " . $middlename . " " . $lastname; //Will change with the implementation of tokens.
			  //Create family_individual node
			  unset($node);
              $node->type = "family_individual";
              $node->uid = $user->uid;
              $node->title = $title_format;
              $node->status = 1;
              $node->moderate = 0;
              $node->comment = 2;
              $node->revision = 0;
              node_validate($node, $error);
              if (!node_access("create", $node)) {
                $error['access'] = message_access();
              }
              if ($error) {
   	            drupal_set_message(
                  t('Error at line @lnum of GED (@line): @error.', 
				    array('@lnum' => $lnum, '@line' => $gedline, '@error' => print_r($error,true))
                  )
				);
              }
              else {
                node_save($node);
				$vid=$node->vid;
                $nid=$node->nid;
              }
              unset($node);
			  //Insert relationship variables into temporary database
			  db_query_temporary("INSERT INTO {family_relations_temp} (nid, famc_xref, fams_xref) VALUES (%d, '%s', '%s')", $nid, $famc_xref, $fams_xref);
			  //insert data into family_individual table
			  db_query("INSERT INTO {family_individual} (vid, nid, title_format, firstname, middlename, lastname, gender, birthdate, birthplace, deathdate, deathplace, children_num, ancestor_group) VALUES (%d, %d, '%s',  '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', %d, '%s')", $vid, $nid, $title_format, $firstname, $middlename, $lastname, $gender, $birthdate, $birthplace, $deathdate, $deathplace, $children_num, $famc_notset);
			break;
			case 'FAM'
			  //Find the group surname shared by the children of the group. - This may be incorrect in cases where the name has not been passed to the children.
			  $result = db_query_temporary("SELECT nid FROM {family_relations_temp} WHERE famc_xref = '%s'", $current0xref);
			  $child_ref_nid = db_fetch_object($result);
			  $result = db_query("SELECT lastname FROM {family_individual} WHERE nid = %d", $child_ref_nid);
			  $group_surname = db_fetch_object($result);
			  //Find Parents of group
			  $result = db_query_temporary("SELECT nid FROM {family_relations_temp} WHERE fams_xref = '%s'", $current0xref);
			  while ($parent_nid = db_fetch_object($result)) {
			    $result2 = db_query("SELECT lastname FROM {family_individual} WHERE nid = %d", $parent_nid);
				$parent_surname = db_fetch_object($result2);
			    if ($parent_surname = $group_surname){
				  $parent1 = $parent_nid;
				} else {
				  $parent2 = $parent_nid;
				}
			  }
			  //Set title format
			  $result = db_query("SELECT firstname FROM {family_individual} WHERE nid = %d", $parent1);
			  $parent1_firstname = db_fetch_object($result);
			  $result = db_query("SELECT firstname FROM {family_individual} WHERE nid = %d", $parent2);
			  $parent2_firstname = db_fetch_object($result);
			  $title_format = $parent1_firstname . " and " . $parent2_firstname . " " . $group_surname; //Will change with the implementation of tokens.
			  //create group node
			  unset($node);
              $node->type = "family_group";
              $node->uid = $user->uid;
              $node->title = $title_format;
              $node->status = 1;
              $node->moderate = 0;
              $node->comment = 2;
              $node->revision = 0;
              node_validate($node, $error);
              if (!node_access("create", $node)) {
                $error['access'] = message_access();
              }
              if ($error) {
   	            drupal_set_message(
                  t('Error at line @lnum of GED (@line): @error.', 
				    array('@lnum' => $lnum, '@line' => $gedline, '@error' => print_r($error,true))
                  )
				);
              }
              else {
                node_save($node);
				$vid=$node->vid;
                $nid=$node->nid;
              }
              unset($node);
			  //insert variables into family_group table
			  db_query("INSERT INTO {family_group} (vid, nid, title_format, marr_type, marr_date, marr_plac, div_date, div_plac, parent1, parent2) VALUES (%d, %d, '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s')", 
  $vid, $nid, $title_format, $marr_type, $marr_date, $marr_plac, $div_date, $div_plac, $parent1, $parent2);
			break;
		  }
		  $current0record = NULL;
		  
		  switch($value){
		    case 'FAM':
			case 'INDI':
			  $current0record = $value;
			  $current0xref = $xref;
			break;
		  }  
		break;
		case 1:
		  switch($current0record){
		    case 'INDI':
			  switch($fact_code){
			    case 'SEX':
				  $gender = $value;
				  $current1record = $fact_code;
				break;
				case 'NCHI':
				  $children_num = $value;
				  $current1record = $fact_code;
				break;
				case 'NAME':
				  //split name value by / to separate surname
				  $splitName1 = explode("/", $value);
				  $lastname = $splitName1[1];
				  // split name by spaces
				  $splitName2 = explode(" ", $splitName1[0]);
				  // take the first name to be firstname
				  $firstname = $splitName2[0];
				  // add all the other names together in a string
				  $middlename = $splitName2[1] . " " . $splitName2[2] . " " . $splitName2[3] . " " . $splitName2[4] . " " . $splitName2[5] . " " . $splitName2[6] . " " . $splitName2[7];
				  $current1record = $fact_code;
				break;
				case 'DEAT':
				case 'BIRT':
				  $current1record = $fact_code;
				break;
				case 'FAMS':
				  $fams_xref = $value;
				break;
				case 'FAMC':
				  $famc_xref = $value;
				break;
			  }
			break;
			case 'FAM':
			  switch($fact_code){
			    case 'MARR':
				case 'DIV':
				  $current1record = $fact_code;
				break;
			break;
		  }
		break;
		case 2:
		  switch($current1record){
		    case 'BIRT':
			  switch($fact_code){
			    case 'DATE':
				  $birthdate = family_changeDateFormat($value)
				break;
				case 'PLAC':
				  $birthplace = $value;
				break;
			  }
			break;
			case 'DEAT':
			  switch($fact_code){
			    case 'DATE':
				  $deathdate = family_changeDateFormat($value)
				break;
				case 'PLAC':
				  $deathplace = $value;
				break;
			  }
			break;
			case 'MARR':
			  switch($fact_code){
			    case 'TYPE':
				  $marr_type = $value
				break;
				case 'DATE':
				  $marr_date = family_changeDateFormat($value)
				break;
				case 'PLAC':
				  $marr_plac = $value
				break;
			  }
			break;
			case 'DIV':
			  switch($fact_code){
			    case 'DATE':
				  $div_date = family_changeDateFormat($value)
				break;
				case 'PLAC':
				  $div_plac = $value
				break;
			  }
			break;
		  }
		break;
	  }

Thanks,
Jonathan

Comment #19

pyutaros commented 1 October 2008 at 02:51

Okay. I haven't even remotely tested this yet. Here is the first draft of the import code. It has been committed to ver 5.x-3.x-dev. Download should be available after midnight.

//new ged file evaluation code
	  switch($level){
	    case 0:
		  switch($current0record){
		    case 'INDI':
			  //create title_format variable that has not yet been set.
			  $title_format = $firstname . " " . $middlename . " " . $lastname; //Will change with the implementation of tokens.
			  //Create family_individual node
			  unset($node);
              $node->type = "family_individual";
              $node->uid = $user->uid;
              $node->title = $title_format;
              $node->status = 1;
              $node->moderate = 0;
              $node->comment = 2;
              $node->revision = 0;
              node_validate($node, $error);
              if (!node_access("create", $node)) {
                $error['access'] = message_access();
              }
              if ($error) {
   	            drupal_set_message(
                  t('Error at line @lnum of GED (@line): @error.', 
				    array('@lnum' => $lnum, '@line' => $gedline, '@error' => print_r($error,true))
                  )
				);
              }
              else {
                node_save($node);
				$vid=$node->vid;
                $nid=$node->nid;
              }
              unset($node);
			  //Insert relationship variables into temporary database
			  db_query_temporary("INSERT INTO {family_relations_temp} (nid, famc_xref, fams_xref) VALUES (%d, '%s', '%s')", $nid, $famc_xref, $fams_xref);
			  //insert data into family_individual table
			  db_query("INSERT INTO {family_individual} (vid, nid, title_format, firstname, middlename, lastname, gender, birthdate, birthplace, deathdate, deathplace, children_num) VALUES (%d, %d, '%s',  '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', %d, '%s')", $vid, $nid, $title_format, $firstname, $middlename, $lastname, $gender, $birthdate, $birthplace, $deathdate, $deathplace, $children_num);
			  //unset all INDI variables
			  unset ($current0record);
			  unset ($famc_xref);
			  unset ($fams_xref);
			  unset ($vid);
			  unset ($nid);
			  unset ($title_format);
			  unset ($firstname);
			  unset ($middlename);
			  unset ($lastname);
			  unset ($gender);
			  unset ($birthdate);
			  unset ($birthplace);
			  unset ($deathdate);
			  unset ($deathplace);
			  unset ($children_num);
			break;
			case 'FAM'
			  //Find the group surname shared by the children of the group. - This may be incorrect in cases where the name has not been passed to the children.
			  $result = db_query_temporary("SELECT nid FROM {family_relations_temp} WHERE famc_xref = '%s'", $current0xref);
			  $child_ref_nid = db_fetch_object($result);
			  $result = db_query("SELECT lastname FROM {family_individual} WHERE nid = %d", $child_ref_nid);
			  $group_surname = db_fetch_object($result);
			  //Find Parents of group
			  $result = db_query_temporary("SELECT nid FROM {family_relations_temp} WHERE fams_xref = '%s'", $current0xref);
			  while ($parent_nid = db_fetch_object($result)) {
			    $result2 = db_query("SELECT lastname FROM {family_individual} WHERE nid = %d", $parent_nid);
				$parent_surname = db_fetch_object($result2);
			    if ($parent_surname = $group_surname){
				  $parent1 = $parent_nid;
				} else {
				  $parent2 = $parent_nid;
				}
			  }
			  //Set title format
			  $result = db_query("SELECT firstname FROM {family_individual} WHERE nid = %d", $parent1);
			  $parent1_firstname = db_fetch_object($result);
			  $result = db_query("SELECT firstname FROM {family_individual} WHERE nid = %d", $parent2);
			  $parent2_firstname = db_fetch_object($result);
			  $title_format = $parent1_firstname . " and " . $parent2_firstname . " " . $group_surname; //Will change with the implementation of tokens.
			  //create group node
			  unset($node);
              $node->type = "family_group";
              $node->uid = $user->uid;
              $node->title = $title_format;
              $node->status = 1;
              $node->moderate = 0;
              $node->comment = 2;
              $node->revision = 0;
              node_validate($node, $error);
              if (!node_access("create", $node)) {
                $error['access'] = message_access();
              }
              if ($error) {
   	            drupal_set_message(
                  t('Error at line @lnum of GED (@line): @error.', 
				    array('@lnum' => $lnum, '@line' => $gedline, '@error' => print_r($error,true))
                  )
				);
              }
              else {
                node_save($node);
				$vid=$node->vid;
                $nid=$node->nid;
              }
              unset($node);
			  //insert ancestor group value into INDI nodes related to this group
			  db_query("UPDATE {family_individual} SET ancestor_group='%d' WHERE lastname='%s'", $nid, $group_surname);
			  //insert variables into family_group table
			  db_query("INSERT INTO {family_group} (vid, nid, title_format, marr_type, marr_date, marr_plac, div_date, div_plac, parent1, parent2) VALUES (%d, %d, '%s', '%s', '%s', '%s', '%s', '%s', '%d', '%d')", $vid, $nid, $title_format, $marr_type, $marr_date, $marr_plac, $div_date, $div_plac, $parent1, $parent2);
			  //unset all FAM variables
			  unset ($current0record);
			  unset ($current0xref);
			  unset ($child_ref_nid);
			  unset ($group_surname);
			  unset ($parent_surname);
			  unset ($parent1_firstname);
			  unset ($parent2_firstname);
			  unset ($vid);
			  unset ($nid);
			  unset ($title_format);
			  unset ($marr_type);
			  unset ($marr_date);
			  unset ($marr_plac);
			  unset ($div_date);
			  unset ($div_plac);
			  unset ($parent1);
			  unset ($parent2);
			break;
		  }
		  $current0record = NULL;
		  
		  switch($value){
		    case 'FAM':
			case 'INDI':
			  $current0record = $value;
			  $current0xref = $xref;
			break;
		  }  
		break;
		case 1:
		  switch($current0record){
		    case 'INDI':
			  switch($fact_code){
			    case 'SEX':
				  $gender = $value;
				  $current1record = $fact_code;
				break;
				case 'NCHI':
				  $children_num = $value;
				  $current1record = $fact_code;
				break;
				case 'NAME':
				  //split name value by / to separate surname
				  $splitName1 = explode("/", $value);
				  $lastname = $splitName1[1];
				  // split name by spaces
				  $splitName2 = explode(" ", $splitName1[0]);
				  // take the first name to be firstname
				  $firstname = $splitName2[0];
				  // add all the other names together in a string
				  $middlename = $splitName2[1] . " " . $splitName2[2] . " " . $splitName2[3] . " " . $splitName2[4] . " " . $splitName2[5] . " " . $splitName2[6] . " " . $splitName2[7];
				  $current1record = $fact_code;
				break;
				case 'DEAT':
				case 'BIRT':
				  $current1record = $fact_code;
				break;
				case 'FAMS':
				  $fams_xref = $value;
				break;
				case 'FAMC':
				  $famc_xref = $value;
				break;
			  }
			break;
			case 'FAM':
			  switch($fact_code){
			    case 'MARR':
				case 'DIV':
				  $current1record = $fact_code;
				break;
			  }
			break;
		  }
		break;
		case 2:
		  switch($current1record){
		    case 'BIRT':
			  switch($fact_code){
			    case 'DATE':
				  $birthdate = family_changeDateFormat($value)
				break;
				case 'PLAC':
				  $birthplace = $value;
				break;
			  }
			break;
			case 'DEAT':
			  switch($fact_code){
			    case 'DATE':
				  $deathdate = family_changeDateFormat($value)
				break;
				case 'PLAC':
				  $deathplace = $value;
				break;
			  }
			break;
			case 'MARR':
			  switch($fact_code){
			    case 'TYPE':
				  $marr_type = $value
				break;
				case 'DATE':
				  $marr_date = family_changeDateFormat($value)
				break;
				case 'PLAC':
				  $marr_plac = $value
				break;
			  }
			break;
			case 'DIV':
			  switch($fact_code){
			    case 'DATE':
				  $div_date = family_changeDateFormat($value)
				break;
				case 'PLAC':
				  $div_plac = $value
				break;
			  }
			break;
		  }
		break;
	  }

Comment #20

pyutaros commented 4 October 2008 at 00:35

Done a quick test. The new code creates nodes, but they are of an unknown type. Trying to figure out what the problem could be. Think it has something to do with the node_save function.

Comment #21

pyutaros commented 8 October 2008 at 04:49

The problem was with the variable being used in the Switch / Case evaluation structure. $current0record was never being set, so the remaining values were not being set. Corrected variable and a few other errors (include statement , duplicate functions). now the nodes import as the proper type, but no data is coming through. Already did one test with echo statements, and it looks like all the variables are there. Receiving the errors like the following on import.

    * user warning: Table 'idkdbla4_testlab5.family_relations_temp' doesn't exist query: INSERT INTO family_relations_temp (nid, famc_xref, fams_xref) VALUES (1718, '@F2@', '') in /includes/database.mysql.inc on line 172.
    * user warning: Duplicate entry '1718-1718' for key 1 query: INSERT INTO family_individual (vid, nid, title_format, firstname, middlename, lastname, gender, birthdate, birthplace, deathdate, deathplace, children_num) VALUES (1718, 1718, 'Test E McTester', 'Test', 'E ', 'McTester', 'M', '2007-02-07', 'DeWitt Army Hosp,Ft Belvoir,VA', '', '', 1) in includes/database.mysql.inc on line 172.

Here is the import code as it stands currently. Too cloudy to go any further tonight.

<?php
// $Id: import.inc,v 1.5.4.5.4.12 2008/10/08 04:36:28 pyutaros Exp $
require_once "includes/common.inc";
// import.inc
// Functions for importing GEDCOM files to database
// Using the data base defined in simple.mysql
// This may also be used as a temporary storage before making more processing
// for import to other database format.

ini_set('auto_detect_line_endings', true);

//Generate a form for uploading a GEDCOM file
function family_import() {
  return drupal_get_form('family_import_form');
}

function family_import_form() {
  $form['#attributes'] = array('enctype' => "multipart/form-data");
  $form['gedcom_file'] = array(
    '#type' => 'file',
    '#title' => t('GED file to upload'),
    '#size' => 40,
  );
  //$form['merge'] = array(
  //  '#type' => 'radios',
  //  '#title' => t('Merge options'),
  //  '#options' => array(t('replace existing data'), t('augment current data'), t('merge individuals by name')),
  //  '#default_value' => variable_get('family_import_replace', 1),
  //);

  $form['range'] = array(
    '#type' => 'fieldset',
    '#title' => t('Import range'),
    '#description' => t('Select a range of records (lines staring with 0) to import.  This allows breaking very large files into multiple import sessions.'),
  );
  $form['range']['start'] = array(
    '#type' => 'textfield',
    '#title' => t('First record to import'),
    '#size' => 10,
    '#maxlength' => 10,
    '#description' => t('Enter the number of the first record in the GEDCOM file to include in this import session'),
  );
  $form['range']['nrecords'] = array(
    '#type' => 'textfield',
    '#title' => t('Number of records to import'),
    '#size' => 10,
    '#maxlength' => 10,
    '#description' => t('Enter the number of records to process in this import session'),
  );
  $form['replace'] = array(
    '#type' => 'checkbox',
    '#title' => t('Replace existing GED data'),
    '#default_value' => variable_get('family_import_replace', 1),
  );
  $form['submit'] = array('#type' => 'submit', '#value' => t('Start Import'));
  return $form;
}

// Check the uploaded GEDCOM file
function family_import_form_validate($form_id, $form_values) {
  $file = file_check_upload('gedcom_file');
  if (!$file) {
    form_set_error('',t("Didn't get GED file"));
  }
  
  $fp = fopen($file->filepath , "r" );
  if (!$fp) {
    form_set_error('',t("Couldn't open get GED file"));
  }
  fclose($fp);
}

// Parse the uploaded GEDCOM file
function family_import_form_submit($form_id, $form_values) {
  $file = file_check_upload('gedcom_file');
  if (!$file) {
    form_set_error('',t("Didn't get GED file"));
  }
  
  $fp = fopen($file->filepath , "r" );
  if (!$fp) {
    form_set_error('',t("Couldn't open get GED file"));
  }

  
  //
  // Empty current content. This is useful for debugging, but more caution should be
  // done before deleting database in the working version
  //
  if ($form_values['replace'])
  {
    db_query("TRUNCATE {family_individual}");
    db_query("TRUNCATE {family_group}");
    db_query("TRUNCATE {family_location}");
    db_query("TRUNCATE {family_variable}");
    
    $q = db_query("select nid from {node} where type = 'family_individual'");
    $n = 0;
    while ($o = db_fetch_object($q)) {
      node_delete($o->nid);
      $n++;
    }
    drupal_set_message(t('Deleted @n family_individual nodes.', array('@n' => $n)));
    
	$q = db_query("select nid from {node} where type = 'family_group'");
    $n = 0;
    while ($o = db_fetch_object($q)) {
      node_delete($o->nid);
      $n++;
    }
    drupal_set_message(t('Deleted @n family_group nodes.', array('@n' => $n)));
    
	$q = db_query("select nid from {node} where type = 'family_location'");
    $n = 0;
    while ($o = db_fetch_object($q)) {
      node_delete($o->nid);
      $n++;
    }
    drupal_set_message(t('Deleted @n family_location nodes.', array('@n' => $n))); 
  }

  $rmin = $form_values['start']? $form_values['start']:0;
  $rcount = $form_values['nrecords']? $form_values['nrecords']:99999999;
  $rmax = $rmin + $rcount - 1;
  $rnum = 0;
  $rprocessed = 0;
  $lprocessed = 0;
  
  $lnum = 0;
  $gedcom_hier=array();   // References to GEDCOM parents on each level
  
  //declare variables for evaluation
  $current0record = NULL;
  
  while (!feof ($fp))
  {
    $gedline = fgets( $fp, 1024 );
    $lnum++;
    
    if (preg_match("/^\s*(\d+)\s*(?:@([^@]+)@)?\s*(\S+)\s*(.*\S)?\s*$/i", $gedline , $matches)) {
      $level=$matches[1];
      if ($level == 0) ++$rnum;
      if ($rnum < $rmin) continue;
      if ($rnum > $rmax) break;
      if ($level == 0) ++$rprocessed;
      ++$lprocessed;
      $xref=$matches[2];
      $fact_code=$matches[3];
      $value=$matches[4];
      $gedcom_source=$gedline;
      $parent=$gedcom_hier[$level-1];

	  //new ged file evaluation code
	  //next line is debug
	  //echo $level . "<br>";
	  switch($level){
	    case '0':
		  //next line is debug
		  //echo $current0record . "<br>";
		  switch($current0record){
		    case 'INDI':
			  //create title_format variable that has not yet been set.
			  $title_format = $firstname . " " . $middlename . " " . $lastname; //Will change with the implementation of tokens.
			  //next line is debug
		      //echo $title_format . "<br>";
			  //Create family_individual node
			  unset($node);
              $node->type = family_individual;
              $node->uid = $user->uid;
              $node->title = $title_format;
              $node->status = 1;
              $node->moderate = 0;
              $node->comment = 2;
              $node->revision = 0;
              node_validate($node, $error);
              if (!node_access("create", $node)) {
                $error['access'] = message_access();
              }
              if ($error) {
   	            drupal_set_message(
                  t('Error at line @lnum of GED (@line): @error.', 
				    array('@lnum' => $lnum, '@line' => $gedline, '@error' => print_r($error,true))
                  )
				);
              }
              else {
                node_save($node);
				$vid=$node->vid;
                $nid=$node->nid;
              }
              unset($node);
			  //Insert relationship variables into temporary database
			  db_query_temporary("INSERT INTO {family_relations_temp} (nid, famc_xref, fams_xref) VALUES (%d, '%s', '%s')", $nid, $famc_xref, $fams_xref);
			  //next line is debug
		      //echo "DB INSERT" . " " . $vid . " " . $nid . " " . $title_format . " " . $firstname . " " . $middlename . " " . $lastname . " " . $gender . " " . $birthdate . " " . $birthplace . " " . $deathdate . " " . $deathplace . " " . $children_num;
			  //insert data into family_individual table
			  db_query("INSERT INTO {family_individual} (vid, nid, title_format, firstname, middlename, lastname, gender, birthdate, birthplace, deathdate, deathplace, children_num) VALUES (%d, %d, '%s',  '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', %d)", $vid, $nid, $title_format, $firstname, $middlename, $lastname, $gender, $birthdate, $birthplace, $deathdate, $deathplace, $children_num);
			  //unset all INDI variables
			  unset ($current0record);
			  unset ($famc_xref);
			  unset ($fams_xref);
			  unset ($vid);
			  unset ($nid);
			  unset ($title_format);
			  unset ($firstname);
			  unset ($middlename);
			  unset ($lastname);
			  unset ($gender);
			  unset ($birthdate);
			  unset ($birthplace);
			  unset ($deathdate);
			  unset ($deathplace);
			  unset ($children_num);
			break;
			case 'FAM':
			  //Find the group surname shared by the children of the group. - This may be incorrect in cases where the name has not been passed to the children.
			  $result = db_query_temporary("SELECT nid FROM {family_relations_temp} WHERE famc_xref = '%s'", $current0xref);
			  $child_ref_nid = db_fetch_object($result);
			  $result = db_query("SELECT lastname FROM {family_individual} WHERE nid = %d", $child_ref_nid);
			  $group_surname = db_fetch_object($result);
			  //Find Parents of group
			  $result = db_query_temporary("SELECT nid FROM {family_relations_temp} WHERE fams_xref = '%s'", $current0xref);
			  while ($parent_nid = db_fetch_object($result)) {
			    $result2 = db_query("SELECT lastname FROM {family_individual} WHERE nid = %d", $parent_nid);
				$parent_surname = db_fetch_object($result2);
			    if ($parent_surname = $group_surname){
				  $parent1 = $parent_nid;
				} else {
				  $parent2 = $parent_nid;
				}
			  }
			  //Set title format
			  $result = db_query("SELECT firstname FROM {family_individual} WHERE nid = %d", $parent1);
			  $parent1_firstname = db_fetch_object($result);
			  $result = db_query("SELECT firstname FROM {family_individual} WHERE nid = %d", $parent2);
			  $parent2_firstname = db_fetch_object($result);
			  $title_format = $parent1_firstname . " and " . $parent2_firstname . " " . $group_surname; //Will change with the implementation of tokens.
			  //create group node
			  unset($node);
              $node->type = family_group;
              $node->uid = $user->uid;
              $node->title = $title_format;
              $node->status = 1;
              $node->moderate = 0;
              $node->comment = 2;
              $node->revision = 0;
              node_validate($node, $error);
              if (!node_access("create", $node)) {
                $error['access'] = message_access();
              }
              if ($error) {
   	            drupal_set_message(
                  t('Error at line @lnum of GED (@line): @error.', 
				    array('@lnum' => $lnum, '@line' => $gedline, '@error' => print_r($error,true))
                  )
				);
              }
              else {
                node_save($node);
				$vid=$node->vid;
                $nid=$node->nid;
              }
              unset($node);
			  //insert ancestor group value into INDI nodes related to this group
			  //db_query("UPDATE {family_individual} SET ancestor_group='%d' WHERE lastname='%s'", $nid, $group_surname);
			  //insert variables into family_group table
			  db_query("INSERT INTO {family_group} (vid, nid, title_format, marr_type, marr_date, marr_plac, div_date, div_plac, parent1, parent2) VALUES (%d, %d, '%s', '%s', '%s', '%s', '%s', '%s', '%d', '%d')", $vid, $nid, $title_format, $marr_type, $marr_date, $marr_plac, $div_date, $div_plac, $parent1, $parent2);
			  //unset all FAM variables
			  unset ($current0record);
			  unset ($current0xref);
			  unset ($child_ref_nid);
			  unset ($group_surname);
			  unset ($parent_surname);
			  unset ($parent1_firstname);
			  unset ($parent2_firstname);
			  unset ($vid);
			  unset ($nid);
			  unset ($title_format);
			  unset ($marr_type);
			  unset ($marr_date);
			  unset ($marr_plac);
			  unset ($div_date);
			  unset ($div_plac);
			  unset ($parent1);
			  unset ($parent2);
			break;
		  }
		  $current0record = NULL;
		  //next line is debug
		  //echo $fact_code . "<br>";
		  switch($fact_code){
		    case 'FAM':
			case 'INDI':
			  $current0record = $fact_code;
			  $current0xref = $xref;
			break;
		  }  
		break;
		case '1':
		  //next line is debug
		  //echo $current0record . "<br>";
		  switch($current0record){
		    case 'INDI':
			  //next line is debug
		      //echo $fact_code . "<br>";
			  switch($fact_code){
			    case 'SEX':
				  $gender = $value;
				  $current1record = $fact_code;
				break;
				case 'NCHI':
				  $children_num = $value;
				  $current1record = $fact_code;
				break;
				case 'NAME':
				  //split name value by / to separate surname
				  $splitName1 = explode("/", $value);
				  $lastname = $splitName1[1];
				  // split name by spaces
				  $splitName2 = explode(" ", $splitName1[0]);
				  // take the first name to be firstname
				  $firstname = $splitName2[0];
				  // add all the other names together in a string
				  $middlename = $splitName2[1] . " " . $splitName2[2] . " " . $splitName2[3] . " " . $splitName2[4] . " " . $splitName2[5] . " " . $splitName2[6] . " " . $splitName2[7];
				  $current1record = $fact_code;
				break;
				case 'DEAT':
				case 'BIRT':
				  $current1record = $fact_code;
				break;
				case 'FAMS':
				  $fams_xref = $value;
				break;
				case 'FAMC':
				  $famc_xref = $value;
				break;
			  }
			break;
			case 'FAM':
			  //next line is debug
		      //echo $fact_code . "<br>";
			  switch($fact_code){
			    case 'MARR':
				case 'DIV':
				  $current1record = $fact_code;
				break;
			  }
			break;
		  }
		break;
		case '2':
		  //next line is debug
		  //echo $current1record . "<br>";
		  switch($current1record){
		    case 'BIRT':
			  //next line is debug
		      //echo $fact_code . "<br>";
			  switch($fact_code){
			    case 'DATE':
				  $birthdate = family_changeDateFormat($value);
				break;
				case 'PLAC':
				  $birthplace = $value;
				break;
			  }
			break;
			case 'DEAT':
			  //next line is debug
		      //echo $fact_code . "<br>";
			  switch($fact_code){
			    case 'DATE':
				  $deathdate = family_changeDateFormat($value);
				break;
				case 'PLAC':
				  $deathplace = $value;
				break;
			  }
			break;
			case 'MARR':
			  //next line is debug
		      //echo $fact_code . "<br>";
			  switch($fact_code){
			    case 'TYPE':
				  $marr_type = $value;
				break;
				case 'DATE':
				  $marr_date = family_changeDateFormat($value);
				break;
				case 'PLAC':
				  $marr_plac = $value;
				break;
			  }
			break;
			case 'DIV':
			  //next line is debug
		      //echo $fact_code . "<br>";
			  switch($fact_code){
			    case 'DATE':
				  $div_date = family_changeDateFormat($value);
				break;
				case 'PLAC':
				  $div_plac = $value;
				break;
			  }
			break;
		  }
		break;
	  }
	}
  }
  fclose ($fp);
  
  drupal_set_message(t('Processed @r records (@n lines) of GED.', array('@r' => $rprocessed, '@n' => $lprocessed)));
  if ($rnum > $rmax) drupal_set_message(t('Next start record: @r.', array('@r' => $rmax + 1)));
  else drupal_set_message(t('No more records to process'));

  return 'family';
}

Comment #22

Microbe commented 9 October 2008 at 09:54

Status	File	Size
new	import.inc_.txt	14.08 KB

Drupal saves node vaiables using node_save() i have found so they don't then have to be inserted afterwards. (see import.inc) this still doesn't seem to work for family group nodes (no idea why) also keep getting errors temporary tables- not sure what these do yet so maybe you could look at that.

# user warning: Table 'retep992_drupal.test5_family_relations_temp' doesn't exist query: INSERT INTO test5_family_relations_temp (nid, famc_xref, fams_xref) VALUES (841, '', '') in /home/retep992/public_html/webdesign/includes/database.mysql.inc on line 174.
# user warning: Table 'retep992_drupal.test5_family_relations_temp' doesn't exist query: CREATE TEMPORARY TABLE F1 SELECT nid FROM test5_family_relations_temp WHERE famc_xref = '' in /home/retep992/public_html/webdesign/includes/database.mysql.inc on line 174.

Comment #23

Microbe commented 9 October 2008 at 16:24

made a couple of changes to your import script - it should now work fine. :)

// $Id: import.inc,v 1.5.4.5.4.12 2008/10/08 04:36:28 pyutaros Exp $
require_once "includes/common.inc";
// import.inc
// Functions for importing GEDCOM files to database
// Using the data base defined in simple.mysql
// This may also be used as a temporary storage before making more processing
// for import to other database format.

ini_set('auto_detect_line_endings', true);

//Generate a form for uploading a GEDCOM file
function family_import() {
  return drupal_get_form('family_import_form');
}

function family_import_form() {
  $form['#attributes'] = array('enctype' => "multipart/form-data");
  $form['gedcom_file'] = array(
    '#type' => 'file',
    '#title' => t('GED file to upload'),
    '#size' => 40,
  );
  //$form['merge'] = array(
  //  '#type' => 'radios',
  //  '#title' => t('Merge options'),
  //  '#options' => array(t('replace existing data'), t('augment current data'), t('merge individuals by name')),
  //  '#default_value' => variable_get('family_import_replace', 1),
  //);

  $form['range'] = array(
    '#type' => 'fieldset',
    '#title' => t('Import range'),
    '#description' => t('Select a range of records (lines staring with 0) to import.  This allows breaking very large files into multiple import sessions.'),
  );
  $form['range']['start'] = array(
    '#type' => 'textfield',
    '#title' => t('First record to import'),
    '#size' => 10,
    '#maxlength' => 10,
    '#description' => t('Enter the number of the first record in the GEDCOM file to include in this import session'),
  );
  $form['range']['nrecords'] = array(
    '#type' => 'textfield',
    '#title' => t('Number of records to import'),
    '#size' => 10,
    '#maxlength' => 10,
    '#description' => t('Enter the number of records to process in this import session'),
  );
  $form['replace'] = array(
    '#type' => 'checkbox',
    '#title' => t('Replace existing GED data'),
    '#default_value' => variable_get('family_import_replace', 1),
  );
  $form['submit'] = array('#type' => 'submit', '#value' => t('Start Import'));
  return $form;
}

// Check the uploaded GEDCOM file
function family_import_form_validate($form_id, $form_values) {
  $file = file_check_upload('gedcom_file');
  if (!$file) {
    form_set_error('',t("Didn't get GED file"));
  }
  
  $fp = fopen($file->filepath , "r" );
  if (!$fp) {
    form_set_error('',t("Couldn't open get GED file"));
  }
  fclose($fp);
}

// Parse the uploaded GEDCOM file
function family_import_form_submit($form_id, $form_values) {
  $file = file_check_upload('gedcom_file');
  if (!$file) {
    form_set_error('',t("Didn't get GED file"));
  }
  
  $fp = fopen($file->filepath , "r" );
  if (!$fp) {
    form_set_error('',t("Couldn't open get GED file"));
  }

  
  //
  // Empty current content. This is useful for debugging, but more caution should be
  // done before deleting database in the working version
  //
db_query("CREATE TABLE {family_relations_temp} (`nid` VARCHAR( 128 ) NOT NULL ,`famc_xref` VARCHAR( 128 ) NOT NULL ,`fams_xref` VARCHAR( 128 ) NOT NULL) ENGINE = MYISAM");

  if ($form_values['replace'])
  {
    db_query("TRUNCATE {family_individual}");
    db_query("TRUNCATE {family_group}");
    db_query("TRUNCATE {family_location}");
    db_query("TRUNCATE {family_variable}");

    
    $q = db_query("select nid from {node} where type = 'family_individual'");
    $n = 0;
    while ($o = db_fetch_object($q)) {
      node_delete($o->nid);
      $n++;
    }
    drupal_set_message(t('Deleted @n family_individual nodes.', array('@n' => $n)));
    
	$q = db_query("select nid from {node} where type = 'family_group'");
    $n = 0;
    while ($o = db_fetch_object($q)) {
      node_delete($o->nid);
      $n++;
    }
    drupal_set_message(t('Deleted @n family_group nodes.', array('@n' => $n)));
    
	$q = db_query("select nid from {node} where type = 'family_location'");
    $n = 0;
    while ($o = db_fetch_object($q)) {
      node_delete($o->nid);
      $n++;
    }
    drupal_set_message(t('Deleted @n family_location nodes.', array('@n' => $n))); 
  }

  $rmin = $form_values['start']? $form_values['start']:0;
  $rcount = $form_values['nrecords']? $form_values['nrecords']:99999999;
  $rmax = $rmin + $rcount - 1;
  $rnum = 0;
  $rprocessed = 0;
  $lprocessed = 0;
  
  $lnum = 0;
  $gedcom_hier=array();   // References to GEDCOM parents on each level
  
  //declare variables for evaluation
  $current0record = NULL;
  
  while (!feof ($fp))
  {
    $gedline = fgets( $fp, 1024 );
    $lnum++;
    
    if (preg_match("/^\s*(\d+)\s*(?:@([^@]+)@)?\s*(\S+)\s*(.*\S)?\s*$/i", $gedline , $matches)) {
      $level=$matches[1];
      if ($level == 0) ++$rnum;
      if ($rnum < $rmin) continue;
      if ($rnum > $rmax) break;
      if ($level == 0) ++$rprocessed;
      ++$lprocessed;
      $xref=$matches[2];
      $fact_code=$matches[3];
      $value=$matches[4];
      $gedcom_source=$gedline;
      $parent=$gedcom_hier[$level-1];

	  //new ged file evaluation code
	  //next line is debug
	  //echo $level . "<br>";
	  switch($level){
	    case '0':
		  //next line is debug
		  //echo $current0record . "<br>";
		  switch($current0record){
		    case 'INDI':
			  //create title_format variable that has not yet been set.
			  $title_format = $firstname . " " . $middlename . " " . $lastname; //Will change with the implementation of tokens.
			  //next line is debug
		      //echo $title_format . "<br>";
			  //Create family_individual node
			  unset($node);
              $node->type = family_individual;
              $node->uid = $user->uid;
              $node->title = $title_format;
              $node->status = 1;
              $node->moderate = 0;
              $node->comment = 2;
              $node->revision = 0;
              node_validate($node, $error);
              if (!node_access("create", $node)) {
                $error['access'] = message_access();
              }
              if ($error) {
   	            drupal_set_message(
                  t('Error at line @lnum of GED (@line): @error.', 
				    array('@lnum' => $lnum, '@line' => $gedline, '@error' => print_r($error,true))
                  )
				);
              }
              else {
                $node->title=$title_format;
                $node->FORE=$firstname;
                $node->MIDN=$middlename;
                $node->SURN=$lastname;
                $node->SEX=$gender;
                $node->BIRT_DATE=$birthdate;
                $node->BIRT_PLAC=$birthplace;
                $node->DEAT_DATE=$deathdate;
                $node->DEAT_PLAC=$deathplace;
                node_save($node);
                $nid=$node->nid;
		    //Insert relationship variables into temporary database
		    db_query("INSERT INTO {family_relations_temp} (nid, famc_xref, fams_xref) VALUES (%d, '%s', '%s')", $nid, $famc_xref, $fams_xref);
  		    //next line is debug
              }
              unset($node);
			  //unset all INDI variables
			  unset ($famc_xref);
			  unset ($fams_xref);
			  unset ($vid);
			  unset ($nid);
			  unset ($title_format);
			  unset ($firstname);
			  unset ($middlename);
			  unset ($lastname);
			  unset ($gender);
			  unset ($birthdate);
			  unset ($birthplace);
			  unset ($deathdate);
			  unset ($deathplace);
			  unset ($children_num);
			break;
			case 'FAM':
			  //Find the group surname shared by the children of the group. - This may be incorrect in cases where the name has not been passed to the children.
                    $current0xref='@'.$current0xref.'@';
			  $result = db_query("SELECT r.lastname FROM {family_individual} r INNER JOIN {family_relations_temp} t ON r.nid=t.nid WHERE t.famc_xref = '%s'", $current0xref);
			  $group_surname = db_fetch_object($result);
                    //Debug Line Below
                    //drupal_set_message(t('FAM XREF @n', array('@n' => $current0xref))); 

			  //Find Parents of group
			  $result = db_query("SELECT nid FROM {family_relations_temp} WHERE fams_xref = '%s'", $current0xref);
                    while ($parent = db_fetch_array($result)) {
                      //Debug Line Below
                      //drupal_set_message(t('PARENT NID @n', array('@n' => $parent['nid'])));
			    $parents[] = $parent['nid'];
                    }
                    $parent1 = $parents[0];
                    $parent2 = $parents[1];

			  //create group node
			  unset($node);
              $node->type = family_group;
              $node->uid = $user->uid;
              $node->title = $title_format;
              $node->status = 1;
              $node->moderate = 0;
              $node->comment = 2;
              $node->revision = 0;
              node_validate($node, $error);
              if (!node_access("create", $node)) {
                $error['access'] = message_access();
              }
              if ($error) {
   	            drupal_set_message(
                  t('Error at line @lnum of GED (@line): @error.', 
				    array('@lnum' => $lnum, '@line' => $gedline, '@error' => print_r($error,true))
                  )
				);
              }
              else {
                $node->title=$title_format;
                $node->MARR_TYPE=$marr_type;
                $node->MARR_DATE=$marr_date;
                $node->MARR_PLAC=$marr_plac;
                $node->DIV_DATE=$div_date;
                $node->DIV_PLAC=$div_plac;
                $node->PAR1=$parent1;
                $node->PAR2=$parent2;
                node_save($node);
                $nid=$node->nid;
		    //insert ancestor group value into INDI nodes related to this group
		    //db_query("UPDATE {family_individual} SET ancestor_group='%d' WHERE lastname='%s'", $nid, $group_surname);
		    //insert variables into family_group table
		    //db_query("INSERT INTO {family_group} (vid, nid, title_format, marr_type, marr_date, marr_plac, div_date, div_plac, parent1, parent2) VALUES (%d, %d, '%s', '%s', '%s', '%s', '%s', '%s', '%d', '%d')", $vid, $nid, $title_format, $marr_type, $marr_date, $marr_plac, $div_date, $div_plac, $parent1, $parent2);
		    //unset all FAM variables
	          $result = db_query("SELECT nid FROM {family_relations_temp} WHERE famc_xref = '%s'", $current0xref);
                while ($child = db_fetch_array($result)) {
                  //Debug Line Below
                  //drupal_set_message(t('CHILD NID @n', array('@n' => $child['nid'])));
			$childnid = $child['nid'];
                  db_query("UPDATE {family_individual} SET ancestor_group = %d WHERE nid =%d", $nid, $childnid);
                }
              }
       	        unset($node);
			  unset ($current0xref);
			  unset ($child_ref_nid);
			  unset ($group_surname);
			  unset ($parent_surname);
			  unset ($parent1_firstname);
			  unset ($parent2_firstname);
			  unset ($vid);
			  unset ($nid);
			  unset ($title_format);
			  unset ($marr_type);
			  unset ($marr_date);
			  unset ($marr_plac);
			  unset ($div_date);
			  unset ($div_plac);
			  unset ($parent1);
			  unset ($parent2);
                    unset ($parents);
	  		  break;
		  }
		  $current0record = NULL;
		  //next line is debug
		  //echo $fact_code . "<br>";
		  switch($fact_code){
		    case 'FAM':
			case 'INDI':
			  $current0record = $fact_code;
			  $current0xref = $xref;
			break;
		  }  
		break;
		case '1':
		  //next line is debug
		  //echo $current0record . "<br>";
		  switch($current0record){
		    case 'INDI':
			  //next line is debug
		      //echo $fact_code . "<br>";
			  switch($fact_code){
			    case 'SEX':
				  $gender = $value;
				  $current1record = $fact_code;
				break;
				case 'NCHI':
				  $children_num = $value;
				  $current1record = $fact_code;
				break;
				case 'NAME':
				  //split name value by / to separate surname
				  $splitName1 = explode("/", $value);
				  $lastname = $splitName1[1];
				  // split name by spaces
				  $splitName2 = explode(" ", $splitName1[0]);
				  // take the first name to be firstname
				  $firstname = $splitName2[0];
				  // add all the other names together in a string
				  $middlename = $splitName2[1] . " " . $splitName2[2] . " " . $splitName2[3] . " " . $splitName2[4] . " " . $splitName2[5] . " " . $splitName2[6] . " " . $splitName2[7];
				  $current1record = $fact_code;
				break;
				case 'DEAT':
				case 'BIRT':
				  $current1record = $fact_code;
				break;
				case 'FAMS':
				  $fams_xref = $value;
				break;
				case 'FAMC':
				  $famc_xref = $value;
				break;
			  }
			break;
			case 'FAM':
			  //next line is debug
		      //echo $fact_code . "<br>";
			  switch($fact_code){
			    case 'MARR':
				case 'DIV':
				  $current1record = $fact_code;
				break;
			  }
			break;
		  }
		break;
		case '2':
		  //next line is debug
		  //echo $current1record . "<br>";
		  switch($current1record){
		    case 'BIRT':
			  //next line is debug
		      //echo $fact_code . "<br>";
			  switch($fact_code){
			    case 'DATE':
				  $birthdate = family_changeDateFormat($value);
				break;
				case 'PLAC':
				  $birthplace = $value;
				break;
			  }
			break;
			case 'DEAT':
			  //next line is debug
		      //echo $fact_code . "<br>";
			  switch($fact_code){
			    case 'DATE':
				  $deathdate = family_changeDateFormat($value);
				break;
				case 'PLAC':
				  $deathplace = $value;
				break;
			  }
			break;
			case 'MARR':
			  //next line is debug
		      //echo $fact_code . "<br>";
			  switch($fact_code){
			    case 'TYPE':
				  $marr_type = $value;
				break;
				case 'DATE':
				  $marr_date = family_changeDateFormat($value);
				break;
				case 'PLAC':
				  $marr_plac = $value;
				break;
			  }
			break;
			case 'DIV':
			  //next line is debug
		      //echo $fact_code . "<br>";
			  switch($fact_code){
			    case 'DATE':
				  $div_date = family_changeDateFormat($value);
				break;
				case 'PLAC':
				  $div_plac = $value;
				break;
			  }
			break;
		  }
		break;
	  }
	}
  }

  fclose ($fp);
  db_query("DROP TABLE {family_relations_temp}");  
  drupal_set_message(t('Processed @r records (@n lines) of GED.', array('@r' => $rprocessed, '@n' => $lprocessed)));
  if ($rnum > $rmax) drupal_set_message(t('Next start record: @r.', array('@r' => $rmax + 1)));
  else drupal_set_message(t('No more records to process'));

  return 'family';
}

Comment #24

pyutaros commented 9 October 2008 at 23:17

Microbe, thank you very much for the update. I can see where I went wrong with node_save and the temp DB. I have updated CVS with ver 5.x-3.3. I have also tested and it works beautifully. Thanks again. I will udate the 6.x branch with the import file you submitted over there as well.
Jonathan

Comment #25

pyutaros commented 16 October 2008 at 03:22

Status:

Active

» Fixed

No complaints. I am marking as fixed. I should also add that 6.x is now the official "New Features" version of Family Tree 2. The only feature that will be added to 5.x (barring community backports), will be the pending export feature. 6.x will remain the official branch for new features until the 7.x code freeze, at which point we will begin developing the 7.x version.

Comment #26

Anonymous (not verified) commented 30 October 2008 at 03:41