Row highwater field is checked unprepared [#1538046]

Here's a simplified scenario of my case:

Source: CSV. Sample row: "Peter", "Pan", 1330855930
Destination: Node entity

The source column definitions:

array(
  0 => array('firstname', 'First name'),
  1 => array('lastname', 'Last name'),
  2 => array('updated', 'Last updated'),
);

Added this hightwater mark:

$this->highwaterField = array(
  'name' => 'updated',
  'type' => 'int',
);

Now... I'm expecting that, on the second import, Migrate will bypass all records that were previously imported and not changed. I didn't touch the CSV file between the imports. I see that {migrate_status}.highwater is filled with the greatest "updated" column from source.

I used both:

$ drush migrate-import MyMigrate
$ drush migrate-import MyMigrate --update

Both are importing again, everything. My intention is to keep my Drupal in-sync with CSV so it's not a one-time migration. So, performance is critical... I need only fresh data (new and updated records) to come in.

Comment	File	Size	Author
#4	migrate-highwater_prepare-1538046-4.patch	2.56 KB	mikeryan
#2	migrate-unprepared-highwater-field-1538046-2.patch	1.71 KB	claudiu.cristea

Comments

Comment #1

claudiu.cristea

Romanian

Arad 🇷🇴

commented 18 April 2012 at 12:22

It seems that Migrate is using the un-prepared value of timestamp when deciding to parse the row. In my case I implemented a prepareRow() method to transform CSV milliseconds timestamps to regular UNIX timestamp.

  public function prepareRow($row) {
    // Convert milliseconds timestamp to UNIX timestamp.
    if (isset($row->updated)) {
      $row->updated = floor($row->updated / 1000);
    }
  }

In includes/source.inc we have:

      // 5. So, we are using highwater marks. Take the row if its highwater field
      //    value is greater than the saved marked, otherwise skip it.
      elseif ($row->{$this->highwaterField['name']} > $this->activeMigration->getHighwater()) {
        // Fall through
      }

So...

$row->{$this->highwaterField['name']} will be in milliseconds (unprepared)
$this->activeMigration->getHighwater() will be in seconds, it was prepared before stored.

I think the above elseif statement must take the prepared $row->{$this->highwaterField['name']}

Comment #2

claudiu.cristea

Romanian

Arad 🇷🇴

commented 18 April 2012 at 13:22

Title:	Highwater & CSV not working	» Row highwater field is checked unprepared
Assigned:	Unassigned	» claudiu.cristea
Status:	Active	» Needs review

Status	File	Size
new	migrate-unprepared-highwater-field-1538046-2.patch	1.71 KB

Here's a patch. I admit that is not so nice cloning the $row object but I didn't wanted to messup all the logic there by preparing the row in an early stage. This works for me.

Comment #3

mikeryan

he/him

English

Pittsfield, MA, USA

commented 21 April 2012 at 16:08

Priority:	Major	» Normal
Status:	Needs review	» Needs work

No, there's got to be a better way. Doubling the calls to prepareRow() for the sake of an edge case is not acceptable. Better would be to find a way to insert a call to prepareRow() just before testing the highwater mark, and make sure it doesn't get called again below if it was called here...

Comment #4

mikeryan

he/him

English

Pittsfield, MA, USA

commented 23 April 2012 at 22:41

Status:

Needs work

» Needs review

Status	File	Size
new	migrate-highwater_prepare-1538046-4.patch	2.56 KB

The attached patch is what I was talking about - does it address the issue for you?

Thanks.

Comment #5

claudiu.cristea

Romanian

Arad 🇷🇴

commented 26 April 2012 at 14:31

Status:

Needs review

» Reviewed & tested by the community

Works as expected with #4. I had to apply it manually.. the branch is ahead.

Thank you!

Comment #6

mikeryan

he/him

English

Pittsfield, MA, USA

commented 26 April 2012 at 19:51

Assigned:	claudiu.cristea	» mikeryan
Status:	Reviewed & tested by the community	» Fixed

Committed for D6 and D7, thanks!

Comment #7

claudiu.cristea

Romanian

Arad 🇷🇴

commented 27 April 2012 at 11:29

Status:

Fixed

» Needs work

Checked twice... Now existing, untouched records are updated every import :(

Tested and discovered that every time this condition is satisfied:

      // 2. If the row is not in the map (we have never tried to import it before),
      //    we always want to try it.
      elseif (!isset($row->migrate_map_sourceid1)) {
        // Fall through
      }

This mean that the existing rows doesn't have the migrate_map_sourceid1 property set at this moment. Not sure that this is coming from your patch.

Comment #8

mikeryan

he/him

English

Pittsfield, MA, USA

commented 28 April 2012 at 15:24

Status:

Needs work

» Fixed

The line comes from #1529362: Migrate not respecting existing map statuses. I don't see why a previously-imported record would not have migrate_map_sourceid1 set...

Comment #9

12 May 2012 at 15:30

Status:

Fixed

» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Row highwater field is checked unprepared

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

News items

Our community

Documentation

Drupal code base

Governance of community