I've been doing more data imports recently, and core makes it painful to import when you want to control the uid and nid of the new records. Core automatically assumes that if you pass a nid or a uid to the save() function, then you performing update. The attached patch removes that assumption so that migation scripts can use node_save() and user_save(). All existing forms and scripts are unaffected.

Migration scripts get the new behavior by setting $node->is_new or $user->is_new before calling save(). This was formerly an internal property implemented just by node_save(). So I now expose that property and unify the logic between node_save() and user_save().

The change to drupal_write_record() is needed because currently that function refuses to accept a value for a serial field. See http://drupal.org/node/169982#comment-597094 for the justification. But we were too cautious there IMO. This is a special case and we should leave it up to the caller to handle the auto-increment if he performs special operations like these. Anyway, in my tests mysql actually increments the serial to one more than the nid/vid that was passed so this caution is not needed.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

stewsnooze’s picture

I think this is a good idea. I've reviewed this patch and it looks fine.

moshe weitzman’s picture

I tested this by pasting the following code into the Execute PHP block provided by devel.module. A small script would work just as well. The node was created just fine, with the specified nid.

$node->is_new = TRUE;
$node->nid=99999;
$node->title='foo';
$node->status=1;
$node->format = 1;
$node->vid=0;
$node->uid=1;
$node->body='bar';
$node->type='page';
node_save($node);

bjaspan’s picture

Unless I'm mistaken (and I haven't tested it), I think this will break on pgsql. pgsql serial columns are not really auto-increment; they are "use the next value from the attached sequence." If the sequence is at 10 and you insert 15, then 5 inserts later you will get a primary-key collision.

But I'd have to try it to be sure.

moshe weitzman’s picture

I mentioned earlier that this is for developers who know what they are doing. They are responsible for setting the pgsql sequences after the import.

stewsnooze’s picture

I think in most import scenarios developers will be using sequence ranging techniques where the Drupal sequences are set higher than any possible value from the import. Otherwise I don't see an easy way how they could possibly avoid primary key collisions. I know that's how I do my imports. This patch is important for enterprise users as well as people who are migrating from other systems but need to retain IDs in some way. You can control most things in Drupal, not being able to control the IDs seems wrong.

alex_b’s picture

I like the approach. This will help us out a lot with migrations where node or user ids matter.

#3 - I'm not worried about collisions within node_save or user_save, they can be handled on a higher level IMO (see #4). I think this behavior and the possibility of collisions should be documented though (in comments? node_save() doesn't have any parameter comments at all).

Ran a quick test. Creating a new user fails with some notices: http://skitch.com/alexbarth/i15q/users-localhost

The culprit seems to be where

if (is_object($account) && $account->uid) {

is substituted by

if (is_object($account) && $account->is_new) {
Dave Reid’s picture

Just some small code reviews for simplicity...
$node->is_new = empty($node->nid) ? TRUE : FALSE; should be $node->is_new = empty($node->nid);
$account->is_new = empty($account->uid) ? TRUE : FALSE; should be $account->is_new = empty($account->uid);

moshe weitzman’s picture

FileSize
5.05 KB

I have fixed the issues raised by Alex and Dave (thanks!). I also ran the node and user tests and identified a minor issue which is also fixed.

moshe weitzman’s picture

I forgot to mention that I also added Doxygen for node_save().

Here is a fragment for importing a new user with a specific uid.


$user = array(
  'uid' => 99999,
  'is_new' => TRUE,
  'status' => 1,
  'mail' => '99999@example.com',
  'name' => 'al',
  'pass' => 'secret',
);
user_save(NULL, $user);

Aron Novak’s picture

I tested the patch (on MySQL) and it seems to be fine.
#9 - the ease of import is nice

Aron Novak’s picture

Status: Needs review » Reviewed & tested by the community

forgot to change the status

moshe weitzman’s picture

FileSize
4.95 KB

Reroll to remove fuzz

webchick’s picture

Status: Reviewed & tested by the community » Needs work

Oooo. I like this patch very much. Some nits:

* Since you've done an awesome job unifying these two functions, we should also make sure their PHPDoc reads symmetrically as well. One says something about is_new being FALSE and the other says something about is_new being TRUE.
* Please include POC code in the patch in the form of a test. One "is_new" for $node and one for $user should suffice. In node.test and user.test, respectively.

If by chance someone sees this and is around tonight, this can make it into UNSTABLE-2. If not, then UNSTABLE-3 for sure. :)

snufkin’s picture

FileSize
8.03 KB

added two tests, for node and user module respectively to test if creating arbitrary content with is_new succeeds, and it seems so :)

also changed the documentation so it is worded the same for node_save and user_save when its talking about the new stuff being created.

webchick’s picture

Ok, since this is your first test I'm going to give you the Merciless Nitpicking Webchick Review (tm). ;) *cracks knuckles*

+class NodeImportTestCase extends DrupalWebTestCase {

a) Please add a line of PHPDoc above there that summarizes what the test case does.
b) Rather than name this NodeImportTestCase, how about NodeSaveTestCase. What you're calling "Import" here is actually just one facet of testing the node_save() function, and we could potentially expand this test case out to cover other uses as well.

      'name' => t('Import test'),
      'description' => t('Test node_save for importing content.'),

That means these lines need to be updated. Should be node_save() (with the parentheses, to signal that it's a function).

+    // create a user that is allowed to post, we'll use this to test the submission

a) All inline comments should start with a capital letter, and end with a period.
b) Replace the comma with a semi-colon.

+    $web_user = $this->drupalCreateUser(array('create article content'));
+    $this->drupalLogin($web_user);
+    $this->web_user = $web_user;

Normally, you only need to do this stuff in setUp() if it will be used in multiple testX functions, and this test case only has one. However, since we're planning on eventually using this test case for the rest of node_save(), this code is fine.

    $title = 'Import test title';

I would use $this->randomName() instead. That's more standard with what other parts of SimpleTest do. Check the function drupalCreateNode().

    $node->body  ='test body';

Same thing here, and also you have spacing issues (two spaces to the left of the = and none to the right)

    $node->type = 'article'; // notice if not provided

I'd take out those comments.

    $node->vid = 1; // notice if not provided

a) Here as well.
b) $node->vid gives a notice if not provided? That's not right. Sounds like a bug. Can you see if node_save() generates the notice without this patch? This strikes me as something we ought to fix as well in this patch if we're trying to make things easier for import scripts; hard-coding the vid is not an option in the 'real world'.
c) You shouldn't hard-code this to a number that might already exist in the database. I have my worries about $test_nid in that respect as well.

    // now lets test this import 

Should be "Now, let's test this import." But how about just "Test the import." :)

    $node_test = node_load(array('nid' => $test_nid));

a) $node_test is not very obvious. Let's call this $node_by_nid or something.
b) Can be shortened to node_load($test_nid);

    $this->assertTrue($node_test, t('Node load based on nid failed'));

While your assertion is fine by the traditional unit testing rules, you'll find that because of the way SimpleTest module displays messages, we tend to make the message reflective of what the test is testing, rather than the message it should display on failure. So change to something more neutral like "Node loaded based on node ID."

    $node_test2 = node_load(array('title' => $title));
    $this->assertTrue($node_test2, t('Node load based on title failed'));

See above. All of these comments also hold true for user.test as well.

      'name' => 'al',

$this->randomName(). Probably for password too.

    user_save(NULL, $user);

Aw. :( Sad panda. :( Let's think about a follow-up patch that would make this match node_save($node).

    $test_user = user_load(array('uid' => $test_uid));

a) Can be shortened to user_load($test_uid);
b) Again, $test_user not very clear.
c) Do we need to load by name as well?

snufkin’s picture

FileSize
8.84 KB

Thanks a lot for the thorough review! vid does not give a notice if omitted, i must have mistaken it with something.

Changes in this patch:

* added doc for the test cases and renamed them
* changed description and names
* comments corrected
* node and user object/array are now built based on drupalCreateNode() and drupalCreateUser()
* variable names are cleared up for the testing phase
* assert messages are fixed
* user is being tested three ways, as return from user_save, loading by uid and loading by name

snufkin’s picture

Status: Needs work » Needs review

changing to cnr

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

I've code reviewed and ran the provided tests. No failures for either so we are RTBC.

I'd just like to address one of webchick's items:

b) $node->vid gives a notice if not provided? That's not right. Sounds like a bug. Can you see if node_save() generates the notice without this patch? This strikes me as something we ought to fix as well in this patch if we're trying to make things easier for import scripts; hard-coding the vid is not an option in the 'real world'.

In my testing, the caller just has to set $node->vid to something in order to avoid a notice. I know that because I first tried this without setting $node->vid. The actual value is not important, and node_save() will create a new vid and save it in node and node_revisions tables. So this is not a bother for import script writers. Further, fixing this is out of scope IMO (if it really is a problem, it is extremely minor).

webchick’s picture

That seems strange. I've programmatically created all kinds of nodes in my day, and have never had to set vid, which made me think that perhaps this notice was introduced in this patch. And setting it to a static value, in any case, looks buggish. But I'll have to take a closer look next week when I get back.

snufkin’s picture

A bit more on the notice. the exact error message is:

notice: Undefined property: stdClass::$vid in /home/balu/projects/drupal/7/drupal/includes/common.inc on line 3345.

It only appears when one tries to send a node with is_new and nid set, without the patch. After applying the patch the notice does not appear anymore. I think this is related to the part in the patch where it does unset($node->vid);, i will try tracing the error when i have some more time.

moshe weitzman’s picture

FileSize
9.4 KB

That NOTICE no longer happens. I retried the $node listed at the top of this message, without a vid and it imported without a notice. Here is a reroll.

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

lilou’s picture

Status: Needs work » Needs review
Kevin Hankens’s picture

subscribing :)

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

This was RTBC before the great testbot debacle.

brainski’s picture

Can this patch also be applied to Drupal 6? Or is there a way to use this genius functionality with drupal 6?

moshe weitzman’s picture

We are using it successfully on D6 on Economist. The patch is a slightly different for D6 - not sure I can extract the right bits in order to post here.

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

moshe weitzman’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
9.4 KB

Fixed up a failing test due to node_load_multiple() patch. Back to RTBC.

Even with a cvs up, I am still seeing an *unrelated* test failure in testAliasTranslation(). The french language is not getting fully enabled.

catch’s picture

I can't reproduce the path.test failure with the patch applied, so looks good since the test bot is also happy.

moshe weitzman’s picture

FileSize
8.85 KB

Reroll - no changes.

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review
catch’s picture

Status: Needs review » Needs work

These changes look fine. Minor point - we no longer have phpdoc for getInfo/setUp, so CNW for that.

moshe weitzman’s picture

Status: Needs work » Needs review
FileSize
8.8 KB

Reroll plus catch's suggested doxygen fix.

catch’s picture

Status: Needs review » Reviewed & tested by the community

These changes look fine, the new tests are good and it's hopefully a small step away from the ugliness in user/node_save()

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

moshe weitzman’s picture

Status: Needs work » Needs review
FileSize
7.92 KB

reroll. this was rtbc for 3 weeks and bit rotted. so i will rtbc this again once it passes testing. no changes.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community
webchick’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs documentation

Wow, I'm really sorry! I have no idea how this managed to fall off of my radar for this long. I guess my RTBC clean-up spurts and testing bot freak-outs were badly timed. :(

There was a couple minor formatting things I cleaned up and... committed to HEAD!

Marking "needs work" because this needs to be updated in the 6.x -> 7.x upgrade docs.

moshe weitzman’s picture

Status: Needs work » Fixed
Issue tags: -Needs documentation

No problem webchick. It took me quite a while to reroll.

I added docs to the chronological page. I can't quite stomach adding to category page because it thats a new level of dumbness to post duplicate content when we have a whole CMS and categorization system at our disposal. We would never inflict such a system on our clients. Perhaps I misunderstood and there is no duplication but it looks like it at first glance.

webchick’s picture

Yeah, the second (categorized) page is only there for Coder module developers to coordinate on the 6.x => 7.x upgrade path checks. They're the ones keeping that page in sync. Once code freeze is done, we'll likely switch and make that the default page, but until then it's useful to have the items chronologically listed.

Damien Tournoud’s picture

Assigned: moshe weitzman » Unassigned
Status: Fixed » Active

The change in drupal_write_record() broke some of the tests on (at least) SQLite. The reason is that some tests in Drupal pass drupal_write_record() a NULL value for a serial column (this is the case during the Text field test, for example).

This is unsupported on both PostgreSQL and SQLite, but MySQL will silently consider a NULL value as a synonym for a non present value.

Damien Tournoud’s picture

Status: Active » Needs review
FileSize
846 bytes

Here is a patch that fix the issue. I'm not completely convinced this is the way to go, and I'm hoping for some feedback.

yched’s picture

Patch in #445094: broken d_w_r() call on non-MySQL db in field_test.module should fix the problematic d_w_r calls in field_test.module

moshe weitzman’s picture

@Damien - does that just put back some code that I removed? I'm pretty sure that I had to remove that in order to fix some edge case. Lets fix the text field in yched's issue and leave this alone unless there are more problems.

moshe weitzman’s picture

Actually Damien's patch looks harmless if we want a quick fix for the test suite. It is debateable whether this should be fixed here or at the call sites.

Damien Tournoud’s picture

Status: Needs review » Fixed

@Moshe: I agree that this is not completely the fault of this particular patch (the bugs are in functions calling d_w_r() with invalid values). Let this issue die in peace. Josh opened #445214: dwr() fails when a NULL value is passed in a serial column (on PostgreSQL and SQLite), in which we will study how to correctly fix that.

yched’s picture

Independantly of #445214: dwr() fails when a NULL value is passed in a serial column (on PostgreSQL and SQLite), I think the d_w_r() changes in this patch introduced a problem in the node save sequence.

Take the following code, handling node creation (roughly simplified from the actual code in node_save):

if ($node->is_new) {
  drupal_write_record('node', $node);
  drupal_write_record('node_revision', $node);
}
db_update('node')
    ->fields(array('vid' => $node->vid))
    ->condition('nid', $node->nid)
    ->execute();

Incoming $node has no nid or vid property yet.
It comes out of the 1st d_w_r('node') with
- nid = [next nid] (because nid column is serial)
- vid = 0 (because vid column has default 0)
Then on SQLite, d_w_r('node_revision') writes vid = 0, instead of vid = [next vid] (previously, the incoming 0 value on the serial vid column would have been ignored). Thus, the {node}.vid value remains 0.
On a subsequent node creation, the first d_w_r('node') will fail because of duplicate vid = 0.
This is causing the failures webchick mentions in #445094-2: broken d_w_r() call on non-MySQL db in field_test.module

[edit: made the description above a little clearer]

Noe that MySQL properly inserts vid = [next vid].

Josh Waihi’s picture

Status: Fixed » Reviewed & tested by the community

#45++ I've tested and it fixes a large portion of what is broken on HEAD with PostgreSQL. Can we get this commited?

moshe weitzman’s picture

Status: Reviewed & tested by the community » Fixed
yched’s picture

Note that the behavior described in #50 is *not* related to NULL values. See comment #445214-6: dwr() fails when a NULL value is passed in a serial column (on PostgreSQL and SQLite).

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.