Hi, what are you thoughts about migrating existing Drupal fields to MongoDB? Are there plans for a migration sub-module?

Files: 
CommentFileSizeAuthor
#41 field_collection_item.patch750 bytesMantasK
Test request sent.
[ View ]
#30 mongodb_migrate_module_fix_1653202.patch856 bytesmcrittenden
Test request sent.
[ View ]
#29 mongo-migrate-drush-v5-1653202.patch637 bytesmcrittenden
Test request sent.
[ View ]
#27 mongo-migrate-drush-v4-1653202.patch4.53 KBmcrittenden
Test request sent.
[ View ]
#25 mongo-migrate-drush-v3-1653202.patch3.95 KBmcrittenden
Test request sent.
[ View ]
#24 mongo-migrate-drush-v2-1653202.patch3.39 KBmcrittenden
Test request sent.
[ View ]
#22 mongo-migrate-drush-1653202.patch3.75 KBmcrittenden
Test request sent.
[ View ]
#4 mongodb_migrate-1653202.patch1.2 KBTwoD
Test request sent.
[ View ]

Comments

Category:support» task
Priority:Normal» Critical
Status:Active» Needs review

This has just been pushed to drupal.org. Anyone care to review it?

Instructions: backup the database. MySQL and MongoDB both. Backing up the database is a good idea. Do not forget to backup before you even think of this. Now, enable the mongodb_migrate module, run drush mongodb-migrate-prepare then mongodb-migrate. You might need to run the latter a few times or give it relevant options to run for a longer time. You can run several migrate workers at the same time.

Oh, fantastic! ;) Will review in a few days.

Likewise.

StatusFileSize
new1.2 KB
Test request sent.
[ View ]

So far I've been able to migrate taxonomy terms - after applying this patch - nodes and custom entities. Or at least that's what it says it's doing. The new mongodb drupal.entity_name collections seem to only contain {'_id : 123, 'timestamp': 1234567890} entries after the import.

I'm not sure if the comments part of the patch is valid, haven't tested that yet.

Why is it filtering out all field which haven't $field['storage']['details']['sql']? That seems to match almost all my fields so most of them are still showing up as using field_sql_storage.

You have mongodb_field_storage enabled, don't you? I just added that as requirement. Re {'_id : 123, 'timestamp': 1234567890} that should be in a collection called migrate.node and so on. The fielded entities are stored in fields_current.node by mongodb_field_storage. As for not having $field['storage']['details']['sql'] I have removed that and replaced with a better detection mechanism. Thanks for the feedback, please keep it coming.

As for the patch, do we really want to make this even slower...? I am not sure what makes it slower, not filtering on bundles or having to iterate... hm. I guess the taxonomy bundles are a Big Deal(TM) because if you have say a 'navigation' and a free tagging one... yeah. Thanks for the patch but I would rather string mangle the comment bundles into node bundles and avoid an unindexable CONCAT and also transform the taxonomy bundles into vids and avoid a JOIN. I will get to this ASAP.

Title:Initial field migration to MongoDB?Initial field migration to MongoDB

A bunch of bugfixes were committed today and after them I got confirmation from someone who has a saner setup than mine that this worked for her.

Sorry, for the wait.

Yes, I had mongodb_field_storage enabled. I never got anything stored in fields_current.node etc. Not sure why, but I see now that it may have been an error with my setup.

Regarding the patch, that's what I had to do to get anything at all to import, since it started with the taxonomy terms I had and that just threw errors at me when it thought db columns were missing.

I'm going to see if I can try another migration today, pulling from git first.

Are there notes around about how this works or what will be different. The idea sounds great and I am anxious to have it working, but I have lots of custom code updating nodes on an existing site, so I am curious how that might be affected.

I have lots of SQL queries that I assume will have to change, and lots of custom cached data that I guess will have to be rewritten. Ouch.

Will migrate work on all the content types at once? All the fields, nodes and taxonomy?

Will the process take a very long time for about 600k nodes, and about 50 vocabularies, some with over a million rows?

Any tips to make this work would be very much appreciated.

I'm having a similar issue, I have installed mongodb and it seems to save the node record but all other field data is being stored in my postresql database. Mongo doesn't seem to be saving anything for me apart from the watchdog reports.

What's the status of the patch from July? How do we see this proceed? I've got a few large taxonomies we'd like to migrate over.

The code in the git repo supposedly works. Just need more feedback.

Hi @chx, are you suggesting we/I try using code from git, or would it be the same to use .dev?

Also, is the problem only relating to migration to field storage, or field storage itself?

In other words, which version of MongoDB Storage should be used on a live site?

I would like to move over a million nodes to field storage and would appreciate tips for accomplishing this.

Would looping through and re-saving all the nodes accomplish putting a copy, along with all the fields into Mongo?
(suggested here: http://technosophos.com/content/loading-drupal-nodes-mongodb-drush)

Thanks, Jerry

-dev would be it. Moving over a million nodes will be interesting. My drush command exactly does that: it sets up the migration and resaves all nodes. I am interested in how this goes.

Thanks. I look forward to this working. But, I must be doing something wrong.

On a small test site I had Mongo working for other than fields using rc2. I enabled mongodb_field_storage,
set
# Field Storage
$conf['field_storage_default'] = 'mongodb_field_storage';
(not sure what this does yet)
and this test site appears to have saved a new node into Mongo.

Today I installed
version = "7.x-1.0-rc2+6-dev"
over version = "7.x-1.0-rc2"

and cleared Drupal and Drush cache, but get this error:

The drush command 'mongodb-field-update' could not be found.

I see this command mentioned in mongodb_migrate/mongodb_migrate.drush.inc in
$items['mongodb-migrate']
'description' => 'Migrates fields. Run mongodb-field-update first.',
, but not the command itself.

What does this mean "Run mongodb-field-update". Should it be a Drush command?

I then tried the following:
drush mongodb-migrate-prepare
drush -v mongodb-migrate

Use of undefined constant ASC - assumed 'ASC' mongodb_migrate.drush.inc:84 [notice]
WD php: PDOException: SQLSTATE[42S22]: Column not found: 1054 Unknown column 'node_type' in 'where clause': SELECT e.cid AS cid [error]
FROM
{comment} e

Use of undefined constant ASC - assumed 'ASC' mongodb_migrate.drush.inc:84 [notice]
WD php: PDOException: SQLSTATE[42S22]: Column not found: 1054 Unknown column 'node_type' in 'where clause':

This patch
http://drupal.org/files/mongodb_migrate-1653202.patch
is not in .dev

Yeah, comments are a problem, see above, I will see whether I can get a patch going this weekend sorry for not getting to it earlier; you can try that patch.

Hi chx,

In your opinion, is mongodb_field_storage ready for use on sites that need comments on the nodes?

Would the patch be for migrate or for mongodb_field_storage? In other words, will mongodb_field_storage work for new nodes? I have several new sites related to the one with over a million node where I would not need migrate, but I am concerned about jumping in over my head.

I am not sure how to proceed as I am confused how this works. Is there an overview available?

Are the fields stored twice, or only in MongoDB? Would it be possible to revert back to MySQL if it came to that?

I have a major problem with performance with the large site. In your opinion, would mongodb_field_storage be a huge benefit for authenticated users?

Thanks for any and all help.

I would like to migrate if possible. I have about
1.6 million nodes
44 vocabularies with 30 million terms

I don't need to migrate comments, but I need to support comments after the migration.

Is this possible with the existing code?

Thanks, Jerry

@chx, I have a few free days sponsored to work on this for a client. Would you mind providing an update on where things are at with this issue and I can try and pick up where things have left off? IRC is fine too, just let me know.

Status:Needs review» Needs work

Just drop the bundle filtering, there's no point in it. Just have a list of entity types and that's it.

Assigned:Unassigned» mcrittenden

Not sure we can remove the bundle filtering altogether. For example, in my setup, for comments $entity_info['entity keys']['bundle'] is "node_type" which obviously isn't useful. Anyways, digging in.

Status:Needs work» Needs review
StatusFileSize
new3.75 KB
Test request sent.
[ View ]

Here are some fixes:

- Added a README (I think a lot of people miss that you need to change $conf['field_storage_default'] as I did)
- Fixed some of the drush command comments
- Included patch from #4 except removed this line for comments as it was failing. Not sure what the correct fix is for that.

$query->condition('CONCAT("comment_node_", n.type)', $bundles);

...so now it's missing that condition altogether which is a bug, but I'm not sure what the correct fix is for that. Can't compare to $entity_info['entity keys']['bundle'] for comments because that apparently == 'node_type' for some reason.

Any thoughts on that?

Status:Needs review» Needs work

Just saw your chx's comment above about avoiding the JOINs and also realized the comment JOIN is currently pointless, so back to CNW.

Status:Needs work» Needs review
StatusFileSize
new3.39 KB
Test request sent.
[ View ]

Bundle filtering removed. Ready for review.

StatusFileSize
new3.95 KB
Test request sent.
[ View ]

Running with --timeout="0" should disable timeout. Fix for that included in this patch. Should be it for now, pending review.

I havent committed this as there's some debug left but also -- care to remove the code that assembles the bundles? We only need entity types. Thanks.

StatusFileSize
new4.53 KB
Test request sent.
[ View ]

Like this? (Sorry about the debug code)

Status:Needs review» Active

I have committed this, thanks! Further feedback is warmly appreciated.

Status:Active» Needs review
StatusFileSize
new637 bytes
Test request sent.
[ View ]

Not sure how I missed this.

StatusFileSize
new856 bytes
Test request sent.
[ View ]

Fix for entity_load loading empty mongo fields in initial migration instead of populated mysql fields. This + the patch in #29 is confirmed to work well on my end.

Status:Needs review» Active

Committed, thanks, let's see whether there's more feedback.

Status:Active» Fixed

Closing this per chx's request in IRC since my rather large migration went well.

In this approach is the data left behind in the mysql tables lost in terms of functionality?

I have about 600 fields with data I'd like to move over (the performance increases are already invaluable before even moving these over, I can only imagine what benefits this will result in).

Yes -- as a matter of principle, never delete data when migrating. Who knows what it'll be useful for later. There are ample tools out there to delete whatever you want.

Yes their functionality will be lost or yes their data will be moved over?

Anotherwards if I have an existing field and user goes in to create a new entity with that field, will it be saved in the mysql or mongo db with this migration tool, since it is already an existing field with previous data?

I gave it a shot and answered my own question. The data is transferred.

Migration continually fails after Migrating users and will not continue. My guess would be something like field_collections is coming up next resulting in the failure on line 85 which tries to fetchfield(). Perhaps there should be some sort of ignore function to deal with this sort of thing.

<?php
Migrating user 514
PDOException
: SQLSTATE[42000]: Syntax error or access violation: 1064 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near &#039;AS
FROM
e
WHERE 
( > &#039;0&#039;)
ORDER BY  ASC
LIMIT 1 OFFSET 0
&#039; at line 1: SELECT e. AS
FROM
{} e
WHERE 
( > :db_condition_placeholder_0)
ORDER BY  ASC
LIMIT 1 OFFSET 0
; Array
(
    [:
db_condition_placeholder_0] => 0
)
in drush_mongodb_migrate() (line 85 of /sites/all/modules/mongodb/mongodb_migrate/mongodb_migrate.drush.inc).
?>

Status:Fixed» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Status:Closed (fixed)» Active

Hi

I have issues with migrating custom entity type which I have defined in my module.
1. when migrating it creates only this collection in MongoDB: migrate.my_entity_type
2. I cleared that table and tried to see what happens when I load that entity. It selects basic info from mysql, but when it attaches fields in field_attach_load storage is already mongodb and of course fields are not there.

I tried to change storage of those fields. Then fields are loaded, but still during migration only 1 collection is created (see 1.). Is it possible that it happens because I use custom controller which extends DrupalDefaultEntityController

my save function is:

public function save ($entity) {
    if (empty($entity->nbiid)) {
      $entity->created = time();
    }
    module_invoke_all('entity_presave', $entity, 'node_blocks_item');
    $primary_keys = $entity->nbiid ? 'nbiid' : array();
    drupal_write_record('node_blocks_item', $entity, $primary_keys);
    $invocation = 'entity_insert';
    if (empty($primary_keys)) {
      field_attach_insert('node_blocks_item', $entity);
    }
    else {
      field_attach_update('node_blocks_item', $entity);
      $invocation = 'entity_update';
    }
    module_invoke_all($invocation, $entity, 'node_blocks_item');
    return $entity;
  }

I saw that field_collection uses EntityAPIController I will try to use that tomorrow and see what happens then

I solved saving part by adding 'save callback' => to my entity definition, but fields are still empty. If I change storage to sql it loads values, but saves fields back to sql also and entity doesn't have those fields in mongodb storage. And if I leave mongodb storage for fields it loads empty values and then saves empty values to entity. Any hints how does it work with nodes, field collections and other types?

btw with field collections it also was not working - host entity was not loaded. So I made some small change:

<?php
        
if ('field_collection_item' == $entity_type) {
           
//don't save host entity
           
$entity->save(true);
          }
          else {
           
entity_save($entity_type, $entity);
          }
?>

Status:Active» Needs review
StatusFileSize
new750 bytes
Test request sent.
[ View ]

Oh. All seems to work now. Loading was not working because I was debugging code outside drush so this code:
if (defined('DRUSH_VERSION') && $age == FIELD_LOAD_CURRENT && in_array($entity_type, $all_types)) {
was not working.

So the only issue I was facing was with field collection items. Not sure if it is only my case. Anyway attaching patch

As it stands, I can't get multiple migrate workers to play nicely together. They keep running into race conditions and throw MongoCursorExceptions for duplicate key errors.

Or should I be doing something different than calling drush in multiple terminal windows?

Issue summary:View changes
Status:Needs review» Closed (fixed)