The purpose of this module is to provide a way to move data between Drupal websites.
Quick testing
Please note the easiest way to test this module is to use a single Drupal install.
- Install the module into a testing D6 or D7 site.
- Export current users, taxonomy terms or nodes to a dataset file.
- Alter either the users, terms or nodes.
- Go the the import tab and import the dataset file which was created.
- Check that the alterations which had been made are now reverted back to when the dataset was exported.
NB - By default the dataset files are put into sites/default/files/data_export_import - and the import functions look in this directory for file to import. To move the data from one Drupal site to another it is necessary to move the dataset files from sites/default/files/data_export_import on the site where they were exported and then put them into sites/default/files/data_export_import on the site which needs to import the files. When they are in that directory the module will find them automatically.
Description
If the module is installed it is then very easy to export users, taxonomy terms or nodes to data files. These files can then be imported into a Drupal site which has the module installed.
This makes it easy to move what we would normally think of as data (users, taxonomy terms, nodes) between Drupal sites.
Key points
* When nodes/terms/users etc are imported they are assigned the same ID number as they had when they were exported.
* In the D6 varsion files attached to nodes via attachments or CCK fields are exported to the data file and imported to the receiving site.
* In the D7 version attached files are not working yet.
* Checks are carried out to ensure vocabularies exist and that content types have matching fields etc.
* Batch API is used on node export/import to deal with large datasets.
* After an import the data in the receiving site is an exact match to the data which was exported to the file. In principle this works like rsync with the --delete option.
* Easy to test. Install on test Drupal site, export nodes/terms/users to files, delete some data, import the data file, admire the nodes etc which re-appear.
* A drush interface has been added so this module can be used from the command line.
Images which show part of the admin interface are available here.
Overview
Data export and import is not easy to achieve as Drupal stores configuration and data in the same database. Also, Drupal can be extended and customised which makes creating a standard data export/import module effectively impossible.
This module provides profiles for exporting/importing the main data items; nodes, taxonomy terms and users. These profiles make it possible to export these items into dataset files and then import these files into a different Drupal site.
This module is customisable and extendable by the addition of new profiles to take into account different Drupal builds. This means that if a Drupal instance has a very specialised set-up it will be possible to create a new dataset profile to export/import data.
As standard the following data items can be exported and imported.
- Users.
- Taxonomy terms.
- Nodes, selectable by content type.
When exporting the data is saved in files which are placed into the current files/ directory. When importing the data is used in a way which makes sure that the receiving Drupal instance ends up with matching data. Careful checks are used to ensure ID's match and that the data remains consistent.
Effectively this module is similar to the backup_migrate module except that the backup_migrate profiles are extended. Currently the backup_migrate profiles only allow for the choice of which tables will be exported/imported and a few other settings. This module uses the Druapl API to read data and produce rich data files which contain all data needed to reproduce a particular dataset. It also uses the Drupal API to recreate the data.
The primary usage for this module is envisioned to be exporting datasets from a live Drupal site and importing this live data into a new updated version of the site which developers have been working on - i.e. a beta version. Once the import has completed then the newly developed site can become the new live site. This has the advantage of being able to fall back to the original live site if there is an unforseen problem between the new version of the site and the live data.
Very importantly, this module is also extensible with additional profiles enabling the export/import of data of any format. Because Drupal can be extended and customised in any way it may be that certain sites contain exotic content types or even very specific custom data. Users will be able to create profiles which will be able to export/import for their specific site in conjunction with this module.
It is hoped that it will be possible to set up a way for module users to be able to share profiles so they can be re-used or extended as required.
Why this is different from existing import/export modules
Needed as part of a new type of deployment plan
The focus of this module is to provide a tool to help with the way Drupal sites are developed, tested and deployed.
Most deployment plans push updates from dev to testing to live. This means that it is not possible to safely carry out large changes. If a site has been extensively redeveloped then there is a risk that when the changes are pushed on to live the site will break. Of course, it's possible to test on a copy of the live site as much as possible and it is possible to have a secondary copy of the live site to fall back on if the new live site breaks.
A better approach is to separately develop the new copy of the site as a beta copy - changes made by developers can be pushed up to a central beta copy of the site. This site can then be extensively developed and can be completely different from the existing live site. Using the Data Export Import module it will then be possible to export live data (users, nodes, taxonomy terms etc) from the existing live site and import that data into the beta site.
When the beta is ready to be deployed a final export of the live site data can then be imported to the beta site - and the beta site set to be the new live site. The advantages are:
- The beta site can be extensively tested with a full copy of the live data before going live.
- The existing live site is untouched and can be re-enabled if the beta site has any serious issues.
No existing module is suitable
After very extensive research into the existing import/export modules it was found that none of them were able to export and import data in a suitable way.
Deploy module
The big underlying assumption with the deploy module is:
'Deploy 6.x-1.x assumes that you have two exact clones of you code and configuration across your staging site and your production site (expect the Deploy and Services specific stuff). You can't have anything that differs the sites from each other. Nothing.'
The Data Export Import module is designed to allow for data to be exported to dataset files which can then be imported into an different (e.g. beta) version of a site.
Also, Deploy transmits data directly via the services module. The Data Export Import module exports to dataset files which can be transferred in different ways - and data can be re-imported back in to the exporting instance which is useful for testing purposes. I.e. Developers can work on a site and make drastic changes to taxonomy terms and nodes etc and when testing is finished the original data can be re-imported from a dataset file.
Features module
The Features module is focussed on exporting/importing configuration settings. The Data Export Import module is focussed on exporting and importing content type of data such as nodes, taxonomy terms, users etc.
Stager
Focussed on pushing from staging to live. Also it is Drupal 7.x only.
Backup and Migrate
The Backup and Migrate module works at the database table level. Since there are so many interdependencies between tables it is not possible to cleanly extract datasets in a way which means they can be re-imported. The Data Export Import module works at the API level to recreate datasets exactly.
Staging
From the staging project page:
'This module do not work very well on sites that contains user generated content, on live site. So if you need this modules's name or url or what ever, let me know.'
Migrate
The primary focus of the Migrate module is the importing of data from non-Drupal sources. Also, it depends on several modules including, Dbtng, elements, autoload etc. It also relies on Drush and although Drush is an excellent tool it is not used by all developers.
The Data Export Import module is designed to export/import Drupal data and is designed to be a standalone module which just uses the Drupal API to carry out it's functions and should not need to rely on other modules. Also, the user interface will be set up as cleanly as possible to enable relatively inexperienced Drupal developers to export and import datasets.
The Migrate module also requires that module developers add extensions to their modules. The Data Export Import module does not need any other extensions to any other modules. Indeed, the Data Export Import module is designed to be extensible itself - so developers can add custom profiles to match their exact site requirements.
Project page
http://drupal.org/project/data_export_import
Git repository
http://drupal.org/node/1278830/commits
Testing
NB - The Drupal 6 version is fully working and has been tested with very large datasets. The D7 version has been ported from the D6 version and has only been tested with small datasets.
Reviews of other projects
http://drupal.org/node/1804464#comment-6644034
http://drupal.org/node/1797916#comment-6704876
http://drupal.org/node/1827294#comment-6707932
| Comment | File | Size | Author |
|---|---|---|---|
| #42 | drupalcs-result.txt | 7.37 KB | klausi |
| #17 | response_to_review_20111214_2.txt | 9.06 KB | bailey86 |
| #16 | 0001-changes-from-code-review.patch | 31.83 KB | greenrover33 |
| #1 | coder-result.txt | 5.55 KB | klausi |
Comments
Comment #1
klausiIt appears you are working in the "master" branch in git. You should really be working in a version specific branch. The most direct documentation on this is Moving from a master branch to a version branch. For additional resources please see the documentation about release naming conventions and creating a branch in git.
Review of the master branch:
This automated report was generated with PAReview.sh, your friendly project application review script. Please report any bugs to klausi.
Comment #2
bailey86 commentedThanks for the review.
I've created a new branch called 6.x-1.x - is this now OK? I haven't deleted the master branch yet.
Is deleting the master branch crucial? I see the code to do this is the following - but thought I'd check RE the best way to follow the standard etc.
I've found the switch to make the coder module stricter - I've run it and tidied up the code until no errors show up. However, it may be that your version of coder is better - mine does not pick up the errors RE t().
Will be grateful to have any further feedback.
Thanks.
Comment #3
bailey86 commentedI've set this to 'needs review'. Can I take it that this is what I'm supposed to do after fixing up the points raised by the reviewer.
Comment #4
bailey86 commentedThis has now been run through coder_tough_love and no errors show.
Extra stubs have been put in place for the next dataset profiles.
Currently it is ready to test:
* Install module.
* Export taxonomy terms.
* Change something with some taxonomy terms. NB - Use a test vocabulary or change terms which are not crucial.
* Go to 'Import' tab and import the dataset file which was just exported and which should be listed there.
* Go back to the taxonomy terms and see them magically restored to as they were when exported.
Comment #5
doitDave commentedHi,
Yes, this is exactly it. ;)
Automated review (Please keep in mind that this is primarily a high level check that does not replace but, after all, eases the review process. There is no guarantee that no other issues could show up in a more in-depth manual follow-up review.)
Review of the 6.x-1.x branch:
This automated report was generated with PAReview.sh, your friendly project application review script. Go and review some other project applications, so we can get back to yours sooner.
Manual review
Comment #6
doitDave commentedComment #7
bailey86 commentedCode has been tidied and run past pareview_sh and coder (with coder_tough_love).
Master branch code has been pruned to a single README.txt file as advised.
Comment #8
doitDave commentedHi,
as I just extended my review environment with the drupalcs script, here's what comes out now:
Review of the 6.x-1.x branch:
This automated report was generated with PAReview.sh, your friendly project application review script. Go and review some other project applications, so we can get back to yours sooner.
hth,
dave
Comment #9
bailey86 commentedThanks for the previous review.
Code now passes code, pareview_sh and code sniffer!
Export and import of taxonomy terms and pages is now complete.
Extra blank pages have been commented out to keep the interface cleaner.
Docs have been added to to explain how to download and upload the dataset files.
Some stubs of code have been left in to start the export/import of another content type - these have been commented out in a compliant way. Next I will look to move the development of new content types to a development branch.
Comment #10
bailey86 commentedCode has been updated, cleaned up, run through all coding standards tests.
Obscure bug related to updating pages has been fixed.
Comment #11
bailey86 commentedHi,
This was set to 'needs review' over two weeks ago so I've taken the liberty to raise the priority level as per the guidelines.
Currently the code is polished and users can export then import taxonomy terms and pages. Should all be working fine and the interface is clean and easy to use.
I'm currently working on another branch to enable the export and import of all content types - but I'd like to get this project reviewed/tested/approved etc before merging in this new work as a development branch.
Thanks in advance for any reviews carried out.
Comment #12
bailey86 commentedJust dropped the priority back down - the pareview_sh module is reporting functions as needing to have the module name at the beginning of all functions. Unfortunaltely, this looks like it is also flagging up internal functions (those which start with an underscore).
Comment #13
bailey86 commentedAll was fine - the code review tools were picking up things from the new branch which is not in this release. Will repost the reason I raised the priority to major.
This was set to 'needs review' over two weeks ago so I've taken the liberty to raise the priority level as per the guidelines.
Currently the code is polished and users can export then import taxonomy terms and pages. Should all be working fine and the interface is clean and easy to use.
I'm currently working on another branch to enable the export and import of all content types - but I'd like to get this project reviewed/tested/approved etc before merging in this new work as a development branch.
Thanks in advance for any reviews carried out.
Comment #14
bailey86 commentedThe online service at:
http://ventral.org/pareview
has found some errors - will be working through them now. Will move the priority back up to major after the code has been tidied.
Hopefully this online service is a one-stop code review process now and will make the process of testing much simpler.
Comment #15
bailey86 commentedThe review at:
http://ventral.org/pareview
is now being passed with now errors. This online test includes pareview and code sniffer - and I think pareview runs coder as well.
Comment #16
greenrover33 commentedI have reviewed this module manualy.
But perhaps should some one other review the taxonomy part againe, i was gone tired will review. Perhaps not found all.
Nearly all i found is fixed within the attached patch.
I found some coding style issues.
1.)
http://drupal.org/coding-standards#array
place closing brackets at new line
for me its easy da read and avoid problems will add new array element
2.)
array elements shoulb be idneted with 2 spaces
3.)
i dont like debug stuff in commited code, but this is my personal opinion
if ($debug) {
Some errors:
1.)
function data_export_import_import_form()
file: data_export_import.module
line: 197
$directory = "sites/default/files/data_export_import/pages";
remove this line, ovoid hard coded pathes
not every one have this module in sites/default
2.)
files missing
profiles/pages.inc
profiles/taxonomy_terms.inc
3.)
avoid include_once use module_load_include() insted
4.)
function data_export_import_export_form_submit()
file: data_export_import.module
line: 320
$dataset_file = data_export_import_export_articles()
call a not declared function
5.)
include filename in t() in other languages perhaps it not have to be on the end af string
check_plain(t("The Articles dataset was output to the file:") . $dataset_file)
t('The Articles dataset was output to the file: @dataset_file', array('@dataset_file', $dataset_file))
6.)
function data_export_import_import_form_submit()
file: data_export_import.module
line: 403
trigger buttons with not exist make no sens
if ($form_state['clicked_button']['#post']['taxonomy_terms'] != 'none') {
how to trigger button clicks within drupal:
http://api.drupal.org/api/drupal/includes--form.inc/function/_form_butto...
7.)
function data_export_import_export_pages()
file: page.inc
line: 26
if it is an object, use it as object
8.)
function data_export_import_export_pages()
file: page.inc
line: 44
if you will do this with an content type, with more as 100 nodes, you will run out of php max execution time!
it is realy not usable to save all nodes to same file
http://api.drupal.org/api/drupal/includes--form.inc/group/batch/7
9.)
function data_export_import_export_pages()
file: page.inc
line: 60
avoid serializie, use var_export or json_encode insted
why use serializie when there is no need to recreate classes, it is overkill
json_encode() or var_export() ar mutch faster. The featuresm odule an some other prever var_export()
if you are afraid of securety issues or white page dies, use json_encode()
10.)
function data_export_import_export_pages()
file: page.inc
line: 156
dont compare arryas to each other.
better compare json or serialized strings. Bett hte md5`s of them
11.)
function data_export_import_export_pages()
file: page.inc
line: 171
t() dont need ony check_plain() bevor
http://api.drupal.org/api/drupal/includes--common.inc/function/t/6
12.)
function data_export_import_export_pages()
file: page.inc
line: 171
if a drupal_set_message() is an, error, dont prefix it with "ERROR -", set second parameter "error"
13.)
function data_export_import_export_pages()
file: page.inc
line: 205
count($file_content_as_array['nodes'])==0
better use if(empty($file_content_as_array['nodes']))
for better performance and error handling
14.)
function data_export_import_export_taxonomy_terms()
file: taxonmy_terms.inc
line: 35
directly use vid is dangerous, cause perhaps them are not equal on diferent machines
better use vocabular machine name for identify
specizialy in the case when you have several developers for one project
15.)
file: data_export_import.info
the dependencies are missing
15.)
$synonym_list .= $synonym . "\n";
is faster to read than
$synonym_list = $synonym_list . $synonym . "\n";
questions
1.)
Overrides the standard drupal_write_record() function.
function data_export_import_drupal_write_record_via_insert_with_defined_id()
why you have had to do that?
2.)
Why is data_export_import_taxonomy_get_tree_with_reset() better than
http://api.drupal.org/api/drupal/modules--taxonomy--taxonomy.module/func...
3.)
did you have tested the taxonomy import / export with transalated terms?
Comment #17
bailey86 commentedThanks for the review.
All points have been worked through - I've attached a document which outlines all the details of the work carried out and responses to the points raised.
Module fully passes the coding standards review at: http://ventral.org/pareview/httpgitdrupalorgsandboxbailey861278830git
Module is fully functional - users can export and import all taxonomy terms - and can export/import all standard page nodes.
Once the module is accepted I will be able to work on the next phase which is to export/import all content types and users.
Comment #18
klausiReview of the 6.x-1.x branch
Comment #19
bailey86 commentedThanks for the review. All points have been fixed or addressed so the module is ready for review again.
>>> REPLY
Comments have been changed.
>>> REPLY
I'd prefer to leave them in separate functions as later I may add some checks which would be carried out before the current forms are loaded.
>>> REPLY
The t() function has been added to 'Create dataset files' and elsewhere where needed. Menu items have been left as they are as pareview says 'Menu item titles and descriptions should NOT be enclosed within t()'.
>>> REPLY
Fixed.
>>> REPLY
Fixed.
>>> REPLY
Fixed.
>>> REPLY
Fixed - the file has been removed.
>>> REPLY
I've currently tested this module with 5,000 nodes and 5,000 taxnomy terms and the export and import ran fine. I would like to change the code to run functions under Batch API and I've looked closely at Batch API to see how it can be used.
I'm currently being sponsored to create this module for a company I am working for. If the module is accepted as a full Drupal module then they may be more likely to sponsor it further and myself or one of their full-time coders can add in the Batch API functionality.
>>> REPLY
I looked very closely at node_import.
The key/main/important difference is that this module will import the nodes and assign them the same ID number as they had when they were exported.
Also, it will only update or create the nodes (or taxonomy terms) necessary to make them match the nodes/terms which were exported. So if you have a dataset of 1,200 nodes - and the receiving instance already holds the first 1,000 nodes - then only the new 200 nodes will be imported. Node Import will re-import all the nodes and will create them with new ID numbers. So before any import it would be necessary to delete all the existing nodes.
Think of this module is terms of 'rsync --delete' for terms and nodes.
If the terms and nodes have the same ID number as the original data then any cross references and hard-coded links will still work OK. The sites I'm working on are huge multimedia sites with dozens of custom modules. And dozens of editors/reporters etc - and many of the content types (approx 17) use the node ID in their path.
This module is to be used as part of a whole dev/staging/live deployment process. I've written about the process in the description but basically the dev's all push site functionality/design changes up to a common beta site - and using this module live data can be exported from the current live site and imported into the beta site. The beta site becomes the new live site when all has been tested.
I.e. - using this module you pull data down from a live site to a beta rather than pushing code/module changes from a beta site to a live site.
>>> REPLY
Agreed - as you say that is D7 only - so I've had to override the current function to get the functionality needed in D6.
>>> REPLY
Fixed.
FINAL NOTES.
If you check out the all_content_types branch you'll see that I now have the module able to cleanly export/import any of the content types on a Drupal instance - they get listed automatically and the dataset files get named automatically. Standard CCK fields work fine. I am currently working on exporting then re-importing any files which have been attached to a node or are from a CCK field (FileField or ImageField).
If this module is shown to be accepted as part of the Drupal module list my current employer may be happy to sponsor further development either by myself or by one of their own full-time coders. Currently the 6.x-1.x branch export/imports taxonomy terms and pages - I feel that the terms import/export on it's own would be very useful to many. Further down the line we're looking to add a profile to export/import users as well.
Please note - the way the module is designed on the new all_content_types branch allows for site devs to create their own completely custom profiles by adding a single file containing the relevant functions. This will allow the export/import of ANY data for any site no matter how specialised any added modules are.
FURTHER WORK
As this module addresses a big issue for Drupal - the data content migration issue - I'm hoping it will be of use to the wider Drupal community who may wish to work with it and enhance it further.
CODE STANDARDS
I've run the code through the automated tests at http://ventral.org/pareview and it's looking fine. I'm sure you remember from the discussion I generated at http://groups.drupal.org/node/195303 about the coding standards tools!
Thanks for your review - I await any further feedback if you find the time.
And thanks for the heroic effort on the modules queue - it was getting quite large and seemed to be becoming a log jam.
Comment #20
bailey86 commentedThe branch to export and import all content types has now been merged in.
This module will now detect all the site's content types and offer to export/import them. Files attached via the Upload module or by FileField or ImageField CCK fields will be output to the dataset file and can then be re-imported. There may be issues if the files are larger than 1MB - but I'm currently working to deal with larger files. If needed I may use batch API - but currently this module has been tested with 5,000 nodes in a dataset and all is well.
Using Batch API is obviously the way forward but may not be needed by my current employer.
Comment #21
bailey86 commentedThere is now a branch which exports/imports using the batch API. The branch is called handle_large_files.
I will now clean up the code and make the batch API messages nicer. Once the code is compliant I will merge in the handle_large_files branch to the 6.x-1.x branch.
Comment #22
bailey86 commentedThe main 6.x-1.x branch is now using the Batch API to deal with large datasets.
Comment #23
bailey86 commentedThis module now exports and imports users.
Comment #23.0
bailey86 commentedThe beginning of the description seems to give the impression that this module was just a framework - whereas in fact it is complete and ready for use as soon as it is installed.
Comment #23.1
bailey86 commentedAdded link to admin interface images.
Comment #23.2
bailey86 commentedQuick change.
Comment #23.3
bailey86 commentedTidy up.
Comment #24
bailey86 commentedThis issue description has been updated to reflect that fact that the module is ready to be used out of the box for Nodes, Taxonomy Terms and Users.
Comment #24.0
bailey86 commentedUpdated as the module has been extensively extended since it was initially uploaded.
Comment #25
bailey86 commentedA Drush interface has been added to enable the module to be used from the command line.
Comment #26
dpatte commentedI am interested in how this is progressing.
You say this module preserves nids and uids (and therefore presumably noderefs and userrefs). What happens during the sync process if the node id or user id already exists on the destination db?
Comment #27
bailey86 commentedThink in terms of the module acting like rsync with the --delete option.
If the node id or user id (or term id) already exists in the destination DB then those nodes/users are updated with the information in the incoming data file. The plan is that the nodes and users are first exported to a data file - then this data file is imported into a destination DB - and after the importation the users/nodes should be an *exact* match of the data which was exported from the first Drupal site.
This is useful for the following reason.
Say you have a live site which you export from - and it has a new node with id of 123.
You also have a beta copy of the site which is being tested - and a developer creates a test node with dummy data which has an id of 123.
If you import from the data file exported from the live site the data for node 123 will overwrite the data in the beta site. I.e. it will replace test/dummy data with data from the live site.
Similarly, if on the beta site a developer creates a test node of id 123456 which does not exist on the live site (which exported the data) then this node will be deleted from the destination DB.
Preserving the node ID's of everything - terms/users/nodes - will (as you say) keep noderefs and userrefs working.
As mentioned - the idea is that terms/users/nodes will be exactly replicated between Drupal sites.
This is all part of a larger development/deployment plan which has been documented in the module. I'm currently working on enhancing the plan and should be posting a white paper when it is ready.
Comment #27.0
bailey86 commentedUpdated the status of the drush interface.
Comment #28
klausiSorry for the delay, but you have not listed any reviews of other project applications in your issue summary as strongly recommended here: http://drupal.org/node/1011698
manual review:
Comment #29
bailey86 commentedThanks for the review.
I've not been able to review other projects recently as I've been working hard on testing this module. It's been used to pull in 10GB dataset files and has been used in a large deployment scenario.
Also, I've been writing up a document called 'Drupal websites development and deployment strategies' which shows how this module can be used as part of large deployments. Once the document is finished it will be published as a white paper (or possibly a book!) and
made available to the Drupal community.
Please note that I've aided the review process by helping to clear up the issue of coding standards testing. I raised the issue at http://groups.drupal.org/node/195303 and the conclusion RE using ventral.org has become part of the instructions on the 'Tips for ensuring a smooth review' page - http://drupal.org/node/1187664 - hopefully that shows I'm keen to help!
I've worked on your module review points. Here's my replies (not in the same order):
2. data_export_import_callback_overview(): all user facing text must run through t() for translation.
DONE.
I've changed the output in that function.
4. dei.drush.inc: make sure to use your full module name as function prefix to avoid name collisions with other modules.
DONE.
I've changed the drush.inc file so that the functions use the full module name as the prefix. The function names are related to the drush.inc file name so I've changed the file name as well.
1.data_export_import.module: don't execute PHP code gloablly here, as this will be called on every single page request. Include your files only in the appropriate hooks/functions.
and
3. data_export_import_menu(): why do you include files in hook_menu()? You don't need them there?
I've designed the module so it can be extended relatively easily by adding new profiles. If someone adds a new profile file into includes/profiles then it will create a new tab on the data_export_import interface and allow the user to export and import data.
Currently, the module enables user, taxonomy terms and nodes to be exported and imported. I wanted to allow for expansion by adding new profiles. This means that users can relatively easily add new profiles to handle any weird and wonderful custom data they may need to export/import.
As part of this profile plugin setup the code in data_export_import.module loops through all the files it finds in the includes/profiles directory and adds in the callback functions and the extra menu items (tabs). That is what the code is doing.
I don't really want to remove the ability to easily add new profiles - and can't see how it can be implemented differently. When the admin page is requested it needs to read the profiles directory to see what tabs to load and to have the callback functions available.
Thanks again for the review - my current employer has invested heavily in this module and it would be great if we can get it approved as a full Drupal module.
Comment #30
joachim commented> I don't really want to remove the ability to easily add new profiles
You shouldn't require people to put new files into your module's folder, because that will make updating this module to newer versions difficult.
You should either require other modules to implement an API hook (like hook_views_api) or expect them to make a file with a particular name pattern (eg 'MYMODULE.data_export_profile.inc'). The first is probably more robust.
Is this just D6 for now?
Comment #31
bailey86 commentedAgreed.
Hmmm...
So - we can say that to extend this module to be used for other specific/custom types of data it really should be carried out by using hooks or sub-modules which depend on this module.
Initially I thought I'd only be able to support nodes of a couple of specific content types - but now it can support nodes of virtually all normal content types based on CCK fields. Since it can support taxonomy terms, users and all content types (nodes) then it is pretty much complete in itself.
At least by using the current separate profile files for terms, users and nodes it keeps the code clean structured.
So I agree - if users want to be able to export their own custom data sets then they can extend this module by using hooks - or better by using sub modules. Sub modules would be better because they could be re-used - i.e. something like by creating a module called data_export_import_ups_delivery_rates for example.
This is just D6 currently - but - if it can be put forward as RTBC I can more likely get further sponsorship to produce a D7 version.
Comment #32
bailey86 commentedAs mentioned before - if anyone is interested in a D7 version - and possibly a version which can export from D6 and import the datasets into D7 then could they look to getting this module to RTBC - then I might be able to carry out further work.
Comment #33
bailey86 commentedThis module is D6 only - I may need to create a D7 version soon but this depends on the priorities of the work required by my current employer. As this module has been tested and used by jono - http://drupal.org/user/97674 - it would be helpful to get the module accepted as a full Drupal module. Currently I'm waiting to get it set to RTBC status.
The same user (jono - http://drupal.org/user/97674) has added in some hooks which sound like the sort of thing you may need. I've asked him to send in a patch and when that is received I'll add it in to the code.
Comment #34
joachim commentedI've recently made something for D7 in a similar vein: http://drupal.org/project/migrate_parcel
Comment #35
bailey86 commentedHi Klausi,
(Thanks for the restws module - it's being used by us and I'm hoping to feedback anything we can to that module. Currently I've added back some documentation on how to call REST calls).
I agree that this DEI module should be extended by the use of hooks - and jono - http://drupal.org/user/97674 - who has used the module successfully has said he can send in a patch to add hooks. I've requested the patch and will add it when I receive it.
This would mean that I could effectively stop all plans to extend this module by adding profile files - and therefore stop the PHP code in the module file from being called on every page request.
My current work priorities mean that I'm not able to change the action now. However, I may need to produce a D7 version of this module. Hopefully, as jono has now tested and successfully used this module I can get it accepted as a full Drupal module - this community acceptance could make it easier for me to get sponsorship to create the D7 version. In the D7 version I would definitely hard-code the current menus and remove the PHP code in the module file.
Comment #36
bailey86 commentedHi,
You could use this Data Export Import module to export taxonomy terms to a file - which can then be imported into another Drupal instance. This importation recreates the hierarchy of terms and recreates the terms to have exactly the same ID numbers as when they were exported so it is pretty complete. It also resets the Hierarchy value for a Taxonomy to match the terms which are imported. This covers for when terms have more than one parent.
If you want to export to CSV you could add a hook to do that. This DEI module encodes the data to make it completely safe to store in a text file. CSV files may struggle with all terms. Also, this DEI module can export/import nodes and users and this data may have line endings etc which would break in a CSV file.
Similarly, key nodes and key users which are needed in a base copy of a site could be exported to data files which are then ready to be imported into a new site. The advantage of this module is that all ID numbers are maintained and so all links between content, terms and users would be maintained.
The advantage of exporting to a separate file is that the understanding is clear - nodes, terms or users are exported to actual 'data' files. These files can then be transferred to and imported into a new Drupal site and the module knows how to read these files.
I think we're looking at the some question here - which is to be able to export blocks of 'data' as opposed to configurations/site settings etc. The blocks of data exported are; nodes (any content type), terms and users. These are exported to files which can be re-imported to any other site. (For nodes, as long as the receiving site has the same content types defined then the nodes can be imported).
This also enables a very important feature RE deployment. Data can be exported from a live site and then imported into a new version of a site - this enables a site to be extensively updated as a beta version and then to have the existing data brought in from the live site. See http://drupal.org/node/1330454 for how this module can be used as part of a deployment strategy. I'm working on a white paper on the whole plan for development and deployment of Drupal sites which I'm hoping to push up to d.o in the next couple of weeks.
This module has been extensively tested with gigabytes of data and nodes with 150MB of images/videos/file attachments/etc. Another user - jono http://drupal.org/user/97674 - has tested and used this module in production. Also, this module does not have any dependencies.
I suggest you install this module on a D6 test site and have a look at it - I'm feel it may be doing the sort of thing which you require. I may need to work on a D7 branch soon and would advise you if that work gets started.
Regards,
Kevin Bailey
Comment #37
Anonymous (not verified) commentedI've tested this for exporting/importing book pages from site to site, and it works well for us.
It presumes the node ids are the same across both sites, and that's important to preserving the book hierarchy.
I haven't tested it with taxonomies and users -- just book nodes -- and I haven't tested it with Drush. But for our use, it works well, and so I'm comfortable flagging it as "RTBC".
Kevin, thanks again for contributing this!
Comment #38
klausiWe are currently quite busy with all the project applications and I can only review projects with a review bonus. Please help me reviewing and I'll take a look at your project right away :-)
Comment #39
bailey86 commentedOK - thanks for the heads up.
I'm currently converting this module to D7 - which I'm sure will make it very useful - especially as I'm thinking it might be able to export data from a D6 site and load it into a D7 site.
Will see if I can look into the review bonus - but I may not have time.
BTW - Thanks for the restws module - I'm using it here at BFBS as part of an application which is rolling out set top boxes.
Comment #40
bailey86 commentedThe port to Drupal 7 is now complete and I've tagged the code at 7.x-1.0.
This port is not as heavily tested yet as the D6 branch - but the D6 branch was tested with hundreds of thousands of records soem of which were up to 140MB in size (due to attachements). However, main testing has been carried out and users, terms, nodes are exported OK to files and then re-imported OK.
The module will be used soon as part of a development and deployment process and so be tested with a certain amount of live data.
I have significant documentation available on how to use this module as part of a Drupal development and deployment strategy and I'm currently looking for where I can upload that document to as a sort of white paper. It would need to be linked to this module - possibly with just a hyperlink.
Comment #40.0
bailey86 commentedFixed link to screenshots.
Comment #40.1
bailey86 commentedAdded notes about ease of testing.
Comment #40.2
bailey86 commentedUpdated woth notes on ease of testing.
Comment #41
bailey86 commentedI've updated the project page to reflect all the changes which have been made.
Please note - I've added a couple of paragraphs to explain how easy it is to test this module. I'll paste them here for reference:
Comment #42
klausiReview of the 7.x-1.x branch:
This automated report was generated with PAReview.sh, your friendly project application review script. You can also use the online version to check your project. You have to get a review bonus to get a review from me.
manual review:
Comment #42.0
klausiAdded first review summary.
Comment #43
bailey86 commentedThanks for the review. I'd just added the functionality to enable nodes to be exported and imported with their attached files so debug code and code from the D6 version was still in there. My first task today was to tidy up the code to match coding standards!
Here are your points addressed:
1. DONE - Notes about other modules added to the projects page.
http://drupal.org/sandbox/bailey86/1278830
2. DONE - I've added the t() function to strings I found.
I've been sent back and forth a little on this - my understanding is that the tab names etc laid out in data_export_import.module should be in plain text.
Any others that I missed I'll change as soon as they are pointed out.
3. DONE - 'und' changed to LANGUAGE_NONE. I may need to look further into this to get the module to work well with the locale module. My current employer is UK based and so there is no requirement for the locale module to be used currently.
4. DONE - changed 'echo' to drupal_set_message().
5. DONE - hook_permission() implemented.
6. DONE - I've added the t() function to the message pointed out.
The second part of the message is a list of file names which have been exported and so would not need translating.
And finally - all code tidied up and it now passes ventral.org parreview - tests carried out and module working OK.
Comment #43.0
bailey86 commentedTidy up.
Comment #43.1
bailey86 commentedAdded review link.
Comment #44
bailey86 commentedPAReview: review bonus tag added.
Comment #45
bailey86 commentedThis had been set by Jono previously http://drupal.org/node/1330454#comment-6408376
Comment #46
klausiThanks for your contribution, bailey86!
I updated your account to let you promote this to a full project and also create new projects as either a sandbox or a "full" project.
Here are some recommended readings to help with excellent maintainership:
You can find lots more contributors chatting on IRC in #drupal-contribute. So, come hang out and get involved!
Thanks, also, for your patience with the review process. Anyone is welcome to participate in the review process. Please consider reviewing other projects that are pending review. I encourage you to learn more about that process and join the group of reviewers.
Thanks to the dedicated reviewer(s) as well.
Comment #47.0
(not verified) commentedTidied up.
Comment #47.1
bailey86 commentedUpdated the project page.