I was in need of a large-scale CMS, and Drupal was pointed out to me as something I should look into.
It seems like it's a wonderful CMS, but with one glaring problem (for my needs): the complete lack of an import feature. I have over 100,000 pages worth of information I want to import into several thousand categories.
It seems like I could set up categories as books and then add in pages into those books. The problem is, I obviously can't *manually* create several thousand categories and then go about separating over 100,000 pages into those categories + manually create each page under the appropriate book.
Is there any way to create books on a mass scale, or to import in several thousand pages at once into specified categories?
I can modify/prepare the information in any format/structure necessary. I just haven't found any modules (core or otherwise) that look like a viable solution.
If Drupal can't accomplish what I'm after, does anyone have any CMS recommendations that could? I've been using Mediawiki, which works moderately well, but I want a more advanced/flexible/easily customized + upgraded CMS.
Thanks in advance for any help anyone can offer!
Comments
Import / Export Modules
Have you explored the Import / Export Modules Page?
Importing content
Hi,
I haven't done anything on quite that scale, but I did recently import around 500 pages into a book structure and, on another site, around 3500 pages into a CCK/taxonomy setup. It wasn't a simple 'install one module, hit import, and done' process, but it wasn't all that difficult. Most of the work involved is in preparing your sources for import.
The node_import module - http://drupal.org/project/node_import - worked nicely for me when converting & importing a Word doc via CSV to separate book pages. Because the pages had to respect the organisational heirarchy of the original Word doc, I imported the structure directly into the 'book' table of the database in a separate step.
The 3500 pages were in a SQL export from a Typo3 site. I imported this into a temporary table in the Drupal database, set up my new CCK content type with the fields I wanted, and then used PhpMyAdmin to insert the Typo3 columns into the corresponding CCK fields. For that project I also had to import calendar information into the 'date' table.
The final step was to import the categories into a Drupal taxonomy. There are two modules that can do this:
- http://drupal.org/project/taxonomy_csv - which does relatively simple imports from a CSV file.
- http://drupal.org/project/taxonomy_xml - which allows you to import/export entire taxonomy vocabularies, including heirarchy and description info.
I exported the relevant columns from the Typo3 data to a text file, which I then massaged into XML form with some regular expressions (Excel's XML export was too verbose to be worth it!) and then imported it using the taxonomy_xml module.
I guess what I'm trying to say is that you might need to use some other tools to prepare your import, rather than being able to use one module to import everything, but if you're not afraid of a bit of MySQL, you can successfully manage quite large, complex imports.
Thank you
Awesome post. Just what I was looking for. At least until a more formal import/export module is created. I also have an upcoming requirement on mass import. FYI to the developers, getting the data into a csv format is not so tough for most of us end users. Getting it into Drupal books/stories/pages/blogs/forum is. I love the idea that there is work being done on the specific doc type filters. That would be like two homeruns in one for me. Now I have to figure out how to use these chip in things for all the amazing capabilities here.
Very helpful, guys. Much
Very helpful, guys. Much appreciated.