I've been an observer of Drupal over the years, occasionally installing it on a person site here or there but never using it was a professional scale publication platform.

I'm a web developer for a local newspaper company here and there's been talk about us needing to shift to a more Web 2.0 friendly CMS in the future. I've been tasked with researching and finding possible alternatives for us concerning the CMS market. One of the first things that popped into my head was using Drupal to accomplish what we needed.

However, before I get started and dive into this head first I wanted to ask a couple of questions from the community and maybe have a concern or two explained.

The main thing that needs to be accomplished here is the nightly, automated import of our stories. Right now this happens with our current CMS. The HTML files are dumped into a folder, a PHP script scrubs the data and then inserts it into our database, which our CMS then handles. My main concern is the import part.

Is there a module or something thereof that allows this kind of thing or am I looking at a completely custom solution? Keep in mind these are newspaper stories with all the corresponding author info, categories and headlines and that all needs to be brought over and functioning/search-able/readable in Drupal. Is this something I can realistically accomplish with Drupal?

Second, is Drupal a good choice for us? I know I'm going to really have to spend some time fine tuning the whole CMS but maybe there's an alternative, or if not, maybe Drupal just wouldn't work in my production environment?

Any information would be appreciated and hopefully we can find our home with Drupal.

- fawkstrot

Comments

profjk’s picture

Hi fawkstrot,
Updating the time stamp, so that somebody would notice the post and give you an informed support.
Best of luck.

cog.rusty’s picture

You may find some useful ideas and code for importing html files in http://drupal.org/project/import_html (not ready for D6 yet) and http://drupal.org/project/node_import. You may need to write your own code or help with what is available.

That said, I am not sure why you want to keep the old workflow and make it a patchwork, since Drupal with the help of some modules is perfectly capable of managing content input in several ways, with version management, pending approval, filters to sanitize html pasted from Word documents etc.

From what I read in your project description here, I really have no idea whether Drupal is suitable or not. I can only say that Drupal is the most versatile CMS around. They often call it a "framework". So,
- If some other CMS almost matches your requirements out of the box, then go for it.
- If not, then you can probably assemble what you want with Drupal.

fawkstrot’s picture

Thanks for the reply.

The old production flow is going to be widely replaced and updated should we move over to Drupal. However, the way we get stories is through a dump of HTML files generated by our publishing system. We're a newspaper first, a website second (with the newspaper stories, obviously). The reason for the need of maintaining the HTML scrub+insert kind of workflow is it's the only way to automate our published stories in the paper getting to the website. It's not the prettiest solution but it works and it's also the only format of automation our publication system supports. I don't really view it as being a patchwork, just one of the steps we're forced to do in order to play nice with our publishing system.

Beyond that the content will fall into it's very typical work flow within Drupal, requiring approval, running through the filters, etc. In a nutshell, this is our paper pushed to the web.

Whether I like it or not, it will come down to me having to make sure that those dumped HTML files (they are hardly HTML, just pure stories) get inserted properly, hourly, so that they are in Drupal as stories, with proper taxonomy, author info, etc. How easy is inserting new story data into Drupal, in compliance with the ways Drupal wants table relations and things of that nature? Am I looking at a huge undertaking?

Drupal's versatility is the main reason I've placed it at the top of my list. As stated, just hoping to get a feel for what I'm looking at.

profjk’s picture

Inserting stories into Drupal is a piece of cake! Please get going. Drupal takes plain text, filtered HTML as well as full HTML as input format. It even eats PHP for breakfast!!
Good luck.