SavannahNow is the web arm of the Savannah Morning News, a print newspaper in Savannah, Georgia (USA). It's built on a customized 4.6.8 platform. Some statistics:

  • We imported 160,000+ users from an existing system using the user_import module.
  • SavannahNow gets about 2M page views per month

More details here at http://ken.therickards.com/2006/06/04/tech-notes/.

Comments

jasonwhat’s picture

This is a pretty amazing site and I think as much information as you can provide to other drupalers would help many. This is also a great example of Drupal's ability (front page material imho). Would love to see "Generator" and maybe the community could help you port some of this stuff to 4.7.

A couple questions. You seem to have done a lot of custom stuff, is any of this modules that can be contributed to the drupal community?

Also, I'm very curious about how content gets on the site. I imagine that the paper's editors want the control, but probably don't know much about websites. What is the work flow for the newspaper content and how does it get up there each day? What xml stuff did you do?

BlogAPI — we had to write a ton of custom XML parser/importer routines to process data. We run it all through XMLRPC and the BlogAPI (I think, this was another corner of the development).

I've always felt that allowing content import from Word would be a major turning point for Drupal to be more business friendly. Did your work involve anything along these lines?

agentrickard’s picture

Thanks. Two part answer.

1) Generator -- will eventually get cleaned up and ported. One limitation -- it presents the same pages to anon and logged in users, so we use JavaScript to mimic logged in status on flat pages.

2) Editorial content that flows into this section:

http://new.savannahnow.com/in_print/

Comes straight out of the editorial system (DTI PageSpeed) and flows via XML to Drupal's XML-RPC library. Content only moves when the editors flag it for the web. We then have a custom module that parses and stores the data.

The data is moved in NITF -- News Industry Text Format, an XML DTD.

We do some similar things with Movie and TV data from CinemaSource (a data vendor).

From the In Print section, the web editors can 'enhance' a story for online viewing. We're hoping to move people from 'reading yesterday's newspaper online' to read news online. See http://new.savannahnow.com/node/103527 for an example.
--
http://ken.blufftontoday.com/
http://new.savannahnow.com/user/2
Search first, ask good questions later.

jasonwhat’s picture

I was wondering about this and asked once before but didn't find the answer. Where is documentation on drupal's xml rpc library. Since Word 2007 supports importing xml schema, could it be pretty simple to write a module that imports from word? Maybe it's time for a whole content import api or set of modules? Love to hear more about your experience with this particular aspect of your site.

agentrickard’s picture

Some better coders worked on that end of the project. so I can't really give you any pointers.

I do know that we use the blogapi module. Beyond that, I don't know.

--
http://ken.blufftontoday.com/
http://new.savannahnow.com/user/2
Search first, ask good questions later.

venkat-rk’s picture

Maybe it's time for a whole content import api or set of modules?

Jaza is working on an import/export API for drupal as part of a google soc project. May be you should share your ideas with him.

eoneillPPH’s picture

I really really need that... I'm currently tasked with setting up a Drupal site for some of our DTI content. Eventually we hope reverse-publishing will be possible, but in the meantime, I need to find a way to parse DTI's nasty PageSpeed export format into something that can become a node with the CCK fields I've set up to spill the stories into.

Any possibility you could shoot the parser module my way? Our deadline is of course ridiculous -- like, August 5th (2008) and my vacation is next week, just to make it more absurd.

Otherwise, I think I'm going to have to dig into the guts of FeedAPI module and/or Import/Export API module, and that looks unlikely to be achievable in just a few days.

All ideas and pointers graciously and gratefully welcomed.

apex’s picture

One of the best looking drupal sites yet. Very impressive.

Could you describe some of the modules that you used in creating this. for example, what are you using for the slideshow, weather? Also how did you do the profile pages? Is that custom or just a bunch of blocks thrown togehter using the profile module?

Again unbeliivable themeing.

http://www.ApexMediaDesigns.com

agentrickard’s picture

We had many, many people work on the design. See http://new.savannahnow.com/node/89117

The slideshow I assume refers to the Flash 8 player embed. That's custom. Works in two ways.

1) For section fronts, its a block that pulls from NodeQueue

2) For stories, it pulls from its own menu callback within our News Story module.

Profiles are hacked together and stuffed into a generic 'functions' module that I wrote. We then say on profile_view()

if (function_exists('mdw_profile') {
  $profile = mdw_profile();
}

Theming was the hardest part, largely because we have a complex site.
--
http://ken.blufftontoday.com/
http://new.savannahnow.com/user/2
Search first, ask good questions later.

apex’s picture

How did you create the shadow boxes on your boxes (for example there is a shadow box on the right-bottom of the weather box).

http://www.ApexMediaDesigns.com

agentrickard’s picture

I didn't write the CSS (we had 6 people working on the site full time).

But, looking at the code, this looks like the part:

/******************************************************************* Column 1 */
        #main #main-content .column-1 {
            float: left;
            width: 216px;
        }
            #main #main-content .column-1 .content {
                margin: 0 9px;
                padding: 6px 8px;
            }
                #main #main-content .column-1 .content h2 {
                    background: url("http://new.savannahnow.com/images/know/misc/h2_column_1.gif") repeat-x;
                    border-top: 1px solid #484848;
                    color: #fff;
                    font-size: 14px;
                    margin: -6px -8px 0;
                    padding: 12px 16px 22px;
                }
                #main #main-content .column-1 .content ul li {
                    margin: 0 0 10px 0;
                    font-weight: bold;
                }
		#main .column-1 .hr {
			background: url("http://new.savannahnow.com/images/know/misc/hr_column_1_index.gif") no-repeat;
			height: 11px;
			overflow: hidden;
		}

But hey, you already figured that out, right? CTRL-U.

http://drupal.org/node/38986#comment-72298

--
http://ken.blufftontoday.com/
http://new.savannahnow.com/user/2
Search first, ask good questions later.

giorgosk’s picture

What are the classifieds on the site ?
Is it a drupal module ?

--
Chios Greece sightseeings

------
GiorgosK
Web Development

agentrickard’s picture

No. The classifieds are part of our normal suite of products. They run from C++ and Oracle.

They are integrated onto the site via server-side includes and blocks.

See: http://morrisdigitalworks.com/products/mdclassifieds.shtml

--
http://ken.blufftontoday.com/
http://new.savannahnow.com/user/2
Search first, ask good questions later.

LateNightDesigner’s picture

Simply Fantastic! I really love all the work you did and the excellent use of modules and custom work. Truely inspiring and a great example of how amazing drupal is. Good work!
//---------------------------------------
Latenightdesigners.com- Giving IMD a Fighting Chance

drupallinux’s picture

this is a nice website. very nice custom work. good job.

http://www.SiteLancers.NET

johnchalekson’s picture

robert castelo’s picture

I think 160,000 users imported is a new record!

The site the user import module was writen for only needed to import about 500 users, but I wanted to make it as scalable as the rest of Drupal. Glad the extra effort was worthwhile, hope the import went smoothly.

As for the site itself, fantastic, nice work.

Cortext Communications
Drupal Themes & Modules

------------------------------------------
Drupal Specialists: Consulting, Development & Training

Robert Castelo, CTO
Code Positive
London, United Kingdom
----

jasonwhat’s picture

User Import is a great module. Is it up for 4.7 yet? I know it was asked here (http://drupal.org/node/48202) but no answer since May.

Rosamunda’s picture

It´s amazing!!!
Not only it doesn´t look like a Drupal site, but it is one of the most impressive site I´ve saw !!!
Congratulations!

Mojah’s picture

2500 users (on 4.72). 160K is fantastic! I agree user import is a much appreciated module.

Very inspirational site there Ken. We will study and learn from the pros.

One Love

agentrickard’s picture

Robert and I talked before we did the import to this new site. He was interested to know how the batch import that large would go. I forgot to tell him....

It turns out that we had a practical import limit of about 15,000 users per import. Don't know why. Even with the cron throttle in the module.

We ended up chunking our user file into 10 or 11 parts.

About 30,000 accounts (out of 190,000) were dropped. Mostly due to invalid email addresses.

The coolest part was that we were able to keep the 20 or so pieces of data from the old system and import them into Drupal.
--
http://ken.blufftontoday.com/
http://new.savannahnow.com/user/2
Search first, ask good questions later.

bertboerland’s picture

forgive my ignorance, but why not migrate the old user database to a LDAP server and use a drupal<->ldap module for AAA functions? This way, it is more futureproof (what if you want to migrate tomoorow to another CMS) and other services can more easy reuse the userid/passwords.

a question: in stead of users, how much content did you migrate and how did you do this (both the content and the meta content like url and taxonomy)? any word on that?

--
groets
bertb

--
groets
bert boerland

yelvington’s picture

Bert, that's an interesting idea. I don't think it came up during this process at all. The Morris registration system uses Oracle to store user profiles but LDAP to handle authentication. We did lose some information moving into Drupal, because Drupal doesn't store passwords (just md5 hashes).

As for content migration: Not much. And one of the performance issues the website team wrestled with is Drupal's position as a 404 error handler. The reason the site is parked on the "new.savannahnow.com" domain is that when it was launched on the "www.savannahnow.com" domain, the server tanked due to the 404 barrage. When you run a very large site for about a decade, a lot of ancient junk gets into the search engines.

agentrickard’s picture

Actually, the password is the only piece of user data we lost in translation (and we 'lost' it only in the sense that it is now irretrievable, but old passwords work).

And, frankyl, we didn't use other methods becauase I'm just not that smart. Importing users into Drupal was an obvious path, so I drove us down that road.

There are some ideas floating around about content migration -- some small chunks are being moved over, but not the whole news archive.

--
http://ken.blufftontoday.com/
http://new.savannahnow.com/user/2
Search first, ask good questions later.

Max Bell’s picture

Prepare to be emulated. A lot.

insomoz’s picture

wow
Truly amazing, I dont have the words do describe how good this is
on a scale of 1000 its 999. Well you can never be perfect ;)

Will White’s picture

Really great site, but sharing your detailed processes with the community is even greater. Thanks!

divrom’s picture

That is a very nice site. Well done!

I wonder how long it'll take to get ripped-off? ;-)