France 24France 24 is a public 24/7 international news channel broadcast in three languages: French, English and Arabic. Its mission is to cover international current events from a French perspective and to convey French values throughout the world. The channel provides keys to understanding complex events through in-depth analysis. France 24 also puts culture at the forefront of its programming. France24 is part of the AEF (the "Audiovisuel Extérieur de la France" or French foreign media), along with RFI (a radio station) and TV5 (a TV station).

Launched in December, 2006, the website was originally based on a Java CMS, Magnolia. But due to stability problems, we switched to Drupal 5 in mid-2008. We have just migrated to Drupal 6 and a brand new codebase. This case study covers this migration, focusing on the technical part, and describes some of our homegrown modules to be open-sourced.

The monthy traffic of the France 24 websites is around 5 million unique visitors. A more geeky metric is that the site runs 300-400 concurrent active Apache threads at all times.

The migration scope

This was not a simple migration. Indeed, since the first migration to Drupal 5, some lessons were learned, and a few technical choices had scalability issues. So it was decided to restart the code from scratch.
Then, we wanted to add much more flexibility to frontpages and stories. The structure was quite rigid, especially in the frontpages.

Plus, we were not going to make one migration, but two migrations at the same time: France 24 and RFI. The RFI website is based on ASP.NET with an homegrown CMS. And that had to be done in 6 months.

That was an interesting object specialization paradigm applied to a full-scale website development in Drupal: can we work faster by sharing some of the code?

Developing two websites at once

Well, that works quite well. We finished the websites in time (RFI to go live in a couple of weeks), and we gained time every day by working on both websites at the same time.

Basically, we have three sets of modules, "AEF", which is shared by both sites, RFI-specific modules, and France24-specific modules. In the AEF set, we define Views, Content Type, basic templates, taxonomies, and so on. We create the big common features: menu, tabs, easy views, externodes, etc. Then we specialize them if necessary in the RFI and France24 modules. For example, we can add a field to a content type, a filter to a view, or a RFI/France24-specific process like the fetching of videos for France24 video-on-demand service, or the RFI radio editions.

As a result, more than half of the code is common, with a quarter specialized to each RFI and France24.

The project development

For this project, we were a team of 9 developers/projects leaders, and 2 sysadmins.
We worked using the Scrum methodology, with short 3-weeks sprints: 2.5 weeks of development, followed by a demo to the journalists for feedback. That way, they could easily monitor the progress of the project and provide feedback *early*.

Main concepts

Multimedia Element: One thing we learned with the first version of the site is that often new stuff is asked for which already exists hardcoded somewhere else in the website (e.g., a carousel of images for the frontpage is asked for, but that element is already available hardcoded to the story content type).
So we created the concept of the "Multimedia Element", which is a mix of videos, sounds, diaporamas, carousels, links to stories, Twitter, external links, text, quotes, etc. Everything on the website is either an story or a multimedia element that can be placed in a box anywhere in a story, on the frontpage, in a special report, etc.

CCK Formatters: The usual way to theme nodes on Drupal would be to theme the node page template and the Views item template. Unfortunately this approach means that it's not easy to reuse the themes elsewhere (e.g. in a multimedia element). So we are making heavy use of the standard CCK formatter concept: we are creating node themes in CKK formatters for each content type, and we can reuse them easily.

Contrib modules and homegrown modules

We are using quite a lot of contrib modules: 35. Among the classics, the lightweight Composite is used instead of Panels for the frontpage.

We also developed quite a lot of homegrown modules that you can preview in our Drupalcon Paris presentation. This includes:

  • AEF Easy View: Views is a very powerful tool, but too complicated for journalists: can you see them creating their own view, and including the result somehow in their story?? This module is a CCK field that let you easily configure an existing view and put it in your story. You can even choose the theme of the results, and if it is going to be a carousel. For example, say you want to show all the latest stories with the tag France. Select the tag, select the number of stories you want to show, the theme, preview it, and you're done!
    Also, you can reorder the results of the automatic list, making it a manual list. Or you can keep it automatic even if you have reordered results.

  • AEF Multimedia Element: This is the multimedia element content type as described earlier, plus the FCKEditor plugin to insert them in the body of a story. A very powerful module.
  • AEF Externodes: A module allowing one Drupal installation to access another Drupal installation's nodes remotely, using Nid/Fid Address Space abstraction, and executing Views remotely. For example, if you're looking at node 100000001 on Drupal 1, you are in fact looking at node 1 on Drupal 2. This is truly powerful, allowing you to save SQL CPU time by querying nodes on a remote Drupal installation and share a set of nodes between installations. A good example we're going to use it for is image nodes. A collection of 20,000 image nodes will be put on a separate Drupal installation, which we will able to plug to other Drupal installations. And the heavy SQL full-text search on these images by journalists will be done on a database different than the main one. I strongly recommend watching the videos to see it live.

  • AEF Image: A very powerful image CCK field. When you upload an image, you see it in all the different imagecache presets used in the website. And you can scale/crop each one of the presets differently using JCrop: you are overriding an automatic imagecache preset. Useful when you are cutting one's head :) Plus, you can even upload and scale/crop another image for a given preset. And finally this field supports both the direct image-upload approach or the image-as-a-node approach. See it live on the video!

  • AEF Editor Toolbox: A small fixed frame where you can search stuff, manage bookmarks, history, and search your image collection.

  • AEF Embedded Edit: Do everything in a single window! Creating a image, then searching it, inserting it in a multimedia element, saving it, then going to your article, searching your multimedia element and inserting it... can be quite a lengthy process. With this module, you create your image directly on the same page in an iframe, and when you save, the nodereference of the multimedia element is automatically filled with the result. And you can edit/view every node referenced from a nodereference.
  • AEF Formatter Selector: Have you ever been frustrated by the fact that there is no way in Drupal to select a theme in the node edit form? And no contrib module for it? Well, this module is doing it! It let you select a theme in a list of themes under each nodereference you selected.

39 more generic "AEF" modules were also developed.

The server architecture

Our server infrastructure is basically laid out as follows:
First, the Akamai CDN, which act as a giant reverse-proxy and saves our server from 90% of hits.
Then, 4 load-balanched Apache servers, each one sharing the same webroot with a NFS mount.
Finally, a replicated MySQL database linked with the Apache server at 1Gbit/s.

Problems we encountered, lessons we learned

Problem encountered: Before the migration, on the first version of the site, we had some MySQL slow queries with a cron we made that sometimes crashed the database.
Lesson learned: Choose your data model very carefully. Be very careful with the database. That's the only part that is not scalable. You can add as many apaches servers as you want, but only one database.

Problem encountered: When we were working at reducing the amount of traffic between the database and Apache we found out than 80% of the traffic was due to Lightbox2, which was unnnecessarily generating thousands of CCK formatters. These formatters definition were stored on cache tables and transfered to apache on each page load. If we hadn't found out that, the server infrastructure would have probably collapsed with 1.2 Gbit/s traffic on a 1Gbit wire between Apache and MySQL.
Lesson learned: Take your average number of active Apache threads, multiply it by the SQL *data size* transferred for a page, and check your wire capacity.

Problem encountered: When we hit the "migration" button, all the Apache servers went crazy at 200M of load. After some times of investigation, we found out that they were simply swapping like hell.
Lesson learned: Take your average number of active apache threads, multiply it by the average memory usage of a page, in our case 30M, and check that your Apache servers have enough RAM.

Problem encountered: Before the migration, on the first version of the site, loading a France 24 page in a browser was quite slow. I am talking here about the user experience in the browser, the total loading time you can see on your network tab on firebug, when all JS,CSS,images are loaded. We were at about 6-8s, and the user experience was not that good, which may sound weird since we had Akamai caching our files.
In fact, part of this was due to quite a large number of Javascript files, including one tracking file that was loading 9 more Javascript files. Now, on the new version, we have a *much* better loading time, 2-4s.
Lesson learned: Aggregate your Javascript files!! In fact, your browser can load images and CSS concurrently, but it will load JS files *sequentially*! And don't forget to also aggregate your CSS files.

Problem encountered: As we were developing the website, the Apache 2 computing time got longer. Often, we were able to reduce dramatically the loading time by commenting a single line or two.
Lesson learned: There is always room to reduce your loading time, and that's usually due to simple mistakes. Put timers in your code, display the time needed to generate parts of the page, and narrow down the part eating the most of the CPU time.

Open-sourcing

We've announced that we are going to open source this code, all 45 AEF modules. First, we need to release the RFI site and package the code, removing the bits of non-generic stuff that may remain on AEF modules.
This should be done by the end of year; meanwhile, you can see it live on this Drupalcon Paris presentation.

Conclusion

Every development team has a given level of expertise in development. And having an excellent base such as Drupal for a project allows us to increase the final quality of this project.
By contributing (soon) these new Drupal modules, we hope to help strengthen the newspaper module base, and we want to thank the Drupal community for this wonderful product.

Comments

ndeschildre’s picture

The images that should appear on this presentation:

New France24 screenshot:
http://drupal.org/node/614000

Homegrown module "AEF Easy View":
http://drupal.org/node/614002

Homegrown module "AEF Externodes":
http://drupal.org/node/614004

Homegrown module "AEF Image":
http://drupal.org/node/614008

Homegrown module "AEF Editor Toolbox":
http://drupal.org/node/614010

Homegrown module "AEF Formatter Selector":
http://drupal.org/node/614012

xmacinfo’s picture

Congratulations for the switch to Drupal 6 and I look forward to all the AEF modules that you will release.

As for me, I would like to know more about the video streaming modules and worflow you are using.

ndeschildre’s picture

Ok concerning the workflow, we have none for France24: journalists can directly publish content. But we do use checkout ( http://drupal.org/project/checkout ) to avoid two journalists edit the same article simultaneously.
At RFI, we are using the workflow module ( http://drupal.org/project/workflow ), that we have configured.

Concerning video streaming, we don't really use a module for that: we have a video stream from our provider yacast, and we put this stream on a flash player in a preprocess function. So that's basically theming.
We do have developed a few helpers functions for that, but that no big deal, just that, helpers functions.
The more advanced stuff you're seeing (timeline and links back to the shows at a given time) are more france24-specific functionalities.

yelvington’s picture

The workflow question is interesting. I would think France24 would need workflow support more than RFI, just based on what the RFI folks told me (when I visited last month) about the somewhat balkanized nature of their operation.

Do you have something in your data model to tie together the English, French and Arabic versions of a story?

I've been avoiding the Workflow module, but based on recent conversations with folks at some of our Morris newspapers, I think I'm going to have to get familiar with it soon.

Journalists at our newspapers quickly come to prefer working in Drupal to working in the legacy newsroom content management systems, at least for breaking news, and we haven't adequately confronted the implications of that.

ndeschildre’s picture

Concerning the data model:
We have three different databases, one for each language. While it may look like a bad design decision (an update needs to be done three times, possible config desynchronization), it have proven to be life-saving during exploitation . An example: when Mysql decide to mark a table as crashed. It happen from time to time, and when the concerned table is node or node_revision (it happened), it's better to only have one language down. Another example is, let's say the french database is too big, the server is overloaded: we can just move it to another db server, while keeping english and arabic on the first one.

Concerning the workflow, you have two choices, that are well illustrated by the France24 & RFI choices. Either you trust your journalists to have a minimum of self-discipline, and to communicate a lot, and you go without a workflow system (What France24 choosed). Or you put formal rules in place, with some people allowed to publish, others not. (What RFI choosed)
The former is better for a smaller team, with all members close by, where informal rules can be followed. Content is produced and published much more quickly. But some mistakes may happen (like some un-publication by mistake).
The latter is better for bigger team, with members separated in differents places, where informal rules are more difficult. Content is produced slower, each workflow step takes time, but mistakes are much less common.

It really depends of the journalist team.

larowlan’s picture

Wow.

These image handling, simple view and media element embedding modules look fantastic.
Really great features here for usability. Can't wait until these modules are released to the public.

Congratulations on a job well done.

Lee Rowlands

--author="larowlan <larowlan@395439.no-reply.drupal.org>"
dshaw’s picture

Thanks for the good news and the brief case study. In particular, the list of problems found and how you solved them is excellent. Hopefully this will save a lot of time for others down the track.

JayNL’s picture

really excellent Drupal implementation, very well done.

magnusproject’s picture

that's good

Delta Bridges’s picture

Congratulations and thanks for sharing the moldules.... this is great :)
Jean-Jacques

Aniara.io’s picture

Great job and write up. France24 rocks. :)

Roi Danton’s picture

Congratulations for the project release in time and foremost thanks for sharing the results of your work with the Drupal community!

I'm wondering about the way you choose to include the multimedia elements into textfields. Do you paste the actual element HTML/code into the text or do you use a filter that replaces a tag with the corresponding multimedia element?

ndeschildre’s picture

Thanks!
You can see how multimedia elements works on this image: http://drupal.org/node/627416

It works as follow:
First put your multimedia element in the nodereference list (first image). Then click the Fckeditor multimedia element icon, you get the dialog in the third image. You select the multimedia element, the alignment (left, right, center) and the theme (fixed width, full width). Click ok, and you get the little square you see at the right of the first image, in the FCKEditor. You can then drag it, re-edit it.
The result is what you can see on the second image.

Example of stories with multimedia elements:
http://www.france24.com/en/20091108-iraq-parliament-approves-much-awaite...
Example of frontpage *full* of multimedia elements:
http://www.rfi.fr (each carousel is a multimedia element)

PlayfulWolf’s picture

I see the site which can hold ground against bbc.co.uk and likes, but unlike the british version - Drupal community will benefit. I am anxious to see all those AEF modules, especially "AEF Externodes"

ndeschildre’s picture

Thanks!
But please be aware that the AEF Externodes modules is one of the few that will require a high level of expertise to use it. Indeed, it has a number of limitations, and if not used correctly, it may lead to a much increased server load.
It is at the moment used on production by RFI and France24 for images (one separate drupal, rfi only) and RSS stories (another separate drupal).

dstuart’s picture

Hi,

quick question as to why you needed a separate drupal instance to handle images to help distribute load instead of using a cdn or something simlar?

ndeschildre’s picture

Hello,

We do already have a CDN, Akamai.
The rationale is mainly to be able to reuse the collection of images in the differents websites to come next, and also to remove the load generated by the full text search of images by journalists (plus limit the overcrowding of the node table).

GeorgeLitos’s picture

very good work, an inspiration to all of us!

I dont know if you did it on purpose but instead of a 404 error page you get a link to an
"Access Denied" install.php page

EDIT: if the link is
http://www.france24.com/en/XXX you get a 404 error page
if its not (http://www.france24.com/123) you get :
Access Denied
You don't have permission to access "http://www.france24.com/install.php" on this server.

ndeschildre’s picture

Thanks for pointing this out.
We are already aware of this and working on it, along some others 'oopses' with our redirect rules :)

slippast’s picture

I think that releasing some of those modules (expecially the image work flow) is a game changer not only for news publishing but for the Drupal system at large. Since I started working in open source CMS's a few years back I've been continually surprised that no one has really created a useful image workflow, it seems fundamental to me.

Anyway, kudos to you for your site, which is top notch; but also a major 'thank you' for opening up those modules!

jorge’s picture

Amazing work and you should all be commended.

voipfc’s picture

Good news - Congratulations on a website which really showcases Drupal's capabailities.

Bad news - I am one of those guys who can't help finding fault, but in this case it is not a Drupal issue, but the font used.
Some how the font doesn't look right to me. I don't know what name it is but it feels somewhat ungainly, and the line spacing is higher than necessary
.
I am a fan of serif fonts like Georgia and Palatino.

NYTimes.com is my favorite front page.

wetchina’s picture

hi! amazing project!
but what module did you use to create the appearing sub-menu when clicking on an arrow of the main menu item?

ndeschildre’s picture

Thanks.
The whole menu is a homemade module of the AEF pack, that also will be open-sourced.

aac’s picture

I am looking for the module to achieve the mega menu functionality.
I have not found the menu module in the list provided at
http://drupal.org/node/665756
Could you please provide any kind of help!!
Thanks

---~~~***~~~---
aac

dalin’s picture

Be very careful with the database. That's the only part that is not scalable.

With Pressflow Drupal (which you should be using anyway since you are concerned about performance) you can send read queries to the slave server while keeping the write queries on the master. This is the same technique that drupal.org uses.

________________________
Dave Hansen-Lange
Director of Technical Strategy, FourKitchens.com

ndeschildre’s picture

Hello,

I wasn't aware of this derivate of the Drupal core. I will look at that.
We are likely to use Mysql Proxy in the future, that does what you described without touching the Drupal core.

toma’s picture

Good work, thanks for sharing your modules

markus_petrux’s picture

Re: "9 developers/projects leaders, and 2 sysadmins"
Re: "with short 3-weeks sprints: 2.5 weeks of development"

Measured as man/days that's (say) 3 weeks * 11 people = 33 weeks. Is that correct? Is that all the time spent on this project?

I'm curious about this because I'm alone working on another project, and it's the first time in mt life I do something similar. I used to work on a big company, and not related to the internet. :-/

Thanks for the write up. It's much appreciated.

Doubt is the beginning, not the end of wisdom.

ndeschildre’s picture

Hello,

We were working in 3-weeks sprints, meaning we were setting goals for three weeks periods, using the scrum methodology. But the whole project lasted 6 months, so that's more 5 man/year :)

Btw I was quite eager to see CCK3 coming, too bad it didn't come in time. We are finally using a modified version of the multigroup module (which disable the DnD functionality, source of problems) with CCK2.

markus_petrux’s picture

I started the project in September 2008. It's been a bit more than a year, and that was making me feel somehow bad, though I expect to get the first phase ready to launch soon. Our projects are not comparable, but I still appreciate your input. Thank you.

Oh, sad to hear the multigroup was not ready for your project, maybe next one :) It would have been nice to see a post on the Drupal front page about it. It's been a challenge for several reason (complexity and probably too late in the life cycle of CCK2).

Doubt is the beginning, not the end of wisdom.

aleksey.tk’s picture

Great! When to expect codebase opening?

yelvington’s picture

Nicolas has uploaded seven AEF modules that I've counted so far.

ndeschildre’s picture

Indeed, I'm slowly uploading them one by one, checking them for stand-alone use, then I'll have to do my favorite part... documentation, yipee!
I still expect to have the full pack online by year's end.

Starminder’s picture

Seems many of the modules require "JQ"...I haven't been able to find that one, thanks!

EmilOberg’s picture

http://drupal.org/project/jq

If you're having problems finding modules you know the name of, my recommendation is to go to http://drupal.org/project/[projectname] or do a Google search for "drupal [modulename]".

arnieswap’s picture

Which module was used for subscription form?

EmilOberg’s picture

I'm not the noble soul behind this project, ndeschildre is. But I bet he'll see your message

ndeschildre’s picture

Oops, forgot to look here for a while.

Which submission form are you talking about?

Summit’s picture

Hi,

This is absolutely brilliant! thanks for all your hard work, and looking forward to test the modules!

ndeschildre’s picture

Here is the list of open-sourced modules:
http://drupal.org/node/665756

tsvenson’s picture

Wow, this seems to be exactly what I have been looking for. I am working on a media site and have been researching for a long time the best way to manage images and other files. I am right now installing most of the modules and will give it a test run.

aac’s picture

Thanks for such a nice writeup and website.

---~~~***~~~---
aac

MaT972’s picture

Two thumbs up!

I'm just adding a link to the newspaper modules release article which points to all modules:

http://drupal.org/node/665756

bkraft’s picture

I think this is great work and glad to see it being contributed back to the community. WE can all only wish for more organizations doing so.

Regarding Magnolia CMS, I would like to clarify that France24 did work with a service provider that had no Magnolia CMS training and was no official partner, so the story is one of qualification, not product. To give the impression Magnolia had technical issues is unfair. It should also be noted that France24 did in fact successfully run on Magnolia for obviously 1.5 years. Thanks.

ndeschildre’s picture

Indeed you're right. I don't have the exact details since I was not there back then, but I heard that the core of Magnolia was modified quite a lot, which is of course not good, and that's probably the main source of the stability problems back then.

I don't know Magnolia, it may work quite fine, but, like Drupal, hacking the core is definitely a bad thing.