Course-Notes.Org Homepage Drupal

In March 2012, we decided that the site needed a Drupal back-end "overhaul" and front-end "facelift". We wanted to shift as much of the site's functionality to core Drupal modules and move away from several unsupported modules in Drupal 7. We also wanted to add a bunch of new and nifty things that were the talk of the town in Drupal 7. Initial meetings yielded the following priorities:

  1. Clean up the database and organize the file structure of the existing Drupal 6 site.
  2. Upgrade to Drupal 7 using core modules whenever possible.
  3. Redesign / rebrand the website and introduce a responsive theme.
  4. Improve the user experience.
  5. Optimize site performance and speed.

We will expand on each of these items in the next section.

Why Drupal was chosen: 

Course-Notes.org provides free notes, flashcards, practice quizzes and other resources to students to help them study more effectively. Originally created as part of a class assignment for AP US History December 2002, Course-Notes.org has grown to become one of the leading education resources for high school students taking AP classes.

The website has expanded its content offering from AP US History to 20 different subjects ranging from chemistry, art history, psychology and calculus. As the number of subjects increased, so did the popularity of the site. The website has over 135,000 registered members and receives close to 1 million visitors per month.

Course-Notes.org had very humble beginnings; originally developed (if you can even call it that) using Microsoft FrontPage in December 2002. In early 2005, when Ctrl+H (Find and Replace All) to make site-wide changes was no longer doing the trick, the site was migrated to Mambo. Soon thereafter it was migrated to Joomla, but when Joomla’s 3 level hierarchy limit was making it impossible to properly organize content, it was time to move on.

Enter Drupal. As Course-Notes.org has been bootstrapped since day one, it has repeatedly relied upon open source, community solutions. Very few requirements were not able to be met by Drupal core and contributed modules. Currently, Course Notes uses over 100 contributed modules (and has probably tried hundreds more!).

We are proud of our “little” site and what Drupal has enabled us to do.

Describe the project (goals, requirements and outcome): 

1) Clean up the database and organize the file structure of the existing Drupal 6 site

When I first started working on the original Drupal 6 site, I knew it was going to be a challenge. There were essentially two main problems with site: 1. hundreds of database errors and issues and 2. thousands and thousands of files that had been stored in default/files directory.

The site had obviously seen a ton of evolutions (especially problematic were lingering issue from one of the early version with the fusion site of vBulletin/Drupal). The site was use db_prefixes since it was a hybrid site/database which made it incompatible with Aegir hosting where I do most of my development and hosting. It was also a rather big database with lots of unused tables that had been lingering with either new version, even if the module had ceased to be used. A few development tools checks revealed there were several hundred “table errors” in that initial site.

My initial goal was see if I could get some version of site migrated to Drupal 7. So my initial cleaning effort was done via direct SQL manipulations and disabling and uninstall as many modules as possible. Since we were aiming for a Drupal 6 to 7 migration, I figured we should use core and contrib to do this. (Arguably I could have (and maybe should have!) used Migrate module.) So slowly but surely (emphasis on slowly) I got rid of enough tables and modules so that finally a version of the site landed in Drupal 7—warts and all.

Once I had gotten an initial version of the site in Drupal 7, I began recreating content types and Views structures in Drupal 7 and exporting them via ctools Features. By this point, I had collected enough of the site’s essence in Drupal 7 that I could drop the current database and start developing the Drupal 7 as such, including checking on ported modules and working on the site redesign. This allowed us to work on the Drupal 7 site, even though the database clean up and true migration wasn’t done yet.

While I had gotten a version of the site into Drupal 7, it had been done by removing anything that broke the migration/upgrade path. The task now was get the database cleanup, pre-upgrade preparation, migration and site re-launch into a series of more or less automated steps. To do this, I started creating several bash scripts that used SQL and Drush to break the whole process into several “cleansing” processes so I could debug in methodic manner.

The initial cleansing script fixed the db prefix issue and dropped various unused tables. A subsequent script used various SQL queries to fix table errors. And a final “cleaning” script disabled and uninstall any modules and their tables that were no longer used. At this point, we reached a point where the Drupal 6 site was still working and the database was “fixed” in a valid format.

With the database fixed, the next problem to fix was the files directory. As a notes exchange site, Course Notes gets thousands of document submissions, which unfortunately weren’t properly uploaded into any structured directories. FTP and SSH had issues even accessing the files directory.

My initial fix was to move all of the files from default/files to default/files/past and then rewrite the database accordingly. I then used a wonderful module File (Field) Paths (http://drupal.org/project/filefield_paths) which not only allows you to use tokens to set paths but provides a way to “retroactively update” files. I made a few adjustments and then saved the file fields. A batch process started and I watched as file and file was moved into subject-specific directories. Wow!

2) Upgrade to Drupal 7 using core modules whenever possible.

While the database had been cleansed, there were several problems in migrating the site but the major problems were:

a) Modules not yet ported or fully ported to Drupal 7

When we started the migration work, we created a list of modules that were “Must Have” for migrating to Drupal 7 and their Drupal 7 status or alternatives. As a bootstrapped project, Course Notes has tried a lot of modules and was using several rather small modules that weren’t ported. We prioritized and helped fix or port several modules like Text Extract for Drupal using Tika, Quizlet and helped fix a few others like quiz. Converting the site’s content types and Views to Features made the migration a bit easier since we could then apply changes to fields by enabling and reverting the Features modules.

Obviously this step with functional migration is a delicate one and in a few cases we needed to completely disable and reinstall the module in the new site to avoid errors. In other cases, like Premium module, which is a popular module in Drupal 6 but had no port to Drupal 7, we were forced to use an alternative module nopremium along with some SQL queries to update preexisting content.

b) Multiple-use file references that breaks upgrades

One of the biggest migration issues we had was with file path conflicts in file fields in Drupal 7 and CCK field conversions. The problem was the fact that multiple fields were referencing the same file in Drupal 6, which wasn’t a problem. There are specific changes in Drupal 7 core that require unique file references so even though there were technical solutions and hacks to core to make this work in spite of this problem, I didn’t like the idea of a “patched” core. So I used various SQL queries to determine the duplicates, then use some bash code to copy and rename the duplicate files and then redo the database references. This is likely not a solution for all sites but for me, this fixed the integrity errors and moved the migration process along.

These file path fixes were run as part of a final cleanup and prepare for migration bash script. This script fixed some of the miscellaneous errors that remained but couldn’t be applied without breaking the Drupal 6 site in some way. This script also made a bunch of Drush commands disabling and/or uninstalling all non-core modules. I also used Variable Cleanup module to check on unused variables and deleting several hundred variables from the variable table.

At this point, the functional essence of the site could migrate to Drupal 7. For those of you that haven’t done a Drupal 6 to 7 migration it’s important to do it in several stages when you have a lot of modules. So first migrate just the core modules, then slowly added more code and contrib back into your site and run update.php (better to use Drush!). It was a bit of trial and error to figure out which modules needed upgrading first but eventually the order worked. After I, of course, used CCK field migration to help migrate all the CCK field to Drupal 7 fields.

At this point the site, I used an additional bash script that applied all of the new Features and enabled various modules and changes that were built. A few changes were needed here and there for the UI but it was mostly seamless.

c) Converting core profile stuff to fieldable user entity

The final step, which was more about future compatibility than purely needed for migration, was converting core profile to fieldable user entity. Depending on your experience and history with Drupal, you’ll notice two user profile modules in Drupal 7. This is because Drupal 7 moved to fields everywhere approach and there were unfortunately a few stragglers, namely core profile module which doesn’t use fields and provides a nearly exact equivalent to Drupal 6 version. So while the core profile stuff migrated more or less fine to Drupal 7 (excluding some of the minor modules that we dropped), the core profile stuff is quite functionally limited compared to the user entity. So we decided to convert this to Drupal 7.

There are a few approaches to how to do this and discussions around Profile2 are quite helpful. In the end, I created a new user feature with the user entity fields I wanted and used some custom code that transferred field values. Since the site has so many users, this script took some time to perfect and optimize in terms of speed.

3) Redesign / rebrand the website and introduce a responsive theme.

With the database and migration work taking so much time, we hesitated about leaving the new design for post-migration work, but finally we decided (and rightly so, I think) to include a new design and theme in the new site. After spending many hours looking at websites for inspiration, taking hundreds of screenshots of design elements, and mocking up the site, we came up with a basic wireframe of “needs” and “page elements”. These initial “page elements” were reworked into a new front page design, logo and branding. This became the inspiration and base palette used for various internal pages.

For better or worse, we didn’t take a “mobile first” approach since our design was clearly for desktop. This obviously created later problems. With the PSD design, we created an initial HTML/CSS version in order to get the basic elements working and looking correct.

We opted to use Omega base theme from the beginning. Omega provides a very solid, responsive framework and creating multiple regions is extremely easy. The other key omega module that we use in several places in the site is Delta module which allows you to create an alternative layout that can be triggered with context module. This has allowed us to create major layout changes in different sections from frontpage to dashboard or book pages to forum pages.

4) Improve the user experience.

High school students are under a lot of pressure these days. There once was a time where you could simply score highly on standardized exams and earn good grades and you could get into the college of your dreams. Now, in addition to getting a perfect score on your SAT and earning straight As in all of your classes, you’re also expected to be the captain of varsity soccer team, volunteer at the local animal shelter, and take every Advanced Placement course under the sun. Students have been forced to learn how to study more efficiently, which is where Course Notes comes in.

Course Notes utilizes Drupal’s Book module to organize content that has been curated by Course-Notes.Org administrators. This content typically consists of entire sets of chapter outlines for a textbook or other study material that needs to be grouped together. Unfortunately, this meant that students were at the mercy of the Course-Notes.org administrators for finding and adding new content and content was only added when there was a complete set of outlines available.

To solve this problem, we introduced a ‘Premium’ content section allows students to exchange individual notes, flashcards, essays, and other materials with each other. In order to access these materials, we require students contribute back to the community by sharing their own notes, participating on the forum, writing articles for the blog, or providing feedback on other student’s work. Each of these activities yields the user a certain number of user points, or c-note$ as we call them. The User Points module handles the granting of points and automatically promotes users into the Premium content role as soon as they meet the minimum point requirement.

Once control of the content curation was relinquished to the community, there were a number of steps taken in order to ensure this user generated content remained well organized, easy to find, and the user experience didn’t suffer.

a) Organization
In an ideal world, users would tag their content with the most descriptive phrases making it very easy for users to find related documents. Unfortunately, this rarely happens. Integrating the OpenCalais/Social Tags module allows user generated content to automatically be scanned and related tags inserted. For example, if a user uploaded a document related to the ‘Revolutionary War’, the document might be tagged with terms like ‘George Washington’, ‘Boston Massacre’, ‘King George III’, and ‘Declaration of Independence’.

b) ‘Searchability’
Students have the option of either pasting text into the body field from a document, or attaching the document itself. A majority of the students opt to upload the document which results in a blank field and users only able to rely on the title field to determine how useful the attachment will be to them. To solve this problem, a custom module utilizing Tika to extract the text elements from the documents was developed. This gives unregistered users to view the teaser text, premium users to view the full text without downloading the attachment, and increases the likelihood of search engines indexing the page.

c) User Experience
Students are able to upload a variety of different file types, such as Word Docs, PDFs, PowerPoints, etc. Some of these file types play better than others when formatting the extracted text from them. To provide the best user experience possible, uploaded attachments are embedded into the page using the Google Docs Viewer module. This allows users to preview the document in its original formatting prior to downloading it to their computer.

5) Optimize site performance and speed.

From a performance perspective, we decide that post-migration we’d need to think about possible moving servers. The previous server lacked many optimizations and did not quite have the Drupal “awesome sauce” we wanted.

For anyone thinking about Drupal site performance and hosting, it’s not an easy decision. It’s not an easy decision since there are a number of great Drupal hosting choices available today. The major players are Pantheon, Acquia and Omega8.cc. There models and methods are different. If you are interested in the details, check out this Quora discussion which layouts

In July, we decided to migrate servers to Omega8.cc hosting. Their high performance Aegir/Nginx hosting along with various technical suggestions have significantly improved page loads. We also switched from Drupal's default search to Apache Solr which significantly improved server load.

While all of the above changes improved the page speed, we hit a glass ceiling (or a floor, rather?) due to the advertising blocks blocking the loading of the remaining page elements until the ad unit had completely loaded. Unfortunately, simply removing the advertisements from the site is not an option as they fund the site’s hosting and the development efforts. By utilizing Google DFP’s asynchronous ad tags to serve our impressions, we were able to drastically reduce the page load time.

Modules/Themes/Distributions
Why these modules/theme/distribution were chosen: 

Site Maintenance (Module(s): CTools, Views, Features, Strongarm, Boxes and Context)
We use the obvious site building modules that everyone should be using like Ctools, Views, Features, Strongarm, Boxes and Context. Ctools exportables, GIT version control and Aegir-powered site clones have made deploying new and modified Features a lot easier. The combination of these technologies has saved us a ton of headaches during dev and deployment.

Education (Modules: Quiz, Quizlet)
For quizzes and flashcards, two of our extra educationally-specific functionalities, we use the Quiz suite of modules and the Quizlet module. Quiz modules allow us to provide multiple choice quizzes. Quizlet module allows us to search and embed thousands of flashcard, vocabulary lists from Quizlet.com

Forums / Community (Module(s): Advanced Forum, User Stats, Signatures for Forums, Author Pane, Forum Access, Chain Menu Access API)
Forums are an important part of the "community health" of the site. We've really seen the quality of Drupal's forum capacities improve over the years. Currently we use Advanced Forum along with a custom "Advanced forum style" to give us a forum look that matches the overall site.

We also use several small but helpful modules to improve forum usage like User Stats, Signatures for Forums and Author Pane, and we control access control via Forum Access and Chain Menu Access API. Due to the number and diversity of forums on our site, we did some major work reworking the forum front page. In addition, we’ve enabled the Private Messages module to allow users to communicate directly with each other.

Premium Content (Module(s): User Points, Node Option Premium, Content Access, Text Extract for Drupal using Tika, Embedded Goggle Docs Viewer)
We use User Points, Node Option Premium and Content Access modules to handle user access to premium content. Users contribute content or posts in the forum or blogs to gain enough points to get upgraded to a premium role. There was no upgrade path for our premium access control modules, so in Drupal 7, We use Node Option Premium and Content Access to control how content is visible to non-premium members.

The uploaded content by users is one of the most important resources on the site. Unfortunately, since most of this content is uploaded as PDFs, Word docs and other media files, it isn't directly part of the site as "text" and "keywords." In order to get over this problem, we extract the text elements from all uploaded content and embed it in the body field. We created a custom module: Text Extract for Drupal Using Tika which uses Tika, a java jar file, to pull the text from pretty much any file type. Initially we used a java file on the server itself but later when we shifted to Omega8.cc hosting, we used Apache Solr's tika file to do the text extraction. The extracted text gives users an idea of the value of the contributed files and also provides a rich text for search, keywords and search engines.

Content Recommendation / Rating (Module(s): Apache Solr ‘Like this’, OpenCalais, Fivestar)
In our migration to Drupal 7, one of our main goals was to get all the book pages, uploaded content and other content organized better. One of the initial pushes was to ensure all content was properly tagged by subject. This allowed us to create various Views blocks of "most recent content in X subject." We also used Apache Solr's "like this" block to create blocks to show similar content. Finally, we use the Fivestar module to allow users to rate the quality of user generated content. Overall, this allowed us to create a very rich user experience and SEO-happy book and premium pages.

We've recently been working with OpenCalais's autotagging feature so that user generated content is properly tagged with rich and specific keyword tags. This helps both in terms of semantic and search but also with search engine optimization.

Spam Prevention (Module(s): reCAPTCHA, Spambot, Mollom)
As a site with an active user base, tons of daily visitors and lots of shared content, we are a prime target for spam. We've tried and used tons of different spam prevention modules. The first step is to prevent spam users. For that we use Google's reCAPTCHA during registration. We also use spambot to isolate well-known spammers on the site. For content moderation, we use mollom text analysis to block spam postings and uploads.

Team members: 
Community contributions: 

Throughout the life of this project, Course Notes has regularly worked with various modules and where possible have contributed back patches, fixes and feature improvements.

The two main modules we helped develop, improve and continue to maintain are:

During the migration to Drupal 7, we contributed Drupal 7 ports, patches and/or fixes to the following modules:

Project team: 

Chris Keenan is the founder of Course-Notes.org. While he may have only gotten a 3 on the AP US History exam, he likes to think that through creating such a site, he has aided many other students in getting 5s on the exam. In addition to running Course-Notes.org, he also works full-time as a healthcare consultant and attempts to stay in shape by playing as much soccer as possible.

Mark Koester is full-time Drupaler who is known for mixing a bit of Drupal with his coffee in the morning. He has worked on a number of Drupal sites and is passionate about open sourcing his code (https://github.com/markwk/), thoughts and dreams. He graduated twice with degrees in Philosophy in English and French and currently spends his time between the US, Europe and China. You can find his extended Drupal thoughts here: http://int3c.com/blog or his tweets @markwkoester. You can seek his professional Drupal/PHP and node.js expertise at Int3c.com.

Comments

Great example of a responsive site using Omega. Congrats to you and the team.