Drupal for the EDU Enterprise (40K users?)
lhl - October 12, 2004 - 23:14
I'm really interested in looking at Drupal's suitability for a large, campus-wide educational installation.
Has anyone scaled up to this size before? What are the bottlenecks we're dealing with?
Issues off the top of our head:
* Shibboleth for WebISO - Apache module, passes env variable for login
* LDAP population, attributes of accounts (or dynamic creation w/ first login [see above])
* Scaling for editing
* Scaling for output - creative use of Squid, Accelerators
* DB support?
We're looking at Movable Type and Roller, but honestly, neither blow us away.

Im setting it up for one department
for only one department of a large university though. I hope to be able to port it (copy) for other departments once I am done. I hope that Drupal 4.5 will suffice but I too am needing the LDAP functionality to be finished/added for complete control of user information and logging.
I would love some other replies from people who are using this for uni or schools/departments. Our site is not live yet. It will be soon though and I will post it.
interested as well
Our University is in the process of switching from WebCT to Blackboard. I was not impressed by my first look at blackboard. The components that would allow interaction seemed either poorly implemented or of limited utility. Drupal seems to allow more possibilities for collaboration and interaction. I will be interested in other people's experiences in this regard.
scaled to this size: there is
scaled to this size: there is an install related to mobile phone uses that is at 180K (Tipic runs this, I believe)
env variable for login/LDAP: should be doable to create an authentication module. We've done this Drupal-to-Drupal with a FOAF module
scaling for editing: not sure what you mean. Permissions control editing, and usually submitters can edit their own content.
scaling for output: caching is built in. A recent test of open source CMS's had Drupal come in second (against a less full-featured system).
DB support: MySQL and PostgreSQL. By doing some fancy horizontal scaling using replication, you should get the performance you need out of MySQL.
We have experience (and some tools) to help manage mass-hosting of sites. It probably makes more sense to not have a single instance of Drupal, but rather one umbrella site plus slave sites for each group/department/class/whatever unit makes sense. Can still use a single user base (potentially with external auth in your case)
Re: MT and Roller. Both are made for individual or (at best) group blogs. Neither have many of the community-type features that Drupal has and that fits well in an educational setting, never mind support for different content types like events.
Contact me at boris AT bryght.com if you need some more pointers or would like some assistance in your investigation.
--
Boris Mann
Bryght Guy
There is also ecademy.com wit
There is also ecademy.com with over 60k users. It runs on an old, very customized version though.
For a really large site with many editors Drupal would IMHO need some locking mechanism.
--
If you have troubles with a particular contrib project, please consider to file a support request. Thanks.
hmm...
Boris, thanks for the pointers, I'll definitely be pinging you as we're interested in generating a plan for "on up" very soon and I think it'd be great to be able to implement Drupal instead of something closed-source or (and this is a personal preference) Java-based.
So, very glad to see all these responses on the page (thanks everyone).
On your tech issues:
authX: Does Drupal's pluggable auth allow authZ (loading properties that control the permissions, or is that handled somwhere else)?
scaling: for editing, there are two parts I guess. One, if in creating a scaling infrastructure, there's anything that needs to be done w/ 'read' vs 'write' machines, session affinity, etc. but I guess the fancy stuff in that case would probably go to the MySQL scaling. Our production machines these days are primarily Sun V240s and V440s (2x or 4x USIIIs). We have hardware load balancers (NetScalers) available that support session affinity, load-based balancing, SSL acceleration.
The other side of the editing is creating a delegated administration infrastructure (letting depts creating new group blogs, etc), although we're more interested in doing things like dynamic aggregation to create class blogs and the like.
DB: we have a much larger Oracle (production) infrstructure than MySQL, but I'm assuming the former is flat out right?
We're interested in having each user (student/staff/faculty) maintain continuity over all their posts. For the students, this ties into our future plans for ePortfolio and our LMS systems. We'd like to do things like class blogs as aggregations so that posts remain w/ the students.
As you point out MT and Roller are both quite more limited, but we're definitely looking at pieces here. We have certain collaboration tools with both our uPortal and Blackboard frameworks, and our looking into SocialText and Jot for wiki-ish interactions. We also have both personal calendaring mostly set (we run the Sun stack) and events calendaring (we're just finishing building an events repository/multi-calendaring system as nothing met our needs).
authZ: technically the permis
authZ: technically the permissions are handled separately, but, well anything's possible. Off the top of my head, you might settle on a set of roles with specific permissions, and set the role(s) of the user on login.
As killes points out, a locking mechanism for editing might be necessary.
When you say "letting depts create new group blogs" -- that's exactly what an entire Drupal site would do (my concept of a "slave" site, instead of one monolothic Drupal install). This of course does not rule out aggregation, but the combination of control (centrally managed network of drupal sites, some default templates, permissions etc.) as well as flexibility (customize look and feel, components, etc.) at the departmental level makes this a good way to set things up.
DB: Drupal does use a DB abstraction layer, but it is untested with Oracle, and there are no out-of-the-box Oracle compatible SQL syntax for table creation etc. So it *could* be done, but would likely be a lot of work.
Drupal's book module is somewhat wiki-like, and could be made more so quite simply (we've done a little bit of experimentation).
Oracle
There are only 14 functions to be replicated so the amount of work would maybe not that big for somebody who knows oracls.
--
If you have troubles with a particular contrib project, please consider to file a support request. Thanks.
Don't forget reserved terms...
Well we still have to face off problems with drupal database's schema that contain word uid (600 occurence in my install) and access (table name) which are oracle reserved word...
Bad :(
See http://drupal.org/node/4907 , http://drupal.org/node/4770
Maybe I'll try it, but seem to be a deseperated task :( anyone interessted in joining me ?
works well
I have assisted in installation of a large group of Drupal sites at University of Vienna. Its quite an interesting configuration, with multiple drupal sites sharing users, translations, etc.
Feel free to contact me for more details.
-moshe
User base
In less than a month, SpreadFirefox.com got 20.000 registered users. It is probably the fastest growing Drupal site, although this rate is likely to drop.
To the best of my knowledge, Tipic.com has the largest Drupal user base: in May 2004 they had more than 180.000 registers users.
It must be said that Drupal's scalability does not depend on the number of registered users, but on the number of concurrent (authenticated) users (and on many other aspects of your Drupal configuration).
Dries, thanks for the numbers
Dries, thanks for the numbers. Understand about the concurrent connections. Anyone in particular I can talk heavy-duty numbers about (irc, aim, phone works)?
Does Drupal play well w/ mmcache or ioncube? Boris mentioned built-in caching, would it still benefit from slapping squid in front of the output?
BTW, if we go with Drupal, we'd probably like to submit patches, improvements upstream (in particular, probably some UI tweaks). Is it a get on IRC/mailing list and contribute type thing or is there more to the process? Also, there may be some things that aren't as general purpose. How are those dealt with (I'm thinking say about the 'UI tweaks for bloggers' thing a while back and CivicSpace -- hadn't followed those too closely, just wondering if that's anything I should know or worry about).
phpaccelerator
I used Drupal 4.3 with phpaccelerator when I was self-hosting and found that it offered a noticeable performance increase. Definitely worth considering for a large Drupal installation.
If those tweaks are likely to
If those tweaks are likely to be interesting for the general Drupal public, you should create patches and submit them to our patch tracker. Have a look at the contributor's handbook.
If they are rather special, you could (if possible) do them in your them without changing core files.
--
If you have troubles with a particular contrib project, please consider to file a support request. Thanks.
Community College
I work for a medium sized community college. They dropped BlackBoard for Educator. Educator, I believe, is open source. My college has it set up on MySQL.
The problem where I work is that none of their systems are integrated.
Online classes run on Educator, registration and payments run on Datatels' web interface (Java).
I only started work there 6 months ago so I've not had the chance to develop a concrete plan to integrate these systems. But I'm working on it!
They tried WebCT a little while ago, and chose Educator over that. The reason for this post is to let you know about Educator--it may work better with a Drupal installation than WebCT would. Just a thought.
Annie
LMS, enterprise systems on campus
Annie, we currently use Blackboard. We may make a transition, although any move would be non-trivial to say the least, and any change would most likely be to Sakai as we're aligned w/ the OKI (and now JA-SIG) and related initiatives.
We are NMI and I2-MACE participants as well. You might want to look at Shibboleth for both inter (federated) and intra campus single sign on for centralized authX. Many educational institutions are rolling this out, and I think it's now at a stage where it's ready for primetime (rollouts by mere mortals).
You might also want to look at Unicon's Academus uPortal distribution, which I believe provides DataTel integration, or if you're feeling ambitious (or is that masochistic?), the OpenEAI project.
Thanks
Thanks for the advice, I appreciate it. I've setup uPortal at work in a testing environment, and while I like it--and it does provide fairly straight-forward Datatel integration--I work for a college that has been doing things a certain (same) way for 25 years. I don't think I'll change anything overnight. But I'm trying!!
Annie