Origo - Distributed Software Development

By bherlig on 18 Aug 2009 at 15:16 UTC

Origo is an open, modular and extensible software development, management and distribution platform. Aimed at software developers, Origo provides a set of services for combining, integrating and facilitating the development process over a network. Origo provides services like source control management, issue tracking, statistics, release hosting, wiki, blog and features for communicating and networking with other developers. An extensive API can be used to integrate Origo into other applications and development processes.

Origo was conceived as a research project of the Software Engineering chair at ETH Zürich, and has resulted in several theses. Recently ownership was transferred into an ETH spinoff, Oriact.

Released in August 2007, the platform as of August 2009 hosts 3,000 projects with over 6,500 developers worldwide.

Origo's Features

Origo provides all the common features of a source control platform:

Source Control Management & hosting using Subversion
Release Hosting
Both open- and closed source projects
A wiki for presentation and documentation
Blogs and Forums
Bugtracker ("Issue Tracker")
Community wiki pages
Jabber server
Networking by befriending other users
Process tracking through workitems

A user can own or be a member of any number of projects. While all projects are separate, the user can conveniently track the progress of the projects centrally, as all actions (source commits, wiki-edits, etc.) are tracked by Origo as "workitems". Email notification for each project and each action can be configured. The concept of "simultaneous separation while centrally tracked" projects posed the biggest challenge for this site.

Why Drupal?

When looking for a front-end framework that would allow to build Origo, it was clear from the start that Origo had to be ready for the masses (because hosting a project on Origo is free), and that stringent security requirements were necessary. Therefore a solution was needed that

keeps the design separate from the content and is thus scalable
allows updating things in one place for all projects hosted on Origo
takes security seriously and integrates in Origo's development process

The only CMS out there that met all three of these requirements, and additionally allowed integration of great functionality from other developer's modules (there was plenty of work to implement Origo anyway, and being able to re-use existing solutions wherever possible was most welcome) was Drupal - thanks to the Drupal community, amazing work!

Architecture & Webpage

Origo uses a two-tiered architecture: a back-end handling the business logic and most of the data, and a front-end for user interaction. An API connects both layers using XML-RPC calls. Part of the API is also open to 3rd-party applications, e.g., Origo's Eclipse plug-in. Origo's Web front-end is implemented using Drupal, including contributed and custom modules, and set up in a (huge) multi-site installation: Each hosted project gets its own Drupal site with its own database, but all sites share the same modules. This enables Origo to completely separate projects from each other while building upon a common code base for its modules. This also allows for tight controlling of each project's data. All project-independent functions, such as user-related functions, are managed by the back-end and relayed back to the website.

Drupal with a Twist

From a technical viewpoint, integrating multiple Drupal sites with a back-end is one of the main challenges of Origo. Fortunately, this is where Drupal's modules - and APIs - come into play. Using a range of contributed and custom developed modules, Origo's front-end provides the following main features:

Authentication

User authentication is handled by the back-end and replicated to Drupal. The custom module uses XML-RPC calls to authenticate users in the back-end. The authentication module works by implementing hook_form_alter, where Drupal's validation- and submit functions are swapped against custom ones that authenticate with the back-end. Also, functions from Drupal's session.inc are overwritten with slightly changed ones that synchronize session information with the back-end. This interception of Drupal's login mechanism is leveraged to implement a single-sign-on system: users that change between projects, i.e. Drupal sites, are automatically logged in at the new site, and the still valid back-end authentication recreates a Drupal session at the new site-instance.

The module also provides an XML-RPC framework for other Origo-API functions, both authenticated (that require a valid user-session) and anonymous API calls.

Wiki

Origo aims to provide full MediaWiki functionality. It uses multiple modules to accomplish this:

wikitools to provide wiki nodes in Drupal
Custom module mediawiki_filter wraps MediaWiki's library for parsing and rendering MediaWiki syntax. The MediaWiki filter is enabled for all of Origo's content: issues, blogs, forum posts, etc.
Diff for comparing revisions or previewing changes
GeSHi Filter for syntax highlighting of code blocks
Image, because they speak a thousand words
Image Assist to facilitate image insertion (locally patched to use MediaWiki syntax)
Custom module section_edit which provides the ability to edit individual sections of a wiki page
Custom module developer_pages enables users to restrict access of a wiki node to project members.

All these modules - including the custom ones - interact purely with Origo's front-end. Therefore, this wiki setup could be used in other Drupal installations.

Issue tracker

Origo uses its own bug tracking module which integrates with its back-end. Issues are a custom node type with its own vocabulary. This way, issues can be tagged, indexed and searched using Drupal's taxonomy module. An issue's metadata (such as open/closed-, assignment- or resolution-status) is added to a node's tags, too, and thus seamlessly integrate with other vocabulary data. The Issue Tracker's main page provides an interface to this data, and besides displaying a paginated list of all issues, gives the user the possibility to search and filter these tags. The different filters can be customized in any way, and the resulting combination can be saved.

issue_tracker implements quite a few Drupal hooks, e.g. hook_view, hook_form_alter, hook_update, hook_insert, hook_nodeapi, etc. This is necessary to intercept node processing at various stages, and synchronize issue-nodes with data from the back-end.

Origo Home: Communicating, tracking, releasing

origo_home is another custom module providing a bulk of Origo's features. It handles all user related functions such as profile, settings, networking (user friendships and communities), as well as workitems (including RSS feed). It offers input masks for requesting projects and communities, and displays an aggregated view of issues reported by the user. It also provides the interface to Origo's release hosting, i.e., displaying the list of a project's releases, as well as forms for uploading and categorizing releases. Furthermore, it implements an interface for the system's administrative functions, such as mass mailings to all users.

While most of origo_home's functions are straightforward form processing, its main task is bringing the front- and back-end together. On the Drupal side this results in intercepting all processed nodes: hook_nodeapi is crucial for this. It checks which action is performed on what type of node, and registers this action accordingly with the back-end.

Restricted content

Origo implements a hierarchical role model. The most important roles are project owners, project members and authenticated users. These roles - with their associated permissions - are used throughout Origo and give different users different possibilities of how to use Origo: posting blog entries, moderating issues, editing wiki pages, etc.

Other modules

Besides the modules mentioned above, Origo uses the following contributed modules:

Captcha for blocking spambots from user-registration
Customerror for blending in errors with Origo's design
Google CSE for providing site-wide search
Pathauto and Token for generating better readable URL's
Trash for nondestructive deletion of nodes
Views for generating node lists: a browsable list of content-nodes, filterable by type, and a list of taxonomy terms.

Back-end

Integrating, coordinating and tracking all steps of the software development workflow requires a powerful back-end. Origo uses a decentralized approach using several processing units (called nodes) potentially running on separate machines, which communicate using Apache's ActiveMQ message broker system. API nodes receive XML-RPC calls, generate a message of an appropriate message type and relay them to the relevant nodes, e.g. a storage-node for database access, or mail-node for sending emails.

Challenges

Managing Origo

Managing and maintaining such a large-scale multi-site Drupal installation - Origo's web front-end currently combines nearly 3000 Drupal multi-site instances - poses some unique problems, as one needs to automate Drupal administrative functions. Origo uses a combination of shell-, Python- and PHP-scripts to solve this: some triggered by cronjobs or the back-end, some manually.

Drupal Installation

A usually simple action such as adding a new instance to the multi-site installation is a complex process with Origo: While the creation of the front-end needs to be coordinated with other services (a subversion repository is set up, databases created, etc.), there's also the need to tailor the new instance to Origo's needs. This is achieved through a custom installation profile, which enables the needed modules, activates the proper theme, creates navigation links, and also adds a few default nodes to the new Drupal instance. Origo needs to automate these steps, i.e., perform command-line installations: for this, a modified version of the Drupal CLI utils are used (which screen-scrape Drupal's installation forms).

Updates

Similar to installations of new projects, Origo orchestrates updates - security updates or releases - from the command line. For a few months now, Origo has used the wonderful Drush module (well, more of a "utility") to access Drupal functions via command line. Drush is used extensively for installing new modules, security updates, but also for executing custom SQL scripts (e.g. for deploying bug fixes in custom modules), emptying caches, etc.

API Synchronization

Origo's open API allows to access most functions from 3rd party applications, e.g., we provide a Mylyn plug-in to manage one's project's issues from within Eclipse. This mandates the propagation of data posted to the back-end via API calls to the front-end. The custom module issue_tracker implements a small XML-RPC interface to this effect, with which a project's issue nodes can be created or manipulated.

Future Work

In accordance with Drupal's motto, development on Origo never stops. Here's a quick overview of future prospects

Streamlining Drupal Databases
Currently all Drupal instances have their own, completely separated databases. Some of these could be shared between projects, e.g. a common user-table for all instances.
Sandbox
A sandbox project where users can test Origo's functionality. This not only concerns the Web- but also the back-end.
Project time management
Extend Origo with time management, where nodes (especially issues) can be assigned to a timeline, and be managed centrally.

Contact

You can contact the Origo team by either emailing info@oriact.com, or posting on the message board - or by posting comments below.

Associated Drupal users: bayt, roetzi, tario and bherlig.

Comments

Origo's Web front-end is

Garrett Albright commented 18 August 2009 at 18:14

Origo's Web front-end is implemented using Drupal, including contributed and custom modules, and set up in a (huge) multi-site installation: Each hosted project gets its own Drupal site with its own database,

What method (if any) do you use to allow users to be members of more than one project/site?

Also, you stated this project was originally launched in August 2007; so was it initially on D5? Or even D4? What was the migration process(es) to D6 like?

What method (if any) do you

bherlig commented 19 August 2009 at 08:22

What method (if any) do you use to allow users to be members of more than one project/site?

Users can be part of any number of projects; Origo's back-end holds the data about a user's associated projects. On the Drupal side, we intercept login-functions and session-creation, e.g. when a user navigates to one of his projects/sites, the back-end checks his credentials in relation to the site, and logs the user into the Drupal-site with appropriate credentials.

Also, you stated this project was originally launched in August 2007; so was it initially on D5? Or even D4? What was the migration process(es) to D6 like?

It started out with 5.x. We started the migration process in late April this year, at first with porting our own modules and theme, and adapted the install-profile (which we make heavy use of) and also our set of shell- and php-scripts that we use for automation (e.g. project-creation). We used Coder to review the modules and get easy access to the relevant documenation. All in all it wasn't that hard, Drupal provides excellent documentation on how to upgrade these things. The main part was just crunching through lines of code.
For the migration of the live-system, we used a set of shell-scripts, a large part of which was calling Drush commands - an excellent and essential tool for anyone who wants to automate Drupal maintenance (try calling update.php via browser for a multi-site installation with 3000 instances).

So are you saying that you

Garrett Albright commented 19 August 2009 at 15:23

So are you saying that you somehow forego Drupal's user management system entirely and only manage users through the back-end database?

No, we just handle

bherlig commented 19 August 2009 at 18:39

No, we just handle authentication (and session-recreation, for automatically logging users in if a valid backend-session is available) in the backend - Origo still uses Drupal user objects in the regular way (e.g. user-permissions or registration of new users).

How does Origo compare to

neokrish commented 19 August 2009 at 06:52

How does Origo compare to Open Atrium, which is also released very recently? Origo seems to have a lot more features than Open Atrium but seems to lack the visuals that make Open Atrium very appealing.

Another quick thing that I noticed is the use of image and image_assist module. I am sure when the project initially started, you must have only had the option of using image and image_assist module but as the project took so long and I think there are better options available now when you have completed it.

With the well defined future of cck, I am not sure if using image module is the right way to go? The last time I used image module was around the same time I discovered cck and imagefield. With FileField Insert coming quickly and nicely, there is little future for image and image_assist. What if the image and image_assist module fade out as drupal supports cck and imagefield by default?

=-=

vm commented 19 August 2009 at 07:01

note: image module heading into core = http://drupal.org/node/513096

Image module vs CCK

rötzi commented 19 August 2009 at 08:45

We decided on the image and image assist modules because they fit our needs. Also, it was clear that the image module has so many users that it will not just be abandoned. And, as already mentioned, the upgrade path will even be provided by Drupal core.

Security and CM hooks?

stodge commented 19 August 2009 at 11:26

How do you control security/access? For example say I have two developers Bill and Ted. I want to let Bill raise tickets but I don't want him to see any source code (so no read/write access to code). I want to let Ted raise tickets but I also want him to modify source code. Taking this a step further, I want to swap the access for a different project; Ted can raise tickets on Project B but he can't see any source code while Bill can raise tickets and modify code.

Oh and do you use Subversion (CM) hooks to add changesets to tickets, like Trac?

Thanks

Permission security/access is

tario commented 20 August 2009 at 17:19

Permission security/access is controlled by our backend and mapped to specific Drupal user permissions. The permission system is relatively simple, it is currently not possible to specify detailed rights for everything, there are just 4 levels, (anonymous, registered user, project member, project owner). In the cases we encountered this suited very well and is still easy to grasp.

We have hooks in the subversion that creates a workitem, a workitem can be viewed in various places/ways (mail, rss, website, or using a client over the api) and it is also possible to link an issue in a log comment.

Drupal as a front-end to an ESB

mbutcher commented 19 August 2009 at 18:28

I'm happy to see a practical and insightful implementation of Drupal as a front-end to an enterprise service bus. This is a novel blend of Drupal and some serious enterprise-grade tools!

Drupal as a front-end to an ESB

bayt commented 21 August 2009 at 06:41

Thanks for the comment, mbutcher!

The underlying framework is Aranea. We haven't published anything on the framework so far, but it is open-source and we would be interested in hearing what others think of it and maybe even build with it.

Mylyn connector

pasqualle

🇭🇺 Budapest

commented 21 August 2009 at 15:18

Could you write your thoughts about the Mylyn connector? Is Mylyn really as useful tool as it is propagated? Do you use it regularly in your own projects? Would it be hard to create a similar connector for drupal.org issue queue (for the project_issue module)?

Mylyn very useful

bayt commented 21 August 2009 at 15:56

The main reason for developing Origo with an API from the start was that we wanted to be able to integrate the code hosting/software development platform into our daily work-flow. There is always a new tool or another script in another technology that development teams find useful for their work and we wanted to be able to integrate our platform with those.

Mylyn is one of the tools we found out there, that supported integration with other applications from the opposite side - namely from within an IDE to the outside world. The Mylyn community is very large and dozens of connectors to different Issue trackers exist. The documentation for writing your own connector is very helpful and for the connector we wrote for Origo, we also got inspiration by looking at other connectors.

So, to answer your question:

We think it would not be too hard to write a connector for the drupal.org issue queue and can only confirm that Mylyn is very useful in everyday use as it gets you the issue tracker right into your development environment.

Image Assist

asb commented 7 February 2010 at 00:52

'Image Assist' is used to embed images into Origo Wiki nodes, but the 'mediawiki_filter' itself doesen't support embedding images from 'Image' module's nodes, at least not with the full MediaWiki syntax (e.g. choosing derivates or scaling an image). Thus to accomplish image handling, you're using 'Image assist' - but with local patches to support MediaWiki syntax.

As far as I know these patches haven't made it into the 'Image Assist' module, so you're forking 'Image Assist'. Wouldn't it make sense to add MediaWiki syntax to 'Image Assist', at least as an alternative option, opposed to having to maintain a fork and local patches?

Thanks & greetings, -asb

the 'mediawiki_filter' itself

bherlig commented 17 February 2010 at 12:48

the 'mediawiki_filter' itself doesen't support embedding images from 'Image' module's nodes, at least not with the full MediaWiki syntax

Yes it does. We provide Image Assist for users not familiar with Mediawiki syntax, who want to have a GUI for including images.

so you're forking 'Image Assist'.

Calling it a fork is a bit too strong - check the patch, it's merely hiding some of Image Assist's advanced options, and providing three derivative sizes for the user to choose from. So "maintaining" the patch isn't really a problem, although I admit that we didn't really think about (properly) integrating it into the Image Assiste codebase, to enhance it with Mediawiki syntax.

Not like MediaWiki

asb commented 20 March 2010 at 18:02

>> the 'mediawiki_filter' itself doesen't support embedding images from 'Image' module's nodes, at least not with the full MediaWiki syntax
> Yes it does. We provide Image Assist for users not familiar with Mediawiki syntax, who want to have a GUI for including images.

Indeed. 'img_assist' doesn't solve our issues with embedding images into nodes with mediawiki_filter.

I can not reproduce that mediawiki_filter supports embedding images with full MediaWiki syntax. Neither with or without img_assist enabled, image derivates are generated (since 'image' module doesn't support on-the-fly scaling like MediaWiki, this would have to be browser-scaled, as 'pear wikifilter' did); also, I can not access available image derivates configured at ./admin/settings/image/nodes. Example: If the derivative preset is labeled "400", I'm getting an empty image with a (working) hyperlink to the image node when using this syntax: [[Image:folksonomy|center|400|Folksonomy with Freetagging]]. This also doesn't work with image derivates that have names like "Miniaturansicht" (I'm using localized versions of 'image' module, so there's no derivate labeled 'thumb'). Additionally, image descriptions are not displayed below the embedded image, like MediaWiki does. The only working way to embed images is to manually scale the images and use a syntax like: [[Image:folksonomy|center]] - that is working sometimes for me.

However, since there is no supported release of 'mediawiki_filter', it makes not much sense to discuss this at this place.

Greetings, -asb