This originally started as a comment on the Predictons for 2006 thread, but after submitting it I realised it was just far too long and out of place to be a comment, so i created this node.
Someone had mentioned open document support, and me and others made some cryptic comments regarding to some ideas I'd had recently.
My basic idea is to create a file access layer for drupal, that exports all content into standards based files.
This would mean we have to fully develop our export mechanisms (ie: RDF, RSS, ICAL .. and later on OpenDocuments) and also create an equal set of import mechanisms.
By doing this layer properly, we gain the ability to consume all the formats we output. Meaning we can import any ical etc. on the net cleanly onto a drupal site and it also gives us a very convenient place to handle our RSS aggregation too.
I am not going to try to gloss over the fact that implementing this _right_ is probably going to be fairly difficult, but the benefits to doing so are truly incredible and will very definitely be worth the trouble.
One of the final goals is that you will be able to mount your site via webdav (or webdave as I always mistype it *g*) and be able to edit the content using your desktop applications (anything that properly supports the standards really.) And that's really only one of the possible benefits.
By maintaining a documented set of standards and designing a library that allows for easy export and import of these documents, we also have a very very good place to implement a cross rdbms interchange layer. By creating a wordpress -> standards documents script for instance, we will have the ability to import the content from a wordpress site into a drupal site and vice versa. The OpenDocument support could also be implemented as a translation into this standard format. Similarly we could tie microformats in feeds into the same import routines.
When I thought of it originally, I decided that I wanted to call the actual set of standards / the community that looks after it .. the Open Content Initiative (mirroring some of the basic principles of OSI). It would also tie into the semantic web movement (which has really been on my mind before the tim berners lee uses drupal thing. although that was a pleasant surprise.)
Because when it comes to the internet, the only thing that is really important is the content (it's the content stupid!).
But those are just pipe dreams. I just really think we should look at the bigger picture when developing all this, otherwise we might limit the impact this could have. I just really want people working together.
I have also only recently become aware of the structured blogging stuff, and I realize a lot of their goals are inline with what I am proposing. I am concerned about their 'everything is a blog post' methodology however as I believe that the architecture might very well go deeper than that.
Their work is still going to be very important to us, as what we choose for the document formats will likely mirror theirs to a big extent.. if their framework is rich enough to satisfy all the dependencies we have we should DEFINITELY use them as the basis for all this. Also consuming and generating micro-formats will be very important as more of this information becomes available.
My suspicions however (and these aren't very thoroughly researched yet), is that their methodology will not meld with our needs due to the implementation micro-formats as a work around to their tool of choice's lack of proper content types.
I am not trying to knock the structured blogging initiative at all. I think it's great, i really really really do. But I think they might be trying to solve a problem that doesn't really exist in Drupal.
Micro-formats, as great as they are, will also probably not get us closer to open document support and the like.
I would _love_ to be proven wrong though. If anything this topic needs far more serious discussion than this post can ever hope to go into.
After that fairly lengthy diversion, I still need to clarify some of my earlier comments saying that I don't believe we should aim for these improvements in 4.7, but more appropriately target 4.8.
Firstly. To do this properly, in the way it really needs to be done, will take a lot longer than the 3-6 months to the next release. This is one of those things that is probably going to need a fairly lengthy research and design phase. I don't believe we can just code at this from one end and hope everything just clicks into place cohesively.
Also, the way I envision this being implemented, ties very closely into what the forms api is going to become in the next release.
It is basically the first step towards implementing a model-view-controller pattern in drupal (for much more than just forms), and each of these data exports very naturally fits into the 'view' concept.
Also, my forms api plans allow for much greater code re-use than is currently possible, which will be very important for implementing all this in the least amount of code possible (which is going to be very important as we still need to maintain all this functionality).
Then there is the relationships API, which is something INCREDIBLY powerful and INCREDIBLY useful and INCREDIBLY important that we get _right_ first, as it implements the data layer for all the RDF mappings that we could hope to generate.
It is actually truly mind boggling to see how many times we have rewritten this same functionality in various places. To be able to generate RDF for everything we possibly can we really need to have everything speak a central relations api, as trying to write input / output layers for all these differing re-implementations is going to be incredibly time consuming and ultimately a losing race, as some other contrib module is just going to go and re-implement the exact same functionality and then not be accounted for in the outputs / inputs.
I would also prefer that cck handle as much as possible, because it allows for the greatest flexibility. Shipping configurations of cck node types is much cleaner than shipping actual node modules.
A lesser part of this is the install system (which is actually getting 'close' to being finished) which is an important piece of the puzzle, as it is the mechanism for distributing content types and very importantly, views of said content types.
I don't know wether or not all of this i discussed can, should or will be implemented. Hell, perhaps we will run into issues that mean we just can't implement it the way I intended (although i am fairly certain the logic is sound ... at least in my head, in among all the ferrets, bubbles and mounties.)
Also don't think I mean that this all absolutely totally has to live in core first. A lot of the core developments I mentioned before are aimed at making as much as possible live in contrib yet still be cohesively tied to what we do in core.
I think this functionality is important enough that it needs to be developed properly, and developed in such a way that it is as easy to maintain as possible. It might just be one of those things that could be more trouble than it's worth if we don't do it as smartly and cleanly as possible.
I apologize if I rambled on for a bit, I just have been giving this kind of thing serious thought. Also they are really just ideas at this point and I am really looking forward to discussing this with other developers in vancouver in february.
Comments
POOOO, the Pet OpenOffice Odt Obtainer
You probably already saw this post, Adrian, but in case others start thinking seriously about OpenDocument support, I thought I'd link to peterx's post POOOO, the Pet OpenOffice Odt Obtainersince he's already got some opendocument support working.
If someone could come up
If someone could come up with a working Wordpress import for Drupal, I'd switch tomorrow!
Very cool stuff
My import / export module proposal for the Summer of Code will, if it gets accepted, hopefully help to make this stuff happen in Drupal.
Jeremy Epstein - GreenAsh
Jeremy Epstein - GreenAsh