i have just gotten out of the keynote talk at drupalcon given by tim o'reilly, and -- not surprisingly -- i find myself inspired and overflowing with ideas.

i began thinking about how we could leverage the huge community growing around drupal (and perhaps more importantly, the huge audience the sites they are building serve) to help keep the "internet operating system" and its data store open. how can drupal help keep some part of this "ownerless" and free?

i thought of a module which could be used flexibly to collate and report (to an open, distributed service) all sorts of information from the running drupal install. what sort of information? my first thought was aggregating anonymous data of a statistical nature, in a manner similar to smolt or the linux counter project. for example:

  • drupal version, (contributed/public) modules & versions
  • operating system info
  • php info (version)
  • user base info (# of users, statistics on sign-up rates, login rates)
  • content info (# of nodes published, comments left, files uploaded, etc)
  • traffic metrics, performance stats

what use might this serve? i suspect many. these statistics might help the drupal community analyze real-world needs and problems. but maybe more importantly, i suspect the uses will pop up later, as is often the case when data sources are opened up.

much like the smolt project and similar tracking systems, the key to such a module would be to keep it automatic (all stats collected via automatic means, not requiring intervention of the admin, other than install) and anonymous (great minds have already addressed issues of aggregating information across systems like this).

but why not take this idea further? why not have an opt-in (on both the part of the site admin [and their legal department], and the users) of providing more information? user demographics (age, gender, location, email domains, etc). maybe content taxonomy information, or other useful content-related data.

obviously there would need to be the strongest consideration for privacy and rights of the users and individual sites. however, tim's talk inspired me to realize we are moving towards a future where data will be king and people are already (semi-)willingly giving out personal demographic information via sites like facebook. lets not forget this data is highly monetizable as well. if these things are already happening at "the usual supsects" of big internet companies, what can we do to create an open source alternative that makes the most of the size of the growing (and diverse) drupal community?

Comments

Mumonkan’s picture

he just posted his slides from the talk. how cool is that? obviously recommended.