Can we include third-party code, libraries, or datasets in CVS?

In practice, we do, in many modules (including several of mine). The code ranges from small javascript libraries to databases several megabites in size. This is not to say it's right, just that it's done. And we have some external code in Drupal core (jquery.js).

Recently I suggested pulling various jQuery libraries into a common module to avoid having them included multiple times in different modules' CVS directories, http://groups.drupal.org/node/2787. Gerhard informed me quite appropriately that we restrict third party code in CVS.

Now that I look, I find this is what we say in the Drupal CVS contributions README:

"In cases where another non-Drupal project is required DO NOT include that code in the repository. Instead provide a link where the other code can be downloaded and instructions on how to install it."

I could and presumably should delete the jQuery and other external code I have in my modules. But doing so personally won't address the general issue. I'd like first to discuss the issue and see if we can find some solutions.

The main thing is, it's not always so easy as "provide a link". External code is not necessarily available for easy download in the same form over an extended period. Taking jQuery libraries as an example, a single Drupal module might require jQuery libraries from 4 or more different locations, scattered over several private websites, readily available only in current versions (potentially incompatible with a particular version of Drupal core). The necessity of hunting down compatible versions from e.g. SVN repositories would effectively lock out most module users.

Dries in a thread on GPL code in CVS said last May :

"We've also decided against mirroring other projects in our CVS repositories -- unless there are good reasons to do so."
(http://lists.drupal.org/pipermail/development/2006-May/016589.html)

What would such "good reasons" be? Can we come up with guidelines that allow us to include certain types of external libraries, without contributing substantively to code bloat, or putting undue demands on our volunteer CVS maintainers?

Here are several potential approaches to third party code:

  1. Produce and follow guidelines that permit limited inclusion of third party code in CVS.

    Here are some suggestions, based on what I imagine the reasons are for including jQuery in core:

    • It's small (how small is small? <30kb?)
    • It's GPL.
    • We need to support a particular version for a relatively long period, but that version may not be readily downloadable for that whole period (without resorting to methods beyond the typical user's expertise, e.g., checking out from a versioning system).

    Other potential reasons might include:

    • Need to patch the external files.

    If they were clearly communicated, these guidelines might help prevent some of the externally produced code that currently contributes the most to CVS bloat (e.g., external databases several megabites in size).

  2. Introduce an application process, where individual module developers may apply for permission to include a particular external library.
  3. Combine 1 and 2: we have guidelines, and if you feel you meet the guidelines you can apply. No external code allowed without explicit permission.
  4. Create a non-CVS drupal.org repository where module developers can place external code could for download. Users are instructed to download separately from there.
  5. Introduce a way for our packaging scripts to include externally-hosted files.
  6. Clarify and enforce a "no third party code" rule and delete a lot of code from many existing modules. IMO, this approach only makes sense if we have the will and ability to educate about and ultimately enforce a "no third party code" rule.

There's no approach that's going to please everyone. I think options 1 or 3 might be an improvement over what we have now. Comments?

Nedjo

Comments

great write-up on an important problem, nedjo. i hope a sane solution can be worked out...

personally, i think #1 is really all we need.

there's no technical way to enforce our policy automatically, we have to rely on humans to notice and point out 3rd party code that gets checked into our repository. therefore, a mandatory application process for voluntary compliance with our policy seems not particularly helpful. a wiki page of "approved 3rd party commits" somewhere on g.d.o or something would suffice as a way to hold the "let's remember which exceptions we've made and why" data... so, #2 and #3 appear (at least at first glance) to involve some additional work with little or no additional gain.

#4 seems like more of a pain for everyone. i believe the primary concerns about 3rd party code are not wanting to have stale versions duplicated on our infrastructure, in case of security updates, etc. basically, in general, people should get the "upstream" source, and we don't want to be providing a stale cache of it (except for exceptions, the point of your message). ;) the fact that it's in our main contributions CVS repository vs. another CVS repository, or an SVN or git or whatever repository doesn't really matter. it's more the principle of holding onto and distributing potentially stale code, potentially wrong-licensed code, etc, etc.

#5 sounds nice on the surface, but the details of getting it right would be a nightmare.

#6 would be a shame, given how much of a pain this can be in some cases (as you described very well in your original post).

so, i vote for a more clear policy via #1, with clear criteria for exceptions. we should probably create the "allowed exceptions" page somewhere, preferably in a spot that was both easy to find for project maintainers, and easy to edit for CVS admins. i guess the drupal.org handbooks would work... we wouldn't really need g.d.o for this.

__________________________________________________________________
My professional services are available through 3281d Consulting

I think a page where developers can self-disclose and explain why they are complying with #1 would be sufficient.

---
Work: Acquia

yes, i think this the right course of action ... for the sake of our users, we ought to loosen the ban a bit IMO.

#1 isn't really a change of policy, btw. We've always said that if the version/external stuff is not readily available to end users (ie it is specific to Drupal) that people can include it in cvs.

I should mention that people better don't try to push boundaries on this. :p
--
Drupal services
My Drupal services

I have been doing a bunch of jQuery work lately, and what Nedjo is describing is a real need. My module relies on GPL'd jQuery plugins, and now that jQuery has moved to 1.1, it is no longer easy to find 1.0 compatible versions of these files on the web. If I can't include these files for download (or, even better, refer to files in Nedjo's jquery.module), it will be quite a challenge for someone to use my work. I hope we go with option #1.

I strongly support #3 (#1+#2). For transparency, people should be willing to state their case.

IMO, they should also get approval (or proven lack or interest) from the original author - a matter of principle.

It wouldn't be too tough to monitor this, especially where code is committed as-is: like tinymce or plugin or a a bunch of icons. These would account for most of the unauthorized commits AFAICT.

For transparency, people should be willing to state their case.

why create extra work (for d.o maintainers) for an application process? how is the goal of transparency and historical memory better served by a custom application process (much must be written, maintained, ported, etc) than a simple editable page? see my comments above for more on this...

__________________________________________________________________
My professional services are available through 3281d Consulting

I agree that a wiki page would be better. Sorry to miss that detail in your post.

For the record, I would be happy to moderate a wiki page and keep a casual eye on CVS-wide commits (which I already do to a degree).

This issue, atm, pretty much solely concerns 3rd party jQuery plug-ins, due to the fact that we are locked into a certain version of jQuery for each version of Drupal. A prudent course of action would be to get jquery.com to organise their plug-ins better, in a central location, by version, license, compatibility etc. I doubt if we are the only people who have issues with their lack of a better structured plug-in repository. At the very least, could whoever in the community interfaces with John Resig and crew, please contact them and get their opinion on this?

Introducing these grey zones into the Drupal repository is counter-productive for all the reasons already covered in this thread.

-K

We should just tell them about all the advantages that the New Release System (tm) has and make them adopt it for their plugins. As they use SVN, they should just sponsor Derek to develop project_svn for them. As Derek does not like to do the same work multiple times, he develops project_vcs instead and then a project_vcs_svn bridge. When this is done, Drupal can even finally do the switch to whatever VCS is top of the pops at the moment without further difficulties :P -- all problems solved & everyone happy ;-)

...Unfortunately that's not the world we live in. ;-)

We already have a strict 'Only GPL' rule on the CVS repository. I think #1 -- noting the explicit reasons that are considered 'legitimate' for a third-party GPL library to be included -- is the only really useful solution.

--
Lullabot! | Eaton's blog | VotingAPI discussion

+1 to #1 until we have an installer that can go out and grab the 3rd party files and install them.

--
Matt
http://www.mattfarina.com

From the dev list thread. What if we make references to the 3rd party libraries in module.info files.

As adrian pointed out, this might allow us to create tarballs from the external libraries during the file download preparation.

I would also open the possibility of file hash checking by update_status.module to see that the user as the "preferred" libraries installed.

--
http://ken.blufftontoday.com/
http://new.savannahnow.com/user/2
Search first, ask good questions later.

I haven't seen no one mentions anything about CVS support for vendor branches, so here it is.

CVS supports a vendor branch, which basically maintains code from third party ('vendor'), and allows you to change that code (patch) it in a native way.

Whenever you want to update the 3rd party code, you just import the new version onto the vendor branch, and then merge it to your development branch. The last step ensures that changes you did to the code are left there (unless conflicts are discovered, which should be handled in the usual CVS way).

There is more info on this here (google: 'cvs vendor branches'):
http://cvsbook.red-bean.com/cvsbook.html#Tracking%20Third-Party%20Source...(Vendor%20Branches)

I am not sure how much is this supported, or not-supported by the current installation of cvs @ drupal.org, but this is technically possible.

--yuval

--yuval

The issue is not a matter of whether it's technically possible. And, a mass majority of the people involved in this discussion know how CVS works well enough to understand this.

It's a matter of policy on cvs.drupal.org and there are a whole bunch of reasons behind the policy.

You can find more details on the developer list archives at http://lists.drupal.org/pipermail/development/2007-May/023997.html. Read through the thread and you'll see both sides of the story.

--
Matt
http://www.mattfarina.com

does cvs have the equivalent of the svn:externals prop?

--
Matt J. Sorenson (emjayess)
d.o. | g.d.o. | twitter

I guess I am just thinking too technically.

Having read again the whole thread, and as I have a module with external code myself, I strongly support #1.

As a user, adding "modules" to "Drupal" is a burden as it is. Requiring the user to "go fetch" code from all over (not to mention what happens on upgrade) - is an impossible nightmare.

Things should be easy on the users - and the only way is to include the code in. I think the advantages outweigh the disavantages by far.

--yuval

I think we have to reconsider what the policy is for third party GPL code in the Drupal CVS. The GPL encourages working together with other technologies, by restricting what GPL code goes into the Drupal CVS, we're not encouraging working together. Although we can implement hook_requirements(), and give the user instructions about how to install the third party code correctly, but this can run into installation problems, version control issues and support disasters.

I strongly suggest we rethink our strategy here, and go with what nedjo suggested in #1.

If you want to bring this up seriously, a 2 year old post in a deprecated forum is not the place for it. File an issue. :)

Michelle