This issue is part of the Distribution Packaging initiative.

Problem/Motivation

Distributions often package external JS and PHP libraries. However, many such libraries are not released under the GPL. Drupal.org generally requires all code hosted on the server to be released under GPLv2-and-later, just like Drupal. Also, not all 3rd party libraries are compatible with the GPL.

Proposed resolution

We will allow the distribution packager to automatically pull in 3rd party libraries to the packaged download. However, to ensure only libraries released under GPL-compatible licenses are used we will need to maintain a whitelist of 3rd party libraries that have been confirmed GPL-compatible.

Remaining tasks

This issue is focused solely on the technical aspects of the whitelist on Drupal.org itself. See the Distribution Packaging community initiative page for other aspects.

Regexes were chosen as the format for whitelist entries rather than custom syntax, to allow for greater flexibility of download links for distribution maintainers. Regexes look something like the following: ^http://www\.example\.com/downloads/download-name.+$.

User interface changes

A new view showing packaging whitelist entries.

API changes

None.

Original report by alexb

Follow-up from #594704: Allow packaged install profiles on d.o to pull in code from other sources + sites.

Create and deploy a white list on drupal.org that can selectively allow download of external libraries from non drupal.org URLs.

Example:

http://download.cksource.com/CKEditor/*

Would allow the inclusion of any files available under this wildcard URL.

This white list includes

  • UI (surfaces where?) that shows which URLs are currently allowed to include in an installation profile.
  • UI to edit the white list.
  • Simple permission to control access to edit the white list.
  • Necessary modifications to drush make to limit potential sources for external dependencies.

Comments

alex_b’s picture

This issue is part of the community initiative on Install Profile Packaging http://drupal.org/node/779440

dww’s picture

Most of the problem here is a policy one about the criteria for being on the whitelist, and identifying who actually maintains the list, etc, etc. However, let's continue to discuss *those* parts over at #594704: Allow packaged install profiles on d.o to pull in code from other sources + sites. Let's use *this* issue to discuss the actual technical side of the whitelist, as per the original post here. At DCSF alex_b, hunmonk and myself discussed this in some detail...

A) One very attractive solution to this would be to just make a new node type on d.o for "packaging whitelist entry" or something.

The title of the node would be a human readable identifier for the entry and the body would be a newline separated list of URLs supporting wildcards. Something like:

Title: JQuery.com plugins and libraries
Body:
http://code.jquery.com/*
http://plugins.jquery.com/*
...
  • We'd have revisions on changes to the whitelist entry
  • We'd have permissions we could grant to various roles for creating and editing these entries
  • We'd use views to make a human-readable table of the whitelist
  • We'd use views to make an XML display of the whitelist
  • We'd be able to use a taxonomy vocabulary on the whitelist entries if we needed to classify them (e.g. License, JS vs. PHP, etc) -- not sure we care, but it'd be easy if we do.

Then, we'd just update the --drupal-org drush make plugin to fetch the XML from this view and use it to enforce the whitelist during packaging.

The only code we'd have to write is the drush make stuff, most likely. Everything else could be clicked together (and then exported to code in drupalorg module so we can have it in version control and safely deployed).

At least, that'd be one way to do it.

B) Alternately, we'd have to write a new drupalorg_package_whitelist module that has it's own DB table for whitelist entries which it'd expose to views, and it'd have some kind of admin UI to create/edit entries. This would require writing more code, and I'm not sure if it'd be any better than just using nodes. Seems like we're going to duplicate a lot of effort as we add features (wouldn't it be nice to know who created this entry? who has edited it? can't we add other fields? etc).

I'm leaning towards (A), unless there's a compelling reason not to use nodes for this.

Cheers,
-Derek

dww’s picture

alex_b’s picture

#3

A) sounds like a good plan. It may require writing a light custom field handler for Views that displays the lines of the node body as separate XML entities.

philbar’s picture

+1 for #3A

pwolanin’s picture

So - sounds like we need to start by creating this node type?

hunmonk’s picture

probably need to start by getting official approval for this from the infra guys.

damien tournoud’s picture

Apparently dww agrees, and I was even for a blacklist myself. So unless Gerhard objects, we are good to go.

killes@www.drop.org’s picture

Status: Active » Reviewed & tested by the community

great plan

pwolanin’s picture

since killes approves of the plan, I added the content type to drupal.org it's at ?q=admin/content/node-type/packaging-whitelist

Machine-readable name is packaging_whitelist

pwolanin’s picture

Do we need a specific maintainer, or can it be managed via the webmaster queue?

philbar’s picture

Any update on this? It's been over a month?

alex_b’s picture

Status: Reviewed & tested by the community » Needs work

We will need a little more than the content type on d.o. In our original conversation at DrupalCon SF, hunmonk, dww and I talked about exposing the whitelist in a machine readable format (XML? plain text?) on drupal.org. The whitelist functionality for drush make will pull from there.

pwolanin’s picture

As either plain text or XML (or JSON) it could be about a ~10 line custom page callback, or there was discussion of adding Views Bonus pack to d.o so we could get a machine-readable export that way.

dixon_’s picture

Would it be possible to see how the actual whitelist content type looks like? The functionality will probably need to put out a plain text file as it looks from the discussion around the technical implementation over in the Drush Make issue.

pwolanin’s picture

It's a totally generic node type at the moment- I think we were just going to dump out the body text.

Amazon’s picture

Just to bring everyone up to speed there is already a packaged whitelist content type on Drupal.org:
http://drupal.org/node/add/packaging-whitelist

Amazon’s picture

Here's an update on the current plan after discussing with dww and dmitri01:
http://drupal.org/node/684788#comment-3750828

philbar’s picture

I can be the whitelist maintainer. Just as long as there are clear guidelines you want me to follow and I have a RSS feed for addition requests.

Let's keep this issue focused on this portion of the functionality:

3) The whitelist maintainer (and others potentially) need to create the whitelist.
Paraphrased from dww:

the canonical storage of the whitelist will be these nodes. the nodes are elements of the list.

We want both a machine-readable feed in whatever format he thinks is best (JSON, XML, whatever) and a human-readable view so that the humans can easily see what's available for their distros.

We further discussed which issue queue to track these changes. End users will submit an issue in the packaging white list component of the drupal.org infrastructure project to get packages added to the white-list content type.

Dmitri01 will need to create the white-list content type view on drupal.org that drush make can read.

matt2000’s picture

Seem like the biggest hang-up here is just having a clearly defined policy for what's allowed to be on the whitelist and who maintains it. So let me make a proposal in an attempt to restart conversation:

For a library to be added to the whitelist, it must:
1. Be requested via a TBD issue queue by a module or profile maintainer, and seconded (RTBC) by at least one other community member, and
2. Be licensed under a GPL-compatible license listed here: http://www.gnu.org/licenses/license-list.html#GPLCompatibleLicenses
3. If a PHP library, it must be compatible with the version of PHP supported by the current Drupal Core release. (i.e., no PHP4-only libraries.)

A library may be removed from the whitelist if:
4. The Security team lead deems that the library is not receiving proper security maintenance which poses a threat to drupal sites.
5. The maintainer of the library declares it to be deprecated and the whitelist maintainers deems it prudent to be removed.
6. it can be established that no current release of any profile on drupal.org uses or has used the library in the last 6 months. (not sure how feasible it is to track this, so maybe strike this point on the basis of practicality.)

The whitelist maintainer shall be nominated and confirmed by majority vote of the drupal.org infrastructure team to serve until resignation, or removal by majority vote of the infrastructure team.

rickvug’s picture

@matt2000 I think your list is a very good start. I strongly agree with point 2. While Drupal is licensed under GPL 2 or later many libraries are not. However that doesn't mean that these licenses are incompatible with the GPL version 3. At the end of the day the whole of a distro downloaded from d.o. should be distributable under GPL v. 3. The license list is very clear on what licenses are compatible.

I don't think there needs to be any restriction on PHP versions as that will be the defacto situation with D7 onwards (an most all D6 sites in practicality). I'd also say that point #6 doesn't matter. If a library is not being used but is maintained there's no need to remove it from the whitelist IMO.

Anyone else have thoughts about this? It would be great to codify these standards in a handbook page once we are in rough agreement (and potentially have a legal thumbs up, if we don't have this already).

webchick’s picture

Subscribing.

batsonjay’s picture

Subscribing.

webchick’s picture

Ok, sat down tonight and went through this and wrote an issue summary at the top with my current understanding of where things are at, which can be loosely summarized as:

1. Setting up the whitelist capability itself, which basically just needs a couple of views. (XML is tricky, tho)
2. Packaging script needs to be adjusted to take this into account.
3. A process and team needs to be put into place to manage these.

#684788: Verify Library URLs against a White-list for drupal-org.make and #594704: Allow packaged install profiles on d.o to pull in code from other sources + sites are also somehow related, but I haven't figured out how, yet.

jeff veit’s picture

subscribe

jarodms’s picture

subscribe

jarodms’s picture

Issue summary: View changes

Attempting to summarize the issue.

dww’s picture

Issue summary: View changes

added links to subtasks around process

dww’s picture

After spending quite a while reviewing all the various issues and pages about this effort, there's much confusion. ;) To help clarify, let's keep *ALL* the meta plans and roadmap stuff in the Distribution Packaging community initiative page. Let's keep this issue solely focused on the technical aspects of the whitelist on Drupal.org itself. Process and packaging stuff lives elsewhere (in smaller, more manageable sub-issues).

I should probably split this issue itself up into subtasks, but I have a bunch of other edits and comments to post tonight...

mikey_p’s picture

StatusFileSize
new11.75 KB

Here's a patch that provides a feature for drupalorg module that provides a content type and fields along with a view that provides an page and feed of packaging whitelist nodes. This module also provides a machine readable JSON callback that could be consumed by drush, using a custom menu callback in the module file. These currently live at:

Main view: /packaging_whitelists
RSS: /packaging_whitelists/rss
JSON: /packaging_whitelist/json

If need be, I can update these paths to be more specific, or better namespaced as well.

Sample JSON output (this is from two separate nodes - note that it is flattened here for machine readability):

[ "http://jqueryui.com/download/jquery-ui-*.custom.zip", "https://github.com/jquery/jquery-ui/zipball/*", "https://github.com/twitter/bootstrap/tarball/*", "https://github.com/twitter/bootstrap/zipball/*" ]
mikey_p’s picture

Status: Needs work » Needs review
mikey_p’s picture

StatusFileSize
new12.68 KB

Added some of the basic variables in hook_install() for content type settings, comments, etc.

dww’s picture

Status: Needs review » Needs work

Great start. Minor nits:

- if drupalorg_packaging_whitelist_output() assumes json output (which it does) I'd rather "json" was in the name.

- ) ; extra space in there.

I'm also not thrilled with those paths for the URLs for these things, but I'd have to think about a better alternative.

Otherwise, all looks good on a quick skim. I was a bit confused as to why it's using a text field instead of just body, but mikey_p explained that's so that it can be a plain text area instead of using an input format. That makes sense to me, and since we've already got cck + text on d.o, there's no new dependency here.

mikey_p’s picture

StatusFileSize
new12.68 KB

Fixed the two items you mentioned, I'm not sure if the URLs should live with a path starting with project/* (probably not) or something like drupalorg/* (probably best for the JSON, not sure about the views).

Leaving at needs work for the path items.

hunmonk’s picture

this patch now lives on the 779452-packaging-whitelist branch of drupalorg module, branched from 6.x-3.x. i installed this feature and took it for a test drive. stuff fixed along the way:

  1. the paths were inconsistent, some singular, some plural, all with underscores. they are now all singular with dash separators.
  2. there's a flag you can pass to preg_split that will filter out empty results, so i added that to keep empty lines from creeping into the json output.

the main view, rss feed, and json output all work well, although i do wonder if the main view should be a table -- node preview seems like a bad format given that we only have two fields.

stuff that's still broken:

  1. the search box at packaging-whitelist doesn't seem to work, every time i tried it returned an empty result.
  2. the text field doesn't convert newlines to <br />, which makes multi-line entries annoyingly hard to read.

i'm ok with the current paths as they stand, not seeing much of a path conflict potential with 'packaging-whitelist' as our top level, but if somebody comes up with a better path that's fine.

hunmonk’s picture

ok, the search 'bug' was because i hadn't run cron to index the site. ;) but, since this is a drupalorg module, and drupal is using solr for search, shouldn't we put the search stuff in solr?

gerhard killesreiter’s picture

The more solr the merrier.

dww’s picture

Views + Solr don't play well together (at least not when we last looked). At this point, ease of development, deployment and customization (views) outweighs performance (solr) for stuff like this.

hunmonk’s picture

Status: Needs work » Needs review

after discussion w/ mikey_p, i converted the list of allowed urls from a cck textfield to a normal node body. this fixes the line break display issue, but does introduce one other issue: the default input format contains a URL converter, which happily converts the whitelist URLs into clickable links. i'm not sure if this is worth worrying about -- technically they are links, it's just doubtful that they'll work because most will contain wildcards. so do we care about this? if so, it seems we'd need to either create our own filter that doesn't contain the URL converter, or go back to the CCK text field approach which of course has the line break problem. note that the JSON output isn't affected either way, since we take the raw text.

i also converted the default view display to be a paged table with 50 rows, which i think suits this use case better.

these changes have been pushed to the 779452-packaging-whitelist branch.

hunmonk’s picture

working on #684788: Verify Library URLs against a White-list for drupal-org.make, it's become pretty obvious to me that we should just use full regexes for our whitelist entries (as opposed to the custom syntax that's been suggested earlier, eg. http://code.jquery.com/*), because:

  1. it fixes the formatting problem mentioned in #38: the standard input filter won't convert regexes for URLs, and we still get the line break converter.
  2. it simplifies the drupalorg_drush side of the code at #684788: Verify Library URLs against a White-list for drupal-org.make when verifying URLs against the whitelist: no extra code for special transformations or escaping, the incoming array of regexes can be used as is.
  3. using full regexes gives the whitelist maintainers more flexibility when defining entries.

the tradeoffs are pretty minor:

  1. whitelist maintainers have to know how to write regexes. however, they are pretty simple, almost always taking the form ^http://www\.example\.com/downloads/.+$, and i provide an example of that in the help text on whitelist nodes.
  2. they're slightly less readable on the packaging-whitelist view, but again, since they're not crazy regexes, it should be apparent to most people what they mean.

i believe this addresses the last big problem here. these changes have been pushed to the 779452-packaging-whitelist branch. if we do decide against this approach, it hasn't been much wasted time to put it in, and will be trivial to adjust.

geerlingguy’s picture

@philbar - Can you please post your intention to be a whitelist maintainer over on #1360460: Populate initial team of 3rd party packaging whitelist maintainers?

dww’s picture

Status: Needs review » Needs work

We're going to want a 'packaging whitelist maintainer' role defined, too (as per #1360456: Finalize the criteria and process for the whitelist for external dependencies packaged with Drupal distributions and #1360460: Populate initial team of 3rd party packaging whitelist maintainers). I think it'd be cleanest if that role was included in this feature. We can worry about assigning the users to it, but it'd be nice to have the role in here from the start.

In the near future, we're probably going to want a view of all users from this role, so it'd be best if the role was "owned" directly by the feature.

Thanks,
-Derek

hunmonk’s picture

Status: Needs work » Active

added a 'Packaging whitelist maintainer' role to the feature, and gave it CRUD on all whitelist nodes.

dww’s picture

Status: Active » Reviewed & tested by the community
Issue tags: +needs drupal.org deployment

Did a final review on that branch, merged it and pushed to 6.x-3.x.

We can deploy this now, although distro maintainers won't be able to take advantage of it until #1365536: Switch distribution packaging system to use just drush core and drupalorg_drush and #1365538: Deploy drupalorg_drush and latest drush for distribution packaging system are done...

dww’s picture

Assigned: Unassigned » dww
Status: Reviewed & tested by the community » Fixed
Issue tags: -needs drupal.org deployment

This is now deployed. So, we've got the plumbing on d.o to create and maintain the whitelist. For example:

http://drupal.org/packaging-whitelist
http://drupal.org/packaging-whitelist/json

Neither of these are particularly interesting just yet, but that can change very quickly now.

The code in drush make to enforce the whitelist exists, but that's not yet deployed. So distro maintainers can't actually include code on the whitelist just yet (although that'll be happening in the next few days). If folks want to follow issues about that, they should follow:

#1365536: Switch distribution packaging system to use just drush core and drupalorg_drush
#1365538: Deploy drupalorg_drush and latest drush for distribution packaging system

Meanwhile, we can at least get started populating the whitelist on d.o. We've even already got an initial team of maintainers care of #1360460: Populate initial team of 3rd party packaging whitelist maintainers.

So, calling this particular issue fixed.

Yay!
-Derek

geerlingguy’s picture

Thanks so much for all the work on this, Derek!

I've filled in a few whitelist nodes (see links to views in comment above), mostly to see how things work, and to populate the issue queue a bit... hopefully I'm not too bold in adding a few whitelists without a double-RTBC, since the whitelisted projects are some of the big ones.

I'm hoping we have the largest projects whitelisted and ready to rock and roll once we get the new make system working, because I know I'd like to modify an installation profile I'm building to include all necessary code asap!

webchick’s picture

Yayyyy! Exciting stuff. :D

rickvug’s picture

Very exciting! Thanks to all - this will be a massive boon for distros. I've added requests for few more popular libraries (well, at least those that are truly open source). The list at http://drupal.org/project/issues/drupalorg_whitelist is shaping up nicely.

rickvug’s picture

Issue summary: View changes

Removing summary references to process and other stuff. Let's keep all that summary text in one place: /node/779440

geerlingguy’s picture

Issue summary: View changes

Updated to incorporate what actually went down...

juan_g’s picture

Some complementary issues (docs, devs...) remain in the Distribution Packaging initiative, but the tasks in #44 have been also marked fixed by dww. He says:

Now done and deployed! We're now supporting patches and external libraries (so long as they're at http://drupal.org/packaging-whitelist).

The start of a new era... Thanks, and congratulations all!

Status: Fixed » Closed (fixed)
Issue tags: -drupal.org distribution blockers

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

Added link to the whitelist.