Provide tarballs/zip files of patches + projects combined for easier review [#707526]

To facilitate reviews of patches from non-technical users, esp. on things like themes, CSS changes, usability issues, accessibility issues, etc. it'd be really awesome if there were a nice and easy "Download this whole thing as a package and test it out" button on each issue reply that contains a patch. Kind of like this (better wording/placement welcome):

Download project+patch

I'm not really sure the logistics of how this would work, since I don't think we'd want ftp.osuosl.org to store 500,000 copies of Drupal core + XXX.patch tarballs... I'm picturing something like compiling the tarball in "real time" when it is asked for by the reviewer, and they'd expire after 24 hours or something. We'd have to block these links from Google (perhaps only making them available to authenticated users) so we didn't put undue stress on the server.

I don't even know if this is really possible, or if this is the right issue queue if it is, etc. But this has come up as a feature request over at the discussion about moving drupal.org to a distributed version control system, since apparently GitHub has this feature (see the "Download source" button in the upper-right corner).

Comment	File	Size	Author
#20	707526-download-archive.patch	3.26 KB	boombatower
#14	707526-download.png	2.44 KB	boombatower
#14	707526-download-archive.patch	2.04 KB	boombatower

Comments

Comment #1

dww

we/he/they

commented 8 February 2010 at 01:49

I'm not sure what's the right queue to discuss this in, either, but here's fine. ;) Could also be considered part of cvs.module, versioncontrol_project (if only we were using that), project/project_release, pift/pifr, and/or d.o infra stuff... But yeah, let's talk here for now.

Yeah, there are a lot of potential issues here, but they're all fairly easy to solve. You enumerated most of them. The only other I'd add is this:

A) What version of the project do you presume to include in these tarballs? Obviously, the version of the issue at the time the comment is posted would be a good first approximation, but what about backports, etc? Like the test bot, we'll have to do some magic based on parsing the filename, too. That's even more complicated for contrib. What about -dev releases? Do we just grab the end of that branch when the tarball is requested? Do we always grab the state of the branch at the time the comment was posted? What about patches that don't apply? Etc. All of this is completely independent of $(D)?VCS de jour.

Otherwise, yeah, the potential concerns are:

B) Do we just do this on demand or do we always create them as patches are uploaded?

C) When/how do we do garbage collection?

D) What kinds of ACLs do we want on these links for blocking search engines etc?

E) (Unimportant in the scheme of things, but perhaps worth thinking about): What's the filename convention for the resulting .tar.gz and .zip files?

Oh, and one other thought:

F) Doesn't the test bot already have to do something like this, anyway? If so, couldn't it just be responsible for this? It's already altering the display of these patch download tables to inject test results. Why doesn't it have its own place to temporarily stash tarballs of the thing it tried to test, and inject links to those, too?

Comment #2

webchick

she/they

English

Vancouver 🇨🇦

commented 8 February 2010 at 02:08

Project:	Project issue tracking	» Project Issue File Review
Component:	Issues	» Code

Actually, given the questions you brought up in A), this does indeed sound like pifr's territory (I hope I got that right...)

This would allow us to:
a) Re-use the same issue metadata logic that testbot uses to determine which branch to test against, rather than re-writing that logic.
b) Re-use the existing filename exemption rules of testbot (now that testbot works for contrib, I figure we must've figured those out, though I'm not sure what they are or where that is documented...)
c) Only generate download links for patches that are actually "needs review" and both apply and pass testbot.

I'm not sure of B, C, and D, and would need to defer to more "infrastructurey" people. For E, I was thinking something like:

[project_short_name]-[patch_file_name].tar.gz / .zip

Ideally, we'd love to have more metadata in there, like the issue node ID and the reply #, but a lot of us put that in the patch filename anyway, so it'd end up getting repeated. And I know there's a feature request "somewhere" to auto-generate this meta-data in the patch filename on upload, so it seems like that'd be a better way to handle it.

Comment #3

dman commented 8 February 2010 at 02:38

After lectures at length why such a feature is damn hard to do why does everyone have to patch? [2006] The testbot project has proven that at least the technical part of this is now possible in theory...
I still have concerns about the social part - "if you're not comfortable with patches, this code may hurt you"

me thinks [project_short_name]-[issue_num]-[reply_num].tar.gz shows provenance more clearly than misc filenames (I still don't have a good naming scheme for those). As these bundles are auto-generated from a very specific point and place, that is a consistent UID.

I think the demand vs benefit vs server load vs storage and spidering issues point at generation-on-demand rather than build-and-store.
As per point A, This also means that a patch is always fresh, not a snapshot in time. ... which cuts both ways.

Comment #4

webchick

she/they

English

Vancouver 🇨🇦

commented 8 February 2010 at 02:52

Well, as with testbot, the really important thing about a patch is whether it applies to the latest version of the code. "In theory," testbot is re-testing patches every 24 hours to verify this fact (this has been broken since the migration to PIFR 2, though...), so I think it would make sense to compile and link the tarball each time a "Patch passed" message is sent back to the issue queue.

My only concern with that is how many GB of storage that ends up being, when we take into account for every patch in the issue queue against every project. Maybe if we made it only the latest patch in the issue (whose filename matches the issue metadata)? Kind of a pain though, as we do sometimes play around with two different approaches in the issue, or there are back-ports or whatever.

This is why I was thinking "on-demand" since that will make the problem mostly take care of itself. No (reasonable) human is going to click to download patch #1 in a 300-reply thread with 70 patches in it. If we cleared these out after 24 hours (or whatever the threshold is for when testbot should be re-testing patches to confirm they still work) hopefully this would take up a fairly reasonable amount of space.

Your filename convention is what I originally wrote, but it's nice to have some context other than pure numbers for testers so if they grabbed a few of them before heading onto an airplane, they could have some idea what they were testing. That's when I added the patch filename in as a 'trigger', but then you'd sometimes end up with file names like drupal-fix-silly-bug-121343-32-121343-32.tar.gz Better too much info than not enough though, I guess. :)

Comment #5

webchick

she/they

English

Vancouver 🇨🇦

commented 8 February 2010 at 03:43

Incidentally, I want that 20 minutes of my life back. ;) What an awful car-wreck of a thread.

Comment #6

boombatower commented 8 February 2010 at 18:30

Project:	Project Issue File Review	» PIFR Demo
Assigned:	Unassigned	» boombatower

Had a long standing idea to just build a live demo site on demand (for quick review especially when away from dev environment and for UX ppl), and given another 10-20 hours should be able to make it happen. Already have a start at pifr_demo and should be trivial to allow a download of the demo environment that will have already been built.

Comment #7

dww

we/he/they

commented 9 February 2010 at 01:30

PIFR Demo sounds cool, but it seems like an easy first step is at least the tarball for the tested code. Part of the benefit of decoupling these is that we can get folks with different environments to test the same code that way...

Comment #8

boombatower commented 9 February 2010 at 03:04

I'll probably just make a two link, and or a link in the setup site.

Comment #9

webchick

she/they

English

Vancouver 🇨🇦

commented 9 February 2010 at 04:11

While I don't want to discourage progress here (this has been my life's dream since I took the spot as core maintainer), I am a bit concerned about this approach also. What happens if someone embeds some XSS in a patch to steal a tester's drupal.org login credentials? This is a real risk, since qa.drupal.org shares the same domain as drupal.org (this is how we ended up with http://drupalcode.org/ as our source browser).

Someone could still embed nasty code in a patch that I could download and end up running, but then they're going to steal my fake admin pass on my localhost test site, which is far less damaging.

Comment #10

boombatower commented 9 February 2010 at 06:07

The demo server will definitely not be running on qa.drupal.org. It may not even have a domain name associated with it. We were thinking of the old t.d.o server..and we can easily use a different domain or straight IP.

The current design involves a separate computer...I never want patched code running on the main qa.d.o server for fear of bring the central server down and such (since it cannot be firewalled and such). I was also planning on ensure each person gets their own site with unique login credentials (most likely auto login with crazy password they can change if they want to test login or what not). Not sure if we want to run some sort of demo module or something, but the site will also only be around for an hour or so before it gets fried.

Comment #11

webchick

she/they

English

Vancouver 🇨🇦

commented 9 February 2010 at 06:15

Ah, should've known you'd already have thought of all that. :)

dww's concerns about the lack of testing on multiple systems are valid, but this would still be insanely valuable for bringing non-techie reviewers (who are often the best kind) into the core (and contrib!) team.

Comment #12

boombatower commented 9 February 2010 at 08:21

Virtually done with base: #342509: Add the ability to test a patch in an auto-generated environment.

Comment #13

boombatower commented 9 February 2010 at 08:43

My thought at the moment is to just provide the tar balls during the build always, and then I just need to figure out how to link to them.

Comment #14

boombatower commented 9 February 2010 at 09:32

Status:

Active

» Needs review

Status	File	Size
new	707526-download-archive.patch	2.04 KB
new	707526-download.png	2.44 KB

Best filename I could come up with given limited information on client and that would be unique:

date('Ymd') . '-' . $this->test['test_id']

example:
20100209-2.tar.gz
20100209-2.zip

Upon the completion of generating the environment, the user is presented with, or a button to login.
download

Comment #15

boombatower commented 9 February 2010 at 09:33

Status:

Needs review

» Fixed

Committed.

Now we just need to deploy this baby.

Comment #16

dww

we/he/they

commented 9 February 2010 at 16:42

Status:

Fixed

» Needs work

Nice to see this progressing here! A few concerns:

A) Excuse me if I don't understand the PIFT + PIFR architecture since 2.0, but PIFR Demo is part of the PIFR client which runs on the test clients, not on the test server, right? If so, looking at the diff you committed (http://drupal.org/cvs?commit=326234 is what I found), it appears that all that's happening is some .tar.gz and .zip files are getting created on the test clients. Those files need to be transfered back to the test server for them to be useful. Otherwise, what about opportunistic test clients "in the cloud" that come and go? These tarballs need to persist longer than a given test client is alive (although not forever).

B) Speaking of not forever -- what about garbage collection?

C) Re:

Best filename I could come up with given limited information on client and that would be unique:
date('Ymd') . '-' . $this->test['test_id']
example:
20100209-2.tar.gz

Really? Ugh. ;) Can't this be based on issue nid and commit id (and project shortname, for that matter)? Surely *something* knows the correlation between those ids and the patch file being tested, which makes updating the results at the issue possible. If the clients don't currently get any of that info, can we easily pass it along to them? Or, when the files are moved back to the test server, can't we rename them based on the thing that the humans using this feature are going to want to know?

Thanks,
-Derek

Comment #17

boombatower commented 9 February 2010 at 18:58

A) The clients can run in the cloud, but as I originally intended (and still do) for users to be able to just live demo the site...this client (pifr_demo) will need to have a visible IP and thus the user can just download from there.

B) hook_cron() fries instances older then 1 hour.

C) It would of course be possible, but as of 2.x I have removed all the Drupalisms from PIFR core, so only the plugins are drupal specific. As such it doesn't keep track of issue nid and such. Since these downloads are created on demand, it shouldn't be too bad, but as of now even qa.drupal.org doesn't "directly" know the nid...although it store a "link" field which is an absolute url to the issue nid which will also contain the cid. So I may be able to hack that a bit.

Assuming everyone is good with these details I'll just take a look at the file naming when I have a time block again.

Comment #18

boombatower commented 9 February 2010 at 21:06

#709580: Add link to demo patch using pifr_demo

Comment #19

dww

we/he/they

commented 10 February 2010 at 00:28

A) I don't think that'll do. If this were built on top of a real batch scheduling system, transferring output from the execute clients back to the server would be an integral part of the process. This feature is much less useful if it depends on *which* test client happened to test a given patch. I see no reason why the clients can't have a mechanism to send files back to the server and we host all the tarballs there.

B) Woah. For this to be useful, we need these tarballs to last O(days), not 1 hour. ;)

C) Again, even if the plumbing isn't drupal-specific, the plumbing should have a way to send meta-data along with the jobs... Anyway, short of that, yes, I think a hack is in order here, since filenames like 20100209-2.tar.gz are going to suck for the non-technical audience that's trying to use them.

Comment #20

boombatower commented 10 February 2010 at 04:46

Status:

Needs work

» Needs review

Status	File	Size
new	707526-download-archive.patch	3.26 KB

A) They could easily have a mechanism for sending back files. However I see the more useful side of this to be the live demo, as that makes it infinitely easier for a UX person or whatever to just see it without even having to setup drupal or anything.

This also doesn't depend on which client, as this will only occur on-demand and be sent to our demo client.

B) We can leave them longer, but it shouldn't take long at all to gin one. Personally I don't see any reason to leave them hang around as they may become outdated at any time (commit) and I don't want to build something in to attempt to update them after commits...easier to just rebuild when needed.

C) It does have a method for sending meta data and I've added the information we need in patch. In IRC we decided that including all the necessary information in the filename is too much, so just use a standard filename and add the verbose stuff to a text file (BUILD.txt).

Comment #21

boombatower commented 10 February 2010 at 04:48

Status:

Needs review

» Fixed

Committed. (as usual re-open if necessary)

Comment #22

dasjo

he/him

German

Zurich

commented 23 February 2010 at 19:50

Status:

Fixed

» Active

boombatower, this seems like a great improvement to me and i just wanted to ask for a quick explanation of the impact of your final commit.

from my understanding, this targets providing test-links for patches on the PIFR demo platform. what about the original idea of providing a download link for the patched module? is it included yet / planned?

i just had outlined a similar proposal at #719882: Enable patches to the average drupal user, including further thoughts on a patch_watcher module which could keep track of installed patches and therefore assist the user with upgrading such modules and provide feedback to the issue queue. thanks to dww for pointing me here...

Comment #23

boombatower commented 23 February 2010 at 20:08

We can rework this some if we want to generate them and store them...but this method generates the environment on the fly and installs drupal so they can play with it. Upon initially viewing the site they are prompted to download as tar.gz or zip if they so desire.

Assuming the branch the patch applies to can change at any time I think it is more useful to build them on demand...not sure if we want a link that skips the install part (slowest) and just generates download links.

To summarize:
1) Link on patch
2) Wait for build
3) Prompt for download
4) Play with site if desired

Comment #24

dasjo

he/him

German

Zurich

commented 23 February 2010 at 20:38

i believe that there are two user-stories related to this issue
1) i want to test this patch on my existing infrastructure that i am familiar with - download patched module
2) i want to test this patch online - on the fly generation of test environment

from my personal point of view i see 1) more appealing, because as a site builder, i encounter bugs while having everything set-up yet.
on the other hand side, option 2) might come in handy for non-technical users who just want to help out in the issue queue.

Comment #25

boombatower commented 23 February 2010 at 20:58

Which this already supports. It is merely implementation details.

Although as I believe I have mentioned #1 seems far less likely since people who are building sites will most likely: a) be using a cvs checkout, and/or b) be able to patch easily.

Provide tarballs/zip files of patches + projects combined for easier review

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

Comment #18

Comment #19

Comment #20

Comment #21

Comment #22

Comment #23

Comment #24

Comment #25

News items

Our community

Documentation

Drupal code base

Governance of community