as in http://drupal.org/node/109143 and in other cases, sometimes the packaging script fails to create a tarball and publish a release. often it's because of an invalid translation. it'd be nice if the packaging script somehow updated the release node to inform the owner about this, since most of the time, they wouldn't have access to the watchdog logs to see what's wrong.

Comments

dww’s picture

Priority: Normal » Critical

more and more support requests are coming in about this. particularly with 5.0 just out, there's going to be a wave of translation updates, and some of them will have errors resulting in broken packages. we desperately need to provide feedback and make this process not suck so much for both the admins trying to figure out what's going wrong and the translators themselves.

related issues:
http://drupal.org/node/111323
http://drupal.org/node/111902
http://drupal.org/node/92915

dww’s picture

in case i don't fix this before i leave for february (which is quite likely, i'm totally running out of time), here's how translators can try to find out why their translations aren't being packaged and published:

  1. make sure you have the GNU gettext-tools installed (msgcat, msgattrib and msgfmt)
  2. checkout a temporary copy of your translation from CVS (not the workspace you care about, something you're going to modify and throw away). the rest of these instructions assume 'XX' is the country code for your translation. so, for example:
    % mkdir /tmp/tmptrans
    % cd /tmp/tmptrans
    % cvs -d:pserver:anonymous:anonymous@cvs.drupal.org:/cvs/drupal-contrib co  -r DRUPAL-5 -d XX contributions/translations/XX
    
  3. cd into your tmp translation directory: cd /tmp/tmptrans/XX
  4. rm installer.po XX.po (if you have either of them).
  5. msgcat *.po > everything.po (this is likely the step where you'll find errors)
  6. msgattrib everything.po --no-fuzzy -o XX.po
  7. msgfmt --statistics XX.po

if either msgcat or msgattrib fail in your local copy, the package isn't going to be built on drupal.org, and your release node will not be published. the output (to stderr) of the final step (msgfmt) should be getting saved into the README.txt for your translation when it's packaged, but apparently that's not happening (http://drupal.org/node/100652).

that's about all the wisdom i have time to document right now.

thanhhai’s picture

Is there any tool to do this work instead of command line?

dww’s picture

no, sorry. if there was an easier way, that's what i would have documented.

hba’s picture

thanks dww, this helped me solve http://drupal.org/node/124630

dww’s picture

Since this recently came up on the devel list, I figure I should at least write down my previous thoughts on how this should work:

A) Make a new table called something like {project_release_package_errors}. It'd have nid (release node), and a text field for the error, and probably a datestamp. If we got really fancy, we could also have an int error code for the kinds of errors we face (invalid translation, invalid .info file, whatever else we come up with).

B) During package-release-nodes.php, if anything goes wrong, we generate record(s) in this table about it.

C) Also in package-release-nodes.php, if everything goes right (whatever problems used to exist have been corrected), we delete all the records from this table.

D) Perhaps we should also have logic in the packaging script that doesn't keep regenerating more records in the table if we already know something's broken. So, either we just update the timestamps if we're about to insert another record about an error for this release node we've already recorded, or just forget about the timestamps entirely and make nid, error_code the primary key for the table (thereby enforcing that we prevent duplicates).

E) When viewing an (unpublished) release node (which release owners and project admins can do), hook_nodeapi('view') would also check for records in this table in the DB, and format a nice little HTML table that's appended after the summary info about the release node, before the body. Of course, if the DB table was empty for a given release nid (as is frequently the case), there'd be no HTML table when viewing the release node.

Something like that... ;)

Anonymous’s picture

This looks doable. I haven't played with project yet but I'm game for at least trying. For the {project_release_package_errors} table I see nid, rid, errno, error and timestamp as the columns with PRIMARY being nid, rid. I also think emptying the errors for nid, rid as the first step of package-release-nodes.php makes more sense than updating or deleting an individual row.

dww’s picture

a) what's "rid" ? releases are nodes, so nid *is* a "release id", if that's what you have in mind.

b) the reason i was over-engineering how to preserve errors is in case errors change or something. but, sure, we can always just clear them all out and record the errors of the most recent packaging attempt -- that simplifies things, and is probably just as good for end-users.

certainly any error propagation is better than none. ;)

Anonymous’s picture

I was looking at http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/project/pro... for clues. Rid is the id if the project_release table. I'll take further look later today.

drewish’s picture

Status: Active » Needs work
StatusFileSize
new4.21 KB

here's a patch that adds a table and displays it on the node. i didn't quite know where to start hacking on package-release-nodes.php.

dww’s picture

@earnie: that's a very stale version of project.install. please take a look at the code in CVS HEAD right now. in particular, you want the project/release/project_release.* files.

@drewish: thanks for getting this started. I think "errno" is more self-documenting than "number" for the schema and variable names. Not sure about "message" -- i guess that's ok, in a table with "error" in the name. ;)

@all: it's still not clear if it's even worth storing timestamps for this. if we're just always going to clear out the old errors before we try to re-package, it's not obvious what good they serve. i guess it's a way to let the release owners know the last time we tried to package something (though, honestly, that'd probably be useful to just store in the {project_release_nodes} table and display for all release nodes, not just the ones with errors).

in terms of package-release-nodes.php -- you just want to look at 2 functions: package_release_core() and package_release_contrib(). basically, at the start of that, delete all records in the new table with the matching nid, and then if anything goes wrong in either of those functions (or the functions they call), we want to generate a reasonable entry in the errors table...

dww’s picture

On the devel list, killes suggested just sending an email to the release node owner for any errors. Here are some of the reasons I think this is a bad idea:

A) There might be multiple maintainers of a project, any of whom should be able to see these errors and try to correct them.

B) d.o project admins wouldn't see those error messages as easily (or not at all) when trying to help deal with support questions. I sure as hell wouldn't want to add infrastructure@d.o to the list of recipients of these emails.

C) Unless we added a DB table to remember who/when we send email, we'd be re-sending the same emails every 5 minutes indefinitely. We shouldn't be e-mail-bombing a project maintainer like that, even if they did something stupid. ;) Yet another good reason infra@d.o shouldn't be getting blasted with these messages, too...

D) The packaging script is already doing a lot of work -- I'm not sure we should add "blocking on sending emails" to its load. It already connects to the DB and updates some records, so I don't see the small cost of a few more queries in the same negative light as sending off emails (though maybe this is bogus and I shouldn't care about this point).

Even if (D) is invalid, (A), (B), and (C) are good enough reasons for me to say we should continue with the method I outlined above and that drewish has started implementing.

killes@www.drop.org’s picture

A) It would be also easy to mail all approved cvs committers.

B) The project maintainers could forward their mail in case they need help.

C) It should be easy to add a timestamp for "last time mail sent" and only send a mail every 24h.

D) I think D is indeed invalid.

We really shouldn't try to re-invent the wheel, email is an easily understood "push" concept.

dww’s picture

B) is bound to introduce delays. People won't forward these on their own, we'll have to ask, etc, etc. I'd rather just be able to look at a release node and see if something's wrong, instead of asking the owner to forward me an email.

C) is just about as much work as our other plan, which is already in progress.

i don't think this is re-inventing the wheel. if i create a release node, i expect to look at that release node and see information about that release node. i wouldn't automatically think "gee, this node still doesn't have a tarball, maybe i should check my e-mail in case something was pushed there"...

Anonymous’s picture

@dww: I was looking at CVS HEAD and the file I pointed to in #9 was updated only two weeks ago.

@drewish: I agree that number needs to be errno. We also should have error column for our description of errno. We can use message to contain the text from the process if there is any.

@killes: I don't think email is good enough. If a casual observer of the package sees the error he can help fix it. So displaying the error in the release node has valuable side effects. Besides what happens in the email bounce situation; there is then no notice by which we can achieve the desired effect.

Anonymous’s picture

Looking at package_release_nodes.php this morning I'm thinking that wd_err function will need to change to add the errno to the parameter list. Also I think we can use POSIX errors such as ENOTDIR, ENOENT, etc. If we decide that an error column is required that can be easily assigned via an array or it may be overkill. The wd_err function will also change to populate the table. Comments? Is there any time that we need to populate the error table outside of wd_err?

@dww: I apologize. I now realize link @ #9 wasn't a view of HEAD.

Anonymous’s picture

I'm attaching a patch for package-release-nodes.php incorporating my idea at #16. I've not populated a timestamp or error column. Are these needed?

drewish’s picture

<rant>grrr... all you people with errno and other cryptic variable names. you're the reason we've got aid bid pid and no one knows what they mean where. it's not 1970, we aren't waiting for 300 baud modems and we can fit more than 80-characters on a line. we can afford the luxury of saying what we mean. can't we just spell it out?</rant>

dww’s picture

@drewish -- i totally share your general point (see my similar rants about "pid" showing up dozens of times in the drupal schema and the $node object, with utterly different meanings, etc). however, in this case, i think you're mistaken. "errno" isn't exactly cryptic, and is certainly more clear than "number" in terms of what it's for. if you really want to bend over backwards to coddle the newbies, "error_code" would be ok, but "number" is totally vague.

All that said, it's not 100% clear we even need an errno anymore. ;) If we're just going to nuke all the records each time we're about to try to package something, and the table just displays the error messages directly, there's not really any reason for an errno. Let's just store and display the full error text and be done with it. Makes it easier to not have to worry about adding errno's to wd_err(), too. ;)

Re: timestamps -- see my comments in #11 above. Basically -- no, we don't need them in this table, but it might be worth adding a "last packaging run" timestamp to the {project_release_nodes} table -- see http://drupal.org/node/174819

So, unfortunately, Earnie, that means your latest patch at #17 isn't really going to help -- sorry you spent time on that. :( All we really need from there is the code in wd_err() to generate the new record in the table.

We also need to fix how we're keeping track of which nid we're working on -- your patch gets things wrong since you're relying on the $project_id handed to package_releases(). That's optional, and only specified when running the script manually. As I said above, you really want to look at the $nid we pass to either package_release_core() or package_release_contrib().

Finally, we need to clear out the records in the {project_release_package_errors} table when we're starting to process another release node.

All that said, this is actually pretty close to being ready. Thanks, all.

Anonymous’s picture

it's not 100% clear we even need an errno anymore Do we need a sequence number to keep the order straight?

sorry you spent time on that. :( Not a problem, I submitted the patch to spark communication.

We also need to fix how we're keeping track of which nid we're working on -- your patch gets things wrong since you're relying on the $project_id handed to package_releases(). I figured this was wrong. We need nid before we start populating errors though.

As I said above, you really want to look at the $nid we pass to either package_release_core() or
package_release_contrib().
Ah ha! I was wondering about that.

Finally, we need to clear out the records in the {project_release_package_errors} table when we're starting to process another release node. I ran out of time but wanted to submit what I had for discussion.

grrr... all you people with errno and other cryptic variable names. you're the reason we've got aid bid pid and no one knows what they mean where. it's not 1970, we aren't waiting for 300 baud modems and we can fit more than 80-characters on a line. we can afford the luxury of saying what we mean. can't we just spell it out?

I wasn't around for aid, bid, pid, etc; but respect their use as long as confusion isn't added by using the same mnemonic for more than one thing. Errno is a documented, long standing and known variable name; however, no need to worry we won't be using it. Number now needs to be removed or maybe needs to become SEQ or SEQUENCE.

I'll supply another patch my tomorrow morning, EDT, to not use ERRNO and to add the removal of errors.

dww’s picture

I see no need for a sequence number. We should just print all the errors. In the rather unlikely case that there are multiple errors for a given release and the order is really important for it to make sense, we can add a weight or sequence number or something. for now, let's keep it simple: nid, error_message. ;)

drewish’s picture

dww, i think the point of the sequence was for use as a primary key. if we're using nid as the PK then there can only be one error... which actually makes some sense, as it is now the script skips the package after the first error anyway. so maybe we should just have one error message for each release? if there's more than one error we could just concatenate lines together.

Anonymous’s picture

@drewish: There is no need for a primary key; just an index. We could update a global array that is written serialized to DB when done if it contains data. This would keep the rows to one and the sequence correct.

Anonymous’s picture

Now that was easier. :) The replacement patch for package-release-nodes.php is attached.

dww’s picture

Indeed, much easier. ;) A few minor gripes:

a) If we're now storing a serialized array of error messages in the DB, i'd rather the column was named "messages" instead of just "message". Either that or "message_array", though we don't normally do that in the Drupal schema, it seems...

b) Drewish's original patch and this one should be merged such that the schema completely agrees with the code, and that the display code matches how the packaging script stores the messages.

Almost there... thanks, folks.

Anonymous’s picture

Status: Needs work » Needs review
StatusFileSize
new5.69 KB

Combined patch attached. I'm not setup to test it.

drewish’s picture

Status: Needs review » Needs work

Took a quick look:

  1. I don't think we should suppress query warnings with @.
  2. In one of the queries you've got %s without quotes.
  3. I don't think pgsql likes the ` characters around fields.
Anonymous’s picture

Status: Needs work » Needs review
StatusFileSize
new5.68 KB

I've taken care of the issues pointed to by drewish.

dww’s picture

Status: Needs review » Needs work

Update and fresh install both work fine on MySQL. However, theme('table') doesn't work that way. The rows must be an array of arrays, not an array of values. So, when I tried to test this on a local site (manually populating the {project_release_package_errors} table with a record), I got a slew of php errors when I tried to view the release node...

dww’s picture

Status: Needs work » Needs review
StatusFileSize
new5.71 KB

Since the errors are only single values, semantically a table doesn't even make sense here. Seems like this is really an item list. So, here's a new version that uses theme('item_list') instead of theme('table'). Everything works fine for me, and I don't care *that* much about how this looks. ;) It'd just be nice to get a little confirmation the schema changes work fine on pgsql. Otherwise, I think this is RTBC.

dww’s picture

StatusFileSize
new111.51 KB

Screenshot of a release node with some bogus packaging errors manually inserted into the DB.

drewish’s picture

i'm really not a fan of the way the global $wd_err_msg is used in package-release-nodes.php. i can't find a program flow that would put two errors into the array. every call to drupal_exec(), drupal_chdir() or wd_err() that fails is followed by a return. it just seems kludgy the way it's setup now. i'd rather see wd_err() take an optional $nid parameter, if it's provided the error is logged into the table.

dww’s picture

drewish: i agree that the global array is a kludge. however, the problem with your proposed approach is that either we're back to a more complicated schema using sequence numbers (in which case we still need a global var for the sequence), or we make this rigid so that you can only ever store a single error for a given release. even if the code doesn't currently generate multiple errors for the same release, i'd like to keep it flexible so that it's easy to add more error reporting to this script in the future.

Anonymous’s picture

Status: Needs review » Reviewed & tested by the community

I have to admit that I don't much like it either. I used it because 1) it is used throughout Drupal and 2) it made the case of multiple errors without a sequence in the database possible.

dww’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new5.8 KB

Re-rolled to resolve conflicts now that http://drupal.org/node/64100 was committed.

Anonymous’s picture

Status: Needs review » Reviewed & tested by the community

Looks the same except for a couple of the locations near the end of the patch. Go for it.

hunmonk’s picture

Status: Reviewed & tested by the community » Needs work
  • i don't see a need for a global -- we should be able to handle persistence in function wd_err() if we expand the function a little.
  • i really don't like just slapping the messages in the node body. these are fairly serious errors, so looping through the messages and setting error messages via drupal_set_message() seems like the right way to go. at the very least, if we're going to keep the item list, we should get some simple CSS class on there, and theme it appropriately.
dww’s picture

Assigned: Unassigned » dww
Status: Needs work » Needs review
StatusFileSize
new5.88 KB

- This is too important to hold up over the global in wd_err().

- drupal_set_message() for the errors gets weird when viewing a list of releases. i'd rather just keep the errors directly in the node body. however, i agree, some more urgent theming is required, so i just slapped a <div class="messages error"> around them, so that we'd re-use the theme's CSS for error messages. Works like a charm in local testing.

hunmonk’s picture

Status: Needs review » Reviewed & tested by the community

code style is good. i trust that it works as advertized :)

dww’s picture

Status: Reviewed & tested by the community » Fixed

Committed to HEAD and installed on d.o. For example:

http://drupal.org/node/128493
http://drupal.org/node/127836
http://drupal.org/node/129779
...

Woot! ;)

Anonymous’s picture

Status: Fixed » Closed (fixed)