This is an offshoot of Re-number (core) updates according to major version (of Drupal).

We encourage module maintainers to join forces with others, but the code of many contrib modules is in an out-dated state. So the rule to continuously count module updates beginning from 1 turns upgrading and co-maintaining of contrib modules into a pain.

Changing the numbering scheme for contrib modules allows new co-maintainers of a module to learn which updates were needed in the past and more importantly, for which version. You don't know that by looking at e.g. module_update_123.

Like the new numbering scheme for Drupal core updates, the numbering scheme for contrib modules should be changed to:

512#
||||
|||└ Continuous counter, i.e. starting from '0'.
||└ Minor version number of this module, i.e. '2' for module version '1.2'.
|└ Major version number of this module, i.e. '1' for module version '1.2'.
└ Major version number of Drupal this module's release is compatible with.

IMHO a contrib module won't need more than 10 database updates within a minor version. However, if we want to allow more, we certainly can append another # and start counting from '01'.

Examples:

  • module_update_4110 would indicate the first database update for version 1.1 that is compatible with Drupal 4.x.
  • module_update_5172 would indicate the third database update for version 1.7 that is compatible with Drupal 5.x.

After there is an agreement on a new numbering scheme for contrib modules updates, the documentation (http://api.drupal.org/api/5/function/hook_update_N) has to be updated accordingly.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

webchick’s picture

Personally this strikes me as way too complicated and also a bit brittle (more than 10 updates in a minor version, while not likely to happen, definitely could)...

I think a better approach would be to add in the ability to the update system for developers to associate meta data with their updates. I've no idea how this would work, but maybe something like:

module_update_47_info() {
  return array(
    'version' => 1.4,
    'description' => t('Adds a new column to the foo table so that we can track bar.'),
  );
}

module_update_47() {
  // Do the update stuff.
}

Then in the UI we could see something like:

foo.module:
[ Update 47: Adds a new column to the foo table so that we can track bar. [V] ]

dww’s picture

i'm in favor of a convention, and have been using one myself for my own contribs. it's similar to core, and sort of like what's proposed here, but i think more simple and easy to understand/use:

3 parts:
1 digit for core compatibility (jah be praised, only need 1 digit for that anymore)
1 digit for major release version (e.g. is this the 5.x-1.* (1) or 5.x-2.* (2) series of your module?)
2 digits for counting (10 is way too low of a limit, but 100 should be more than enough).

so, for example:

signup_update_5200()

this is the first update to get your DB ready to run signup 5.x-2.*

also, my current convention says that the 2nd digit should be 0 for initial porting to the new core API. so, when porting to 6.x, you'd start with:

hook_update_6000()

i.e. everything that's running 6.x of your module needs these updates. only updates that were specific to the 6.x-1.* series would need to happen as hook_update_61XX(), so that, for example, people could jump directly from 5.x-2.* to 6.x-2.*, and they'd get 60XX and 62XX, but not 61XX.

make sense?

that said, i'm not inherently opposed to the kind of meta data stuff webchick is talking about, but i think that's a totally separate issue. the reason core has these updates numbered in the 1000s is to handle CVS branching. if everything is totally sequential, and then we decide we need to update the schema to fix a security problem in 5.1 core, we're screwed. however, since all the 6.x updates are 1000 updates away, we've still got room to make some updates specific to 5.x that don't break everything when you eventually upgrade to 6.x.

the exact same problem exists in contrib (only, it's even more of a concern, since you can have N branches of your module that are all compatibile with 5.x core). that's the real reason we need a convention like this for contrib. the self-documenting clarity about what version a given set of updates is for is just a bonus.

thanks,
-derek

sun’s picture

+1 for the numbering scheme proposed by dww. It's clean, easy understandable and allows implementing updates to a major version at any time.

webchick’s picture

dww's approach sounds sane. It provides the advantages of sun's without requiring too much thinking to know what the next number should be. :)

And yes, my comment was off-topic. I hadn't looked at this issue closely enough to see that it was a documentation issue and not a core issue; please disregard.

kbahey’s picture

Structured codes (compound keys) are generally a bad idea: in the future, you renumber things, and everything breaks down.

Can we just have it simple? Major API + 3 digit sequence?

webchick’s picture

@kbahey: That's how it works for core; because each core version is a major version.. 5, 6, 7, 8... so update hooks are (from now on) numbered 6000, 6001, 6002.... until Drupal 7.x when it becomes 7000, 7001, 7002....

However, in contrib, Project module 5.x-1.x and 5.x-2.x could be *completely different* schema-wise. Therefore, you need the extra number to tell which version it's associated with:

5100, 5101, 5102... for 5.x-1.x. ... which is all most contributed modules will ever use.
5200, 5201, 5202... for 5.x-2.x.

webchick’s picture

Pardon me; I don't want to confuse this... 5.x schema numbers should NOT change. So please amend the last part of my reply to:

However, in contrib, Project module 6.x-1.x and 6.x-2.x could be *completely different* schema-wise. Therefore, you need the extra number to tell which version it's associated with:

6100, 6101, 6102... for 6.x-1.x. ... which is all most contributed modules will ever use.
6200, 6201, 6202... for 6.x-2.x.

sun’s picture

To clarify this a bit more: Existing schema numbers won't ever change. However, as soon as we have found an agreement here, all new schema updates may use the new numbering scheme. Please bear in mind that there are still modules that are not yet updated/branched for Drupal 5.

sun’s picture

Status: Active » Needs review
FileSize
1.89 KB

I've reworded dww's instructions a bit and updated the documentation of hook_update_N. Please review.

dww’s picture

Status: Needs review » Needs work

cool, thanks. however, you've got some typos:

+ * - mymodule_update_6100()
+ *   - This is the first update to get the database ready to run mymodule 5.x-1.*.
+ * - mymodule_update_6200()
+ *   - This is the first update to get the database ready to run mymodule 5.x-2.*. Users can directly update from 5.x-2.* to 6.x-2.* and they get all 60XX and 62XX updates, but not 61XX.,

update 6100 == 6.x-1.*, not 5.x-1.*...

sun’s picture

Status: Needs work » Needs review
FileSize
1.89 KB

Sorry.

Q: Are those <strong> HTML tags allowed? Regarding #5-#8 I thought it would be better to emphasize this a bit.

webchick’s picture

"Users can directly update from 5.x-2.* to 6.x-2.* and they get all 60XX and 62XX updates, but not 61XX."

Is this true? How would Drupal not hit all of the updates higher than the current number..?

sun’s picture

FileSize
1.95 KB

Enhanced description:
Users can directly update from 5.x-2.* to 6.x-2.* and they get all 60XX and 62XX updates, but not 61XX updates, because those reside in the 6.x-1.x branch only.

webchick’s picture

Ah, got it. :) Duh, sorry. :)

Now, two comments:

1. I'd link somewhere off to http://drupal.org/node/93999 for a "for more information" link about contrib version numbers... there might be a better page, but that seemed to be the best from what I could find.

2. I would reserve X0XX to core, and have contrib modules use only the X1XX, X2XX, etc. namespace. The text as it reads now seems to imply that for my first initial port from 5.x-1.x to 6.x-1.x I would number it 6000, but for any subsequent updates it would be 6101, 6102... which seems confusing.

webchick’s picture

Ah, got it. :) Duh, sorry. :)

Now, two comments:

1. I'd link somewhere off to http://drupal.org/node/93999 for a "for more information" link about contrib version numbers... there might be a better page, but that seemed to be the best from what I could find.

2. I would reserve X0XX to core, and have contrib modules use only the X1XX, X2XX, etc. namespace. The text as it reads now seems to imply that for my first initial port from 5.x-1.x to 6.x-1.x I would number it 6000, but for any subsequent updates it would be 6101, 6102... which seems confusing.

sun’s picture

FileSize
2.17 KB

I've just tested this with a local installation of API. It looks like API currently doesn't support embedded HTML (<strong>), equivalent Doxygen syntax (@b or \b) and links (<a> or simple URIs). So I've removed those tags and added a plain link list to related handbook pages.

dww’s picture

Status: Needs review » Reviewed & tested by the community

@webchick: "I would reserve X0XX to core, and have contrib modules use only the X1XX, X2XX, etc. namespace. The text as it reads now seems to imply that for my first initial port from 5.x-1.x to 6.x-1.x I would number it 6000, but for any subsequent updates it would be 6101, 6102... which seems confusing."

that misses the point that i made in #2 which sun clarified in #13.

here's a scenario that describes in more detail the case i'm worried about:

  • updates 600X live in HEAD while you're porting to 6.x core.
  • when you branch DRUPAL-6, you've got 600X in foo.install already.
  • say you create a DRUPAL-6--2 branch and start working on 6.x-2.*. you start adding 620X updates for new features.
  • meanwhile, you discover some critical security problem that can only be solved by means of a schema change (could happen). on DRUPAL-6 branch, that'd have to be update 6100.
  • however, sites that are already running 6.x-2.0 have schema_version == 6201 in {system} already, so they're not going to run 6100 again. furthermore, update 6100 only lives on DRUPAL-6 branch. so, on DRUPAL-6--2 branch, you'd have to add, say 6202 for this security update. plus, for all you know, the schema change would have to be different on this branch, since updates 6200 and 6201 might have already changed something related to what you'd try to do for the security problem. in fact, maybe the security problem doesn't exist in DRUPAL-6--2 due to the changes you already made there, and you *only* need the new update on DRUPAL-6 branch for 6.x-1.* users
  • now, someone who hadn't updated to 6.x yet decides to use this module. they look at the release history and decide they want the new features from 6.x-2.* branch. so, they need to upgrade from 5.x-1.* directly to 6.x-2.*. they want updates 600X (initial changes for migrating to 6.x that are shared by all versions), and 620X (updates specific to 6.x-2.* releases), but not update 6100, since that'd screw up their schema.

does that all make sense? it's kind of a weird case, but it certainly could happen, and the possibility of something like this is exactly why we need branch-specific update numbers in the first place. end users don't have to think about this crap, only module maintainers, since the schema_version in {system} will automagically do the right thing so long as maintainers understand and follow the convention.

IMHO #16 is now RTBC. however, if folks think it needs further clarification based on this explaination, feel free to set this back to CNW. perhaps this little scenario belongs in a handbook page somewhere, too.

sun’s picture

I also think this RTBC, but I'd like to see some feedback from more core developers.

sun’s picture

Assigned: Unassigned » sun
FileSize
2 KB

New patch against latest HEAD. We should definitely have this committed for D6.

bdragon’s picture

Status: Reviewed & tested by the community » Fixed

Committed to HEAD (finally) with slight modifications (I reflowed the looooong line.)

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

sun’s picture

Status: Closed (fixed) » Active

Re-opening b/c of latest discussion on devel list.

To allow module maintainers to permanently use a module's major version as key identifier for a given API/feature-set, and to not create confusion among maintainers, we should add a second digit for the major version to the official docs.

Hence,
5.x-1.x becomes 50100 (was 5100)
8.x-12.x becomes 81200

Personally, I hope this will also encourage maintainers to not renumber their module's major version for each Drupal core compatibility (information), which is probably very confusing for end-users.

dww’s picture

As I wrote on the devel list:

What other software has the very first digit go up and the feature set go down? I'm surprised to hear you have this expectation, and I'm curious how wide-spread it is.

One example out of thousands: PHP went from 4.4.* to 5.0.* and no one thought "0 is smaller than 4, this must be an older feature set -- where can I find 5.4.*?"

catch’s picture

That seems reasonable to me.

dww’s picture

Furthermore: people advocating the "keep your major versions the same" approach keep referring to "API/feature-set" compatibility. Porting to core by definition involves sweeping API changes. I don't see how anyone can possibly claim that 5.x-2.x and 6.x-2.x of any contrib "have the same API". Modules that themselves provide an API, perhaps, but even then it doesn't really matter, because the plugin I wrote that works with your 5.x-2.x API module won't work in 6.x, either. I've got to port to D6, too, so we might as well all start over at 6.x-1.* once we do that.

aaron’s picture

How will this pan out in 2012 with Drupal 10? (Or maybe the mayan calendar reboot will happen and it'll all be moot...)

dww’s picture

@aaron: As I wrote on the devel list:

"B) Nothing about this prevents Drupal from going to version 10. At that point, we go past 9000 and start numbering things 10000. No problem, except a 1 sentence change to the docs. We don't need a delimiter for this."

redndahead’s picture

Here's my reply to dww on the list:

Here are my thoughts. 6.x just tells me what version of drupal it works on. I don't take that number into account as to what features that module has. The module version number tells me what version the module is on. Let's use CiviCRM as an example. It has it's own version numbers because it is not dependent on Drupal. Their next version is 2.1. I don't expect it to be 1.1 for D6 and 2.1 for D5.

Let's use Pear as an example of a module for PHP. I don't expect pear to be version 1.x for PHP5 and 5.x for PHP4.

Anonymous’s picture

What other software has the very first digit go up and the feature set go down? I'm surprised to hear you have this expectation, and I'm curious how wide-spread it is.

DRUPAL has a version which isn't dependent on the version of PHP. Module FOO has a version which isn't dependent on the version of DRUPAL. FOO-5.x-2.x is a convention that tells me that module FOO has released its second feature set and that feature set is compatible with version 5.x of DRUPAL. When FOO becomes compatible with DRUPAL version 6 the package is released as FOO-6.x-2.x to indicate that its second feature set is compatible with DRUPAL version 6. If it were to name the package FOO-6.x-1.x then we are indicating that the first feature set is compatible with DRUPAL version 6.

There are two version numbers, DRUPALs and FOOs.

One example out of thousands: PHP went from 4.4.* to 5.0.* and no one thought "0 is smaller than 4, this must be an older feature set -- where can I find 5.4.*?"

There is ONE version number and not TWO. The DRUPAL versioning is similar to PHPs with drupal-5.0, drupal-5.1, ... drupal-5.9, drupal-6.0, etc. The version of PHP doesn't affect the version of DRUPAL and neither should the version of DRUPAL affect the version of a module.

sun’s picture

Porting to core by definition involves sweeping API changes. I don't see how anyone can possibly claim that 5.x-2.x and 6.x-2.x of any contrib "have the same API". Modules that themselves provide an API, perhaps, but even then it doesn't really matter, because the plugin I wrote that works with your 5.x-2.x API module won't work in 6.x, either. I've got to port to D6, too, so we might as well all start over at 6.x-1.* once we do that.

Sorry, but this might be true for your module(s) and your API(s), but it is just the case for them then. For example, Wysiwyg API implements an API that (until now) did not change between D5 and D6 (and probably D7). Contrib modules implementing v0.x of this API do not have to change their code when porting to a new major version of Drupal core.

Still, 5.x- or 6.x- is just a Drupal core compatibility information. Most often, the API/feature-set of a module does not change during a port, so what has been a feature of "Awesome module 2.x" will still be available in "Awesome module 2.x". In the end, that is what end-users would expect, IMHO.

Also, those PHP examples are invalid - they would be valid, if PHP would renumber their major version for each new major version of Apache, for example. But AFAIK, they don't do that ;)

Anonymous’s picture

Furthermore: people advocating the "keep your major versions the same" approach keep referring to "API/feature-set" compatibility. Porting to core by definition involves sweeping API changes. I don't see how anyone can possibly claim that 5.x-2.x and 6.x-2.x of any contrib "have the same API".

The version of the API I use doesn't dictate the version of the module I provide. 5.x and 6.x are the version of the API I use and version 2.x is the feature set of the module I provide. There are two version numbers not one, Drupals and the modules.

dww’s picture

More text I sent to devel...

On Aug 7, 2008, at 9:57 AM, Morbus Iff wrote:

I want the latest version of the *module*, not the latest version of Drupal core and then the module.

That's the point of embedding the core version in the version strings of contrib in the first place. What you want is fallacy. You can't possibly not care what version of core you're using, since Drupal is inherently non-compatible across versions of core. In Drupal contrib, the world revolves around core. I just chose to embed that indisputable fact into the version strings themselves.

Re views: There's no 5.x-2.*, nor a 6.x-1.*. So "Views 2" is synonymous with "D6 views". It's irrelevant. Panels would be a better example, since at least there's a 5.x-1.* and a 5.x-2.* (though no 6.x-* yet).

OG is a good counter-example. Moshe is at 5.x-7.3 now, and (thank god) he chose to call the first 6.x compatible version 6.x-1.0. Otherwise, at his current rate, he would have been at 7.x-21.0 in 2 years.

Part of the difference is how Earl vs. Moshe chose to make use of major versions in their own development workflows. Roughly speaking, every coherent set of new features and bug fixes resulted in a new official release of views (5.x-1.5 -> 5.x-1.6), and resulted in a new branch/major version for OG (5.x-6.* -> 5.x-7.0). Given those development styles, it makes perfect sense that Earl puts a lot more weight into his major versions, and wanted to preserve them, while Moshe does/did not.

The main point is this: no matter what we do, the version strings alone will confuse people. No matter what convention anyone follows, some people will think "ahh, just as I expected" and others will think "WTF? Where did xxxxxx go?" Hence the need for clearly stated intentions in your release notes and on your project pages.

Plus, the tools can't and shouldn't enforce a certain development/release process on all developers (I'm sure you're all saying "amen" at that). I'm trying to document what I think are reasonable practices and try to make sure that the tools don't get in the way of people doing sane things, but ultimately, it's totally fine with me that Earl and Moshe are using the tools in completely different ways.

I don't think telling Earl: "No, you *MUST* call it views 6.x-1.0" is any more helpful than telling Moshe: "No, you *MUST* call it OG 6.x-8.0".

Can we please forget about trying to force one model on both of them?

Thanks,
-Derek

catch’s picture

I don't see how anyone can possibly claim that 5.x-2.x and 6.x-2.x of any contrib "have the same API"

This is probably the best example I know of: http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/porterstemmer/ - afaik, the D6 port changed one line. I'd imagine a lot of filter modules like bbcode etc. have very few changes as well.

sun’s picture

I've just found out that Inline is a much better example than originally thought:

- Inline v1.x is a rather "dumb" filter module.
- Inline v2.x becomes an enhanced macro processing and filtering API that will ship functionality of 1.x in an optional sub-module.

...and while 2.x is still under development, some users want to see 1.x ported to Drupal 6 (#193676: Port Inline to 6.x), which means that there will be 1.x AND 2.x, compatible to different Drupal versions, but providing different features. Re-numbering the major version would not be possible here.

However, to come straight to the point: Does anyone disagree with adding another digit to module update numbers for the major version of a module, as proposed in #22?

Please note that the sole purpose of this change is to reduce confusion among developers.

Anonymous’s picture

However, to come straight to the point: Does anyone disagree with adding another digit to module update numbers for the major version of a module, as proposed in #22?

Please note that the sole purpose of this change is to reduce confusion among developers.

I don't see any confusion, nor do I see any removed. ;p

Any confusion is that the documentation doesn't explain that the N is merely a sequence and the suggested format isn't enforced by code. All that is required is that subsequent updates have an N sequence higher than the previous number. So I as a module maintainer may simply decide to number N beginning with 1 and increasing by 1 for subsequent values regardless of any version of Drupal or the module; which of course is what the documentation used to say.

To clear the confusion a description of why the suggestion to use versions to number the sequence number makes good sense should be made. Something like:

The N is a simple sequence number whose subsequent value must be least one higher than the previous update sequence. In order to help you keep track of what changes go with which versions we suggest the following scheme.

webchick’s picture

@earnie, you are now on the docs team. Feel free to edit that page so that it makes sense to someone coming into it new like you.

dww’s picture

Note: the purpose of this convention is not just to "help you keep track of what changes go with which versions". The point of the convention is to make it possible to safely have DB updates on different branches of development. See #64212: upgrade system needs to support branches. This isn't about convenience and readability, it's about correctness.

Other than that, I agree with Earnie. I don't see how adding 2 digits helps "reduce confusion" at all. IMHO, all of the energy spent on this topic really boils down to a 2 sentence change to the docs, something like:

If your contributed module reaches major version 10 for some reason, you should move to a 5 digit numbering scheme: 1 for core version, 2 for your major version, 2 for a sequence number. Once you do this, you must ensure that all future updates always use the 5 digit convention, even if you reset your major version for a later version of core.

It could use a little polish, but that's basically all we have to say.

salvis’s picture

@dww: Your post #17 was illuminating, but there's one situation that I haven't been able to figure out: if I maintain a module for D5 and D6 in parallel, and I want to make the same database change for both versions — if I name this update 5117 and 6105 and someone updates the D5 version and then later upgrades to D6, then they're in trouble.

It seems like sticking to equal (low, old-style D5) numbers is the better way as long as the versions are maintained in synch, or am I missing something?

In the scenario you exposed in #17, you conveniently assumed that your user would "upgrade from 5.x-1.* directly to 6.x-2.*", but the same issue arises when he decides to go to 6.x-1.* first and then to 6.x-2.*...

Anonymous’s picture

I was researching more and found this tidbit http://drupal.org/node/150220 which needs some work on its examples to bring the examples in line with suggested N use. I also would like to state OT that schema API needs some work to help with the issue. I.E.: Why can't db_drop_column handle the instance of the missing column as in the example given in the document?

@salvis: N from hook_update_N isn't the same as the DB schema version and while hook_update_N was supposed to help take care of that, as in the examples given in the above link you still have to know what version the DB exists while updating the schema. I propose that use of a variable_get of mymod_schema_version be used to filter the actual execution of the code within the hook_update_N function.

function MYMOD_update_5117() {
  if (variable_get('mymod_schema_version', 0) < 2)) {
    $ret = array();
    db_drop_field($ret, 'T', 'b');
    variable_set('mymod_schema_version', 2);
    return $ret;
  }
}

function MYMOD_update_6105() {
  if (variable_get('mymod_schema_version', 0) < 2)) {
    $ret = array();
    db_drop_field($ret, 'T', 'b');
    variable_set('mymod_schema_version', 2);
    return $ret;
  }
}

Otherwise the user must upgrade in order as is the suggested best method.

Anonymous’s picture

@dww: I'm sorry but I'm having trouble wrapping my brain around

The point of the convention is to make it possible to safely have DB updates on different branches of development.

I am not understanding how the convention helps resolve "DB updates on different branches of development". The only method I know to really resolve this is to have a version id for the schema change. Now that we have a Schema API we should allow the Schema API to resolve the issues of what schema patches need to happen to the DB. The Schema API would save the current version of the schema on install/update and during the update be able to resolve the differences between the two. This way a developer or a user could upgrade/degrade/regrade gracefully without having to deal with incompatible upgrades between branch differences. We would still need hook_update_N to take care of things like data conversions, insertions of configuration data, etc.

That said, I don't disagree with the convention for correctness. I just don't see the avoidance of collision with it.

betz’s picture

Project: Documentation » Drupal core
Version: » 7.x-dev
Component: Documentation in CVS » documentation
sun’s picture

Priority: Normal » Critical

#282084: Module update numbering scheme question has been marked as duplicate of this issue, resp. follow-up.
http://drupal.org/node/257743#comment-967306 contains a patch for Image module that tries to use Drupal's VERSION constant in an intelligent fashion, but after tinkering about this some more, even the VERSION constant does not help us out.

If a module is maintained for 5.x and 6.x in parallel, it contains module_update_5123() and module_update_6123(). If users will be upgrading from 5.x to 6.x, all of those updates that have already been executed for 5.x, will be executed again. For Drupal's schema system, anything above 5123 is higher and needs to be executed. Now, let's imagine a module update that performs no additional checks on the data:

/**
 * Change variable name from foo_values to foo_settings.
 */
function foo_update_5123() {
  $ret = array();
  $myvalues = variable_get('foo_values');
  variable_set('foo_settings', $myvalues);
  variable_del('foo_values');
  return $ret;
}

The same function has been implemented in 6.x as foo_update_6123().

Am I the only one who thinks this can seriously break Drupal sites during an upgrade?

kbahey’s picture

This is one of the reasons why structured numbering schemes are bad.

I just had this in a module that has a 5.x and 6.x version.

So, I did this:

function foo_update_2() {
 ... some changes to the database schema
}

Both the 5.x branch and 6.x branch have the same function, with the same number (2).

If someone is at update 1, and runs the update while on 5.x before upgrading to 6.x, or only runs the update after they upgrade to 6.x, the result is a good schema, regardless.

If we go for 5xxx and 6xxx then we can either do an update twice or miss one, and each can have dire consequences.

Down with structured numbering.

sun’s picture

@kbahey: It seems you are not yet aware of http://api.drupal.org/api/function/hook_update_N ... we are already encouraging and propagating this module update numbering scheme, and at least all of my modules implement it already. Of course, all of those are still maintained for 5.x and 6.x in parallel. We cannot re-number existing module updates.

kbahey’s picture

I am aware of that, and object to it as well. However, the issue in core is not the same in contrib. Core is under the scrutiny of a lot of eyes, gets a lot more testing, and we don't go and add hook_update_X() to stable versions that overlap with the current development version and clash with it in the future (at least not very often). So, I can reluctantly live with it in core.

In contrib, it is totally different, given the parallel updates for 2 or more stable versions, and hence more probability for errors. A sequential numbering scheme is far safer than the structured numbers here, since we can never the same update twice on a particular site.

Hope this is clearer.

Anonymous’s picture

I understand @dww's point of "convention of correctness" as I said in #40. I also give an example in #39 to overcome the collision @sun gave as an example in #42 using variables.

This "convention of correctness" is just that and I support it. But we don't overcome collisions because of it. And we must control the updates by some other means than the sequence number of a function. You must know if the change for the update has been completed or not. That means controlling the update with data in the variable table and using that data to filter the action for the update.

Using @sun's example foo_update_5123 and foo_update_6123:

/**
* Change variable name from foo_values to foo_settings.
*/
function foo_update_5123() {
  $ret = array();
  $myvalues = variable_get('foo_values');
  variable_set('foo_settings', $myvalues);
  variable_del('foo_values');
  variable_set('foo_update_23', TRUE);
  return $ret;
}
/**
* Change variable name from foo_values to foo_settings.
*/
function foo_update_6123() {
  $ret = array();
  if (!variable_get('foo_update_23', FALSE)) {
    $myvalues = variable_get('foo_values');
    variable_set('foo_settings', $myvalues);
    variable_del('foo_values');
    variable_set('foo_update_23', TRUE);
  }
  return $ret;
}

The action filter isn't needed for foo_update_5123 because it is earlier in the sequence.

sun’s picture

Hm. This is very dirty, but could be a way to circumvent the set-in-stone-problem we have now.

function fu_update_5123() {
  $ret = array();
  // Schema / variable update stuff goes here...
  // Note: If not removed in 6.x, this will be executed for 5.x users as well as
  // 6.x users who did not update to the latest 5.x release first.

  variable_set('fu_schema', 123);
  return $ret;
}

function fu_update_6123() {
  $ret = array();
  // Prevent this update if it was executed already.
  if (variable_get('fu_schema', 0) >= 123) {
    return $ret;
  }
  // Schema / variable update stuff goes here...

  variable_set('fu_schema', 123);
  return $ret;
}

So in other words, we would additionally store a major module version specific schema version that is unrelated to Drupal core compatibility and check this schema version in front of executing updates.

We might want to check whether we could implement this core-independent schema version as an additional $ret array value, too.

Anonymous’s picture

@sun: That's a great method for schema changes but if your changing data only, as in the previous example, you're not changing the schema.

sun’s picture

@earnie: Whether an update changes data or the schema of data is unimportant. Important is that it must not be executed twice, just because the update system does not know that update no. 5123 == 6123.

Thought some more about fixing this regression by extending hook_update_N() and will try to come up with a patch - shall I create a new issue for it?

Anonymous’s picture

Thought some more about fixing this regression by extending hook_update_N() and will try to come up with a patch - shall I create a new issue for it?

Since this issue is about ``documentation'', IMO, yes.

sun’s picture

Assigned: sun » Unassigned

I have tried to come up with a fix for update.php/install.inc in core, but after further investigation it turned out that the envisioned workaround cannot be implemented.

In the meanwhile, I received the first complaint about updates being run duplicate: #315426: Incorrect update function names
Luckily, those Journal module updates did not perform major changes and just lead to PHP/database warnings.

To be honest, I'm running out of ideas how we could fix this bug without touching all modules in contrib that implemented the new update numbering scheme.

I'm pretty sure that we would not have this problem now, if we had not added the Drupal core version compatibility as prefix, basically as outlined above. Adding this prefix like in Drupal core was the primary cause for this issue and change, but it completely broke hook_update() for contrib modules that are concurrently maintained for multiple versions of Drupal core. We added this bit to allow for fundamental schema changes between versions of Drupal core (i.e. update_6000()) as well as intuitive correlation of module updates to Drupal core versions (f.e. to quickly identify obsolete updates). If the prefix was not present, we would not have those additional benefits, but the update system would still work, AFAICS. However, we cannot revert this change anymore. Note that I supported this change in the first place, and this should be treated as analysis only; it is definitely not meant to insult anyone.

salvis’s picture

It seems that those of us in contrib who stuck to low, old-style hook_update_N() numbers are better off...

Anonymous’s picture

@salvis: Yes that works especially in light of the new update screen for D7. However, it can be confusing to those who use your module as an example of how to code and try to follow along with the documentation.

@sun: The idea is good other than it causes another issue.

kbahey’s picture

@salvis "It seems that those of us in contrib who stuck to low, old-style hook_update_N() numbers are better off..."

I agree, and I am sticking with this, because it allows updates to span versions and hence less probability of missing updates, or running them twice.

Simplicity is a virtue ...

dww’s picture

@salvis: "It seems that those of us in contrib who stuck to low, old-style hook_update_N() numbers are better off..."

Until you run into the problem that this numbering scheme solves. :(

Frankly, the root of the problem here is that we were lazy in core when we first discovered this problem (#64212: upgrade system needs to support branches) and we never fully thought through the potential problems since core's own schema was a special (trivial) case of what update.php has to potentially deal with. The appeal of a "solution" that didn't change the hook_update_N() API was so great that we just went with that. Then we tried to apply this scheme in contrib to avoid the same problem, and yet it appears that by solving 1 problem we introduced another.

<claim class="bold unsubstantiated">
I need to think through the case that's currently failing more, but part of me thinks that it's being caused by people committing hook_update_N() implementations to the wrong branches at the wrong times.
</claim>

Anyway, I'm not yet going to agree that going to old-style N numbering is a good idea.

Either way, we should probably bump #64212 to 7.x-dev and propose an API change that will solve all cases of this problem in the future, based on our experiences here.

salvis’s picture

@dww: I need to think through the case that's currently failing more, but part of me thinks that it's being caused by people committing hook_update_N() implementations to the wrong branches at the wrong times.

Not at all. I'm maintaining a D5 version in the DRUPAL-5 branch and a D6 version in HEAD, and I want to keep their databases synchronized. IOW, I always create a hook_update_N+1() function for both versions at the same time. They have the same N+1, they do the same thing, and it is essential that they're run exactly once, either in the context of the D5 version (if the site is still on D5) or in the context of the D6 version (if the site has been upgraded to D6 before running the update). If the site is upgraded afterwards, the hook_update_N+1() must NOT run again!

The issue that was addressed with the structured numbering involves maintaining versions with dissimilar schemas in parallel, either under different Drupal versions or within the same version, but that is probably the rarer case within contrib.

dww’s picture

No, the issue that was addressed with the structured numbering involves the case where an older branch needs *ANY* schema changes at all. I don't think that's particularly rare.

Anonymous’s picture

So with this guideline A good rule of thumb is to remove updates older than two major releases of Drupal. in the documentation we could simply say, don't repeat the changes for more than one Drupal version. If I add a column for version 5200 of my schema then I don't repeat the hook_update_N for 6200.

sun’s picture

@earnie: You're missing the whole point. The maintainer of a contrib module needs to implement 520x and 620x if he/she maintains the module concurrently for Drupal 5.x and 6.x. What you are proposing would be equal to adding a intro notice "Don't use this numbering scheme, if you concurrently maintain a module for multiple branches." to the docs.

Anyway, as long as we do not have hook_upgrade(), hook_update_V_N(), or any other solution, the only workaround for this issue is to add sanity checks to each update function, i.e.

$ret = array();
if (!is_null(variable_get('new_variable', NULL))) {
  return $ret;
}
// Perform updates.

On another thought, one could even argue that forcing developers to have such validations in update functions is an improvement... ;)

Thoughts?

Anonymous’s picture

No, I'm not really misunderstanding. I understand that if you leave 5200 in the .install file for your D6 branch it will be executed if it hasn't already. There is then no need for 6200 because it should never be executed based on the filtering you provided it. I understand the perception here for needing 6200 but we've now removed the update numbers from the update.php UI and just execute the updates that need to be updated. So if the change to version 2 of your module is maintained in more than one branch the N for the lowest Drupal version supported should be the same for the versions upward.

So mymod_update_5200 is also included in the D6 and D7 branches because that is the lowest version of Drupal I support and I don't need mymod_update_6200 because the mymod_update_5200 already takes care of it. I know that you are thinking that 5200 belongs in the D5 branch and 6200 belongs in the D6 branch but no, there is only one change and that one change needs only one hook_update_N. We complicate the this issue because we believe that the N should match the branch; which isn't a bad thought, it just doesn't work that way.

dww’s picture

/me sighs -- much confusion abounds.

@earnie: The problem isn't that 5200 is still in the .install file in your example. The problem is that a user *already* ran 5200 on their site when they upgrade from 5.x-1.3 to 5.x-2.0. Now, they want to upgrade to 6.x-2.0, and update.php is going to think it needs to run 6200, but that's not what we want. You're saying the solution to this is "just don't have 6200, they already get that from 5200". However, what about users at 6.x-1.2 who need to upgrade to 6.x-2.0? They're screwed if there is no update 6200. Does that make sense?

@sun: Yes, the more different possible branch + schema scenarios we dream up in here that break one "solution" or another, the more I agree that without support from update.php in core, there's no complete solution to all cases that doesn't involve making some of the updates protect themselves from trying to do the same schema change twice.

salvis’s picture

@dww: No, the issue that was addressed with the structured numbering involves the case where an older branch needs *ANY* schema changes at all. I don't think that's particularly rare.

This issue only occurs when you fork your module, i.e. when you actively maintain more than one branch with different schemas (like 5.x-1.3 going to 5.x-2.0 with a big schema change, and continuing on to 5.x-1.4 with a smaller schema change (the *ANY* that you mention)). I agree that it's not particularly rare, but it's not the common case.

  1. I'd guess that the large majority of maintainers would want their users to upgrade to 5.x-2.x rather than having to indefinitely maintain a competing, incompatible 5.x-1.x branch in parallel with the 5.x-2.x branch.
  2. If the schemas of 5.x-1.x and 5.x-2.x evolve independently, then the task of providing an upgrade path from each 5.x-1.x to the then-current 5.x-2.x soon becomes overwhelmingly complex — it can't possibly be solved by a clever numbering scheme, but it will always need inside knowledge in the form of module-specific logic.
  3. Maintaining 5.x-1.x and 6.x-1.x with identical schemas (evolving in lock-step) is a very common case in contrib — this is the one that needs to be addressed first and foremost, in contrib.
  4. #3 is probably the most desirable way (from the community pov and maintainer pov) to maintain a module, but at some point the maintainer may decide to leave 5.x-1.x behind. At that point (but not sooner!) it would be a good idea to skip some numbers in the N+1 numbering sequence of 6.x-1.x for emergency surgery in 5.x-1.x, but when that becomes necessary, all the later update_N() functions need to be adjusted to check whether the emergency surgery was applied or not. This cannot be automated.

#3 should not be encumbered by attempting to solve #2, which is unsolvable in a generic way.

Note that I'm talking about contrib only here, core is a different matter, but even core resists evolving the 6.x schema from the day the 7.x version was opened up, because maintaining upgrade paths is difficult when the old schema evolves asynchronously.

Anonymous’s picture

However, what about users at 6.x-1.2 who need to upgrade to 6.x-2.0? They're screwed if there is no update 6200. Does that make sense?

No. If the upgrade sequence 5200 hasn't been run they would upgrade when they move to 6.x-2.0 would they not? The problem with my logic as I see it is if there are changes in 6.x-1.2 that don't appear in 5.x-1.2 and the sequence number then becomes greater than 5200.

What a mess. This can only be cured by a registry of named actions and if the action has already occurred once it doesn't happen again. Then the hook_update_N would just register the action to execute.

dww’s picture

@earnie: "If the upgrade sequence 5200 hasn't been run they would upgrade when they move to 6.x-2.0 would they not?"

Not if their {system} table thinks that schema_version for this module is already 6105 from when they installed 6.x-1.2. update.php just looks at the current schema number in the system table (C), and tries to run any hook_update_N() it can find where N > C.

@salvis: Yes, there are different use-cases in contrib. But, no, not everyone just has one version of their module, and that's not necessarily a bad thing. Some number (I don't have hard stats) of maintainers are doing exactly what the release system was designed to support (which we could never do at all previously): having both a stable, bug-fix only branch that most people depend on, and a new feature development branch where they can add experimental stuff without a) only doing their new features as they port to a new version of core nor b) adding instability to the version most of their end-users are depending on. This is desirable for both maintainers and end users. The bummer is that core's update.php is not a sufficiently powerful tool on its own to handle all the possible schema migration paths that might result from this situation.

sun’s picture

Status: Active » Closed (fixed)

After having dealt with many more module updates in contrib, I believe we can revert the status of this issue.

All updates I wrote (or rewrote due to this issue) were able to test/validate the module's data and conditionally run only. After all this escalation, I actually tend to like those tests. If a module tries to alter or upgrade data that does already, or does not, or does no longer exist, the update will fail. So now, we are forcing developers to double-check that their updates actually make sense, which can only be good.

There are only a few cases remaining where developers cannot do this. We need to

- either add db_primary_key_exists($table, $key) and db_index_exists($table, $index) to Drupal core, so modules can properly test whether a primary key or index update was already run.

- or add db_get_primary_keys($table) and db_get_indexes($table) to Drupal core, so modules can properly deal with the actual data of a table in the database.

In all other cases, modules are able to run updates conditionally already.

babbage’s picture

Given this thread is recommended in Writing .install files, but contains debate and a potential later rejection of the renumbering proposal after its initial implementation, before finally coming to a conclusion that apparently we probably should stick with this, it is somewhat unhelpful that the issue summary at the top doesn't actually summarize the issue, but rather is the original proposal—which wasn't the accepted one. (Rather, #2 was accepted.) I've read the thread, and don't think I'm the best person to update the summary... but I suggest someone probably should.

Edit: Well, just realized that the writing .install files version I was reading was the Drupal 5.x branch version, so perhaps this isn't such an issue. :)