I fixed some other minor issues with Corsix' line change retrieval patch, there's just this one remaining, not very pressing, but would be nice to have.
The issue is that the function signature of the newly introduced versioncontrol_get_commit_statistics($commits, $commit_actions = NULL) is not very flexible. That means if a backend provides multiple kinds of statistics (e.g. line change counts and file sizes) and those are differently expensive (e.g. one is loaded from the database while the other needs direct VCS binary calls), then get_commit_statistics() always needs to load both of them as it doesn't know which statistics the caller needs. This causes a potential performance issue which wouldn't be necessary if get_commit_statistics() could selectively choose which statistics to retrieve.
Possible solutions:
a) get_commit_statistics() gets a third (or rather, first) variable that specifies which statistics are needed by the caller. Probably an array with a list of strings that identify each required metric. Potential problem: more complicated API.
b) get_commit_statistics() is renamed get_line_change_counts() and thus isn't expected to potentially retrieve multiple metrics at once. Problem: inefficient - multiple kinds of statistics might need more independent calls when they could be combined in one get_commit_statistics() function.
Also, with multiple independent metrics being retrieved, there should be semantics defined for the case when a backend doesn't support it (but another might). Maybe we should introduce an information function that tells the caller which statistics are provided by each backend and maybe also how expensive those are.
Or maybe that's all just overengineering (note the number of conditional qualifiers :P ), and there will be no problems with the way it works now? Feedback, please!
Comments
Comment #1
jpetso commentedAs described in comment #10 of #216371: MI#1: Move item handling and commit branches into the API module (bullet point 6), the lines-changed information is now in the API module, and therefore it's possible to fetch this information already on fetching source items. Being known to the API module also makes it possible to get rid of this function in the first place, and have an (optional) $item['line_changes'] array which does the same thing. If someone wants to calculate line changes for a whole commit, that needs to be done manually by the caller.