Who thinks it would be great if the Drupal Coding Standards included checks for Cyclomatic Complexity and maximum nesting levels?
Cyclomatic complexity is a metric used to check the complexity of a function by checking the number of paths through it. There are those that think this should be limited at 10, any more and the function is too complex and should be refactored into a number of smaller, more easy to read, functions.
A good place this has highlighted faults is with people's use of the hook_form_alter() which ends up with a switching if of doom when they should really be using hook_form_FORM_ID_alter in most cases to just work on the form they actually want to change.
Nesting levels is the number of indentations caused by a new code block.
The PHPCS Generic CoderSniffer rules include the logic for this check already and I've supplied a patch to the coder module to include if it gets included in the DCS:
http://drupal.org/node/1934594#comment-7171972
Questions:
- Should this be included in the Drupal coding standards?
- What should the cyclomatic complexity limit be? - PHPCS sets this at 10 for a warning, 20 for an error
- What should the maximum nesting level be? - PHPCS sets this at 5 for a warning and 10 for an error (quite lenient)
For more details on this metric, see: http://en.wikipedia.org/wiki/Cyclomatic_complexity
Comments
Comment #1
skyredwangI see measuring Cyclomatic Complexity in coder module is quite useful and educational as you have already proposed #1934594: Add checks for Cyclomatic Complexity. However, requiring a limit of Cyclomatic Complexity in coding standard might not be suitable at the moment for the 2 reasons below:
1. I tried to follow the wiki as well as "Structured Testing: A Testing Methodology Using the Cyclomatic Complexity Metric" (The link is point to a private file), I have not found any scientific or statistical evidence to suggest that the limit 10 (or any other number) is appropriate to apply in general.
2. In practice, there would be too many existing modules on Drupal.org that were not meeting the potential limit if introduced. If it got in coding standard now, we would be telling most people to re-do much of the our work over. But, we don't intend to send such message, do we?
Comment #2
johnennew commentedHi @skyredwang, Thanks for your comments!
1. The limits would need to be set as appropriate by the community. Personally, I found the 10 (warning) 20 (error) limit that PHPCS highlights functions that do look like they need a refactor. As an example of a 15 - https://api.drupal.org/api/drupal/modules!comment!comment.module/functio...
2. The DCS are often more aspirational and setting the standard high gives developers a target to aim at. As a case in point, run the current coding standards on the Drupal 7 node module, you'll get 400+ violations of the current standards! Often I see attempts at getting web agencies to set some baseline metric of the separation of concerns or clean, supportable code, usually something along the lines of no function longer than 10 lines. This is quite a subjective measure but I think that Cyclomatic Complexity via coding standards is the best suggestion I've seen to date.
Comment #3
fuzzy76 commentedJust my two cents: Most modules already break the existing code standards, so I doubt there would be many re-dos.
Comment #4
Angry Dan commentedI wasn't aware that there was a requirement to re-write code should the coding standards change? In this case you should be free to maintain existing code and failing functions, but new code should be aware of this measure. I don't think you can argue that it's not beneficial to have simpler code, and this is a good way of interpreting the readability and conciseness of code without setting difficult rules like "no function may be longer than 10 lines"
FWIW I'd like to see this in the standards in time for D8 contrib module developers.
Comment #5
Steve.C commentedDefinitely agree with Cyclomatic Complexity being a better measure than lines of code. Only concern would be when to warn and when to error.. My understanding is that a comprehensive switch statement could easily burn a few of your paths and require a refactor, but this can be a very neat, readable way of laying out an Abstract Factory Pattern, for example. I'd be happy to take a slapped wrist and consider the design but should it be an error?
I understand switches are the easy way out, but occasionally Drupal pushes you down that way, and when writing with a view to various levels of developer being able to extend and support, they can be the "right" way. Perhaps the more OO nature of D8 will make the temptation less, but taking the boundaries down at all might stifle the flow of new, DCS compliant modules, when they're not going to break and make up for readability and support in other ways.
Comment #6
johnennew commented@Steve.C - current codebase often fails Cyclomatic Complexity for the overuse of switch statements. In OO design, it's generally considered preferable to use different message types or classes that inherit from a common message class rather than trying to make choices within a function which makes the function more complex and means it behaves very differently depending on it's inputs. Clearly there are a number of different strategies to move away from switch statements (strategy and command patterns come to mind).
The point is that the Cyclomatic Complexity warning highlights that there is an issue and forces you to consider an alternative solution (if you want to be compliant). Perhaps it should remain as a warning and errors set much higher. I'm not sure DCS differentiates between warnings and errors - are warnings allowable but errors a definite no?
Comment #7
damien tournoud commentedUsing the cyclomatic complexity as a metric for code quality is really far from being a silver bullet.
The main drawback in the case of PHP code is that the only available tools evaluate the cyclomatic complexity by static analysis at the function/method level, which has two really nasty implications:
A common fallacy is to think that by breaking up a function/method into smaller chunks (and hence reducing its cyclomatic complexity) you reduce the complexity of its tests. That's really never true, and is just an effect of the above.
I fear that in the current situation, having cyclomatic complexity as code quality metric would just increase the overall complexity of Drupal, making the code paths less obvious and at the end of the day making the whole thing less maintainable.
Comment #8
johnennew commented@Damien Tournoud - what you say is of course true. However, we must remember the purpose of the DCS is to improve quality and provide some standard for compliance - it is not a provable metric for the software actually working, nor does compliance anywhere near guarantee a non-complex end product. This suggestion is for another tool to help capture compliancy issues where they can be captured. I do not think it makes the current situation worse but provides a mechanism to help make contrib authors think a bit deeper about the code they produce.
Comment #9
sunI wanted to comment, but @Damien Tournoud literally covered all aspects in #7 already.
Statistical tools like this may help you to locate and identify some offending code, but they are not sufficient and suitable as a silver bullet and as a hard/enforced coding standard.
Comment #10
Crell commentedI agree that we can't/shouldn't enforce a "if you have a CC higher than X you get an error" rule in coding standards or in coder module. However, simply reporting on CC may help to point people toward the concept of complexity, and thereby become a learning experience. So +1 on calculating and reporting it, -1 for "enforcing" it.
Comment #11
johnennew commentedThe PHPCS "Coder Sniffer" plugin for coder allows problems to be reported as either warning or errors. The Drupal Coding Standards, as I understand them, does not set these two thresholds; it's either in the standard (and therefore enforceable?) or not in the standard (and therefore ignored?). From @Crell's comments - is there a middle ground where DCS makes some recommendations, including CC, which are not part of the enforceable standard but used to point to potential code issues?
Comment #11.0
johnennew commentedAdded li elements to questions list
Comment #12
jhedstromI'm marking this postponed.
Since it won't be enforceable, it doesn't really make sense to add a limit to the DCS. Once patches are automatically tested for coding standards, this could be unpostponed to also report on CC. Alternatively, this could be moved into the coder queue as a feature request.
Comment #26
quietone commentedThere hasn't been interest in pursuing this in 10 years. And the related issue in the Coder project was closed as won't fix 9 years ago with no further discussion. #1934594: Add checks for Cyclomatic Complexity. Due to the lack of interest I am closing this as won't fix. If anyone disagrees, re-open adding an explanatory comment.