Tests for cache.inc need expected fails

cwgordon7 - June 29, 2008 - 07:25
Project:Drupal
Version:7.x-dev
Component:tests
Category:bug report
Priority:critical
Assigned:Unassigned
Status:postponed
Description

This information is from the code coverage report (see http://coverage.cwgordon.com/coverage).

We need to test:

1) Cache flushing.
2) Minimum cache lifetimes.
3) Cache clearing with minimum cache lifetimes.

#1

cwgordon7 - June 29, 2008 - 08:04
Category:task» bug report
Priority:normal» critical
Assigned to:cwgordon7» Anonymous

Bumping to critical on behalf of chx.

#2

cwgordon7 - June 29, 2008 - 08:05

Bumping to critical on behalf of chx.

#3

R.Muilwijk - July 1, 2008 - 06:58

I'm making tests for the expiry part

#4

R.Muilwijk - July 1, 2008 - 22:32

Here my first attempt for the tests. Please comment on how to improve them. By writing these tests I found 2 bugs making 3 assertions fail.

For two a easy patch is provided:
http://drupal.org/node/277440

For one we have to decide what to do:
http://drupal.org/node/277448

AttachmentSize
add_cache_tests.patch14.17 KB

#5

R.Muilwijk - July 1, 2008 - 22:37
Status:active» patch (code needs review)

#6

R.Muilwijk - July 2, 2008 - 07:14
Assigned to:Anonymous» R.Muilwijk

#7

Dries - July 2, 2008 - 20:40
Status:patch (code needs review)» fixed

I've made some minor changes to this patch and committed it to CVS HEAD. It might still be worth a review from someone else.

#8

R.Muilwijk - July 2, 2008 - 20:51

Hmmz Dries you should know I don't have real much experience with testing untill Drupal.. so I'd really like some people to give some feedback about the tests.

#9

catch - July 3, 2008 - 09:11
Title:Tests needed: cache.inc» cache.inc test failures
Status:fixed» patch (code needs review)

There was a discussion on whether tests known to fail should be committed to core at http://groups.drupal.org/node/12051. Fairly inconclusive, but the general view was they should stay in the issue queue until the bug is fixed, or need some special handling to make it clear what's going on. Since there's no special handling for known test failures in place, it makes it near impossible for anyone to keep track of what's supposed to pass or fail - which means a lot of people aren't running all tests when reviewing patches, and when they do, they have to ask one of about three people which are known failures and which aren't.

So here's a revert for that hunk - which could then be attached to http://drupal.org/node/277448 to demonstrate the bug.

AttachmentSize
revert-cache.patch6.92 KB

#10

pwolanin - July 3, 2008 - 17:54
Status:patch (code needs review)» patch (reviewed & tested by the community)

yes please - much better to have the bug fix and the test in the same issue

#11

Dries - July 4, 2008 - 07:09

I'm not sure. It isn't necessarily a bad thing to see that something is broken. It actually encourages us to go and fix it. Why would we start hiding important bugs?

#12

catch - July 4, 2008 - 09:22

Dries, that's what I thought too, but I was persuaded - at least the way things are set up at the moment.

There's two immediate wins for testing in terms of development:

1. Knowing if your own code broke something
2. testing.drupal.org helping to stop anyone's code breaking stuff.

For #1, developers need to know that the testing framework is reliable - this should mean 100% passes all the time unless they broke something.

At the moment there, are many reasons why a test might fail:

* Because there's an existing bug in core
* Because there's a bug in a test
* Because there's a bug in the test framework
* Because my particular development environment runs PHP as CGI, doesn't have clean URLs, has a bad cURL version
* Because I've CVS upped and Dries just committed some broken code
* Because I've CVS upped and Dries just committed a test which exposes some broken code
* Because I've written some code which exposes a deficiency in the test framework
* Because I've written some code which exposes a latent bug in a test (bad variable_set in a test which wasn't causing failures before - i.e. recent comment module patch)
* Because I've written some code which requires a test to be updated (string change or whatever)
* Because I've written some broken code and this is causing test failures

Anyone trying to write patches for Drupal core - especially new developers or people new to testing, needs to know that 99% of the time, fails are down to the latter two reasons.

Otherwise, I can be writing a patch, cvs up, and suddenly poll module's got 7 failures, or cache.inc's got 1 failure - and I have no way to know why it's broken. Even if I know where to look for core commit messages and how to apply patches, I'll have to go through each recent commit individually, find the patches, revert them one by one and run all tests to compare whether it broke or not (as I did for poll a couple of days ago to find out it was the content generator header which ought to have nothing to do with poll module). If they don't know where cvs logs and the issue queue are, or that we have a policy of committing tests that fail (and no-one knows if we do or not at the moment), they have no way to know at all. None of the failing tests have any kind of pointer back to the issues they relate to - one new failure (poll.test) had no issue associated with it until I spent about an hour tracking down the commit that triggered it.

Additionally, as I understand it, it'll be impossible to run testing.drupal.org unless core has 100% passes - and testing.drupal.org is essential for filtering out patches which break tests. There's over 350 patches at CNR or RTBC in the queue, many don't even apply any more. We have no way of knowing this without someone manually downloading the patch and trying it themselves and a massive shortage of people who review patches. That's before we even get to checking whether they break tests or not. testing.drupal.org definitely can't know why tests fail - so it won't be able to accurately report back to the issue queue.

Since we have an issue tracker, with a list of critical bugs, this ought to be enough encouragement to fix something. If we could have a special group for 'known issues' - write test cases for that, and ensure the testing module (and testing.drupal.org) doesn't run them as 'run all tests' - then that might be a good option - especially if they also prominently point back to related issues. But that doesn't exist yet, so at the moment, unless you spend all day on the issue queue, watching commit logs, running all tests, then passes and fails appear seemingly at random, and it's going to be very hard for people to 'embrace testing' when it appears so unpredictable.

#13

boombatower - July 4, 2008 - 15:59

Agree with catch about trying to get/maintain 100% pass.

Not sure if we want a separate "group" of test for know issues as that would create a rather odd facet of the testing framework. Having tests that are intended not to pass just seems odd. Rather I would vote for keeping the tests in the issue queue with the bug. Anyone interested in fixing the bug can apply the test patch and work until it passes. That also works for t.d.o which will apply the test when it checks any patches for that bug.

The point of the tests is to ensure bugs are not introduced and to do that they need to pass so although having failing tests points out bugs it isn't the point of the tests.

</mytwocents>

#14

Dries - August 2, 2008 - 20:15
Status:patch (reviewed & tested by the community)» patch (code needs work)

Sorry, I think we should optimize for making Drupal work, rather than for making the tests pass.

#15

catch - August 2, 2008 - 21:15
Status:patch (code needs work)» postponed

Alright then, postponed until http://drupal.org/node/277448 is fixed

#16

pwolanin - August 2, 2008 - 21:29
Status:postponed» patch (code needs work)

@Dries - I have to agree strongly with catch on this - we need to normally have all tests pass so the existing tests can meaningfully and reliably detect regression and so that developers accept them as a normal part of development.

#17

webchick - August 2, 2008 - 22:11

Rant mode: Engaged. ;)

Paraphrased from http://webchick.net/itch-of-the-week/fix-testing-crisis ... go read that for the whole long-winded thing:

Our #1 priority is to fix the existing tests so that they all pass. Without this, our testing framework is basically useless. If a developer new to testing can't trust that what's there now is working, how can they possibly know if the code they just wrote breaks because they did something wrong, or because something was broken before they got there? Every time there's a bug in the testing framework, and every time an existing core test fails to run, this serves to completely destroy developers' confidence. If we're lucky, these people will go work on other things for a little while, and then check back periodically to see if things have been sorted things out yet. If we're unlucky, they start to develop animosity and resentment about the very idea of testing, and then start to distance themselves from doing development and encourage others to as well. Either way, the end result is stalling development on D7, and fewer people to spread the work around of writing further tests because now not only do you have to grok testing, you also have to be up on the current state of passing tests. You also don't know if your cache-related code broke something new or if it's ok to disregard the result because cache tests are known not to pass.

Additionally, this is a pre-requisite for the #1 (and realistically, only) way to keep the existing tests passing: testing.drupal.org.

When I originally wrote that post on May 24, we had more than half of our tests failing. A month later, on June 24th, we were down to 3. Joy, enthusiasm, and other such things ensued. Now, I see we're suddenly back up to 26. :( This makes me tear my hair out in chunks. And I already have too little of it as it is! ;)

We already have a means for indicating that things are broken and must be fixed before release. It's called category: bug report, priority: critical. Put the tests that knowingly fail inside the bug reports marked thus, and they'll get rolled in with the patches that fix them, so our tests stay at 100% passing at all times.

I've just been informed by chx that the awesome testing party has been accepted as a session at Drupalcon Szeged. There's two ways this could go:

#1: All tests pass. People who attend write their tests, instantly discover if they've introduced a bug, and are able to fix it. Testing experts remain on-hand for trickier questions related to testing, or for giving newbies extra guidance. You know, the stuff testing experts are meant to do. :)

#2: Some tests pass, some tests don't. People who attend write their tests, but are mystified whether the failures they get are because of something they did or something that's already broken, and there's no way for them to know without flagging down a testing expert. Testing experts then become inundated with requests to look at each test and see if it's because of a known failure or not. Mentoring doesn't happen, because all of our time is spent on assessment.

I know which one I'd prefer. ;)

#18

catch - August 2, 2008 - 22:20
Status:patch (code needs work)» patch (code needs review)

While I covered most of my views on this in #12, I should note that this particular failing test is also based on an as yet undecided behaviour of the menu cache - see R.Muilwijk's initial posts in this issue, along with #277448: Cache system isn't doing CACHE_TEMPORARY like documented and #277448: Cache system isn't doing CACHE_TEMPORARY like documented - until the former is resolved, we don't even know if the test is testing desired behaviour or not yet (since the latter patch doesn't cause the test to pass - so both the code and the test might be wrong).

edit: there are also almost 1,000 bugs in the queue against Drupal 6 and 7 (with and without patches) - I'd rather see the tests for those in their respective issues than in core. Tests are great for preventing regressions, but as webchick pointed out, not a replacement for the issue queue.

#19

boombatower - August 5, 2008 - 00:12

Once again, I agree with catch and webchick.

#20

cwgordon7 - August 5, 2008 - 03:11

I do not believe that new failing tests need to be committed in order to expose the bug - the bug has already been exposed by the test, let's fix the bug, and then commit the test, in order to uphold the image of our testing framework as reliable.

So, basically I agree with catch, pwolanin, webchick, and boombatower that this should be reverted in the absence of any other immediate way to make the tests pass. The proper procedure should be: 1) Submit patch containing test that fails. 2) Open core issue to fix cause of test failures. 3) Cross link between the issues. 4) Get the underlying fix committed. 5) Get the the test itself committed.

#21

webchick - August 5, 2008 - 03:22

Simpler workflow:

1) Submit patch containing test that fails.
2) Roll test into patch that fixes bug.
3) Commit fix and test at the same time.

#22

cwgordon7 - August 5, 2008 - 03:34
Status:patch (code needs review)» patch (code needs work)

That works too :)

Unfortunately, this reversion no longer applies, so cnw until it's rerolled...

#23

cwgordon7 - August 5, 2008 - 03:57
Status:patch (code needs work)» patch (code needs review)

Rerolled.

AttachmentSize
revert-cache.patch6.84 KB

#24

boombatower - August 5, 2008 - 04:13
Status:patch (code needs review)» patch (reviewed & tested by the community)

Applies, passes. Beautiful.

#25

Dries - August 5, 2008 - 05:28
Status:patch (reviewed & tested by the community)» fixed

I still disagree with this direction but I've committed the patch on popular demand.

#26

R.Muilwijk - August 5, 2008 - 06:34
Status:fixed» patch (code needs work)

I hope you guys make the same effort now trying to get these tests working right and in again.

#27

catch - August 5, 2008 - 08:05

Thanks Dries, this'll make it a lot easier to keep track of regressions. R.Muilwijk - do you think the test should be moved over to #277448: Cache system isn't doing CACHE_TEMPORARY like documented ?

#28

R.Muilwijk - August 5, 2008 - 08:14

It's fine by me... though I don't understand the reason of all the 'Tests needed: xxx' issues then.

#29

catch - August 5, 2008 - 08:52

There's two kinds of "Tests needed: xxx" - issues.

1. Issues for completely untested bits of core discovered via the code coverage report. These were opened up en masse so we can track what needs to be done easily, and point people to one of them if they say "I want to write a test, where do I start?". The assumption is that the tests for these issues will pass in 99% of cases.

2. Tests for bugs which have already been fixed, but for which we want posthumous tests for the bug so it doesn't come back to life. These tests also ought to pass once written since a bug fix already went in.

What we're missing is a policy decision on Dries as to whether we commit tests which fail due to bugs in core without the associated bug fix - this issue is as close to one as we've got at the moment. Ideally - failing tests from 1., get their own critical issue for the bug itself, or attached to an existing bug report that doesn't have a test for it yet so we can track the bug and the test together.

#30

pwolanin - August 5, 2008 - 13:37

@catch - I think the ideal (and perhaps informal policy) should be that the test and bugfix should be part of the same issue - so that we don't add failing tests and can keep the test and patch in sync.

#31

boombatower - August 5, 2008 - 16:13

@pwolanin: That's been my hope.

#32

BioALIEN - August 5, 2008 - 17:00

Fully agree with #30. Anything less would make it very confusing for us all.

#33

R.Muilwijk - August 5, 2008 - 19:09

@catch: so if I'm not mistaking at the moment you write a test for one of the Tests needed: xxx series and you discover a bug. You post the tests that are working there and the one which discover a bug you post a new issue with the fix and test in the patch.

#34

catch - August 5, 2008 - 19:33
Title:cache.inc test failures» Tests for cache.inc
Status:patch (code needs work)» postponed

R.Muilwijk: yep, I'd add 'search for possible duplicates' before posting the new issue to that ;) but yeah I reckon that's the best workflow. Bit of a cross-post, but #280972: Add a new project issue status option: active (reproducible) would help those issues get more notice than just 'active' if there's no obvious patch.

I'm going to mark this postponed pending the other issue - then we can close it when the test + fix get in from there.

#35

chx - August 27, 2008 - 20:27
Title:Tests for cache.inc» Tests for cache.inc need expected fails
Assigned to:R.Muilwijk» chx
Status:postponed» patch (code needs review)

So Barry Jaspan whined how it sucks that we can't commit tests that have known failures. #9 above says "Since there's no special handling for known test failures in place" -- welll now we have. See $this->expectedFail()->assertCacheRemoved(t('Temporary cache with lifetime does not exists after wipe.')); -- easy! We can add expected fails which still make a test pass but later, like during code freeze we can focus on it and fix it. Just dont ship with expectedFail in place. That's one grep to run on the code or a select on the the simpletest table for some utility module.

AttachmentSize
expected_fail.patch15.34 KB

#36

boombatower - August 27, 2008 - 23:56

I don't really see the usefulness. Having tests in issues avoids the extra bloat and maintains a simple workflow.

#37

chx - August 28, 2008 - 05:17

I added doxygen and changed the expected fail icon to a warning. But it's still a pass so green. As it's so much easier to write a test which reveals a bug than to actually fix a bug this is a very useful feature IMO. And I hope Barry will chime in protecting his feature request :)

AttachmentSize
expected_fail.patch15.53 KB

#38

cwgordon7 - August 28, 2008 - 05:30

Boombatower - that was my initial thought as well, but the more I thought about this, the more sense it made. It's an easy way for us to distinguish assertions that actually fail in core from the assertions that you may have broken with your patch; it also means that tests that expose bugs in core can be committed while still maintaining the 100% test pass rate. However, in contrast to simply attaching the tests to the issues that fix the core bugs, we can now commit tests to core, where everyone will be aware of them (as they are marked 'expected fail') but not be confused by them (as these fails are 'expected', and not your fault).

I don't understand your complaint about extra bloat. What extra bloat? This patch? It is only a few extra lines of code. If you are referring to the tests themselves that are now getting committed, isn't that just what we want, many many more tests?

I disagree that the current workflow is any simpler than the proposed one here. "Attach failing tests to issues where the fails are corrected and don't commit them to core" is no simpler and, I would argue, much more unintuitive then "Just mark the tests that fail due to core bugs as an expected failure."

I am very for this patch.

#39

bjaspan - August 28, 2008 - 07:19

The ability to declare expected failures is a standard property of mature testing systems. It's basically essential.

For the record, an expected failure that *passes* must be considered a full failure. Suppose that a bug exists. Someone writes a test and marks it as expected failure. Later, someone fixes the bug. If the expected failure test continues to pass, it is likely that no one will ever removed the "expected failure." Now, if the bug is re-introduced, the test will continue to pass because it is still an expected failure. Instead, if the expected failure fails when the underlying test actually passes, someone will remove the expected failure indicator, and if the bug is re-introduced the test will correctly fail.

I'll try to review the patch now.

#40

bjaspan - August 28, 2008 - 08:34

FYI, I've found a few bugs in the expected failures patch. Also, at Karoly's suggest, I will be re-posting the expected failures patch as a separate issue instead of tying it to the cache.inc tests. Bu I'm not done yet. I'll post that URL here when it exists.

#41

bjaspan - August 28, 2008 - 09:30
Status:patch (code needs review)» patch (code needs work)

I've posted my new version of the expected failures patch at http://drupal.org/node/301005. The version in this issue's patch should be removed.

#42

chx - August 28, 2008 - 12:19
Assigned to:chx» Anonymous
Status:patch (code needs work)» postponed

http://drupal.org/node/301048 moved the expected fail functionality there.

#43

bjaspan - August 28, 2008 - 12:38
 
 

Drupal is a registered trademark of Dries Buytaert.