CI times

Mike Hearn
 

Hi team,

 

James started a discussion about the time it takes our test runs to complete. When we first set up TeamCity it took 20 minutes to complete CI gates, now it takes more like 80 minutes. Nobody likes this but there’s disagreement on what to do about it.

 

I’ve filed a couple of tickets against the core team to make some small quick fixes:

 

https://r3-cev.atlassian.net/browse/CORDA-1508 - set the --fail-fast switch on Gradle to ensure test runs abort ASAP

https://r3-cev.atlassian.net/browse/CORDA-1509 - investigate Gradle’s build cache

 

The main reason test times has gone up is that the product got bigger. We can optimize and improve things but ultimately the more thorough our testing gets, the longer it will take.

 

I’ve seen lots of solutions brought up. Some sound plausible to me (green) and others I find kind of questionable (red). I’ll go down the list, then we can debate them or introduce new ideas.

 

 

 

Parallel execution. Gradle can run tests in parallel. We’ve tried it in the past and disabled it because it was flaky, but we should make another attempt. One issue that crops up regularly when trying to parallelise tests in any distributed system is port conflicts. Perhaps we can find a way to make Gradle run each parallel test worker in a separate kernel container, which would isolate port conflicts and filesystems.

 

There have also been bugs in Gradle in the past, but many times this is caused by bugs in plugins rather than the core. Also Gradle does release quite regularly so we should re-test regularly too, to see how things progress.

 

 

Deleting code. Mixed. If we’re testing code that’s dead or useless then of course let’s get rid of it. But I’m not yet convinced there are major simplifications that will actually make anything faster.

 

In particular people like to rag on MockNetwork. That’s unfair – MockNetwork is slow because we didn’t keep it as a proper mock. It starts Hibernate these days! The intention was to provide a tool for testing apps that didn’t start the entire node.

 

 

Optimizations. It’s not clear to me we need to start HikariPool and Hibernate for every mock node, just to pick one clear offender – in test context we’re running against an in-memory H2 database and the generated SQL will be the same each time if the code doesn’t change. Do we really need Hibernate for this or could we find/build a simpler ORM that pre-generates SQL for H2 and doesn’t have a multi-second startup time. Our unit tests are slow partly because it takes seconds to start them up, when it used to be instant.

 

Likewise, somewhere along the line MockNetwork started doing classpath scanning, zipping and extraction of zips. It doesn’t have to do this I’m sure. It does it only because MockNode was designed to share code with the real node by default – that’s good for realism and makes it easier to add things to the node, but it hurts test time.

 

 

Modularity. Lots of people assert that more modules or more repositories are the answer. The only way this can work is if each CI run tests less. Testing less, or being smarter about what you test, is absolutely a valid way to improve things. But the same people who assert this are usually the first to assert that not running “gradle clean” between CI runs (the quickest way to only test the modules that are affected by a change) is ‘dangerous’ due to shared state. Well, we can’t have it both ways. Either we have some way of figuring out what modules need to be tested given a change, or we don’t.

 

Multiple repos is a fake fix; it means your brain acts as a replacement for Gradle’s dependency resolver, with lots of manual effort needed to break PRs up, figure out the right commit and thus CI ordering, reviewing PRs in parallel when changes can’t be reviewed in isolation e.g. because one PR is introduced a new API and another PR is using it, and so on. I’m unsure why this is better than just debugging Gradle until it can sort things out itself safely (assuming it can’t already). Additionally multiple repositories introduce atomicity issues, and are often used as ways to try and evade code review requirements.

 

If you’d like to argue for testing less in order to ungate a PR then do so, but please admit that’s what you’re doing - don’t hide it behind unrelated software engineering practices.

 

I’d like to see some evidence that we do get a lot of broken builds if we don’t force Gradle to redo everything each time. I suspect we will hit problems with agents that switch clones between branches regularly. If we preserve workspaces for open PRs, then it means re-runs as a PR evolves will go a lot faster though, and that’s normally pretty safe in my experience. Likewise for testing master – if Gradle breaks on a fast forward merge, let’s figure out why and report it as a bug to the Gradle folks, as ultimately running less is a key way to speed up CI.

 

 

Smaller PRs. Same as modularity. Smaller PRs would make test times worse not better, because you’d end up rerunning the whole integration test suite more times (which is why people don’t always do it).

 

It’s a good idea to make PRs smaller because it makes code review better, but it doesn’t seem related to CI time to me. It’s also tempting to not bother as the costs are externalized onto your reviewers. I do it, we all do. But.

 

One trick I like to use is to put small tweaks and cleanups into separate commits, as you work. The latest version of IntelliJ has a cool feature to help with this – you can commit specific chunks in a file without manually separating them out.  To do this, make sure your gutter is highlighting changed lines. Make your change, then click the coloured bar at the left. It’ll open a toolbar that lets you individually revert changes, copy them etc – this has been there for a while. But in the newest version there’s also a dropdown that’ll say “Default”. Click it and create a new changelist called “Small fixes”. It’ll appear in your version control area (Cmd-9 on a Mac to show it or click the tiny box at the bottom left). If you diff the Small fixes vs Default changelist you’ll discover you can now commit those marked changes independently:

 

 

Once you build up a bunch of commits that are small and easy to review, use “git stash” to stash your work in progress, then use “use “git checkout -b myname-small-fixes” to create a branch with your small fixes commits and upload it to github for review. Then switch back to your original branch, unstash and continue. You should be able to get a quick review of the small fixes because each commit will be logically described, independent and because the PR itself will probably not be changing anything important. This will make the review of the final change easier on the reviewer.

 

This is one of the rare times I actually follow my own advice – if you check my PRs then you’ll see several that are described as minor, tweaks, small fixes etc. I have done real work too 😊 it’s just that most things I do result in multiple PRs of tweaks and one PR of real upgrade. So do it!

 

 

www.r3.com/email-disclaimer

Join corda-dev@groups.io to automatically receive all group messages.