Shard out OSX CI.
Review Request #1873 - Created March 6, 2015 and submitted
|benjyw, nhoward_tw, zundel|
This manually mirrors most of the sharding in the main .travis.yml but keeps ITs all in one shard to start. If 4 shards proves stably schedulable the next step will be just programatically mirroring the matrix env from the main .travis.yml build. Also fix further issues this re-sharding exposed in test_test_builder.py. Force caching of just the current python interpreter by restricting paths and simplify current interpreter check. Furthermore, setup a sane python for OSX CI - the stock 2.7.5 on OSX 10.9 encounters issues with absolute imports in the coverage module. .travis.osx.yml | 19 +++++++++++++++++-- tests/python/pants_test/backend/python/test_test_builder.py | 7 ++----- 2 files changed, 19 insertions(+), 7 deletions(-)
I applied this patch over on a pantsbuild/pants-for-travis-osx-ci clone and pushed for a manual test here: https://travis-ci.org/pantsbuild/pants-for-travis-osx-ci/jobs/53298890 Linux CI went green here: https://travis-ci.org/pantsbuild/pants/builds/53300252
The motivation for this change is the fact we get almost all OSX single-shard CIs timing out at 50 minutes and as the sharding here shows, ITs alone take ~30 minutes. I'm not sure how this will shake-out during peak hours (off hours gets 4 osx shards assigned ~concurrently), but even if single-file and slow, hopefully we at least get 1 travis osx ci green run for every ~4-5 linux green runs.
I saw 37 minutes for this on https://travis-ci.org/pantsbuild/pants-for-travis-osx-ci/jobs/53298890
I don't watch this build, so I'm not terribly concerned about its speed as long as it isn't failing on timeouts, but I did have a few thoughts about a follow-on trying to minimize the turnaround time from this build:
If vms are scarce for osx, maybe we should just split integration tests into 2 shards instead of the full sharding. This will in theory cut the build time in half giving us the best percentage-wise speedup. We could also schedule these shards first to make sure that the long jobs get kicked off right away, then the other 3 jobs might run serially and still finish before the long integration tests.
The other 3 tests run for a total of 22 mins, the longest being about 10 minutes. So this is what we might see in a timeline:
-------------------> Integration test 1 (18 mins) -------------------> Integration test 2 (18 mins) --> Run the shortest test next -----> Schedule the 10 minute test ----> Last shard gets a chance to run after our short test finishes.
Too bad we have to do this, it eats up around 45s to 1 minute.