Shard out OSX CI.

Review Request #1873 — Created March 6, 2015 and submitted

benjyw, nhoward_tw, zundel
This manually mirrors most of the sharding in the main .travis.yml but
keeps ITs all in one shard to start.  If 4 shards proves stably
schedulable the next step will be just programatically mirroring the
matrix env from the main .travis.yml build.

Also fix further issues this re-sharding exposed in  Force caching of just the current python
interpreter by restricting paths and simplify current interpreter

Furthermore, setup a sane python for OSX CI - the stock 2.7.5 on OSX
10.9 encounters issues with absolute imports in the coverage module.

 .travis.osx.yml                                             | 19 +++++++++++++++++--
 tests/python/pants_test/backend/python/ |  7 ++-----
 2 files changed, 19 insertions(+), 7 deletions(-)
I applied this patch over on a pantsbuild/pants-for-travis-osx-ci clone
and pushed for a manual test here:

Linux CI went green here:
  1. The motivation for this change is the fact we get almost all OSX
    single-shard CIs timing out at 50 minutes and as the sharding here
    shows, ITs alone take ~30 minutes.
    I'm not sure how this will shake-out
    during peak hours (off hours gets 4 osx shards assigned ~concurrently),
    but even if single-file and slow, hopefully we at least get 1 travis osx
    ci green run for every ~4-5 linux green runs.
  2. .travis.osx.yml (Diff revision 1)

    I saw 37 minutes for this on

    I don't watch this build, so I'm not terribly concerned about its speed as long as it isn't failing on timeouts, but I did have a few thoughts about a follow-on trying to minimize the turnaround time from this build:

    If vms are scarce for osx, maybe we should just split integration tests into 2 shards instead of the full sharding. This will in theory cut the build time in half giving us the best percentage-wise speedup. We could also schedule these shards first to make sure that the long jobs get kicked off right away, then the other 3 jobs might run serially and still finish before the long integration tests.

    The other 3 tests run for a total of 22 mins, the longest being about 10 minutes. So this is what we might see in a timeline:

    -------------------> Integration test 1 (18 mins)
    -------------------> Integration test 2 (18 mins)
       -->               Run the shortest test next
          ----->         Schedule the 10 minute test
             ---->       Last shard gets a chance to run after our short test finishes.
    1. Definitely to your analysis if 4 shards do not schedule well, but if 4 shards are in fact stable and schedule well, I will definitely just mirror the linux sharding.
  3. .travis.osx.yml (Diff revision 1)

    Too bad we have to do this, it eats up around 45s to 1 minute.

    1. Indeed.  The OSX hacked up default pythons have historically been and are still irksome.
  1. Thanks Eric - submitted @
Review request changed

Status: Closed (submitted)