Zinc patches to improve roundtrip time

Review Request #2178 - Created May 8, 2015 and submitted

Information
Stu Hood
pants
e07088e...
Reviewers
pants-reviews
benjyw, fkorotkov, ity, jsirois, nhoward_tw, zundel

(applies atop https://rbcommons.com/s/twitter/r/2177/)

This is an import of two changes we made to improve the roundtrip time from pants to zinc during an incremental compile with the isolated strategy (and parallelism, although that's orthogonal.)

The worst case that we've found for incremental compile is many unchanged targets downstream of a changed target. In that case, zinc is invoked for each downstream target, but usually just needs to do enough inspection of inputs to decide that it doesn't need to compile anything.

With a graph of 900+ upstream targets, detecting upstream changes was initially filesystem bound.

We did a few things to fix this. First, we determined that we weren't actually providing all relevant analysis to zinc, so it was recomputing in some cases. After fixing that, the next two hot things in the profile were:

  1. In order to hit the in-memory analysis cache, zinc uses a SHA1 hash of upstream analysis files in order to decide that it could use them. This was admirably thorough, but almost certainly overkill. Instead, we use the analysis filename and timestamp here. It would be nice to upstream a patch to include a UUID or something in the header of each analysis file to make this bullet proof though.
  2. To decide where classes have been defined, sbt uses a definesClass function that (for a given directory/jar) returns a lookup table from String=>Boolean which indicates whether the relevant directory/jar defines the class. Oddly enough, it was not using the upstream analysis to decide whether directories contained particular classes, and was instead stat'ing files and listing jars in all cases. So, instead, when we have valid upstream analysis this patch uses it to execute class lookups.

Together, these two changes took the roundtrip time for a noop zinc compile with 900+ upstream analysis files from 5+ seconds to <0.5 seconds.

deployed internally for java and scala. could definitely use tests though.

Stu Hood
Nick Howard (Twitter)
Benjy Weinberger
Benjy Weinberger
Stu Hood
Nick Howard (Twitter)
Stu Hood
Review request changed

Status: Closed (submitted)

Loading...