Optimize goal changed

Review Request #1470 - Created Dec. 11, 2014 and submitted

David Turner
ity, jinfeng, jsirois, peiyu, zundel

Make goal changed approximately O(paths + targets), instead of something much worse.

Correctness: https://travis-ci.org/pantsbuild/pants/builds/43764453

Performance: tested on a megachange branch on Twitter's monorepo; previous time was on the order of hours; time with patch is 3 minutes.


  • 0
  • 1
  • 0
  • 1
Description From Last Updated
David Turner
David Turner
John Sirois
David Turner
Jin Feng
Stu Hood
Ity Kaul
David Turner
Review request changed

Status: Closed (submitted)

Patrick Lawson


The AddressMapper has all of this caching built in. You should probably discard this local optimization and just use the BuildFileAddressMapper (which is also on context and already bound to the BuildGraph on context) to do this candidate mapping.

  1. I'll submit another RB for that, since I just merged this one.

  2. Thanks!

  3. I'm a bit confused, actually. I'm trying to find (and inject) the set of addresses for each build file, parsing each build file at most once. How does anything in BuildFileAddressMapper help here?

  4. _addresses_in_spec_path is precisely what you want, with caching baked in. The only difference is that it works on logical BUILD files (one per directory) rather than individual sibling BUILD files in a directory. But that is semantically exactly what you want here, since searching for candidate targets is an operation on logical BUILD files and not individual ones.

    For reference, here is our internal implementation of this. We're in the process of open sourcing a lot of these internal utilities we've written, but hadn't gotten to this one yet: https://gist.github.com/anonymous/2d3fb59c1d548ce7e7c5. Feel free to use this, particularly _compute_source_addresses.

  5. https://rbcommons.com/s/twitter/r/1488/

This is a massive optimization, I'm surprised this wasn't caught when I added from_cache.

Tejal Desai

Ship It!