Incremental project import for controlled indexing

Review Request #4032 — Created June 29, 2016 and submitted

4181, 4186, 4030
benjyw, nhoward_tw, peiyu, stuhood, zundel

This change allows user to incrementally import a large project to minimize the time blocked by indexing before iterating on code or running tests.

1. At import stage, "Enable Incremental Project Import" option is added.
2. If selected, GUI will ask user how many levels of build graph to import given the max level. level 0 means target roots, level 1 means up to direct deps, so on and so forth.
3. If user wants to incrementally import more of the project, simply click 'Pants -> refresh project', then the dialog in step 2 will reappear once pants export is done.

level 0: shows Distances is not resolved.
level 1: shows Distances is resolved.

Sample stats for two internal projects regarding time saved before user can iterate:
Project 1: The indexing time is down from 590s to 346s (41% saving) with level 2 (total 8 levels)
Project 2: The indexing time is down from 350s to 160s (52% saving) with level 1 (total 6 levels)

Next step:
Currently IntelliJ does not recognize some of the test classes because their junit dependency is too far, so user has to trial and error to figure out the ideal depth to import, but we can optimize that down the road given

  • Junit will be injected into junit_test() target
  • Include all transitive junit dependees during import

Other minor changes:
* Pants cache invalidate action and Refresh project are now DumbAware, meaning they can be clicked while project is still indexing.

Loading file attachments...

  • 0
  • 0
  • 1
  • 0
  • 1
Description From Last Updated
  1. Is PantsIncrementalImportManager missing from this review? Or did it already exist for some reason?

    1. PantsIncrementalImportManager was on the 2nd page of the review.

      Speaking of which, I need to rename it as ExportCacheManager. It was used to trigger subsequent imports (say user wants to adjust the depth from 1 to 2), but now it is covered by project refresh which uses the cached export result by default if project imports incrementally to start with. If user wants to do a pristine refresh, they can invalidate the pants plugin caches first. I am going to demo this feature in a gif.

    2. derp. Thanks! Reviewing page 2.

  2. resources/META-INF/plugin.xml (Diff revision 6)

    Maybe "Import more of the transitive deps of the project"?

    1. Action removed. "Refresh" will do the job.

  3. Should probably just say that you already know it isn't recent enough.

    1. Rephrased to No target root found for constructing the build graph to support incremental import. Please upgrade Pants or disable this feature. Message moved to

      Added pants requirement right next to the check box. Enable Incremental Project Import (Pants 1.2.0-dev0 or above required)

  4. processTargets is actually adding targets to the graph, yea?

    1. removed, and refactored so there's only one constructor public BuildGraph(Map<String, TargetInfo> targets)

  5. This should be a TODO or FIXME I think... it will be N^2*M for N nodes and M average deps.

    Maybe just store allNodes as a map from an address to a Node?

    1. refactored maxDepth(), so it is N*M

  6. This doesn't return nodes by level at the moment. But it seems like it would be pretty easy to make it do that by only inserting it (with the current level) the first time you see it?

    1. Renamed to getNodesUpToLevel, hope it will be more self explanatory.

        // level 0 - target roots
        // level 1 - target roots + direct deps
  7. It doesn't look like this needs to be an inner class.

    Maybe you could make it a file-private class by leaving it in this file, but moving it out to the top of the file as class Node.

    1. moved out.

  8. This is an Entry, not an entrySet.

  9. depth might be a clearer name for this here.

  2. Would be good to include some explanation here as to why being able to fetch the results of old exports is necessary.

    Also, is the export call itself to expensive to just re-run? That seems like it would be less error-prone.

    1. too*

    2. It was made partially because I wanted to iterate faster on the code. However, looking at possible confusion it may cause to developers and the support work, it is probably better to remove export cache.

    3. removed.

    1. removed along with

Review request changed

Status: Closed (submitted)

Change Summary:

3027a708a87f993590354383d90a1f6872e92dd3 thanks!