Added source root tests, updated the internal data structure to include a tree.

Review Request #1003 — Created Sept. 8, 2014 and submitted

zundel
pants
zundel/refactor-source-root
556
30bb55a...
pants-reviews
jsirois, patricklaw

Added source root tests, updated the internal data structure to include a tree to speed up the find() operation.

With 1000+ source roots in our environment, this appears to shave about 5 seconds off of a no-op build.

Retrofitted tests for SourceRoot.
Added new unit tests for SourceRootTree data structure

CI passed at: https://travis-ci.org/pantsbuild/pants/builds/34713735

  • 0
  • 0
  • 3
  • 0
  • 3
Description From Last Updated
ZU
  1. 
      
  2. src/python/pants/base/source_root.py (Diff revision 1)
     
     

    In theory, I think I could remove this dictionary and use the Tree datastructure to replace it entirely, but I didn't do that.

  3. src/python/pants/base/source_root.py (Diff revision 1)
     
     

    I'm not sure what _SEARCHED is for. It seems to be dead.

  4. FYI I wrote these tests first, then went back and changed the implementation. I could land this as 2 changes of you want to be better assured that this doesn't introduce a regression.

  5. 
      
ZU
JS
  1. LGTM - Thanks for finding and fixing this perf trap.
    1 small bug marked as an issue below, otherwise this is all small stuff.

  2. src/python/pants/base/source_root.py (Diff revision 1)
     
     

    It seems a bit odd to use a set. You guard uniqueness so this could just be a list - but then again you need to search in get so it could be a dict.

  3. src/python/pants/base/source_root.py (Diff revision 1)
     
     

    Genrally its summary sentence, blank line, body doc, blank line, param block.

  4. src/python/pants/base/source_root.py (Diff revision 1)
     
     

    sphinx accepts the type as :param string source_root: and handles specially. The source/render look like:
    https://github.com/pantsbuild/pants/blob/master/src/python/pants/backend/android/targets/android_binary.py#L20
    http://pantsbuild.github.io/build_dictionary.html#android_binary

  5. src/python/pants/base/source_root.py (Diff revision 1)
     
     

    perhaps directory or path, dir shadows the builtin

  6. src/python/pants/base/source_root.py (Diff revision 1)
     
     

    call the function: root_to_types.iteritems()
    That said - stick to .items() unless you know materializing the list of items is actually expensive:
    $ python3 -c '{}.iteritems'
    Traceback (most recent call last):
    File "<string>", line 1, in <module>
    AttributeError: 'dict' object has no attribute 'iteritems'

    In 3 .items() is lazy. If this does benefit from not materializing the list, a helper that forked on Compatibility.PY2 would be good: https://github.com/twitter/commons/blob/master/src/python/twitter/common/lang/init.py#L102

    1. OK. I thought thtat items() croaked at about 200 elements. There are > 1000 items in my repo, but normally they are added one root at a time. This call is only used for testing. I'll just mark it as such.

  7. src/python/pants/base/source_root.py (Diff revision 1)
     
     

    I read this wrong at first, maybe ":returns: a tuple of ..."

  8. src/python/pants/base/source_root.py (Diff revision 1)
     
     

    Maybe extract an iterate_path helper and use here and above:

    for component in self._iterate_path(path):
      ...
    
  9. You could just Address(spec_path, target_name) in these 2 Target types to avoid a re-parse.

  10. 2 blank lines between top-level types: http://legacy.python.org/dev/peps/pep-0008/#blank-lines

  11. kill extra blank line

  12. kill extra blank line

  13. 
      
ZU
  1. 
      
  2. src/python/pants/base/target.py (Diff revision 2)
     
     

    This cache reduced the number of calls to find() from about 100 to about 20 in a typical project, but since the find() operation is now a lot fiaster it didn't improve performance. I'm going to drop it unless someone has an objection.

  3. 
      
ZU
JS
  1. 
      
  2. src/python/pants/base/source_root.py (Diff revisions 2 - 3)
     
     

    You could just return self.children.get(key) which returns None on miss.

  3. src/python/pants/base/source_root.py (Diff revisions 2 - 3)
     
     

    kill dup statement

  4. src/python/pants/base/source_root.py (Diff revisions 2 - 3)
     
     

    Still broken - s/.items/.items()/

    1. How did that pass testing? Oh, because I never used it. I just killed the method.

  5. 
      
ZU
JS
  1. Ship It!

  2. 
      
ZU
Review request changed

Status: Closed (submitted)

Change Summary:

commit ec7e133
PA
  1. Could you please add a test that validates that this works when source roots are registered without target constraints? Internally here all of our source roots are registered with zero constraints, and we've seen regressions in SourceRoot before in this tolerance.

    Other than that lgtm!

    1. Oops, I didn't realize how late to this CR I was! Sorry!

    2. I added this unit test in https://rbcommons.com/s/twitter/r/1027/ (it passes)

  2. 
      
Loading...