Repair depmap --graph

Review Request #2345 — Created June 11, 2015 and submitted

areitz, fkorotkov, ity, jsirois, nhoward_tw, stuhood, tejal, zundel
  • Simplify depmap text and graph handlers
  • Make depmap --graph work again
  • Add tests for depmap --graph
  • Eliminate extraneous/duplicated code
  • Convert to generators for immediate output/memory efficiency on larger projects

+ manual testing in birdcage, reviewing graphviz generated graphs, etc

  • 0
  • 0
  • 4
  • 1
  • 5
Description From Last Updated
  1. Looks like a good initial patch. It is definitely easier to follow than the existing implementation. I think some of the simplifications may have introduced some regressions, particularly around --external-only because there are no tests of that flag that I can see. I'd also like to see tests for the new flags.

    1. thanks for the review!

  2. Could you add tests for the new tree format?

  3. Also for show-types

  4. I think rev can be '', which is used to imply no rev. In that case, I think it'd look nicer for the external dep to not have a trailing sep.

    How about setting a is_internal_dep value to return in the isinstance if clauses?

    1. sgtm, refactored.

  5. It seems like this simplification will cause issuse with generating graphs with --external-only because internal graph edges won't be traversed. In that case, external dependencies will only be displayed if the target is a JarLibrary.

    I could be wrong, but I don't see any tests for that case, so it's hard for me to tell by inspection.

    1. I definitely see what you mean. however, I believe this is the way that depmap --external-only already works in pants today and was what I used for comparison testing:

      [illuminati birdcage (master)]$ ./pants depmap --external-only commerce/commerce-loadtest/src/main/scala:scala --print-exception-stacktrace | wc -l
      [illuminati birdcage (master)]$ PANTS_DEV=1 $PANTS_INTERNAL/pants depmap --graph --external-only commerce/commerce-loadtest/src/main/scala:scala --print-exception-stacktrace | grep "ellipse" | wc -l

      currently this seems to only enumerate the jar_dependencies attribute of a given first level target - but you're right it does not properly recurse the full tree even tho it likely should.

      ultimately '--graph --external-only' is kind of tricky tho. for the text output, --external-only only emits external items (e.g. jars) without printing their parent nodes. for the graph itself, if we don't emit internal parent nodes there will be nothing to draw edges to. I almost considered removing --external-only support for --graph for this reason - as the text output is actually just as useful as the graph output in this case because the graph output doesn't reveal any additional structure beyond a flat list without changing how internal nodes are emitted in this case.

      given this matches how --external-only works today, I'm kind of inclined to ship as-is and then follow-up with a fix+tests for --external-only in a smaller, future iteration. does that sound reasonable or would you rather I fix this as part of this change?

    2. While I was checking this I noticed that the text output for --external-only has changed--it now prints the internal target, and prints the external ones indented. It's possible that that could break scripts that depend on the existing behavior.

    3. yeah - I could see that. fixed:

      [illuminati birdcage (master)]$ PANTS_DEV=1 $PANTS_INTERNAL/pants depmap --external-only commerce/commerce-loadtest/src/main/scala:scala --print-exception-stacktrace > a
      [illuminati birdcage (master)]$ ./pants depmap --external-only commerce/commerce-loadtest/src/main/scala:scala --print-exception-stacktrace > b
      [illuminati birdcage (master)]$ sort a b | uniq -u
      [illuminati birdcage (master)]$

      the fix I have in mind for traversal can also break anything that depends on existing behavior as well, so think a separate review for that makes more sense - and possibly some improvements therein with a --display-leaf-nodes option etc.

  6. this could just be ''.join(...), then you wouldn't have to worry about the number of rhs elements.

    1. did that initially, but for some reason felt the .format() was a bit more readable at the time. switched back to ''.join.

  7. Could we bring back the dashed edge styling for internal edges?

    1. sure - readded, but I'd propose we style /external/ edges as dashed since there are usually more internal edges and the graph looks strange with a majority of dashed lines vs hard lines.

      I had initially removed this because in the original implementation I noticed the only edges that ever got dotted styling seemed to be self referential (i.e. was broken) and internal vs external dep nodes are already distinctly shaped as rectangles vs ellipses.

    2. That makes sense to me.

  8. It looks like even if a node has already been outputted, we'll retraverse its dependencies again. Maybe we could short circuit after printing the edge to the parent and save a retraversal of the nodes child graph.

  2. Hmm. I think isinstance(_, JarDependency) implies hasattr(dependency, 'rev') since JarDependency's __init__ sets it.

    Maybe something like

    is_external_dep = isinstance(...)
    if is_external_dep and dependency.rev:

    Right now, I don't think this behavior is unit tested for JarDependencys, it'd be nice if the depmap test suite included a JarLibrary in its fixtures.

    1. ah my bad, this was just a literal translation of the prior logic of getattr(...) is not None. fixed and added a test case for jar_library + the rev='' case.

  1. LGTM, now let's widen it to pants-reviews.

  1. Thanks for fixing this.
  2. s/=/ = /.  I prefer python kwarg style too, but the precedent is set in this file.
  2. FYI, you'll want to rebase to master. I removed these deprecated options and quite a bit of code from this file this morning. I don't think you'll find any major conflicts though.

  3. This should handle 'classifier' and 'type' too if you want it to be correct. You can mark this TODO() if you want, there's a lot of work to be done like this.

    1. thx, added a TODO to address this in a second iteration.

  1. Ship It!
  1. In master @
    Please mark this review submitted.

    A few things to note / fix going forward:
    + The bugs field just wants the issue number and no more
    + Your account fails to rbt patch - I'm pretty sure its because its lacking an email address or full name

    For the latter, I get this:

    $ rbt patch --debug -c 2345
    >>> RBTools 0.7.4
    >>> Python 2.7.8 (default, Oct 12 2014, 00:37:26) 
    [GCC 4.9.1 20140903 (prerelease)]
    >>> Running on Linux-4.0.5-1-ARCH-x86_64-with-glibc2.2.5
    >>> Home = /home/jsirois
    >>> Current directory = /home/jsirois/dev/3rdparty/pants
    >>> Running: git version
    >>> Checking for a Subversion repository...
    >>> Running: svn --non-interactive info
    >>> Command exited with rc 1: ['svn', '--non-interactive', u'info']
    svn: E155007: '/home/jsirois/dev/3rdparty/pants' is not a working copy
    >>> Checking for a Git repository...
    >>> Running: git rev-parse --git-dir
    >>> Running: git config core.bare
    >>> Running: git rev-parse --show-toplevel
    >>> Running: git symbolic-ref -q HEAD
    >>> Running: git config --get branch.master.merge
    >>> Running: git config --get branch.master.remote
    >>> Running: git config --get remote.origin.url
    >>> repository info: Path:, Base path: , Supports changesets: False
    >>> Making HTTP GET request to
    >>> Making HTTP GET request to
    >>> Cached response for HTTP GET expired and was modified
    >>> Making HTTP GET request to
    >>> Cached response for HTTP GET expired and was not modified
    >>> Making HTTP GET request to
    >>> Cached response for HTTP GET expired and was not modified
    >>> Running: git status --porcelain --untracked-files=no
    Patch is being applied from request 2345 with diff revision 4.
    >>> Running: git apply -3 /tmp/tmpgauPsy
    Successfully applied patch.
    >>> Making HTTP GET request to
    >>> Cached response for HTTP GET expired and was not modified
    >>> Making HTTP GET request to
    >>> Cached response for HTTP GET expired and was not modified
    Traceback (most recent call last):
      File "/home/jsirois/.pyenv/versions/rbt/bin/rbt", line 11, in <module>
      File "/home/jsirois/.pyenv/versions/rbt/lib/python2.7/site-packages/rbtools/commands/", line 133, in main
        command.run_from_argv([RB_MAIN, command_name] + args)
      File "/home/jsirois/.pyenv/versions/rbt/lib/python2.7/site-packages/rbtools/commands/", line 612, in run_from_argv
        exit_code = self.main(*args) or 0
      File "/home/jsirois/.pyenv/versions/rbt/lib/python2.7/site-packages/rbtools/commands/", line 206, in main
        not self.options.commit_no_edit)
      File "/home/jsirois/.pyenv/versions/rbt/lib/python2.7/site-packages/rbtools/clients/", line 782, in create_commit
        '--author="%s <%s>"' % (author.fullname,])
      File "/home/jsirois/.pyenv/versions/rbt/lib/python2.7/site-packages/rbtools/api/", line 301, in __getattr__
        raise AttributeError

    You should be able to repro using the same command on a clean new branch of your own.
    If your RBCommons account has a name and email set up, this may be an RBCommons bug.
    They have been prompt in resolving issues:

    1. hmm, strange - the same rbt patch command runs cleanly for me. I think it might've been a permissions issue - my rbcommons profile was marked private; I've just unset that, so hopefully should be good going forward.
    2. Confirmed, I can now patch this RB in normally.  Thanks for fixing that.
Review request changed

Status: Closed (submitted)