Avoid redundant traversal in classpath calculation
Review Request #1714 - Created Feb. 5, 2015 and submitted
|fkorotkov, ity, jsirois, nhoward_tw, zundel|
There is no need to look at the jar_dependencies of jvm targets that are not jar_libraries, because we are already walking them, and the walk will visit their jar_library dependencies. Also, there is no need to visit a target more than once, so keep a set of targets we have already processed, and do not allow our walk to descend into those targets.
For performance testing, ran ./pants goal resolve [some large directory]:: on a Twitter internal repo (from clean); total time saving is 40s.
Please note the override of this in
jvm_binary.py. There might also be external users (though not 4sq) who are relying on the behavior of overriding
We're about to eliminate our use of ivy internally, so I don't have an opinion on interface changes here, but you should definitely add Eric Ayers to this review.
While looking at this, I noticed that it depends on get_jar_dependencies, which also does an uncached walk. https://github.com/pantsbuild/pants/blob/0dba5f50fb8052dffeed314addc35a731980ecb4/src/python/pants/backend/jvm/targets/jvm_target.py#L70 Maybe that should be updated as well?
I think that means that for each target as yet unvisited, we traverse all it's dependencies to gather jars, then gather revs from those jars, then traverse each of its dependencies again to gather their jars's revs.