Introduce a jvm binary shader.

Review Request #2050 — Created April 9, 2015 and submitted

jsirois
pants
jsirois/issues/663/introduce_shader
663, 1362, 1388
2052
cb3adf8...
pants-reviews
nhoward_tw, patricklaw, zundel
This tool initially handles just binary jar auto-shading to support
shading of bootstrapped jvm tools.  It should be easy to extend to add
support for fully customizable non-binary jar shading though.  The tool
uses jarjar to perform the shading but the interface steers clear of
jarjarisms so this should be a stable base to move forward with even if
we end up needing to swap out the shading backend.

 BUILD.tools                                     |   5 ++
 src/python/pants/java/executor.py               |  19 ++--
 src/python/pants/java/jar/BUILD                 |  15 +++-
 src/python/pants/java/jar/shader.py             | 236 ++++++++++++++++++++++++++++++++++++++++++++++++++
 src/python/pants/java/nailgun_executor.py       |   4 +-
 tests/python/pants_test/java/jar/BUILD          |  11 +++
 tests/python/pants_test/java/jar/test_shader.py |  94 ++++++++++++++++++++
 7 files changed, 374 insertions(+), 10 deletions(-)

The jarjar jar (unused in this RB, used in next) was synced to bintray from
this commit: https://github.com/pantsbuild/maven-repo/commit/782e0cb4
Tested the sync with a successful resolve:

$ PANTS_DEV=1 ./pants resolve.ivy --open //:jarjar

And coverage is good for the new code:

$ PANTS_PY_COVERAGE=modules:pants.java.jar.shader PANTS_DEV=1 ./pants test tests/python/pants_test/java/jar/:shader
...
21:53:58 00:00   [test]
21:53:58 00:00     [run_prep_command]
21:53:58 00:00     [test]
21:53:58 00:00     [pytest]
21:53:58 00:00       [run]
                     ============== test session starts ===============
                     platform linux2 -- Python 2.7.8 -- py-1.4.26 -- pytest-2.7.0
                     rootdir: /tmp, inifile: 
                     plugins: timeout
                     collected 4 items 

                     ../../../../../tmp ....
                      coverage: platform linux2, python 2.7.8-final-0 -
                     Name                               Stmts   Miss Branch BrMiss  Cover
                     --------------------------------------------------------------------
                     src/python/pants/java/jar/shader      86      8     30      8    86%

                     ============ 4 passed in 5.17 seconds ============
                     Name                               Stmts   Miss Branch BrMiss  Cover
                     --------------------------------------------------------------------
                     src/python/pants/java/jar/shader      86      8     30      8    86%

21:54:04 00:06     [junit]
21:54:04 00:06     [specs]
               SUCCESS

CI went green here:
https://travis-ci.org/pantsbuild/pants/builds/57977570

  • 0
  • 0
  • 0
  • 1
  • 1
Description From Last Updated
JS
JS
  1. NB: The `Shader` class introduced here is used by the jvm tool bootstrapper to auto-shade tool classpaths here: https://rbcommons.com/s/twitter/r/2052/
  2. 
      
BE
  1. Ship It!
  2. src/python/pants/java/jar/shader.py (Diff revision 2)
     
     

    Do these not represent inclusion and exclusion rules?

    1. Yup - re-worded.
  3. src/python/pants/java/jar/shader.py (Diff revision 2)
     
     

    s/its/it's/

  4. src/python/pants/java/jar/shader.py (Diff revision 2)
     
     

    s/both// or something? The sentence as it stands is confusing.

    1. It was confusing.  Still is, but the words are in the right places at least.
  5. 
      
JS
BE
  1. Ship It!
  2. 
      
JS
ZU
  1. There is no actual invocation of the shader in these tests, which would concern me, but I know you did submit such a test in the followon for the JVM tool task shading.

    1. Yeah - the split into 2 RB's was intended to be a tradeoff in your favor as RB reader, but it's a tradeoff.
  2. BUILD.tools (Diff revision 3)
     
     

    Why did you pick this non-release version and where is it published?

    1. This info is buried up in the testing done but I've added this info to BUILD.tools now where it really belongs:

      The jarjar jar (unused in this RB, used in next) was synced to bintray from
      this commit: https://github.com/pantsbuild/maven-repo/commit/782e0cb4

  3. src/python/pants/java/jar/shader.py (Diff revision 3)
     
     

    I find this to be a surprising default.

    1. Noted but I've left as-is since you weren't more demonstrative.
      
      I think its surprise-factor is directly related to java's packages actually being a flat namespace but in most folks in practice thinking of them or laying them out them with some semantic hierarchy in mind.  I think the surprise pivots on which of those 2 contradictory bit about packages is closer to the top of your mind.
    2. I believe the instances of wanting to shade only part of a library are rare. Your case where you don't want to shade the main class is the first example I've seen of the "non-recursive" behavior.

      If you are shading a library, you probably also want to shade the subpackages.
      They may not be published as public API, so it would be easy to overlook them. For example, if you want to shade package com.google.common.base, you don't want to accidentally overlook com.google.common.base.internal.

      The kinds of problems that shading works around are usually subtle. Forgetting to add the recursive=True parameter could solve the initial problems but leave behind some different subtle problems.

      For precedent, I hate to rely on maven, but it happens to be that the maven shade plugin by default 'recursively' shades. The concept for specifying your shading is by using a string prefix for symbol names. It doesn't even assume that the prefix ends with a '.' so we have been explicit with our shading prefixes when you mean for it to apply to a specific package. Here is a snippet from one of our pom.xml files that shows our typical use:

          <plugin>
              <groupId>com.squareup.maven.plugins</groupId>
              <artifactId>shade-plugin</artifactId>
              <version>${shade-plugin.version}</version>
              <executions>
                <execution>
                  <phase>package</phase>
                  <goals>
                    <goal>shade</goal>
                  </goals>
                  <configuration>
                    <shadedArtifactAttached>true</shadedArtifactAttached>
                    <shadedClassifierName>shaded</shadedClassifierName>
                    <transformers>
                      <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                      <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                        <manifestEntries>
                          <Class-Path>lib-signed/bcprov-jdk15on.jar lib-signed/ncipherkm.jar</Class-Path>
                        </manifestEntries>
                      </transformer>
                    </transformers>
                    <artifactSet>
                      <excludes>
                        <exclude>org.bouncycastle:bcprov-jdk15on</exclude>
                        <exclude>com.squareup.nonmaven:ncipherkm</exclude>
                        <exclude>*:*:tar.gz:*</exclude>
                        <exclude>org.apache.hadoop:*</exclude>
                      </excludes>
                    </artifactSet>
                    <relocations>
                      <relocation>
                        <!-- hadoop uses guava 11, but our java repo uses guava 14 -->
                        <pattern>com.google.common.</pattern>
                        <shadedPattern>shaded_for_hadoop.com.google.common.</shadedPattern>
                      </relocation>
                      <relocation>
                        <!-- hadoop uses guice 3.x but our java repo uses guice 4.x -->
                        <pattern>com.google.inject.</pattern>
                        <shadedPattern>shaded_for_hadoop.com.google.inject.</shadedPattern>
                      </relocation>
                      <relocation>
                        <!-- CDH 4.3.0 uses hsqldb 1.8.0.7, but our java repo uses 2.2.4 -->
                        <pattern>org.hsqldb.</pattern>
                        <shadedPattern>shaded_for_hadoop.org.hsqldb.</shadedPattern>
                      </relocation>
                      <relocation>
                        <!-- CDH 4.3.0 uses protobuf 2.4.0a, but our java repo uses 2.4.1.square.1.3 -->
                        <pattern>com.google.protobuf.</pattern>
                        <shadedPattern>shaded_for_hadoop.com.google.protobuf.</shadedPattern>
                      </relocation>
                    </relocations>
                  </configuration>
                </execution>
              </executions>
            </plugin>
      
    3. OK - this sounds like a motivation from the not-yet-supported use case of shading libraries. It does make sense for that use case, so when that's implemented, flipping the default may make sense if the primitive rules exposed to targets are in fact these methods.

      They may be exposed as even higher level things though, like java_library(name=..., shade_rules=[shade_depenendency('3rdparty:guava')]) (taking the packages to recursively shade to be the transtive closure of orgs for the dep spec by default). If so, then I don't thing a sense flip on the bool is needed since exactly one piece of code implementing the shade_depenendency object aliased into BUILD files will need to know.

  4. src/python/pants/java/jar/shader.py (Diff revision 3)
     
     

    Knowing you, I'm sure these are working, but if I were the author I would have added more unit testing for these methods.

    1. Hrm, the coverage view of the source file isn't available on Coveralls: https://coveralls.io/builds/2306859/source?filename=src%2Fpython%2Fpants%2Fjava%2Fjar%2Fshader.py
      This must be true of any totally new file.

      At any rate - if your run PANTS_PY_COVERAGE=modules:pants.java.jar.shader PANTS_DEV=1 ./pants test tests/python/pants_test/java/jar/:shader and then open dist/coverage/tests.python.pants_test.java.jar.shader/index.html The only uncovered bit is the entirety of _iter_dir_packages and the conditional branch that calls it from _calculate_system_packages

      Both the javax.annotation check (comes from boot classpath scan), and the 'com.google.common.base' shade check (comes from the input_jar scan) are aimed at testing this - granted a bit inobvious.
      So this is a well covered black box test. I don't always agree that whitebox is necessarily a good thing, its necessarily more brittle for sure.

    2. I agree that the functionality is getting exercised, it just makes me feel more confident in my own code to add more whitebox testing. For example, I didn't see any cases that would have covered package_name set == None in _package_rule() and that would make me paranoid that I might have done something stupid like mess up a formatting string. I am not asking you to change, but just to understand why I seemingly go overboard in my own tests.

    3. All the cases that use 'main'/'main.class' as the main cover this.  I tried to make that clear with the test names, like `test_assemble_default_rules` vs. `test_assemble_default_rules_default_package`
  5. 
      
JS
JS
ZU
  1. Still LGTM

  2. 
      
PA
  1. Ship It!
  2. 
      
JS
  1. Thanks folks - submitted @ https://github.com/pantsbuild/pants/commit/b60fe25298df1799fdfbe89bbf254a77a0cc4290
  2. 
      
JS
Review request changed

Status: Closed (submitted)

ST
  1. 
      
  2. src/python/pants/java/jar/shader.py (Diff revision 4)
     
     
     

    I think one of the goals of shading is to allow all of the tools to run in the same nailgun instance? In that case, it would probably be good for the shade prefix to be unique/private for each tool.

    1. That wasn't a goal I had in mind, just to isolate the tool from the code it operates over, not the tool from another tool.
      
      The case you mention sortof makes sense, but it brings to mind other cases that definitely make sense / will happen.
      
      I'm dropping this RB issue since the RB is in master.
      Instead I've opened a github issue: https://github.com/pantsbuild/pants/issues/1420
  3. 
      
Loading...