Advance JVM bundle options, and enable them in jvm_app target as well

Review Request #3910 — Created May 19, 2016 and submitted

yujiec
pants
3454, 3471
pants-reviews
benjyw, peiyu, stuhood

This review does following:
1. JVM bundle options defined in BundleCreate become advanced options
2. Options bundle-jvm-deployjar and bundle-jvm-archive are enabled in jvm_app target, which means you can specify them in BUILD file.
3. Deprecate bundle-jvm-archive-prefix option.
4. The precedence is CLI option > target option > pants.ini option
5. Add test cases in both test_jvm_app.py and test_bundle_create.py

ci green:
https://travis-ci.org/pantsbuild/pants/builds/134520336

  • 0
  • 0
  • 9
  • 0
  • 9
Description From Last Updated
  1. 
      
  2. Please document the new fields using descriptions similar to the original flags. Alternatively, move the descriptions from the original flags, and change those help='..' strings to reference the JvmApp docs instead.

    1. ...it's a good idea in general, but also: this information ends up on the docsite: http://www.pantsbuild.org/build_dictionary.html

    2. So the alternative is to reference JvmApp __init__ docstring in "help=..." string in bundle_create?
      
      __init__ docstring here will end up on docsite?
    3. help message is user facing right? should we make "help=" refer to pants website (http://www.pantsbuild.org/build_dictionary.html)?

    4. That's probably fine. But just referring to "the documentation for jvm_app" would probably be fine too.

  3. Please document this method.

  4. You can't assume that all/any of the target roots will be jvm_app targets.

  5. src/python/pants/backend/jvm/tasks/bundle_create.py (Diff revision 1)
     
     
     
     
     
     
     
     
     
     
     
     
     

    This is probably not the right way to resolve this issue... please talk with Peiyu about whether there is a better way to resolve this issue in the context of these parameters being per-target, rather than being global to the run.

    1. By giving basename already implies they want to use basedname as opposed to pants supplied target.id based naming. In other words, having basename and use_basename_prefix at the same time is redundant.

      use_basename_prefix is a workaround that's better be hidden from user, whether or not we want to kill it and move forward to a single naming convention is a different discussion.

    2. Discussed this logic with Yujie. I proposed to modify the operation under "use_basename_prefix = True" as follows:
      1. If a target is a JvmApp, bundle it no matter it is in root or not;
      2. If a target is not a JvmApp, bundle it only if it is in root.

    3. We still do not need to worry about naming conflicts because the underline JvmBinary target is not bundled.

    4. sgtm, probably a separate review.

    5. I think I add it to my review of "exposing products" - https://github.com/pantsbuild/pants/pull/3501.

      Let's land this one first.

    6. Sure. Thanks @peiyu @yujie. Just a discussion here. Better to implement it in a new review.

  6. Each app in apps might have a different deployjar setting... this _resolved_option call should be for app.

  7. 
      
  1. 
      
  2. i suspect this one is rarely used and can be killed.

    When giving a basename, people are already used to basename prefix-ed archive names, I searched our repo and confirmed
    1) there are only 2 instances of --bundle-jvm-archive-prefix which is the same as not using it
    2) there is no instance of --no-bundle-archive-prefix

    1. Is there a deprecation process for cli options? Also should I ask in Slack room since other companies may use this flag?

    2. Yes and sortof.

      To deprecate an option, mark a removal version: https://github.com/pantsbuild/pants/blob/master/src/python/pants/backend/jvm/tasks/junit_run.py#L82-L84

      And to see whether it is critical to any other users, mailing pants-devel@ is probably more thorough.

  3. 
      
  1. 
      
  2. are we going to deprecate this? in your description says

    3. Deprecate bundle-jvm-archive-prefix option.
    
    1. in bundle_create.py register_option, I set a deprecate version. Are you saying since we will deprecate it we don't need to put it in jvm_app target?

    2. yea so we can simplify.

    3. ok I have remove all usage od archive-prefix (just one place actually) to prepare for deprecation.

  3. mention when used from cli will take precedence over target config

  4. archive and deployjar can live here, and provide a factory method to create App from options.

  5. 
      
  1. nitpicks

  2. feels better to keep the redundant doc copy here as opposed to prompting user to look up.

  3. options is too big, replace with just deployjar boolean and archive and call resolved_option outside.

    1. resolved_option also uses options.is_flagged methods, that's why I passed options.
      If we do not want to do that, maybe add these 2 flagged boolean to paramater list as well?

    2. I think I will move _resolved_option back to bundlecreate class again, and pass resolved option values to App.create_app.
      See my next review.

  4. minor: see if not doing list comprehension is more readable

  5. remove, default is None

  6. 
      
  1. 
      
  2. these two parameters are not doc-ed, can you add?

    1. I think it is intentional. doc for name and payload is already in doc string of base class (Target).

  3. 
      
  1. Thanks, looks good to me.

    @Yujie: please double check that this satisfies all the common flags that are used internally.

  2. Also a string.

    Would be good to validate this as being one of the choices from the option: archive.TYPE_NAMES

  3. The first sentence of a method comment should be on its own line, with a blank line before the rest of the description:

    def example(self):
      """This is an example method.
    
      This is a longer description.
      """
      return
    
  4. The not isinstance(target, JvmApp) is worrying. This method should probably not be running for things that aren't JvmApps... otherwise filtering that ran early probably didn't work?

  5. 
      
  1. So. I am not crazy about this change. We basically don't allow deployjars and we are not using the basename option. But this makes this task incredibly hard to reason about.

    And there are Pants shops that are actively using these flags. In Slack today someone mentioned they run:

    ./pants bundle —bundle-jvm-deployjar —bundle-jvm-use-basename-prefix src/scala/com/...

    1. The current usage of those options will not be affected. Now we can specify "archive" and "deploy" in jvm_app target, which means we don't have to give command line options every time (and get rid of bash scripts that are used to start pants with those options).

    2. @Mateo: As Yujie said: that usecase is still 100% supported, and is not deprecated so much as "lightly discouraged".

      The reason to discourage passing them on the CLI is that instead you want to encourage defining defaults for a repo or per-target values. This avoids having each ./pants bundle command be its own unique flower, which improves build reproducibility.

      As background: there are currently many thousands of scripts at Twitter which pass various archive types and bundle settings for their binaries. This means that it is much, much harder than it needs to be to reproduce someone's deploy binary... you'd need to parse scripts to figure out what people are deploying. We would like to encourage moving those parameters into BUILD files so that we can have automation to build all bundles without teams needing their own scripts.

    3. My understanding is that 'advanced' options were things that you might set in the pants.ini file once and forget about. Marking it advanced is basically okay with me though - I don't use and have no real opionions past generally feeling it a bad fit.

      I pushed back against the things I see as issues. I don't want to block you on them, so drop any remaining issues as #wontfix as you like.

  2. Has 'Class-Path' always been a thing? That is weird to me, even though I see it elsewhere in the codebase.

    1. Sorry I don't know much about this. Here I was simply copying from corresponding cli option's help message. You may want to bring it up in Slack room?

    2. @Mateo, yea Class-path has always been a thing for bundle, essentially it's just a list of all jars under lib/

    3. Yeah, I am familiar with what it is, I just hadn't seen that variation ('Class-Path' instead of the standard 'classpath') before now. No big deal, just was curious - it is elsewhere in the code base already.

    4. ah, this is the Class-Path entry in the manifest: https://docs.oracle.com/javase/7/docs/technotes/guides/extensions/spec.html#bundled.

      Maybe we could say Class-Path header in synthetic jar's manifest to make it more clear.

    5. Ah ha - TIL. Thanks peiyu!

  3. Considering your changes, the App inner class is not providing a lot of value, and is closer to one name for two very discrete items. In fact, given the custom option parsing and if-elses in the instantiation, the task has become really hard to reason about.

    If you think App is worth keeping, how about converting basename and bundles into properties that can encapsulate the logic?

    1. App is a wrapper around both jvm_binary and jvm_app targets. They share a lot of common fields, and can both be executed in this task. I believe that is why we have this class in the first place.

      But what is the benefit of making 'basename' and 'bundles' as properties over having 'basename' and 'bundles' logic in factory method?

    2. I don't really like packing a bunch of inlined if-else statements into the method calls, although I wasn't super clear in what I was asking for. I was thinking something like:

      apps.append(self.App(target, deployjar, archive, use_basename_prefix=False):
      
      class App(object):
        def __init__(self, target, deployjar, archive, use_basename):
          self.target = target
          self.deployjar = deployjar
          self.archive = archive
      
       @property
       def bundles(self)
         return self.target.payload.bundles if isinstance(self.target, JvmApp) else []
      
       @property
       def binary(self):
         etc.
      

      This is a lot easier to understand, imo, because it loses the create_app factory and just passes the target instead of expressions operating over the target. I am sure reasonable people could disagree.

    3. Matt, I changed to this factory method approach based on Peiyu's suggestion and I personally liked it. But I understand your concern about having if-else in method calls as well.
      How about I put those if-else logic before the "return cls(...)" statement but still inside create_app method?

    4. My opinion is that this class is too case-ridden. But there have been plenty of eyes on this code and if they say Ship It I will not block on it. I will drop the issue.

  4. I am not so sure about this. If you need custom or advanced option handling, I think that should go in the options system, not in the bundling code.

    Could you please add benjyw to this review and get his thoughts? He spent a lot of time making the options logically consistent, it would be nice to get his Ship It here.

    1. this logic is local to this task only. Only jvm_app target will have options in target level, which requires the special handling in bundle_create task.

      added Benjy.

    2. @Yujie: It is likely that other targets will want to do this same thing at some point.

      But @Mateo: it's likely a bit premature to try to generalize it until we see more cases where folks would like to do this.

    3. It boils down to an ad-hoc options syntax that is exclusive to this task, that is a can of worms imo.

    4. Basically the same thing applies here, I will drop the issue. I would much prefer that tasks not invent their own option handling. But you reached out to Benjy like I asked so I consider your diligence done.

    5. Thank you, Mateo. As Benjy says, I will create a new ticket and assign to him.

  5. I think this is an error in waiting even if it isn't triggered yet. You conceivably have every target in context as targets_to_bundle being passed to the consolidate classpath actions.

    1. I don't quite understand the issue here. what is the problem in this line? If you are talking about the consolidation logic below, maybe mark those lines?

    2. I will drop this issue. Having all the targets in context be called targets_to_bundle isn't great, but it wasn't introduced by you. My fear was that someday someone will pass targets_to_bundle to the consolidate_classpath methods, which then mutate the classpath for every target in context.

      Of course, at the moment there are no consumers of BundleCreate. But I have internal tasks that would greatly benefit from consuming the Bundle products so it isn't out of the question.

  6. 
      
  1. 
      
  2. Remove stray double-quote at the end of the sentence.

  3. I agree with Mateo. This is logic we may want elsewhere, especially in the new engine.

    But for now I'm fine if you add a TODO here saying that the CLI > target > config logic should be implemented in the options system. And then open a GitHub issue and assign it to me.

    1. It seems I don't have the permission to assign the issue to you.
      https://github.com/pantsbuild/pants/issues/3538

  4. 
      
Review request changed

Status: Closed (submitted)

Change Summary:

Merged as 02f625228457bf1f402231cc41b754c8a7e04692

Loading...