Make :API: public types useable.

Review Request #3752 — Created April 24, 2016 and submitted

jsirois
pants
jsirois/API/public/hide_teasers
3264
3551, 3655, 3520, 3515
pants-reviews
benjyw, gmalmquist, kwlzn, mateor, molsen, nhoward_tw, patricklaw, stuhood, zundel
These types were all marked :API: public but were unuseable since none
of their methods or types were likewise marked.  Using guidance from
users of the types, expose a minimal set of public nested types and
methods.

 src/python/pants/backend/jvm/subsystems/jvm.py | 4 +---
 src/python/pants/ivy/ivy_subsystem.py          | 4 +---
 src/python/pants/java/executor.py              | 5 +----
 src/python/pants/scm/scm.py                    | 5 +----
 src/python/pants/task/repl_task_mixin.py       | 2 --
 5 files changed, 4 insertions(+), 16 deletions(-)
CI went green here:
  https://travis-ci.org/pantsbuild/pants/builds/125645112
  1. I came across this problem in https://rbcommons.com/s/twitter/r/3750/ for DistributionLocator and Distribution.  Since I'm working on those to close https://github.com/pantsbuild/pants/issues/2894, I've filed a specific issue to tackle Distribution: https://github.com/pantsbuild/pants/issues/3263  This RB addresses the rest I could find easily, guided by `git grep ":API: public" | sed -r -e "s|\s+| |" | sort | uniq -c | sort -n | grep "      1 " | grep -v targets/ | grep -v tasks/ | grep ".py:"`: ie, what files contain just 1 `:API: public` - they are likely to be wrong unless the marker is applied to a top-level function or top-level exception.  I also ignored targets and tasks for now.  Targets presumably should have their constructors marked public and tasks are complex - not sure if a type-level public marker on a task is just meant to say this type can be used in goal installs and uninstalls, but not subclassed.
    
    An alternative to this RB would be for me to mark all "public" methods in the exposed types public, or to pick the public and documented ones.  Instead of guessing though, I took the simpler tack of eliminating the current tease state.
  2. 
      
  1. 
      
  2. I think the rationale here is that we want to mark a Subsystem is a public API. You are free to add it to a custom class and use it.

    The original need for susbystems is to share common option values. We don't have a way to go through and mark all the individual options as public/private but regardless, they must follow the same deprecation policy. I take this marker means that you can't change the way this class behaves in general.

    Maybe there is something we could do to make this clearer in:

    https://pantsbuild.github.io/deprecation_policy.html

    1. I think for subsystems to be API public we should require plain old property accessors for flag values that should be exposed, those property methods can then be tagged public.  That would seem to fit with the spirit of the system as I read it.
    2. That's not the way subsystems are currently used and it seems onerous to require that. The field 'options' is reused constantly without adding a separate property method.

      The definition for public APIs at https://pantsbuild.github.io/deprecation_policy.html include the following disallowed changes:

      • Options cannot be removed.

      So it does make sense to mark this as API public even if no methods are accessible. You can add this subsystem to your task and thereby count on the fact that no one is going to change one of these options.

    3. But that would break the current API folks use to access the flag values ... so the specification of special rules for subsystems in the doc makes sense.  I'll post a diff with a doc change and re-en-statement of the subsystems in this RB as `:API: public`.
      I'm not a fan of exposing subsystems even to code in pants core itself beyond a dedicated intermediary, but that pattern is not in universal or even wide use yet.
    4. I missed your response, mostly ACK.  I disagree with the onerous (but that's moot) - for a reasonably testable / composable software system, the subsystems should just be factories for "real" types that can be used without a special (flag-provided) environment in my strong opinion.  We have been using a convenient anti-pattern in my estimation.  My work on https://github.com/pantsbuild/pants/issues/2894 will be an example of moving away from the antipattern; hopefully it will be convincing.
    5. I agree that if options are supposed to exposed as the public API of a subsystem that that should be declared as properties or be consumed by methods of the subsystem. I think that the idea of a subsystem is to be more than just an options bucket, and to actually encapsulate shared logic.

      So I don't think the special case of "all fields of a subsystem are public" makes much sense here.

    6. It sounds like we agree, but I relented because this style of use actually exists and so changing it breaks the API for those cases.  The only sensible path I can see to thread this needle is to further modify the doc to mention this set of special cases exists but is discouraged from further use.  If you have some other solution, I'm all ears.
    7. Since I see no other solution, I'll proceed to update the docs.
    8. See what you think.
    9. The example - though not perfect since DistributionLocator.cached must be retained for API compatibility reasons, is here: https://rbcommons.com/s/twitter/r/3778/
      That RB basically attempts to turn DistributionLocator - a subsystem, into a pure factory for Locator objects, which have no dependency on options at all.  It's only stymied by API compat concerns for 1 method as noted.
  3. 
      
  1. 
      
  2. src/python/pants/java/executor.py (Diff revision 2)
     
     

    In this case, I think we want to keep the API marking on this and mark this as an API method. We use it at Square.

    1. OK - will do.
    2. Fixed.  NB: this required a bit more be exposed since there is a (previously undocumented) custom exception type involved.
  3. Once again, we use this Mixin in a plugin at Square. If you feel it is inconsistent to have no methods marked public, then launch_repl() should be marked as the public interface.

    1. K - will do.
      
      To be clear - this is not "if you feel", its how it is globally.  From my consult with Matt and as backed up by docs, if the method isn't marked public, its not public.  I personally think it would have been more sane to only expose the type explicitly, and have all nested "public" members (no underscore prefix), be implicitly public.  But that's not the case.
    2. Fixed.  I also exposed the other abstract method - doesn't seem to make sense to me to say folks can subclass but they can't subclass both/either of the hook methods.
    3. Thanks for that.

      The problem we had with defining "API:Public" the same as all members that don't have an underscore is that doesn't allow pants modules to share methods between each other that aren't explicitly exposed to the world as a public API. So marking all methods expicitly that are supposed to be an official API method is what we came up with.

  4. 
      
  1. 
      
  2. src/python/pants/scm/scm.py (Diff revision 5)
     
     

    I doubt anyone has implemented another class derived from Scm yet, so whether this is public or not is kind of moot. Still, if that's what was intended, it seems like you would have to mark every abstract property and method as public.

    1. Agreed.  Instead of guessing intent I'll leave as-is until I get an explicit list from someone using this code outside of pants.
      
      As an aside, I think a tactical error was having Matt do most or all of these API markings.  After one small example RB I think folks who wanted stuff to stay stable because they actually used it, should have been asked to step up and do the RBs.  In that way - no guessing, and the folks with the vested interest would naturally pay closer attention to the details of exactly what they needed.
    2. I thank Matt for doing this. You've been out a long time and there has been a ton of discussion about how to do this that probably is lost in the wind. He gathered information from us and correctly marked the classes we were using. For all I know there was a discussion on the marking / non-marking of @abstractXXX and it just didn't get recorded in the deprecation policy. Garrett and I are extremely busy these days (I manage several projects now, Garrett is busy trying to remove the last remnants of Maven from our build.) If this process had blocked waiting on us this would have never gotten done, and 1.0.0 would have been hopelessly blocked.

    3. All no doubt.
      
      I will say though, that classes being marked API public with no methods being marked public are as good as it never having happened.  Anyone can go change those methods and break you.  So the perceived speed win of having a third party do this is not really as much of a speed win as it may have seemed!  This was my surprise on looking at the :API: stuff for the 1st time.  Delineating an API is a great thing to do, but its a hard, detailed thing to do as well.  I think there are no shortcuts there.
      
      Since we're trying to do the hard thing, I want to do it right.
  3. 
      
  1. This patch's name/description no longer really matches the content - this mostly adds new Public markers and only removes the one from Scm.
    But this was enlightening, I previously misunderstood some nuances of public tags. In general I agree with John's point - I am not sure that it is currently possible to create a working plugin using only public API. As an API provider, it would be best if a plugin developer did not need to reach for a "hidden" API.

    But I am okay with public tags on classes with no current public members- it gives a better guarantee than no tag, in that you can count on the module being available for the entire 3 point deprecation cycle.

    In reference to one of John's comments - I'd personally like to see all targets be marked public by default. Future task and plugin developers will hopefully surprise use with the way they leverage the build graph. Does tagging a Target's docstring transitively include the constructor?

    From Foursquare's perspective - we would love some guarantees of stability for our plugins. But we use so many private APIs that we have already punted on that guarantee. Buildgen alone uses 10+ classes that are considered private. So the fact that we use Scm is not a blocker.

    1. I'll wait until the very end to change the description - we'll see how this shakes out.

      But I am okay with public tags on classes with no current public members- it gives a better guarantee than no tag, in that you can count on the module being available for the entire 3 point deprecation cycle.

      I'll agree to disagree on that one. These class-only markings were almost surely simple oversights, but nonetheless they are confusing and misleading - they guarantee nothing except your ability to:

      1. import the type
      2. isinstance check a value against the type
        ... nothing else!

      Basically, as I said before, communicating a tight stable API is a great goal, but its an anal goal, and I think we need to actually be anal about supporting it if we want to support it at all.

      I think there is something to be said for the one-time cost of the API consumer putting their money where their mouth is and spending the time to grab the guarantees they want with RBs that fix :API: public markings to actually work for their case, or else being fine with deferring that cost to break-fixes when they occur.

      Does tagging a Target's docstring transitively include the constructor?

      Not according to the docs. The docs do not specially distringuish Targets or Tasks, so a strightforward reading says constructors must be marked public. That said, I think the docs could be clarified to say the Target case is special and the constructor is always implicitly public if the target type is marked as such. I'm going to continue to keep the scope of this RB away from Target and Task though just to keep it scoped. I think settling these questions is important.

  2. src/python/pants/scm/scm.py (Diff revision 5)
     
     

    Believe it or not, we subclass Scm internally. We implement a FakeScm for times when the build directory is not under source control. This FakeScm overrides every method in the Scm class.

  3. 
      
  1. For the most I agree with John. If we take the docs at their word then we have little public code - not enough to credibly support plugin development. I am willing to volunteer to mark Foursquare's usage more completely.

    Is API public inherited by subclasses? If not, then no product_types are public, which means at the moment it is impossible to extend the build graph without using private API. And even then, the TaskBase class is not marked public, so according to the docs none of those methods qualify. Some general agreement is probably required on pants-devel regarding subsystem, target and tasks availability.

  2. 
      
  1. Ship It!
  2. 
      
  1. I'm going to submit with this set of ship-its but I'll circle back if there are further comments.
  2. 
      
Review request changed

Status: Closed (submitted)

Change Summary:

Now on master:

git log -1 origin/master
commit 56ea26bae6fd00eecf3f3f7d9e9d0577439d76aa
Author: John Sirois <john.sirois@gmail.com>
Date:   Mon Apr 25 14:55:55 2016 -0600

    Make :API: public types useable.
    
    These types were all marked :API: public but were unuseable since none
    of their methods or types were likewise marked.  Using guidance from
    users of the types, expose a minimal set of public nested types and
    methods.
    
    Testing Done:
    CI went green here:
      https://travis-ci.org/pantsbuild/pants/builds/125645112
    
    Bugs closed: 3264
    
    Reviewed at https://rbcommons.com/s/twitter/r/3752/
Loading...