Export: support export of sources and globs

Review Request #2082 — Created April 15, 2015 and submitted

fkorotkov, zundel

Add information about sources and/or source globs to the information
returned by pants export. This will be useful for generating
sparse-checkout files for git. The glob information is,
unfortunately, somewhat imprecise; it loses information about
excludes. That's because that data is not always possible to
linearize into a list of (globs and negative globs), which is what
sparse-checkout requires.

Also add the ability to skip library information, so that this
information can be generated more quickly.

https://travis-ci.org/pantsbuild/pants/builds/58939408 green

  • 0
  • 0
  • 1
  • 0
  • 1
Description From Last Updated
  1. My general feeling about export is that it should accurately model the data in pants. We shouldn't be massaging the export to fit a particular external program, we should let that downstream app take the raw data from pants and massage it as it needs fit. So what I'm saying is that we sould make the new export format accurately model the data that we have, excludes and all. If yo have to ignore excludes in your sparse checkout code then so be it, but another use of the export file might need an accurate representation, then we'll have to do something really ugly to satisfy new users of the export file and existing apps.

    1. OK, I guess that's straightforward.

  2. Do we need to make these optional? Can we just add new fields to the output and let tools ignore the information if they don't want it?

    1. sources, at least, has the potential to seriously bloat the output, which might slow down some consumers. So I'm inclined to keep that one as optional. globs is probably fine to make non-optional.

  2. src/python/pants/base/target.py (Diff revision 2)

    Can you add a comment about what is going on with this attribute lookup? What type of object are we matching, and why isn't this an isinstance check?

    1. If an isinstance check would work here, I'd prefer that because I think it would fit better with the existing control flow.


      • if is addr, return ...
      • if is filesetwithspec, return ...
      • otherwise, convert list of filenames; return ...
  3. src/python/pants/reporting/BUILD (Diff revision 2)

    Do we have a deprecation warning on fileset math yet?

    1. We don't, because fileset math comes from twitter.commons, so we can't easily add our own deprecation warning.

    2. We could with a wrapper maybe?... I think we need to deprecate this or else we're asking for a broken window where export breaks in any cases involving math.

  1. I've got a few suggestions, but I wouldn't call them blockers. This is an exciting change because I think it'll move things towards being able to rm some of the weirdnesses around detecting deleted files.

    My only concern with this change is that the term spec is a bit overloaded already. How would you feel about picking a slightly more specific name, maybe filespec?

  2. How about renaming this to to_filespec? If you aren't serializing to json, I feel like it's a little misleading.

  3. src/python/pants/backend/project_info/tasks/export.py (Diff revision 2)

    You could also guard this against library checking since ivy_info is unused if libraries is false.

  4. If you are skipping libraries, would you want to ignore JarLibrary's completely?

    1. Not necessarily? Mostly, I'm interested in the ability to skip the ivy resolve step.

  5. src/python/pants/backend/project_info/tasks/export.py (Diff revision 2)

    You could also guard this with the libraries flag.

    1. I could, but since that guard is already in get_transitive_jars, all I save is a function call at the cost of duplicating the check, since get_transitive_jars is called from two places.

  6. Maybe I'm missing something, but I think build_graph is unused by create_sources_field so I don't think it's necessary to add here.

  7. src/python/pants/base/payload_field.py (Diff revision 2)

    Document spec.

  8. src/python/pants/base/target.py (Diff revision 2)

    Do you want to special case DeferredSourceFields here or is it ok that they have a spec that's None?

    1. That's ok for my use case; if someone else feels different, we can always change it later.

  1. Can make the name of the field a bit more consistent, but otherwise looks good. Thanks!

    1. Unfortunately, a previous patch added exclude=[list], which is a bit confusing for a list of excludes. So we're going to have to stick with some of this inconsistency.

  2. src/python/pants/backend/core/wrapped_globs.py (Diff revisions 2 - 5)

    excludes or exclude?

  3. src/python/pants/backend/core/wrapped_globs.py (Diff revisions 2 - 5)

    exclude or excludes

Review request changed

Status: Closed (submitted)