Add support to Fetcher for `file:` URLs.

Review Request #4099 — Created July 20, 2016 and submitted

3324, 3697
This also supports bare absolute path names.

 src/python/pants/net/http/             | 169 +++++++++++++++++++++++++++++++++++++++++---------
 tests/python/pants_test/net/http/ |  53 +++++++++++++++-
 2 files changed, 192 insertions(+), 30 deletions(-)

Locally green: ./pants test tests/python/pants_test/net/http

CI went green here:

  2. src/python/pants/net/http/ (Diff revision 2)

    You probably want to sub :// for '' because someone might inadvertently have a URL path that contains double slashes.

    1. See the added test cases and tell me what you think.
    2. The potential problem I'm thinking of is "file://not/normalized//url" looks like it would be transformed to "file://not/normalizedurl"

    3. Aha - well what do you say to letting the filesystem handle that however it does?
    4. Oh oh oh, missed the ^ anchor on the // - gotcha, will fix.
    5. Added a failing test and fixed.
  1. Thanks John: looks great.

  2. src/python/pants/net/http/ (Diff revision 2)

    Making the root required, and passing this in explicitly would drop the build_environment dep, which might be worthwhile?

    Also, is the assumption that this will usually/always be located inside of the repo? Seems like it might not be... folks might set up NFS paths, etc. I ask because it would be neat to use ProjectTree here, but seems like a stretch.

    1. I'll take a look at the tendrils of plumbing the root everywhere Fetcher is used, if it looks big I'll circle back with a new RB to do just that.
      The value though is only used to resolve relative paths, which I think only make sense as relative to the buildroot. If there is an NFS mount, just file:/this/is/the/absolute/mount/path.

    2. I pushed the dep out to the immediate users since there were only 3.  Its not clear to me that's much better, but pushing the dep even farther out towards the tasks that eventually consume would be too big a ripple for this RB at least.
  3. src/python/pants/net/http/ (Diff revision 2)

    I'm unclear on python convention, but: should "private" inner classes be marked with _? ie, class _Response

    1. Not sure there is one, but I agree and tend to _Response, thanks for calling that out.

    2. Fixed.
  4. src/python/pants/net/http/ (Diff revision 2)

    Can use fstat here I think:

    1. I think you're right, passing self._fp._fileno(), I'll try that out.

    2. Fixed.
  1. Ship It!
Review request changed

Status: Closed (submitted)

Change Summary:

Now on master:

git log -1 origin/master
commit 0450732d8144e122282c1cfd256c12ae3e7abc2a
Author: John Sirois <>
Date:   Wed Jul 20 19:30:07 2016 -0600

    Add support to Fetcher for `file:` URLs.
    This also supports bare absolute path names.
    Testing Done:
    Locally green: `./pants test tests/python/pants_test/net/http`
    CI went green here:
    Bugs closed: 3324, 3697
    Reviewed at
  1. Checking in with a success story: was just able to use this to test a new watchman binary using the following:

    baseurls: +[
    version: 4.5.0-stuhood-1

    Great timing. Thanks John!