Fix unicode string on stdout causing taskerror

Review Request #3944 - Created May 28, 2016 and updated

Information
Ity Kaul
pants
3517, 3532
Reviewers
pants-reviews
jsirois, kwlzn, stuhood

- No encoding was set on payload which caused the encoding error detailed in the ticket
- Encode payload string with utf-8 before it gets written to stdout

  • Added a new test which fails without this change
  • Ran ci.sh locally

Issues

  • 0
  • 3
  • 0
  • 3
Description From Last Updated
Stu Hood
Kris Wilson
Ity Kaul
Ity Kaul
Review request changed
Kris Wilson
Ship It!
John Sirois

   

I'd be a fan of explicitly passing return_bytes=False here with an explanation; ie: the payload, which becomes working_dir or command can contain unicode chars as parts of path names or command argument values.
That said, there is 1 caller, and that caller doesn't use these tupe slots at all:

$ git grep -n "\.parse_request" -- src/
src/python/pants/pantsd/pailgun_server.py:62:    _, _, arguments, environment = NailgunProtocol.parse_request(self.request)

So can the new return_bytes parameter just be dropped wholesale and the bytes never decoded?

  1. I think the decode() is failproof for any future cases of this being used. Having said that, since that future is not here currently - happy to remove the decode altogether with a TODO of adding it back in if there was a caller in the future that needed it.

  2. The TODO seems like the best path to me.  At this low layer there is no telling what constitutes the byte stream and its really only the particular nailgun application riding above that knows its safe to decode the bytes in a particular way.
Loading...