Fixup jenkins-slave-connect.service pre-reqs.

Review Request #3849 — Created May 9, 2016 and submitted

jsirois
pants
jsirois/issues/3356/fix_prerequisites
3356, 3373
dd0734a...
pants-reviews
patricklaw
build-support/aws/ec2/packer/jenkins-slave.customize.sh | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

I noticed the slaves were still needing a retry to connect up JNLP.
Looked at the logs more closely, and the re-mount of dev/xvdb was
being performed by cloud-init (which should have been a known!) and
found a target reached by completion of cloud-init in the journalctl
logs:

May 09 15:31:10 ip-172-31-15-110 systemd[1]: Starting Starts Jenkins Master JNLP connector....
May 09 15:31:10 ip-172-31-15-110 systemd[1]: Starting /etc/rc.local Compatibility...
May 09 15:31:10 ip-172-31-15-110 systemd[1]: Reached target Cloud-config availability.
May 09 15:31:10 ip-172-31-15-110 systemd[1]: Starting Apply the settings specified in cloud-config...
May 09 15:31:10 ip-172-31-15-110 systemd[1]: Started /etc/rc.local Compatibility.
May 09 15:31:10 ip-172-31-15-110 systemd[1]: Started Starts Jenkins Master JNLP connector..
...
May 09 15:31:11 ip-172-31-15-110 cloud-init[1103]: [CLOUDINIT] util.py[DEBUG]: Running command ('mount', '-a') with allowed return codes [0] (shell=False, capture=True)
May 09 15:31:11 ip-172-31-15-110 systemd[1]: Starting LSB: Set the CPU Frequency Scaling governor to "ondemand"...
May 09 15:31:11 ip-172-31-15-110 systemd[1]: Starting LSB: automatic crash report generation...
May 09 15:31:11 ip-172-31-15-110 systemd[1]: Starting Permit User Sessions...
May 09 15:31:11 ip-172-31-15-110 kernel: EXT4-fs (xvdb): mounted filesystem with ordered data mode. Opts: discard
May 09 15:31:11 ip-172-31-15-110 cloud-init[1103]: [CLOUDINIT] handlers.py[DEBUG]: finish: modules-config/config-mounts: SUCCESS: config-mounts ran successfully
...
May 09 15:31:17 ip-172-31-15-110 jenkins-slave-connect[1113]: Traceback (most recent call last):
May 09 15:31:17 ip-172-31-15-110 jenkins-slave-connect[1113]:   File "/home/jenkins/bin/jenkins-slave-connect", line 60, in <module>
May 09 15:31:17 ip-172-31-15-110 jenkins-slave-connect[1113]:     with open('{workdir}/slave.jar'.format(**params), 'wb') as fp:
May 09 15:31:17 ip-172-31-15-110 jenkins-slave-connect[1113]: IOError: [Errno 2] No such file or directory: '/mnt/xvdb/jenkins/slave.jar'
May 09 15:31:17 ip-172-31-15-110 systemd[1]: jenkins-slave-connect.service: Main process exited, code=exited, status=1/FAILURE
May 09 15:31:17 ip-172-31-15-110 systemd[1]: jenkins-slave-connect.service: Unit entered failed state.
May 09 15:31:17 ip-172-31-15-110 systemd[1]: jenkins-slave-connect.service: Failed with result 'exit-code'.
May 09 15:31:17 ip-172-31-15-110 systemd[1]: jenkins-slave-connect.service: Service hold-off time over, scheduling restart.
May 09 15:31:17 ip-172-31-15-110 systemd[1]: Stopped Starts Jenkins Master JNLP connector..
May 09 15:31:17 ip-172-31-15-110 systemd[1]: Starting Starts Jenkins Master JNLP connector....
May 09 15:31:17 ip-172-31-15-110 systemd[1]: Started Starts Jenkins Master JNLP connector..
...
May 09 15:31:18 ip-172-31-15-110 systemd[1]: Reached target Cloud-init target.

Now waiting on this target which appears to work.
Slaves now look like so (no retries):

ubuntu@ip-172-31-10-99:~$ journalctl -u jenkins-slave-connect.service 
-- Logs begin at Mon 2016-05-09 19:49:39 UTC, end at Mon 2016-05-09 19:50:44 UTC. --
May 09 19:50:25 ip-172-31-10-99 systemd[1]: Starting Starts Jenkins Master JNLP connector....
May 09 19:50:25 ip-172-31-10-99 systemd[1]: Started Starts Jenkins Master JNLP connector..
May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:43 PM hudson.remoting.jnlp.Main createEngine
May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Setting up slave: 3e20dad1-5532-4f72-b946-27279dc6bbdf
May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:43 PM hudson.remoting.jnlp.Main$CuiListener <init>
May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Jenkins agent is running in headless mode.
May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:43 PM hudson.remoting.jnlp.Main$CuiListener status
May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Locating server among [http://jenkins.pantsbuild.org/]
May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:43 PM hudson.remoting.jnlp.Main$CuiListener status
May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Handshaking
May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:43 PM hudson.remoting.jnlp.Main$CuiListener status
May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Connecting to jenkins.pantsbuild.org:48985
May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:43 PM hudson.remoting.jnlp.Main$CuiListener status
May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Trying protocol: JNLP3-connect
May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:44 PM hudson.remoting.jnlp.Main$CuiListener status
May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: JNLP3-connect: Incorrect challenge response from master
May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:44 PM hudson.remoting.jnlp.Main$CuiListener status
May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Connecting to jenkins.pantsbuild.org:48985
May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:44 PM hudson.remoting.jnlp.Main$CuiListener status
May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Trying protocol: JNLP2-connect
May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:44 PM hudson.remoting.jnlp.Main$CuiListener status
May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Connected

CI went green using this image here:
http://jenkins.pantsbuild.org/job/pantsbuild/job/pants/branch/PR-3373/1/

  1. 
      
  2. This is gratuitous, but AMI builds are very slow, so threw in debugging for this operation which appears to fail most of the time (404).  I'll need to come up with some other means to read labels that is not fraught with timing issues.
  3. 
      
  1. Ship It!
  2. 
      
Review request changed

Status: Closed (submitted)

Change Summary:

Now on master:

git log -1 origin/master
commit c34c39096731f73fbde757beb2cbd024bf9ec3a2
Author: John Sirois <john.sirois@gmail.com>
Date:   Mon May 9 18:48:52 2016 -0600

    Fixup jenkins-slave-connect.service pre-reqs.
    
    Testing Done:
    I noticed the slaves were still needing a retry to connect up JNLP.
    Looked at the logs more closely, and the re-mount of `dev/xvdb` was
    being performed by `cloud-init` (which should have been a known!) and
    found a target reached by completion of `cloud-init` in the `journalctl`
    logs:
    ```
    May 09 15:31:10 ip-172-31-15-110 systemd[1]: Starting Starts Jenkins Master JNLP connector....
    May 09 15:31:10 ip-172-31-15-110 systemd[1]: Starting /etc/rc.local Compatibility...
    May 09 15:31:10 ip-172-31-15-110 systemd[1]: Reached target Cloud-config availability.
    May 09 15:31:10 ip-172-31-15-110 systemd[1]: Starting Apply the settings specified in cloud-config...
    May 09 15:31:10 ip-172-31-15-110 systemd[1]: Started /etc/rc.local Compatibility.
    May 09 15:31:10 ip-172-31-15-110 systemd[1]: Started Starts Jenkins Master JNLP connector..
    ...
    May 09 15:31:11 ip-172-31-15-110 cloud-init[1103]: [CLOUDINIT] util.py[DEBUG]: Running command ('mount', '-a') with allowed return codes [0] (shell=False, capture=True)
    May 09 15:31:11 ip-172-31-15-110 systemd[1]: Starting LSB: Set the CPU Frequency Scaling governor to "ondemand"...
    May 09 15:31:11 ip-172-31-15-110 systemd[1]: Starting LSB: automatic crash report generation...
    May 09 15:31:11 ip-172-31-15-110 systemd[1]: Starting Permit User Sessions...
    May 09 15:31:11 ip-172-31-15-110 kernel: EXT4-fs (xvdb): mounted filesystem with ordered data mode. Opts: discard
    May 09 15:31:11 ip-172-31-15-110 cloud-init[1103]: [CLOUDINIT] handlers.py[DEBUG]: finish: modules-config/config-mounts: SUCCESS: config-mounts ran successfully
    ...
    May 09 15:31:17 ip-172-31-15-110 jenkins-slave-connect[1113]: Traceback (most recent call last):
    May 09 15:31:17 ip-172-31-15-110 jenkins-slave-connect[1113]:   File "/home/jenkins/bin/jenkins-slave-connect", line 60, in <module>
    May 09 15:31:17 ip-172-31-15-110 jenkins-slave-connect[1113]:     with open('{workdir}/slave.jar'.format(**params), 'wb') as fp:
    May 09 15:31:17 ip-172-31-15-110 jenkins-slave-connect[1113]: IOError: [Errno 2] No such file or directory: '/mnt/xvdb/jenkins/slave.jar'
    May 09 15:31:17 ip-172-31-15-110 systemd[1]: jenkins-slave-connect.service: Main process exited, code=exited, status=1/FAILURE
    May 09 15:31:17 ip-172-31-15-110 systemd[1]: jenkins-slave-connect.service: Unit entered failed state.
    May 09 15:31:17 ip-172-31-15-110 systemd[1]: jenkins-slave-connect.service: Failed with result 'exit-code'.
    May 09 15:31:17 ip-172-31-15-110 systemd[1]: jenkins-slave-connect.service: Service hold-off time over, scheduling restart.
    May 09 15:31:17 ip-172-31-15-110 systemd[1]: Stopped Starts Jenkins Master JNLP connector..
    May 09 15:31:17 ip-172-31-15-110 systemd[1]: Starting Starts Jenkins Master JNLP connector....
    May 09 15:31:17 ip-172-31-15-110 systemd[1]: Started Starts Jenkins Master JNLP connector..
    ...
    May 09 15:31:18 ip-172-31-15-110 systemd[1]: Reached target Cloud-init target.
    ```
    
    Now waiting on this target which appears to work.
    Slaves now look like so (no retries):
    ```
    ubuntu@ip-172-31-10-99:~$ journalctl -u jenkins-slave-connect.service
    -- Logs begin at Mon 2016-05-09 19:49:39 UTC, end at Mon 2016-05-09 19:50:44 UTC. --
    May 09 19:50:25 ip-172-31-10-99 systemd[1]: Starting Starts Jenkins Master JNLP connector....
    May 09 19:50:25 ip-172-31-10-99 systemd[1]: Started Starts Jenkins Master JNLP connector..
    May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:43 PM hudson.remoting.jnlp.Main createEngine
    May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Setting up slave: 3e20dad1-5532-4f72-b946-27279dc6bbdf
    May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:43 PM hudson.remoting.jnlp.Main$CuiListener <init>
    May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Jenkins agent is running in headless mode.
    May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:43 PM hudson.remoting.jnlp.Main$CuiListener status
    May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Locating server among [http://jenkins.pantsbuild.org/]
    May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:43 PM hudson.remoting.jnlp.Main$CuiListener status
    May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Handshaking
    May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:43 PM hudson.remoting.jnlp.Main$CuiListener status
    May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Connecting to jenkins.pantsbuild.org:48985
    May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:43 PM hudson.remoting.jnlp.Main$CuiListener status
    May 09 19:50:43 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Trying protocol: JNLP3-connect
    May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:44 PM hudson.remoting.jnlp.Main$CuiListener status
    May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: JNLP3-connect: Incorrect challenge response from master
    May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:44 PM hudson.remoting.jnlp.Main$CuiListener status
    May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Connecting to jenkins.pantsbuild.org:48985
    May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:44 PM hudson.remoting.jnlp.Main$CuiListener status
    May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Trying protocol: JNLP2-connect
    May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: May 09, 2016 7:50:44 PM hudson.remoting.jnlp.Main$CuiListener status
    May 09 19:50:44 ip-172-31-10-99 jenkins-slave-connect[1298]: INFO: Connected
    ```
    
    CI went green using this image here:
      http://jenkins.pantsbuild.org/job/pantsbuild/job/pants/branch/PR-3373/1/
    
    Bugs closed: 3356, 3373
    
    Reviewed at https://rbcommons.com/s/twitter/r/3849/
Loading...