Initial round of pantsd + new engine + watchman integration.

Review Request #3524 — Created March 2, 2016 and submitted

kwlzn
pants
kwlzn/pantsd/engine_integration
2989
pants-reviews
benjyw, ity, patricklaw, peiyu, stuhood
  • Implement a new SchedulerService for managing the online scheduler instance.
  • Revive FSEventService and friends for Watchman integration.
  • Avoid resetting Subsystem options to avoid uninitialized subsystem errors when attempting to use the WatchmanLauncher subsystem in pantsd - this gets stomped on run over run anyhow.
  • Make use of FSEventService and SchedulerService optionable - defaulting to off.
  • Implement a test target in exp to launch pantsd with a copy of the scheduler and listen for file events (currently all this does is log the events - graph invalidation coming soon!).
  • Extend the kill-pantsd goal to also shutdown Watchman when so configured.

CI is away @ https://travis-ci.org/pantsbuild/pants/builds/113015980


while running this in a separate window:

$ tail -F .pants.d/pantsd/pantsd.log .pants.d/watchman/watchman.log

spin up an unconnected pantsd instance using the test target w/ options to enable fs-event-detection:

[illuminati pants (kwlzn/pantsd/engine_integration)]$ ./pants run src/python/pants/engine/exp/legacy:pantsd -q -- src/python/pants/pantsd:: -ldebug --pantsd-fs-event-detection
\*\*\* pantsd launched \*\*\*
DEBUG] acquiring lock: <OwnerPrintingPIDLockFile: u'/Users/kwilson/dev/pants/.pantsd.startup' -- u'/Users/kwilson/dev/pants/.pantsd.startup'>
DEBUG] launching pantsd
DEBUG] purging metadata directory: /Users/kwilson/dev/pants/.pids/pantsd
DEBUG] released lock: <OwnerPrintingPIDLockFile: u'/Users/kwilson/dev/pants/.pantsd.startup' -- u'/Users/kwilson/dev/pants/.pantsd.startup'>
DEBUG] pantsd is running at pid 78563

see log output in the tail window:

D0301 17:45:57.109353 78563 pants_daemon.py:121] logging initialized
I0301 17:45:57.109931 78563 pants_daemon.py:170] pantsd starting, log level is DEBUG
I0301 17:45:57.111968 78563 pants_daemon.py:144] starting service <pants.pantsd.service.scheduler_service.SchedulerService object at 0x106902510>
I0301 17:45:57.112404 78563 pants_daemon.py:144] starting service <pants.pantsd.service.pailgun_service.PailgunService object at 0x1069021d0>
I0301 17:45:57.112715 78563 pailgun_service.py:51] starting pailgun server on port 56898
I0301 17:45:57.112966 78563 pants_daemon.py:144] starting service <pants.pantsd.service.fs_event_service.FSEventService object at 0x106902450>
I0301 17:45:57.116261 78563 watchman_launcher.py:56] watchman is running, pid=70857 socket=/Users/kwilson/dev/pants/.pants.d/watchman/watchman.sock
D0301 17:46:02.122070 78563 watchman.py:147] watchman command_list is: [[u'subscribe', '/Users/kwilson/dev/pants', u'all_files', {'fields': [u'name'], 'expression': [u'allof', [u'type', u'f'], [u'not', [u'dirname', u'dist', [u'depth', u'eq', 0]]], [u'not', [u'match', u'.*', u'wholename']], [u'not', [u'match', u'*.pyc']]]}]]
I0301 17:46:02.265883 78563 watchman.py:153] confirmed watchman subscription: {'subscribe': 'all_files', 'version': '3.1.0', 'clock': 'c:1456869416:70857:1:4433'}
I0301 17:46:02.267790 78563 scheduler_service.py:44] enqueuing 4968 changes for subscription all_files
D0301 17:46:02.292135 78563 scheduler_service.py:75] processing 4968 files for subscription all_files (first_event=True)
D0301 17:46:03.268616 78563 fs_event_service.py:143] callback ID 1 for all_files succeeded

save various src files in my editor, which incur new log events:

I0301 17:49:30.558384 78563 scheduler_service.py:44] enqueuing 1 changes for subscription all_files
D0301 17:49:30.575396 78563 scheduler_service.py:75] processing 1 files for subscription all_files (first_event=False)
D0301 17:49:30.575889 78563 scheduler_service.py:56] file src/python/pants/pantsd/service/fs_event_service.py changed!
D0301 17:49:31.562772 78563 fs_event_service.py:143] callback ID 2 for all_files succeeded
I0301 17:49:33.095844 78563 scheduler_service.py:44] enqueuing 1 changes for subscription all_files
D0301 17:49:33.112721 78563 scheduler_service.py:75] processing 1 files for subscription all_files (first_event=False)
D0301 17:49:33.113214 78563 scheduler_service.py:56] file src/python/pants/pantsd/service/pants_daemon_launcher.py changed!
D0301 17:49:34.098309 78563 fs_event_service.py:143] callback ID 3 for all_files succeeded

shut it all down with:

[illuminati pants (kwlzn/pantsd/engine_integration)]$ ./pants kill-pantsd -ldebug --pantsd-fs-event-detection
18:03:39 00:00 [main]
               (To run a reporting server: ./pants server)
18:03:39 00:00   [setup]
18:03:39 00:00     [parse]
               Executing tasks in goals: kill-pantsd
18:03:39 00:00   [kill-pantsd]
18:03:39 00:00     [kill-pantsd]DEBUG] terminating pantsd
DEBUG] sending signal 15 to pid 82783
DEBUG] successfully terminated pid 82783
DEBUG] purging metadata directory: /Users/kwilson/dev/pants/.pids/pantsd
DEBUG] terminating watchman
DEBUG] sending signal 15 to pid 82696
DEBUG] successfully terminated pid 82696
DEBUG] purging metadata directory: /Users/kwilson/dev/pants/.pids/watchman

18:03:39 00:00   [complete]
               SUCCESS
  • 0
  • 0
  • 1
  • 0
  • 1
Description From Last Updated
ST
  1. 
      
  2. Related: https://github.com/pantsbuild/pants/issues/2956

    Currently the exp.fs just ignores dotfiles.

    1. added a note pointing to #2956.

  3. Can't remember where we settled, but it seems like from a scalability perspective, having a single watch registered for the whole buildroot (subject to --ignore-patterns would make the most sense). So maybe this method is only useful for tests?

    1. not really - it just already existed and I thought it might be handy in the future. rm'd.

  4. Hm... that bears more explanation. Is this just the pywatchman API in action?

    1. no, the 'truthy results' bit is just the if result: check one line above in the case of a non-exceptional future fetch - the idea being that callbacks could simply return something (e.g. an error msg) instead of raising some arbitrary exception to indicate failure (with the default case being no return or an implicit return None).

      if that seems weird, I'm completely fine with dropping it tho.

    2. Since returning an empty string or empty collection might be a non-error, explicitly looking for None would make more sense to me.

    3. sure, added an explicit is not None check.

  5. Probably capitalize Scheduler... don't think that name is going anywhere (although the class will certainly move).

    Can you link this to a https://github.com/pantsbuild/pants/labels/engine ticket that explains the followup bits to get invalidation happening? https://github.com/pantsbuild/pants/issues/2970 I guess?

    1. capitalized and added a link to #2970.

  6. Should this happen in def run instead?

    1. the setup of service<->service interaction all happens prior to any services getting started so they have a chance to interact/register/etc once up front and then run without subsequent changes. FSEventService won't be started at this point or incur live event subscription until services are started.

      when we drop in the HttpService, it'll require this notion of upfront registration - so I thought it made sense to apply it uniformly across the board.

    2. Ok. I think an explicit lifecycle would make more sense then. Constructors having sideeffects smells funny to me.

      1) open
      2) run
      3) close

      ... for example.

    3. sgtm - there's already an explicit run and terminate. added a setup phase so now its setup->run->terminate.

  7. Mark experimental? or is that already obvious from other docs?

    1. I think the line numbers are off in the RB commentary, but assuming you mean the fs-event options - marked them experimental.

  8. Should FSEventService hide its ThreadPoolExecutor dep, and just take a max_workers arg to construct it?

    1. eh - seems to makes more sense for this subsystem to manage construction of Executor pools since there will be at least one more. it used to internalize this in the initial RB - but iirc, you were the one who suggested moving the Executor construction outside for more control (e.g. to hand it an executor of threads/processes/etc).

    2. Haha, irony.

    3. rethinking this, inline with the thinking around lifecycle this probably does make more sense internalized since the service's teardown is what actually terminates this. reverted to internalized setup.

  9. There is strange symmetry here... I don't see this getting launched anywhere in this file.

    ...but now I see that it is probably launched as a singleton via the subsystem_dependencies call. Which begs the question: should Subsystem have a lifecycle?

    1. yeah - the FSEventService service uses WatchmanLauncher to launch watchman. our only entrypoint-with-options back to terminating that running instance is at the Subsystem level.

      in this case, the lifecycle belongs to ProcessManager for both PantsDaemon and Watchman - the Subsystem framework just gives us an options scope.

  10. Needs a TODO.

  11. 
      
KW
ST
  1. Ship It!
    1. Some notes though: when I try to reproduce your example above, I get various errors.

      When trying to startup:

      $ ./pants run src/python/pants/engine/exp/legacy:pantsd -q -- src/python/pants/pantsd:: -ldebug --pantsd-fs-event-detection
      ...
      $ cat .pants.d/pantsd/pantsd.log
      D0302 11:38:24.090219 26755 pants_daemon.py:121] logging initialized
      I0302 11:38:24.091131 26755 pants_daemon.py:170] pantsd starting, log level is DEBUG
      I0302 11:38:24.093492 26755 pants_daemon.py:144] starting service <pants.pantsd.service.scheduler_service.SchedulerService object at 0x1043050d0>
      I0302 11:38:24.094182 26755 pants_daemon.py:144] starting service <pants.pantsd.service.pailgun_service.PailgunService object at 0x1042f1d50>
      I0302 11:38:24.094588 26755 pailgun_service.py:51] starting pailgun server on port 61665
      I0302 11:38:24.094798 26755 pants_daemon.py:144] starting service <pants.pantsd.service.fs_event_service.FSEventService object at 0x1042f1fd0>
      W0302 11:38:24.098351 26755 pants_daemon.py:35] Exception in thread Thread-5:
      W0302 11:38:24.098560 26755 pants_daemon.py:35] Traceback (most recent call last):
      W0302 11:38:24.098731 26755 pants_daemon.py:35]   File "/opt/twitter_mde/package/python2.7/current/lib/python2.7/threading.py", line 810, in __bootstrap_inner
      W0302 11:38:24.098922 26755 pants_daemon.py:35]     self.run()
      W0302 11:38:24.099121 26755 pants_daemon.py:35]   File "/opt/twitter_mde/package/python2.7/current/lib/python2.7/threading.py", line 763, in run
      W0302 11:38:24.099281 26755 pants_daemon.py:35]     self.__target(*self.__args, **self.__kwargs)
      W0302 11:38:24.099406 26755 pants_daemon.py:35]   File "/Users/stuhood/src/pants/src/python/pants/pantsd/service/fs_event_service.py", line 88, in run
      W0302 11:38:24.099567 26755 pants_daemon.py:35]     watchman = WatchmanLauncher.global_instance().maybe_launch()
      W0302 11:38:24.099714 26755 pants_daemon.py:35]   File "/Users/stuhood/src/pants/src/python/pants/pantsd/subsystem/watchman_launcher.py", line 47, in maybe_launch
      W0302 11:38:24.099850 26755 pants_daemon.py:35]     if not self.watchman.is_alive():
      W0302 11:38:24.099973 26755 pants_daemon.py:35]   File "/Users/stuhood/src/pants/src/python/pants/pantsd/subsystem/watchman_launcher.py", line 43, in watchman
      W0302 11:38:24.100112 26755 pants_daemon.py:35]     watchman_path=self._watchman_path)
      W0302 11:38:24.100233 26755 pants_daemon.py:35]   File "/Users/stuhood/src/pants/src/python/pants/pantsd/watchman.py", line 31, in __init__
      W0302 11:38:24.100352 26755 pants_daemon.py:35]     self.watchman_path = self._resolve_watchman_path(watchman_path)
      W0302 11:38:24.100471 26755 pants_daemon.py:35]   File "/Users/stuhood/src/pants/src/python/pants/pantsd/watchman.py", line 67, in _resolve_watchman_path
      W0302 11:38:24.100605 26755 pants_daemon.py:35]     raise self.ExecutionError('could not locate watchman in $PATH!')
      W0302 11:38:24.100725 26755 pants_daemon.py:35] ExecutionError: could not locate watchman in $PATH!
      I0302 11:38:26.098202 26755 pants_daemon.py:82] terminating pantsd service: <pants.pantsd.service.scheduler_service.SchedulerService object at 0x1043050d0>
      I0302 11:38:27.096240 26755 pants_daemon.py:82] terminating pantsd service: <pants.pantsd.service.pailgun_service.PailgunService object at 0x1042f1d50>
      W0302 11:38:27.097388 26755 pants_daemon.py:35] Exception in thread Thread-4:
      W0302 11:38:27.097577 26755 pants_daemon.py:35] Traceback (most recent call last):
      W0302 11:38:27.097726 26755 pants_daemon.py:35]   File "/opt/twitter_mde/package/python2.7/current/lib/python2.7/threading.py", line 810, in __bootstrap_inner
      W0302 11:38:27.097870 26755 pants_daemon.py:35]     self.run()
      W0302 11:38:27.098009 26755 pants_daemon.py:35]   File "/opt/twitter_mde/package/python2.7/current/lib/python2.7/threading.py", line 763, in run
      W0302 11:38:27.098150 26755 pants_daemon.py:35]     self.__target(*self.__args, **self.__kwargs)
      W0302 11:38:27.098288 26755 pants_daemon.py:35]   File "/Users/stuhood/src/pants/src/python/pants/pantsd/service/pailgun_service.py", line 55, in run
      W0302 11:38:27.098424 26755 pants_daemon.py:35]     self.pailgun.handle_request()
      W0302 11:38:27.098560 26755 pants_daemon.py:35]   File "/opt/twitter_mde/package/python2.7/current/lib/python2.7/SocketServer.py", line 276, in handle_request
      W0302 11:38:27.098696 26755 pants_daemon.py:35]     fd_sets = _eintr_retry(select.select, [self], [], [], timeout)
      W0302 11:38:27.098846 26755 pants_daemon.py:35]   File "/opt/twitter_mde/package/python2.7/current/lib/python2.7/SocketServer.py", line 155, in _eintr_retry
      W0302 11:38:27.098988 26755 pants_daemon.py:35]     return func(*args)
      W0302 11:38:27.099123 26755 pants_daemon.py:35] error: (9, 'Bad file descriptor')
      I0302 11:38:27.099322 26755 pants_daemon.py:82] terminating pantsd service: <pants.pantsd.service.fs_event_service.FSEventService object at 0x1042f1fd0>
      I0302 11:38:27.099531 26755 fs_event_service.py:38] shutting down threadpool
      I0302 11:38:27.099934 26755 pants_daemon.py:85] terminating pantsd
      F0302 11:38:27.100698 26755 process_manager.py:403] Traceback (most recent call last):
        File "/Users/stuhood/src/pants/src/python/pants/pantsd/process_manager.py", line 401, in daemonize
          self.post_fork_child(**post_fork_child_opts or {})
        File "/Users/stuhood/src/pants/src/python/pants/pantsd/pants_daemon.py", line 193, in post_fork_child
          self._run()
        File "/Users/stuhood/src/pants/src/python/pants/pantsd/pants_daemon.py", line 182, in _run
          self._run_services(self._services)
        File "/Users/stuhood/src/pants/src/python/pants/pantsd/pants_daemon.py", line 156, in _run_services
          raise self.RuntimeFailure('service failure for {}, shutting down!'.format(service))
      RuntimeFailure: service failure for <pants.pantsd.service.fs_event_service.FSEventService object at 0x1042f1fd0>, shutting down!
      

      When trying to kill (although it does seem to successfully kill):

      $ ./pants kill-pantsd -ldebug --pantsd-fs-event-detection
      ...
      File "/Users/stuhood/src/pants/src/python/pants/pantsd/watchman.py", line 67, in _resolve_watchman_path
        raise self.ExecutionError('could not locate watchman in $PATH!')
      

      Both seem to be because I don't have watchman installed; the plan is to ship a private copy, yea? Can do with binary_utils if need be.

    2. yeah - you either need to add the srcgit/mde watchman to your $PATH or brew install watchman at the moment. I'll get an RB out that uses binary_utils et al soon.

  2. 
      
PE
  1. 
      
  2. and self._scheduler ?

    1. good call - added!

  3. 
      
KW
KW
Review request changed

Status: Closed (submitted)

Change Summary:

thanks gents! submitted @ e28fe8614a849d0de90af5b6844cf47614a2bb27

Loading...