[engine] using memoized to avoid repeated keying and unpickling

Review Request #3597 — Created March 23, 2016 and discarded

peiyu
pants
3066, 3078
pants-reviews
kwlzn, patricklaw, stuhood

This is yet another low hanging fruit but effective performance
improvement in Storage.

In my manual testing list 3rdparty::, improvement for cold cache is
about 50%, for fully warmed-up cache, it's sightly slower because of the
extra memoized check cost.

Using memoized is only for now, there are various lru_cache
implementations for py2 that we could use later, like [1] and [2]
or just switch to py3.

[1] https://github.com/repoze/repoze.lru
[2] https://pypi.python.org/pypi/functools32

http://jenkins.pantsbuild.org/job/pants_ci.unit_tests/140/ passed

  • 3
  • 0
  • 0
  • 0
  • 3
Description From Last Updated
It should not be necessary to check the type in equality, because the digested blob includes the type (or does, ... stuhood
Rather than using memoized here, you can catch a few more cases by doing this manually with a dict, and ... stuhood
This method will key by the entire step, rather than keying by the keyable fields. But memoizing this method probably ... stuhood
  1. 
      
  2. src/python/pants/engine/exp/storage.py (Diff revision 2)
     
     

    It should not be necessary to check the type in equality, because the digested blob includes the type (or does, if it is relevant).

    1. because... we have redundant information? that is ._type, still needed for state_key.type by scheduler.

  3. src/python/pants/engine/exp/storage.py (Diff revision 2)
     
     
     
     
     

    This method will key by the entire step, rather than keying by the keyable fields.

    But memoizing this method probably doesn't make any sense... we know that a particular Step runs exactly once per execution.

    1. This is to avoid making keys twice once in cache.get(step_request) the other for cache.put(step_request, step_result), for the same request, with step_id plus all the keyable fields.

    2. That should be accomplished via an API change, IMO... explicitly split out the creation of the key from the use of the key for these methods.

  4. 
      
  1. 
      
  2. src/python/pants/engine/exp/storage.py (Diff revision 2)
     
     

    Rather than using memoized here, you can catch a few more cases by doing this manually with a dict, and storing the computed key immediately after a put.

  3. 
      
Review request changed

Status: Discarded

Loading...