-
-
src/python/pants/backend/codegen/tasks/antlr_gen.py (Diff revision 1) AntlrGen does grammar -> .java so the product type is more like 'sources'. That said - sources are special currently as products and we map them by injecting synthetic targets owning them into the build graph instead of using the products registry. I'd stick to only advertising product_type for Tasks that register items in the context.products in the execute call flow today. Making this a uniform way to communicate products is for follow up.
-
src/python/pants/backend/codegen/tasks/antlr_gen.py (Diff revision 1) This is too much detail - no where else are these product types accessed in this class. Up above self.register_jvm_tool(compiler, tools) is called and down below self.tool_classpath(compiler) is called indicating the actual interaction with the 'jvm_build_tools*' products happens in a base class. This prepare should move there.
-
src/python/pants/backend/core/tasks/check_exclusives.py (Diff revision 1) This task does not access 'classes_by_source' in the context.products - kill.
-
src/python/pants/backend/core/tasks/prepare_resources.py (Diff revision 1) This call should not be needed. So this RB adds RoundManager.schedule which stands to the side of context.products; however the idea is they become one - or 1 interface. It would be good to combine this call and schedule for 1 way to require a product mapping be made available in execute. It may make sense to change round_manager.schedule to hand a method-per-products.require* for ease of transition. So here this line would go and in prepare: round_manager.require_data('exclusives_groups')
-
src/python/pants/backend/core/tasks/task.py (Diff revision 1) This implies invalidation/caching is per Task and not per Task per product type. That's how things are done today anyway, but this stands out as arbitrary. No suggestions, just talking out loud.
-
src/python/pants/backend/core/tasks/task.py (Diff revision 1) Its worth noting the new part of the contract - that this is a well-known name other Tasks can bind to to get product mapping - thus its a public api. The doc as it stands is only focused on the jvm group case where javac and scalac produce classes both IIUC.
-
src/python/pants/backend/jvm/tasks/bootstrap_jvm_tools.py (Diff revision 1) 'jvm_build_tools' is required_data below so it can't be a product of this Task.
-
src/python/pants/backend/jvm/tasks/bootstrap_jvm_tools.py (Diff revision 1) and context.products.require* call implies a prepare is needed where the requirement moves to a schedule call on the round_manager for the product mapping.
-
src/python/pants/backend/jvm/tasks/checkstyle.py (Diff revision 1) Not true - thats provided by check_exclusives.py
Proposed refactor of the engine: RoundEngine
Review Request #545 — Created June 16, 2014 and submitted
Information | |
---|---|
ity | |
pants | |
247 | |
Reviewers | |
pants-reviews | |
benjyw, jsirois, patricklaw |
Proposed refactor of the engine: RoundEngine This proposed refactor of the Engine is driven by: https://github.com/pantsbuild/pants/issues/152 The "idea" goal is special since it requires a compile of a sub-graph of the overall Target graph followed by a gen on the entire graph. Currently, there is no non-hacky way of specifying a specific section of the Target graph to be passed to a specific task. This refactor, aims to extract out the scheduling of Tasks and their dependencies from register.py (where its been traditionally stored) and be able to pass through a Target graph if/when required. No matter what order goals are specified in the register.py, their order is overwritten by the schedule() method. This change is for building the phase graph recursively via requesting products in prepare(..) and advertising products via a Task classmethod product_type(). I will follow up with another change to be able to request products on a sub-graph (in order to actually fix the idea goal). Beware, this change is not completely integrated. For now, I wanted to send out this isolated change to get everyone's thoughts on it and make sure we are all on the same page. Especially John, Patrick and Benjy - since you folks have been heavily involved in the recent refactor.
./build-support/bin/ci.sh
JS
JS
-
Forgot - it would also be great to do a clean switch over to RoundEngine. So make round_manager a required non-None arg to prepare now and fixup the RoundEngine or just the test to ensure constructors are called in the same order prepare is. Today most tasks rely on __init__ being called in prepare order. IE: I introduced prepare but did not convert Tasks to it, instead LinearEngine calls (__init__, prepare), (__init__, prepare), ... Keep that semantic for the 1st change and the risk is low here since the key guaranty of orderings in the Task lifecycle will be maintained. The only risky bit is getting all the product_type lists and prepare round_manager schedules faithfully reflecting the context.products uses.
ZU
-
I some high level questions: - Why does it run 'gen' over the entire graph. Shouldn't we just need to run 'gen' over the graph that the target or targets depend on? - When 'pants goal idea <target>' was working, it seemed to do more work than I thought it had to. What is the purpose of the 'compile' step? When you are authoring new functionality or trying to bring up the IDE on a broken tree, the fail of compile seemed to preclude the ide config being generated correctly. This is a minor annoyance when all of your java code is under /src/java as in the Pants repo, but not really recoverable when you have a tree with hundreds of java source roots (maven style layout).
ZU
-
-
src/python/pants/backend/codegen/tasks/antlr_gen.py (Diff revision 1) Does every code generator need updating like this? I didn't see jaxb_gen.py (we recently added this) in the list.
BE
-
First of all, I am super-excited for this change. This will be a significant improvement over manually registering phase deps. I have several comments, with the caveat that I may be misunderstanding the design, and where it's heading. See below for details inline. Cheers!
-
src/python/pants/backend/codegen/tasks/code_gen.py (Diff revision 1) For the apparently common case where the scheduled products are hard-coded per type, we could have a classmethod that returns them, and the Task base class could use that as its default implementation of prepare(). It would save a little bit of boilerplate.
-
src/python/pants/engine/round_engine.py (Diff revision 1) OrderedSet? Since you test for membership below.
-
src/python/pants/engine/round_engine.py (Diff revision 1) This comment seems stale - there's no way to specify a graph, and there are no phases here, just products.
-
src/python/pants/engine/round_engine.py (Diff revision 1) You don't need to construct an instance here. goal.task_type.product_type() will work, as it's a classmethod.
-
src/python/pants/engine/round_engine.py (Diff revision 1) phases_by_product can be a defaultdict(set) or (defaultdict(OrderedSet) if order matters) and then you won't need this check.
-
src/python/pants/engine/round_engine.py (Diff revision 1) It's a bit weird to modify a mutable argument, that would need to be well documented. Or you could nest this method inside _prepare(), so it can access phases and phases_by_product from its closure. That would be more intuitive I think.
-
src/python/pants/engine/round_engine.py (Diff revision 1) round_manager could be local to this method, right?
-
src/python/pants/engine/round_engine.py (Diff revision 1) Is phase the right resolution for rounds, or should it be goals? In other words, must we execute all goals in a phase when the product-based selection is happening at the goal level anyway? Say we allow each task to look at the target graph when scheduling products (and isn't that the plan?). Then, for example, junit_test could see that it has no targets to act on, it would not schedule any products, and then none of the jvm machinery would be invoked, which seems like a good thing. Or, am I missing some obvious reason why this is wrong?
-
src/python/pants/engine/round_engine.py (Diff revision 1) Iterating over a collection that you're mutating is bug-prone. Maybe better to let update_phases return the new phases?
-
src/python/pants/engine/round_engine.py (Diff revision 1) This will update for all products ever seen, not just for new products, as the signature of update_phases implies. There's no harm in the end because update_phases won't add the same phase twice, but it's nice to be precise.
IT
Change Summary:
Addressed most comments. Adding final firing to be able to do a clean switch over to RoundEngine.
PA
-
-
src/python/pants/backend/codegen/tasks/code_gen.py (Diff revision 2) I'm a little confused by this being the product type for codegen, given that codegen generally just injects synthetic targets into the graph. So its type is kind of "targets by source", though I don't know if that's a meaningful concept. Could you elaborate in a comment why this is the type, given that codegen is a bit tricky already?
-
src/python/pants/backend/core/tasks/build_lint.py (Diff revision 2) I really like that we're starting to separate out the preparation phases of goals into meaningful pieces.
-
src/python/pants/engine/round_engine.py (Diff revision 2) I might be in the minority here, but I'm not a big fan of nested classes that aren't trivial (like exception aliases). I feel like it makes the methods difficult to distinguish from the outer class's methods, and doesn't particularly buy us much versus just defining them at the top level in the module. Very minor style nit though.
-
-
BE
-
-
src/python/pants/engine/round_engine.py (Diff revision 2) We rely on modifying a mutable argument. That's not awful but it's a bit unintuitive so document it well! Also, the round_manager knows the context, so there's no need to pass both around everywhere. But even better, these classmethods could be instance methods and get the context and the round manager from the instance. In fact, could they be methods *on* roundmanager? One way or another, chains of static methods that call each other with non-trivial argument lists are a code smell, implying that they probably should be instance methods.
-
-
src/python/pants/engine/round_engine.py (Diff revision 2) Probably good to set outer_lock_holder to None here, so we don't release lock again in the finally block.
JS
-
-
src/python/pants/engine/round_engine.py (Diff revision 2) check 1st, construct the round_manager after
-
tests/python/pants_test/engine/test_round_engine.py (Diff revision 2) Now that the actions are all in a specified order, 1 list should be used to collect them. The result should be: construct_action('bootstrap-jvm-tools') prepare_action('bootstrap-jvm-tools') construct_action('thrift') prepare_action('thrift') ... then the execute actions in expected order.
IT
Change Summary:
Addressed all comments.
Diff: |
Revision 3 (+389 -78)
|
---|
JS
-
I only see 1 big issue - JvmBinaryTask _should_ be blowing up. Fixup that and triple check the prepare request vs execute uses - I think I found all the misses remaining, but... and this should be ready to go.
-
src/python/pants/engine/round_engine.py (Diff revisions 2 - 3) I missed this before, I think context.acquire_lock() and the other 2 lines below at the same indent level all need a +1 - they have only a +1 indent now.
-
tests/python/pants_test/engine/test_round_engine.py (Diff revisions 2 - 3) And this is safe because all product.require* are moved out of constructors now - that's great!
-
src/python/pants/backend/core/tasks/build_lint.py (Diff revision 3) 'missing_intransitive_deps' is asked for below as well.
-
src/python/pants/backend/core/tasks/task.py (Diff revision 3) Its slighty odd to have prepare ordered before init, generally constructor then instance methods
-
src/python/pants/backend/core/tasks/task.py (Diff revision 3) self.product_type() was used on the LHS and that defaulted to self.__class__.__name__ - so a non-empty string unique to the Task subclass. I think you can just inline self.__class__.__name__ for suffix_type and then all tasks get an unambiguous single invalidator dir. Right now the '' is degenerate and 2 tasks could also become degenerate if they happen to produce the same product_type.
-
src/python/pants/backend/jvm/tasks/benchmark_run.py (Diff revision 3) Kill - this task doesn't actually populate a 'benchmark-run' product fwict.
-
src/python/pants/backend/jvm/tasks/jar_create.py (Diff revision 3) This has product_types of ['jars', 'javadoc_jars', 'source_jars']
-
-
src/python/pants/backend/jvm/tasks/junit_run.py (Diff revision 3) It would be good to keep the same grouping as in the constructor - the comment goes with the class_by_*
-
src/python/pants/engine/round_manager.py (Diff revision 3) This will need the utf-8 header line of a unit test should fail that checks for this.
-
src/python/pants/engine/round_manager.py (Diff revision 3) I think you need a 1-1 mapping here from RoundManager call to products call. schedule -> products.require require_data -> products.require_data That said - its probably a good time to align the names and s/schedule/require/ Also, you need to reflect the underlying signatures too; for example: JvmBinaryTask passes the optional predicate: https://github.com/pantsbuild/pants/blob/master/src/python/pants/backend/jvm/tasks/jvm_binary_task.py#L36
-
src/python/pants/engine/round_manager.py (Diff revision 3) _get_phases_by_product - this looks like its private
-
src/python/pants/engine/round_manager.py (Diff revision 3) pydoc - the ordering is key here and it would be good to document what it is.
IT
Change Summary:
Addressed all final comments and triple checking all products/require mappings using - https://docs.google.com/spreadsheets/d/1R7-SqcKrQt5407pJL5KKgV_Ng2ZWB12ocJWuoCE6Wy4/edit#gid=0
Diff: |
Revision 4 (+426 -78)
|
---|
JS
-
-
src/python/pants/backend/jvm/tasks/jar_create.py (Diff revision 4) This is missing a call to super and JarTask has some prep to do too. This problem is semi-widespread fwict.
IT
Change Summary:
All tests are passing locally. I will follow up with a change to register.py and extraneous require declarations.
Diff: |
Revision 5 (+434 -296)
|
---|
JS
-
This is looking plausible to me. I'll `rbt patch` your next diff and work it hard and I think it might just be gold.
-
src/python/pants/backend/jvm/tasks/ivy_resolve.py (Diff revisions 4 - 5) You don't actually provide this or have people hack depending on it yet, so I'd kill
IT
Change Summary:
@John - sg, thanks. the more testing the better. Running here too - https://travis-ci.org/pantsbuild/pants/builds/28642239
Diff: |
Revision 6 (+436 -296)
|
---|
ZU
-
I was just looking over the spreadsheet https://docs.google.com/spreadsheets/d/1R7-SqcKrQt5407pJL5KKgV_Ng2ZWB12ocJWuoCE6Wy4/edit#gid=0. I just wanted you to be aware that we are refactoring pants goal idea and at this point have added a consumer of 'ivy_jar_products' in ide_gen.py https://github.com/pantsbuild/pants/pull/283
IT
Change Summary:
Skipped scaladoc/javadoc tests if corresponding binaries are not on PATH (like for Travis ci). And CI just went green - https://travis-ci.org/pantsbuild/pants/builds/28814087
Diff: |
Revision 7 (+452 -298)
|
---|
JS
-
-
-
tests/python/pants_test/tasks/test_jar_publish_integration.py (Diff revision 7) I think the return code is all you need. It's which's job to check the exists & is executable part.
JS
-
This is still a problem though: $ PANTS_DEV=1 ./pants goal clean-all killserver server *** Running pants in dev mode from /home/jsirois/dev/3rdparty/pantsbuild-pants/src/python/pants/bin/pants_exe.py *** (using pantsrc expansion: pants goal clean-all killserver server --server-open) Launching server with pid 30208 at http://localhost:43116 Killed server with pid 30208 at http://localhost:43116 Created new window in existing browser session. It would be good to fix or circle back real fast and fix + add unit for unconstrained CLI phases maintaining CLI order.